IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Otherwise, direct calls to xmlFree() etc. from the application will
use a different set of allocation functions to what was used to allocate
the memory internally.
Building 2.9.0 on MSVC7.1 was failing
This is because HAVE_CONFIG_H is not #defined
The patch addresses the above, adds testrecurse.exe and the
standard "make check" suite of tests to the MSVC makefile, and also
fixes the following (MSVC7.1) warnings:
buf.c(674) : warning C4028: formal parameter 1 different from
declaration
libxml2\timsort.h(71) : warning C4028: formal parameter 1 different from
declaration
cannot compile libxml2-2.9.0 using studio 12.1 compiler on solaris 10
I.M.O. structure initializer (as PTHREAD_ONCE_INIT) cannot be used in
a structure assignment anyway
For https://bugzilla.gnome.org/show_bug.cgi?id=683933
rand_seed should be a static variable in dict.c
We ran into a problem with another library that exports rand_seed as a
function. Combined with 2.7.8 this was not a problem but later versions
have this problem.
For https://bugzilla.gnome.org/show_bug.cgi?id=681822
Regardless if the option HTML_PARSE_NOBLANKS is set or not, blank nodes
are removed from a HTML document, for example:
<html>
<head>
<title>This is a test.</title>
</head>
<body>
<p>This is a test.</p>
</body>
</html>
is read as:
<html><head><title>This is a test.</title></head><body>
<p>This is a test.</p>
</body></html>
This changes the default behaviour but the old behaviour is available
as expected when using the parser flag HTML_PARSE_NOBLANKS
Based on original patch from Igor Ignatyuk <igor_ignatiouk@hotmail.com>
* HTMLparser.c: change various places in the parser where ignorable_space
SAX callback was called without checking for the parser flag preference
* xmllint.c: make sure we use the new flag even for HTML parsing
* result/HTML/*: this modifies the output of a number of tests
configure.am:
* Explicitly disallow --enable-rebuild-docs when builddir != srcdir, per
what you said about needing to build docs with an in-source build
doc/Makefile.am:
* Ensure that xmlversion.h is in the source tree before running
apibuild.py, to avoid generating an incomplete libxml2-api.xml
* Update the .PHONY target (forgot to do this earlier)
doc/devhelp/Makefile.am:
* Wrap the doc-generating rule in an "if REBUILD_DOCS" conditional so it
doesn't cause trouble for regular users
* Added a handy-dandy "rebuild" target
doc/examples/index.py:
* NOTE: You need to run this script to regenerate the files it creates,
and then commit the newly-updated files! The generated files currently
in git master (e.g. doc/examples/Makefile.am) are out of date even
before this patch!
* index.html really needs to be in EXTRA_DIST
* Wrap the doc-generating rules in an "if REBUILD_DOCS" conditional,
because they shouldn't be active otherwise
so we've got this patch to libxml2 2.7.6 in the LibreOffice code base,
inherited from OOo. it fixes a definite problem, which is that Windows
has a rather low maximum path length restriction, and there is a special
trick on NT whereby path names can be prefixed with "\\?\", in which
case the maximum length is 32k, which ought to be sufficient even for
bloated office suites :)
I'll attach the patch to the xmlCanonicPath function. note that i
didn't write this and am by no means an expert on either Microsoftean
platforms or libxml so maybe it's not the best way to do it.
looping 1000 time on an error stating that a nodeset has
grown out of control is useless, make sure we percolate
error up to the various loops and break when errors occurs
Handle special cases of &{...} constructs as hinted in the spec
http://www.w3.org/TR/html401/appendix/notes.html#h-B.7.1
and special values as comment <!-- ... --> used for server side includes
This is limited to attribute values in HTML content.
While xmlCleanupParser() should not be used unless complete control
is insured over the programe making sure libxml2 is not in use anywhere
It should still be usable, and allow a sequence of
xmlInitParser();
xmlCleanupParser();
calls if needed, the problem is that the thread key wasn't reallocated
on subsequent xmlinitParser() calls leading to corruption of pthread
keys used by the program.
* threads.c: make sure xmlCleanupParser() reset the pthread_once()
global variable driving thread key allocation.
Related to https://bugs.launchpad.net/lxml/+bug/502959
Basically the core of the issue is that if an entity references another
entity, then in case we are replacing entities content, we should always
do so by copying the referenced content as long as the reference is
done within the entity. Otherwise, if for some reason there is a later
parsing error that entity content may be freed.
Complex scenario exposed by command:
thinkpad:~/XML/diveintopython-5.4/xml -> valgrind --db-attach=yes
../../xmllint --loaddtd --noout --noent diveintopython.xml
Document references &a;
a references &b;
we references b content directly in by linking in the a content
a has an error further down
we free a, freeing the chunk from b
Document references &b; after &a;
we try to copy b content, but it was freed already => segfault
* parser.c: never reference directly entity content without copying if
we aren't in the document main entity
There is a spurious ".l" in the xml2-config.1 man page. This line can
simply be removed.
$ mandoc -Tlint -Werror xml2-config.1
xml2-config.1:12:2: ERROR: skipping unknown macro: .l
For https://bugzilla.gnome.org/show_bug.cgi?id=677606
For https://bugs.gentoo.org/show_bug.cgi?id=417539
If libxml2-2.8.0 is built with --with-icu --with-python on a system that has an
older version of libxml2 installed, then during "make install", libxml2mod.so
gets relinked to the systemwide version of libxml2.so.2 instead of libxml2.so.2
from the build tree, and fails at runtime if symbol versions from the older
libxml2.so.2 are not available. This effectively makes it impossible to build a
libxml2-2.8.0 binary package on a system that does not already have
libxml2-2.8.0 installed.
Investigation by Rafał Mużyło and Arfrever Frehtes Taifersar Arahesis revealed
the cause of the problem to be that libxml2's configure was adding ICU_LIBS to
LDFLAGS instead of to LIBADD. This resulted in GNU libtool using the wrong
argument order in its relinking command that gets run during "make install".
Example xmlXPathNormalizeFunction() would do CHECK_ARITY(1)
and the expect valuePop(ctxt); to return an object, except
now valuePop() looks at the XPath stack frames and fails returning
NULL, and we end up crashing dereferencing the object.
Real solution is to exten CHECK_ARITY() and recompile all
XPath functions using it.
It seems that setting up both xmlTextReaderSetStructuredErrorHandler and
xmlSetStructuredErrorFunc confuses the code around error.c:592 and following
This patch works with any combinations of using xmlSetStructuredErrorFunc,
xmlTextReaderSetStructuredErrorHandler, both, or none.
I use libxml xpath engine on quite large (and mostly "flat") xml files.
It seems that Shellsort, that is used in xmlXPathNodeSetSort is a
performance bottleneck for my case. I have read some posts about sorting
in libxml in the libxml archive, but I agree that qsort was not the way
to go. I experimented with Timsort instead and my results were good for
me. For about 10000 nodes, my test was about 5x faster with Timsort,
for 1000 nodes about 10% faster, for small data files, the difference
was not measurable.
* timsort.h: the algorithm, kept in a separate header
* xpath.c: plug in the new algorithm in xmlXPathNodeSetSort
* Makefile.am: add the header to the EXTRA_DIST
* doc/apibuild.py: avoid indexing the new header
When investigating the libxslt performance problem reported in bug
#657665, I found that '//' in XPath expressions can be very slow when
working on large subtrees.
One of the reasons is the seemingly quadratic time complexity of the
duplicate checks when merging result nodes. The other is a missed
optimization for expressions of the form
'descendant-or-self::node()/axis::test'. Since '//' is expanded to
'/descendant-or-self::node()/', this type of expression is quite common.
Depending on the axis of the expression following the
'descendant-or-self' step, the following replacements can be made:
from descendant-or-self::node()/child::test
to descendant::test
from descendant-or-self::node()/descendant::test
to descendant::test
from descendant-or-self::node()/self::test
to descendant-or-self::test
from descendant-or-self::node()/descendant-or-self::test
to descendant-or-self::test
'test' can be any kind of node test.
With these replacements the possibly huge result of
'descendant-or-self::node()' doesn't have to be stored temporarily, but
can be processsed in one pass. If the resulting nodeset is small, the
duplicate checks aren't a problem.
I found that there already is a function called
xmlXPathRewriteDOSExpression which performs this optimization for a very
limited set of cases. It employs a complicated iteration scheme for
rewritten expressions. AFAICS, this can be avoided by simply changing
the axis of the expression like described above.
With the attached patch against libxml2 and the files from bug #657665 I
got the following results.
Before:
$ time xsltproc/xsltproc --noout service-names-port-numbers.xsl
service-names-port-numbers.xml
real 2m56.213s
user 2m56.123s
sys 0m0.080s
After:
$ time xsltproc/xsltproc --noout service-names-port-numbers.xsl
service-names-port-numbers.xml
real 0m3.836s
user 0m3.764s
sys 0m0.060s
I also ran the libxml2 and libxslt test suites with the patch and
couldn't detect any breakage.
Nick
>From e0f5a8261760e4f257b90410be27657e984237c8 Mon Sep 17 00:00:00 2001
From: Nick Wellnhofer <wellnhofer@aevum.de>
Date: Sun, 19 Aug 2012 18:20:22 +0200
Subject: [PATCH] Optimizations for descendant-or-self::node()
Currently, the function xmlXPathRewriteDOSExpression optimizes expressions
of type '//child'. Instead of adding a 'rewriteType' and doing a compound
traversal, the same can be achieved simply by setting the axis of the node
test from 'child' to 'descendant'.
There are also many other cases that can be optimized similarly. This
commit augments xmlXPathRewriteDOSExpression to essentially rewrite the
following subexpressions:
- descendant-or-self::node()/child:: to descendant::
- descendant-or-self::node()/descendant:: to descendant::
- descendant-or-self::node()/self:: to descendant-or-self::
- descendant-or-self::node()/descendant-or-self:: to descendant-or-self::
Since the '//' shortcut in XPath is translated to
'/descendant-or-self::node()/', this greatly speeds up expressions using
'//' on large subtrees.
When generating a sequence add an extra epsilon transition
to avoid further constructs from entering via the last state
Bug reported by Johan Corveleyn <jcorvel@gmail.com>
As suggested by Andrew W. Nosenko:
Proposal: expose the new xmlBufShrink() to the "public" API for
compatibility with xmlBufUse().
Reason: the following scenario:
1. Read something into xmlParserInputBuffer (e.g. using
xmlParserInputBufferRead())
2. Extract content through xmlBufContent()
3. Extract content length through xmlBufUse(). Result have type
'size_t'.
4. Use this content
5. Now, you need to shrink the buffer. How to do it? Doing that
through legacy xmlBufferShrink() is unsafe because it uses 'unsigned
int' and the whole point of introducing the new API was handling the
cases, when 'unsigned int' is not enough. Therefore, need to use the
new xmlBufShrink(). But it is "private".
Therefore, I propose to expose the new xmlBufShrink() in the same way,
as xmlBufContent() and xmlBufUse() are exposed.