IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
The serializer sets doc->encoding to a temporary value and restores
the original value when it's done. This overwrites the encoding value
set in xmlBufAttrSerializeTxtContent, causing a memory leak.
Don't mess with doc->encoding if invalid UTF-8 is encountered.
Found with libFuzzer and ASan.
xmlSnprintfElementContent failed to correctly check the available
buffer space in two locations.
Fixes bug 781333 (CVE-2017-9047) and bug 781701 (CVE-2017-9048).
Thanks to Marcel Böhme and Thuan Pham for the report.
There were two bugs where parameter-entity references could lead to an
unexpected change of the input buffer in xmlParseNameComplex and
xmlDictLookup being called with an invalid pointer.
Percent sign in DTD Names
=========================
The NEXTL macro used to call xmlParserHandlePEReference. When parsing
"complex" names inside the DTD, this could result in entity expansion
which created a new input buffer. The fix is to simply remove the call
to xmlParserHandlePEReference from the NEXTL macro. This is safe because
no users of the macro require expansion of parameter entities.
- xmlParseNameComplex
- xmlParseNCNameComplex
- xmlParseNmtoken
The percent sign is not allowed in names, which are grammatical tokens.
- xmlParseEntityValue
Parameter-entity references in entity values are expanded but this
happens in a separate step in this function.
- xmlParseSystemLiteral
Parameter-entity references are ignored in the system literal.
- xmlParseAttValueComplex
- xmlParseCharDataComplex
- xmlParseCommentComplex
- xmlParsePI
- xmlParseCDSect
Parameter-entity references are ignored outside the DTD.
- xmlLoadEntityContent
This function is only called from xmlStringLenDecodeEntities and
entities are replaced in a separate step immediately after the function
call.
This bug could also be triggered with an internal subset and double
entity expansion.
This fixes bug 766956 initially reported by Wei Lei and independently by
Chromium's ClusterFuzz, Hanno Böck, and Marco Grassi. Thanks to everyone
involved.
xmlParseNameComplex with XML_PARSE_OLD10
========================================
When parsing Names inside an expanded parameter entity with the
XML_PARSE_OLD10 option, xmlParseNameComplex would call xmlGROW via the
GROW macro if the input buffer was exhausted. At the end of the
parameter entity's replacement text, this function would then call
xmlPopInput which invalidated the input buffer.
There should be no need to invoke GROW in this situation because the
buffer is grown periodically every XML_PARSER_CHUNK_SIZE characters and,
at least for UTF-8, in xmlCurrentChar. This also matches the code path
executed when XML_PARSE_OLD10 is not set.
This fixes bugs 781205 (CVE-2017-9049) and 781361 (CVE-2017-9050).
Thanks to Marcel Böhme and Thuan Pham for the report.
Additional hardening
====================
A separate check was added in xmlParseNameComplex to validate the
buffer size.
For now this is mainly useful if you work on a fork of the libxml2
mirror on GitHub:
https://github.com/GNOME/libxml2
Start with two build setups:
- GCC with as many GNU extensions disabled as possible, trying to
emulate a C89 compiler on a POSIX system.
- clang with ASan and UBSan.
The Python tests don't set an exit code, so Travis won't detect
failures. The same goes for "make tests", but we only run "make check"
anyway.
The code in xmlParseStartTag2 must handle the case that the input
buffer was grown and reallocated which can invalidate pointers to
attribute values. Before, this was handled by detecting changes of
the input buffer "base" pointer and, in case of a change, jumping
back to the beginning of the function and reparsing the start tag.
The major problem of this approach is that whether an input buffer is
reallocated is nondeterministic, resulting in seemingly random test
failures. See the mailing list thread "runtest mystery bug: name2.xml
error case regression test" from 2012, for example.
If a reallocation was detected, the code also made no attempts to
continue parsing in case of errors which makes a difference in
the lax "recover" mode.
Now we store the current input buffer "base" pointer for each (not
separately allocated) attribute in the namespace URI field, which isn't
used until later. After the whole start tag was parsed, the pointers
to the attribute values are reconstructed using the offset between the
new and the old input buffer. This relies on arithmetic on dangling
pointers which is technically undefined behavior. But it seems like
the easiest and most efficient fix and a similar approach is used in
xmlParserInputGrow.
This changes the error output of several tests, typically making it
more verbose because we try harder to continue parsing in case of
errors.
(Another possible solution is to check not only the "base" pointer
but the size of the input buffer as well. But this would result in
even more reparsing.)
The API tests combine string buffers with arbitrary length values which
makes ASan detect out-of-bound array accesses. Even without ASan, this
could lead to unwanted test failures.
Add a check for "len", "size", and "start" arguments, assuming they
apply to the nearest char pointer. Skip the test if they exceed the
buffer size. This is a somewhat naive heuristic but it seems to work
well.
Don't count leading zeros towards the fraction size limit. This allows
to parse numbers like
0.0000000000000000000000000000000000000000000000000000000001
which is the only standard-conformant way to represent such numbers, as
scientific notation isn't allowed in XPath 1.0. (It is allowed in XPath
2.0 and in libxml2 as an extension, though.)
Overall accuracy is still bad, see bug 783238.
Use the C library's floor and ceil functions. The old code was overly
complicated for no apparent reason and could result in undefined
behavior when handling NaNs (found with afl-fuzz and UBSan).
Fix wrong comment in xmlXPathRoundFunction. The implementation was
already following the spec and rounding half up.
When traversing the "preceding" axis from an attribute node, we must
first go up to the attribute's containing element. Otherwise, text
children of other attributes could be returned. This made it possible
to hit a code path in xmlXPathNextAncestor which contained another bug:
The attribute node was initialized with the context node instead of the
current node. Normally, this code path is only hit via
xmlXPathNextAncestorOrSelf in which case the current and context node
are the same.
The combination of the two bugs could result in an infinite loop, found
with libFuzzer.
Traversing the "following" and the "preceding" axis from namespace nodes
should be handled similarly. This wasn't supported at all previously.
Move the check for trailing characters from xmlXPathEval to
xmlXPathEvalExpr. Otherwise, a valid portion of a syntactically invalid
expression would be evaluated before returning an error.
Move cleanup of XPath stack to xmlXPathFreeParserContext. This avoids
memory leaks if valuePop fails in some error cases. Found with
libFuzzer and ASan.
Rework handling of the final XPath result object in
xmlXPathCompiledEvalInternal and xmlXPathEval to avoid useless error
messages.
Triggered in mixed content ELEMENT declarations if there's an invalid
name after the first valid name:
<!ELEMENT para (#PCDATA|a|<invalid>)*>
Found with libFuzzer and ASan.
For https://bugzilla.gnome.org/show_bug.cgi?id=772726
* include/libxml/parser.h: Add a new parser flag XML_PARSE_NOXXE
* elfgcchack.h, xmlIO.h, xmlIO.c: associated loading routine
* include/libxml/xmlerror.h: new error raised
* xmllint.c: adds --noxxe flag to activate the option
Namespace nodes must be copied to avoid use-after-free errors.
But they don't necessarily have a physical representation in a
document, so simply disallow them in XPointer ranges.
Found with afl-fuzz.
Fixes CVE-2016-4658.
The old code would invoke the broken xmlXPtrRangeToFunction. range-to
isn't really a function but a special kind of location step. Remove
this function and always handle range-to in the XPath code.
The old xmlXPtrRangeToFunction could also be abused to trigger a
use-after-free error with the potential for remote code execution.
Found with afl-fuzz.
Fixes CVE-2016-5131.
For https://bugzilla.gnome.org/show_bug.cgi?id=766834
vctxt->parserCtxt is always NULL in xmlSchemaSAXHandleStartElementNs,
so this function can't call xmlStringLenDecodeEntities to decode the
entities.