libxml2

mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-01-26 10:03:34 +03:00

Author	SHA1	Message	Date
Nick Wellnhofer	f9ea1a24ed	Fix copying of entities in xmlParseReference Before, reader mode would end up in a branch that didn't handle entities with multiple children and failed to update ent->last, so the hack copying the "extra" reader data wouldn't trigger. Consequently, some empty nodes in entities are correctly detected now in the test suite. (The detection of empty nodes in entities is still buggy, though.)	2020-02-11 16:37:52 +01:00
Jared Yanovich	2a350ee9b4	Large batch of typo fixes Closes #109.	2019-09-30 18:04:38 +02:00
Nick Wellnhofer	c2f209c09f	Disallow conditional sections in internal subset Conditional sections are only allowed in external parameter entities referenced from the internal subset.	2019-09-30 15:47:30 +02:00
Nick Wellnhofer	c51e38cb3a	Make xmlParseConditionalSections non-recursive Avoid call stack overflow in deeply nested conditional sections. Found by OSS-Fuzz.	2019-09-30 15:47:30 +02:00
Nick Wellnhofer	99a864a1f7	Fix Regextests - One of the bug316338 test cases is expected to succeed. - Memory leak in testRegexp.c. - Refcount handling in xmlExpHashGetEntry.	2019-09-25 15:27:45 +02:00
Nick Wellnhofer	c2b0a184a9	Fix empty branch in regex Fixes bug 649244: https://bugzilla.gnome.org/show_bug.cgi?id=649244 Closes #57.	2019-09-25 14:22:47 +02:00
Nick Wellnhofer	62150ed2ab	Make xmlParseContent and xmlParseElement non-recursive Split xmlParseElement into subfunctions. Use nameNsPush to store prefix, URI and nsNr on the heap, similar to the push parser. Closes #84.	2019-09-23 17:45:50 +02:00
Nick Wellnhofer	6705f4d28e	Remove executable bit from non-executable files	2019-09-16 15:48:59 +02:00
Nick Wellnhofer	eee1dd5acf	Fix expected output of test/schemas/any4 libxml2 correctly rejects any4_0.xsd as invalid schema. I can't figure out what the intent behind this test case was. Simply adjust the expected output to match the current behavior. Closes #92.	2019-09-16 15:36:44 +02:00
Nick Wellnhofer	e8c9cd5c7a	Fix Schema determinism check of ##other namespaces Non-compound (##local) and compound string atoms are always disjoint regardless of whether the compound atom is negated (##other). Closes #40.	2019-09-16 15:36:02 +02:00
bettermanzzy	01d8cf07d9	Misleading error message with xs:{min\|max}Inclusive Closes #53.	2019-08-25 14:12:34 +02:00
Jan Pokorný	ea695ac0d6	Fix unability to RelaxNG-validate grammar with choice-based name class Previously, test/relaxng/ambig_name-class2.xml would fail to validate against test/relaxng/ambig_name-class2.rng: > test/relaxng/ambig_name-class2.rng:4: > element attribute: Relax-NG parser error : > Found anyName attribute without oneOrMore ancestor > Relax-NG schema test/relaxng/ambig_name-class2.rng failed to compile Signed-off-by: Jan Pokorný <jpokorny@redhat.com>	2019-08-25 13:29:04 +02:00
Jan Pokorný	8074b88179	Fix unability to validate ambiguously constructed interleave for RelaxNG Previously, test/relaxng/ambig_name-class.xml would fail to validate for a simple reason -- interleave within "open-name-class" context is supposed to be fine with whatever else is pending the consumption, since effectively, it's unrelated from a higher parsing perspective. Signed-off-by: Jan Pokorný <jpokorny@redhat.com>	2019-08-25 13:29:04 +02:00
Nick Wellnhofer	f9fce96313	Fix unsigned integer overflow It's defined behavior but -fsanitize=unsigned-integer-overflow is useful to discover bugs.	2019-05-20 13:38:22 +02:00
Nick Wellnhofer	c2f4da1a93	Improve XPath predicate and filter evaluation Consolidate code paths evaluating XPath predicates and filters. Don't push context node on stack when evaluating predicates. I have no idea why this was done. It seems completely useless and trying to pop the context node from a corrupted stack has already caused security issues. Filter nodesets in-place and don't create node sets with NULL gaps which allows to simplify merging a great deal. Simply move matched nodes backward and create a compact node set. Merge xmlXPathCompOpEvalPositionalPredicate into xmlXPathCompOpEvalPredicate.	2019-04-22 14:48:46 +02:00
Nick Wellnhofer	30a6533e01	Fix float casts in xmlXPathSubstringFunction Rewrite conversion of double to int in xmlXPathSubstringFunction, adding range checks to avoid undefined behavior. Make sure to add start and length as floating-point numbers before converting to int. Fix a bug when rounding negative start indices. Remove unneeded calls to xmlXPathIs{Inf,NaN} and rely on IEEE math instead. Avoid computing the string length. xmlUTF8Strsub works as expected if the length of the requested substring exceeds the input. Found with libFuzzer and UBSan.	2019-03-08 14:29:59 +01:00
Nikolai Weibull	c64d4efb31	Remove redefined starts and defines inside include elements When including a grammar from another grammar, we need to make sure that any redefines of starts and includes that that grammar does inside any of its include elements are also removed.	2018-11-29 21:06:06 +01:00
Nikolai Weibull	46da8fc529	Allow choice within choice in nameClass in RELAX NG The pattern nameClass allows for nested choice elements, for example <name> <choice> <choice> <name>a</name> <name>b</name> </choice> <name>c</name> </choice> </name> which is semantically equivalent to <name> <choice> <name>a</name> <name>b</name> <name>c</name> </choice> </name> The old code didn’t handle this correctly, as it never expected a choice inside another choice. This patch fixes this by flattening any nested choices. This pattern of nested choice elements comes up in RELAX NG simplification, where all choice elements are rewritten in this nested manner, see section 4.12 of the RELAX NG specification.	2018-11-29 21:03:11 +01:00
Nikolai Weibull	4338c310eb	Look inside divs for starts and defines inside include RELAX NG allows for div elements inside of include elements. We need to look inside those div elements for start and define elements that may be redefining start and define elements in the included grammar.	2018-11-29 21:00:46 +01:00
Nick Wellnhofer	123234f2cf	Free input buffer in xmlHaltParser This avoids miscalculation of available bytes. Thanks to Yunho Kim for the report. Closes: #26	2018-09-11 15:06:17 +02:00
Nick Wellnhofer	7218255092	Add test for ICU flush and pivot buffer	2017-11-04 15:38:58 +01:00
Nick Wellnhofer	5af594d8bc	Fix comparison of nodesets to strings Fix two bugs in xmlXPathNodeValHash which could lead to errors when comparing nodesets to strings: - Only use contents of text nodes to compute the hash for element nodes. Comments, PIs, and other node types don't affect the string-value and must be ignored. - Reset `string` to NULL for node types other than text. Reported by Aleksei on the mailing list: https://mail.gnome.org/archives/xml/2017-September/msg00016.html	2017-10-07 15:22:57 +02:00
Nick Wellnhofer	69936b129f	Revert "Print error messages for truncated UTF-8 sequences" This reverts commit 79c8a6b which caused a serious regression in streaming mode. Also reverts part of commit 52ceced "Fix infinite loops with push parser in recovery mode". Fixes bug 786554.	2017-08-30 14:19:06 +02:00
Nick Wellnhofer	899a5d9f0e	Detect infinite recursion in parameter entities When expanding a parameter entity in a DTD, infinite recursion could lead to an infinite loop or memory exhaustion. Thanks to Wei Lei for the first of many reports. Fixes bug 759579.	2017-07-25 15:21:12 +02:00
Nick Wellnhofer	872fea9485	Get rid of "blanks wrapper" for parameter entities Now that replacement of parameter entities goes exclusively through xmlSkipBlankChars, we can account for the surrounding space characters there and remove the "blanks wrapper" hack.	2017-06-20 13:19:47 +02:00
Nick Wellnhofer	24246c7626	Fix xmlHaltParser Pop all extra input streams before resetting the input. Otherwise, a call to xmlPopInput could make input available again. Also set input->end to input->cur. Changes the test output for some error tests. Unfortunately, some fuzzed test cases were added to the test suite without manual cleanup. This makes it almost impossible to review the impact of later changes on the test output.	2017-06-20 13:15:43 +02:00
Nick Wellnhofer	8bbe4508ef	Spelling and grammar fixes Fixes bug 743172, bug 743489, bug 769632, bug 782400 and a few other misspellings.	2017-06-17 16:34:23 +02:00
Nick Wellnhofer	5f440d8cad	Rework entity boundary checks Make sure to finish all entities in the internal subset. Nevertheless, readd a sanity check in xmlParseStartTag2 that was lost in my previous commit. Also add a sanity check in xmlPopInput. Popping an input unexpectedly was the source of many recent memory bugs. The check doesn't mitigate such issues but helps with diagnosis. Always base entity boundary checks on the input ID, not the input pointer. The pointer could have been reallocated to the old address. Always throw a well-formedness error if a boundary check fails. In a few places, a validity error was thrown. Fix a few error codes and improve indentation.	2017-06-17 13:25:53 +02:00
Nick Wellnhofer	dbaab1f369	Test SAX2 callbacks with entity substitution This detects regressions like bug 760367.	2017-06-16 21:38:57 +02:00
Nick Wellnhofer	67f9f9d6c8	Misc fixes for 'make tests' - Silence test output. - Clean up after doc/examples tests. - Adjust expected output for script tests. - Add missing results for relaxng/pattern3 There are still two test failures I can't comment on: - regexp/bug316338 - schemas/any4_0	2017-06-12 19:46:56 +02:00
Nick Wellnhofer	0b2d5c48e3	Initialize keepBlanks in HTML parser This caused failures in the HTML push tests but the fix required to change the expected output of the HTML SAX tests.	2017-06-12 19:11:54 +02:00
David Kilzer	85c112a082	Add test cases for bug 758518 test/HTML/758518-entity.html exposed a bug in pushParseTest() in runtest.c which assumed that an input file was at least 4 bytes long. That test case is only 3 bytes, so we now take the minimum of 4 bytes or the length of the test input. We also now use 'chunkSize' in place of the hard-coded value '1024' later in the function.	2017-06-12 18:26:11 +02:00
Nick Wellnhofer	79c8a6b105	Print error messages for truncated UTF-8 sequences Before, truncated UTF-8 sequences at the end of a file were treated as EOF. Create an error message containing the offending bytes. xmlStringCurrentChar would also print characters from the input stream, not the string it's working on.	2017-06-10 18:11:58 +02:00
Nick Wellnhofer	932cc9896a	Fix buffer size checks in xmlSnprintfElementContent xmlSnprintfElementContent failed to correctly check the available buffer space in two locations. Fixes bug 781333 (CVE-2017-9047) and bug 781701 (CVE-2017-9048). Thanks to Marcel Böhme and Thuan Pham for the report.	2017-06-05 19:38:19 +02:00
Nick Wellnhofer	e26630548e	Fix handling of parameter-entity references There were two bugs where parameter-entity references could lead to an unexpected change of the input buffer in xmlParseNameComplex and xmlDictLookup being called with an invalid pointer. Percent sign in DTD Names ========================= The NEXTL macro used to call xmlParserHandlePEReference. When parsing "complex" names inside the DTD, this could result in entity expansion which created a new input buffer. The fix is to simply remove the call to xmlParserHandlePEReference from the NEXTL macro. This is safe because no users of the macro require expansion of parameter entities. - xmlParseNameComplex - xmlParseNCNameComplex - xmlParseNmtoken The percent sign is not allowed in names, which are grammatical tokens. - xmlParseEntityValue Parameter-entity references in entity values are expanded but this happens in a separate step in this function. - xmlParseSystemLiteral Parameter-entity references are ignored in the system literal. - xmlParseAttValueComplex - xmlParseCharDataComplex - xmlParseCommentComplex - xmlParsePI - xmlParseCDSect Parameter-entity references are ignored outside the DTD. - xmlLoadEntityContent This function is only called from xmlStringLenDecodeEntities and entities are replaced in a separate step immediately after the function call. This bug could also be triggered with an internal subset and double entity expansion. This fixes bug 766956 initially reported by Wei Lei and independently by Chromium's ClusterFuzz, Hanno Böck, and Marco Grassi. Thanks to everyone involved. xmlParseNameComplex with XML_PARSE_OLD10 ======================================== When parsing Names inside an expanded parameter entity with the XML_PARSE_OLD10 option, xmlParseNameComplex would call xmlGROW via the GROW macro if the input buffer was exhausted. At the end of the parameter entity's replacement text, this function would then call xmlPopInput which invalidated the input buffer. There should be no need to invoke GROW in this situation because the buffer is grown periodically every XML_PARSER_CHUNK_SIZE characters and, at least for UTF-8, in xmlCurrentChar. This also matches the code path executed when XML_PARSE_OLD10 is not set. This fixes bugs 781205 (CVE-2017-9049) and 781361 (CVE-2017-9050). Thanks to Marcel Böhme and Thuan Pham for the report. Additional hardening ==================== A separate check was added in xmlParseNameComplex to validate the buffer size.	2017-06-05 18:38:33 +02:00
Nick Wellnhofer	7482f41f61	Check for integer overflow in xmlXPathFormatNumber Check for overflow before casting double to int. Found with afl-fuzz and UBSan.	2017-06-01 22:00:19 +02:00
Nick Wellnhofer	855c19efb7	Avoid reparsing in xmlParseStartTag2 The code in xmlParseStartTag2 must handle the case that the input buffer was grown and reallocated which can invalidate pointers to attribute values. Before, this was handled by detecting changes of the input buffer "base" pointer and, in case of a change, jumping back to the beginning of the function and reparsing the start tag. The major problem of this approach is that whether an input buffer is reallocated is nondeterministic, resulting in seemingly random test failures. See the mailing list thread "runtest mystery bug: name2.xml error case regression test" from 2012, for example. If a reallocation was detected, the code also made no attempts to continue parsing in case of errors which makes a difference in the lax "recover" mode. Now we store the current input buffer "base" pointer for each (not separately allocated) attribute in the namespace URI field, which isn't used until later. After the whole start tag was parsed, the pointers to the attribute values are reconstructed using the offset between the new and the old input buffer. This relies on arithmetic on dangling pointers which is technically undefined behavior. But it seems like the easiest and most efficient fix and a similar approach is used in xmlParserInputGrow. This changes the error output of several tests, typically making it more verbose because we try harder to continue parsing in case of errors. (Another possible solution is to check not only the "base" pointer but the size of the input buffer as well. But this would result in even more reparsing.)	2017-06-01 14:31:28 +02:00
Nick Wellnhofer	f4029cd413	Check XPath exponents for overflow Avoid undefined behavior and wrong results with huge exponents. Found with afl-fuzz and UBSan.	2017-05-31 16:04:37 +02:00
Nick Wellnhofer	a58331a6ee	Check for overflow in xmlXPathIsPositionalPredicate Avoid undefined behavior when casting from double to int. Found with afl-fuzz and UBSan.	2017-05-31 16:04:26 +02:00
Nick Wellnhofer	a851868a75	Parse small XPath numbers more accurately Don't count leading zeros towards the fraction size limit. This allows to parse numbers like 0.0000000000000000000000000000000000000000000000000000000001 which is the only standard-conformant way to represent such numbers, as scientific notation isn't allowed in XPath 1.0. (It is allowed in XPath 2.0 and in libxml2 as an extension, though.) Overall accuracy is still bad, see bug 783238.	2017-05-31 15:46:29 +02:00
Nick Wellnhofer	4bebb030db	Rework XPath rounding functions Use the C library's floor and ceil functions. The old code was overly complicated for no apparent reason and could result in undefined behavior when handling NaNs (found with afl-fuzz and UBSan). Fix wrong comment in xmlXPathRoundFunction. The implementation was already following the spec and rounding half up.	2017-05-31 15:38:42 +02:00
Nick Wellnhofer	40f5852149	Fix axis traversal from attribute and namespace nodes When traversing the "preceding" axis from an attribute node, we must first go up to the attribute's containing element. Otherwise, text children of other attributes could be returned. This made it possible to hit a code path in xmlXPathNextAncestor which contained another bug: The attribute node was initialized with the context node instead of the current node. Normally, this code path is only hit via xmlXPathNextAncestorOrSelf in which case the current and context node are the same. The combination of the two bugs could result in an infinite loop, found with libFuzzer. Traversing the "following" and the "preceding" axis from namespace nodes should be handled similarly. This wasn't supported at all previously.	2017-05-31 14:57:46 +02:00
Nick Wellnhofer	9ab01a277d	Fix XPointer paths beginning with range-to The old code would invoke the broken xmlXPtrRangeToFunction. range-to isn't really a function but a special kind of location step. Remove this function and always handle range-to in the XPath code. The old xmlXPtrRangeToFunction could also be abused to trigger a use-after-free error with the potential for remote code execution. Found with afl-fuzz. Fixes CVE-2016-5131.	2016-10-12 13:12:18 +02:00
Nick Wellnhofer	d8083bf779	Fix NULL pointer deref in XPointer range-to - Check for errors after evaluating first operand. - Add sanity check for empty stack. Found with afl-fuzz.	2016-06-25 14:24:51 +02:00
Pranjal Jumde	0bcd05c5cd	Heap-based buffer overread in htmlCurrentChar For https://bugzilla.gnome.org/show_bug.cgi?id=758606 * parserInternals.c: (xmlNextChar): Add an test to catch other issues on ctxt->input corruption proactively. For non-UTF-8 charsets, xmlNextChar() failed to check for the end of the input buffer and would continuing reading. Fix this by pulling out the check for the end of the input buffer into common code, and return if we reach the end of the input buffer prematurely. * result/HTML/758606.html: Added. * result/HTML/758606.html.err: Added. * result/HTML/758606.html.sax: Added. * result/HTML/758606_2.html: Added. * result/HTML/758606_2.html.err: Added. * result/HTML/758606_2.html.sax: Added. * test/HTML/758606.html: Added test case. * test/HTML/758606_2.html: Added test case.	2016-05-23 15:01:07 +08:00
David Kilzer	0090675905	Heap-based buffer-underreads due to xmlParseName For https://bugzilla.gnome.org/show_bug.cgi?id=759573 * parser.c: (xmlParseElementDecl): Return early on invalid input to fix non-minimized test case (759573-2.xml). Otherwise the parser gets into a bad state in SKIP(3) at the end of the function. (xmlParseConditionalSections): Halt parsing when hitting invalid input that would otherwise caused xmlParserHandlePEReference() to recurse unexpectedly. This fixes the minimized test case (759573.xml). * result/errors/759573-2.xml: Add. * result/errors/759573-2.xml.err: Add. * result/errors/759573-2.xml.str: Add. * result/errors/759573.xml: Add. * result/errors/759573.xml.err: Add. * result/errors/759573.xml.str: Add. * test/errors/759573-2.xml: Add. * test/errors/759573.xml: Add.	2016-05-23 15:01:07 +08:00
Pranjal Jumde	38eae57111	Heap use-after-free in xmlSAX2AttributeNs For https://bugzilla.gnome.org/show_bug.cgi?id=759020 * parser.c: (xmlParseStartTag2): Attribute strings are only valid if the base does not change, so add another check where the base may change. Make sure to set 'attvalue' to NULL after freeing it. * result/errors/759020.xml: Added. * result/errors/759020.xml.err: Added. * result/errors/759020.xml.str: Added. * test/errors/759020.xml: Added test case.	2016-05-23 15:01:07 +08:00
Hugh Davenport	beca86e8c8	Detect change of encoding when parsing HTML names From https://bugzilla.gnome.org/show_bug.cgi?id=758518 Happens when a file has a name getting parsed, but no valid encoding set, so libxml has to guess what the encoding is. This patch detects when the buffer location changes, and if it does, restarts the parsing of the name. This slightly change a couple of regression tests output	2016-05-23 15:01:07 +08:00
Pranjal Jumde	45752d2c33	Bug 759398: Heap use-after-free in xmlDictComputeFastKey <https://bugzilla.gnome.org/show_bug.cgi?id=759398 > * parser.c: (xmlParseNCNameComplex): Store start position instead of a pointer to the name since the underlying buffer may change, resulting in a stale pointer being used. * result/errors/759398.xml: Added. * result/errors/759398.xml.err: Added. * result/errors/759398.xml.str: Added. * test/errors/759398.xml: Added test case.	2016-05-23 15:01:07 +08:00
Pranjal Jumde	a820dbeac2	Bug 758605: Heap-based buffer overread in xmlDictAddString <https://bugzilla.gnome.org/show_bug.cgi?id=758605 > Reviewed by David Kilzer. * HTMLparser.c: (htmlParseName): Add bounds check. (htmlParseNameComplex): Ditto. * result/HTML/758605.html: Added. * result/HTML/758605.html.err: Added. * result/HTML/758605.html.sax: Added. * runtest.c: (pushParseTest): The input for the new test case was so small (4 bytes) that htmlParseChunk() was never called after htmlCreatePushParserCtxt(), thereby creating a false positive test failure. Fixed by using a do-while loop so we always call htmlParseChunk() at least once. * test/HTML/758605.html: Added.	2016-05-23 15:01:07 +08:00

1 2 3 4 5 ...

493 Commits