libxml2

mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2024-10-26 20:25:14 +03:00

Author	SHA1	Message	Date
Nick Wellnhofer	e986d09cf5	Skip incorrectly opened HTML comments Commit `4fd69f3e` fixed handling of '<' characters not followed by an ASCII letter. But a '<!' sequence followed by invalid characters should be treated as bogus comment and skipped. Fixes #380.	2022-08-02 14:38:09 +02:00
Nick Wellnhofer	145170125a	Fix parsing of subtracted regex character classes Fixes #370.	2022-04-23 19:22:42 +02:00
Nick Wellnhofer	4612ce3031	Implement xpath1() XPointer scheme See https://www.w3.org/2005/04/xpointer-schemes/	2022-04-21 04:26:52 +02:00
Nick Wellnhofer	41afa89fc9	Fix short-lived regression in xmlStaticCopyNode Commit `7618a3b1` didn't account for coalesced text nodes. I think it would be better if xmlStaticCopyNode didn't try to coalesce text nodes at all. This code path can only be triggered if some other code doesn't coalesce text nodes properly. In this case, OSS-Fuzz found such behavior in xinclude.c.	2022-04-10 14:17:31 +02:00
Nick Wellnhofer	57b81c208c	Normalize XPath strings in-place Simplify the code and fix a potential memory leak. Fixes #343.	2022-03-05 18:22:51 +01:00
Nick Wellnhofer	bc06a522c1	Fix recursion check in xinclude.c Compare the included URL with the document's URL to detect local inclusions. Fixes #348.	2022-03-02 20:44:41 +01:00
Mike Dalessio	24cdc89006	test coverage for abruptly-closed comments These establish baseline behavior so that the subsequent commit is clear about the behavior it will modify.	2022-03-02 14:42:47 +00:00
Damjan Jovanovic	966b0f21c1	Add whitespace folding for some atomic data types that it's missing on. XSD validation fails when some atomic types contain surrounding whitespace even though XML Schema Part 2: Datatypes Second Edition, section 4.3.6 says they should be collapsed. Fix this. (I am not sure whether the test is correct.) Issue: #278	2022-03-02 14:05:51 +00:00
Nick Wellnhofer	ea6e8f998d	Fix certain combinations of regex range quantifiers Fix regex transitions that have both min/max and a counter. In this case, we want to save the regex state before incrementing the counter. Fixes #301 and the issue reported here: https://mail.gnome.org/archives/xml/2016-April/msg00017.html	2022-02-28 16:56:02 +01:00
Nick Wellnhofer	382fb056b5	Fix range quantifier on subregex Make sure to add counted exit transitions before other counter transitions. Otherwise, we won't backtrack correctly. Fixes #65.	2022-02-28 16:56:02 +01:00
Nick Wellnhofer	ce0871e15c	Only warn on invalid redeclarations of predefined entities Downgrade the error message to a warning since the error was ignored, anyway. Also print the name of redeclared entity. For a proper fix that also shows filename and line number of the invalid redeclaration, we'd have to - pass the parser context to the entity functions somehow, or - make these functions return distinct error codes. Partial fix for #308.	2022-02-20 21:49:04 +01:00
Nick Wellnhofer	9edc20c154	Fix double counting of CRLF in comments Fixes #151.	2022-02-07 20:54:07 +01:00
Nick Wellnhofer	5408c10c37	Don't normalize namespace URIs in XPointer xmlns() scheme Namespace URIs should be compared without escaping or unescaping: https://www.w3.org/TR/REC-xml-names/#NSNameComparison Fixes #289.	2022-02-04 14:00:09 +01:00
Nick Wellnhofer	1c7d91abe4	Fix handling of XSD with empty namespace An empty namespace means no default namespace. Fixes #303.	2022-02-03 23:31:19 +01:00
Nick Wellnhofer	f480f7509c	Update NewsML DTD in test suite Switch to version 1.2 which has a clearer license. Fixes #291.	2022-02-03 14:43:17 +01:00
Nick Wellnhofer	d85245f934	Fix regression with PEs in external DTD Fix a regression introduced with commit `a28f7d87`. In some cases, parameter entity references in external DTDs wouldn't be expanded. Fixes #306.	2022-01-16 21:56:10 +01:00
David Kilzer	03bb929390	Fix parse failure when 4-byte character in UTF-16 BE is split across a chunk This makes the logic in UTF16BEToUTF8() match UTF16LEToUTF8(). * encoding.c: (UTF16LEToUTF8): - Fix comment to describe what the code does. (UTF16BEToUTF8): - Fix undefined behavior which was applied to UTF16LEToUTF8() in `2f9382033e`. - Add bounds check to while() loop which was applied to UTF16LEToUTF8() in `be803967db`. - Do not return -2 when (in >= inend) to fix the bug. This was applied to UTF16LEToUTF8() in `496a1cf592`. - Inline (<< 8) statements to match UTF16LEToUTF8(). Add the following tests and results: test/text-4-byte-UTF-16-BE-offset.xml test/text-4-byte-UTF-16-BE.xml test/text-4-byte-UTF-16-LE-offset.xml test/text-4-byte-UTF-16-LE.xml	2022-01-16 14:07:17 +01:00
Nick Wellnhofer	2732b23466	Fix regression parsing public IDs literals in HTML Fix regression introduced when reworking htmlParsePubidLiteral in commit `93ce33c2`. Fixes #318.	2022-01-10 13:37:59 +01:00
Nick Wellnhofer	01411e7c5e	Check for invalid redeclarations of predefined entities Implement section "4.6 Predefined Entities" of the XML 1.0 spec and check whether redeclarations of predefined entities match the original definitions. Note that some test cases declared <!ENTITY lt "<"> But the XML spec clearly states that this is illegal: > If the entities lt or amp are declared, they MUST be declared as > internal entities whose replacement text is a character reference to > the respective character (less-than sign or ampersand) being escaped; > the double escaping is REQUIRED for these entities so that references > to them produce a well-formed result. Also fixes #217 but the connection is only tangential. The integer overflow discovered by fuzzing was more related to the fact that various parts of the parser disagreed on whether to prefer predefined entities over their redeclarations. The whole situation is a mess and even depends on legacy parser options. But now that redeclarations are validated, it shouldn't make a difference. As noted in the added comment, this is also one of the cases where overly defensive checks can hide interesting logic bugs from fuzzers.	2021-02-08 21:51:26 +01:00
Mike Dalessio	e28d9347bc	add test coverage for incorrectly-closed comments this establishes the baseline behavior so that subsequent commits which modify this behavior are clear about what's being changed.	2020-12-16 16:12:07 +01:00
Nick Wellnhofer	87d20b554c	Fix regression introduced with commit `74dcc10b` The code wasn't dead after all, but I can see no reason in delaying the XPointer evaluation. This could lead to nodes included earlier appearing in XPointer results.	2020-08-19 13:52:08 +02:00
Nick Wellnhofer	d88df4bd48	Fix corner case with empty xi:fallback xi:fallback could become empty after recursive expansion. Use a flag to track whether nodes should be skipped.	2020-08-17 01:17:39 +02:00
Nick Wellnhofer	1abf2967f9	Fix exponential runtime and memory in xi:fallback processing When creating XML_XINCLUDE_START nodes, the children of the original xi:include node must be freed, otherwise fallback content is copied twice, doubling runtime and memory consumption for each nested xi:fallback/xi:include pair. Found with libFuzzer.	2020-08-07 19:59:07 +02:00
Nick Wellnhofer	0f9817c75b	Don't recurse into xi:include children in xmlXIncludeDoProcess Otherwise, nested xi:include nodes might result in a use-after-free if XML_PARSE_NOXINCNODE is specified. Found with libFuzzer and ASan.	2020-08-06 14:29:33 +02:00
David Kilzer	6b4717d61d	Add regexp regression tests - Bug 757711: heap-buffer-overflow in xmlFAParsePosCharGroup <https://bugzilla.gnome.org/show_bug.cgi?id=757711> - Bug 783015 - Integer-overflow in xmlFAParseQuantExact <https://bugzilla.gnome.org/show_bug.cgi?id=783015> (Regexptests): Add support for checking stderr output when running regexp tests. This makes it possible to check in test cases that fail and not see false-positive error output when running the tests. Unlike other libxml2 test suites, if there is no stderr output, no *.err file needs to be created.	2020-07-06 12:37:53 +02:00
Nick Wellnhofer	477c7f6aff	Fix quadratic runtime in HTML parser Commit `eeb99329` removed an important optimization avoiding quadratic runtime when repeatedly scanning the input buffer for terminating characters in the HTML push parser. The related bug is https://bugzilla.gnome.org/show_bug.cgi?id=444994 Make sure that ctxt->checkIndex is always written and store additional parser state in ctxt->inSubset which is unused in the HTML parser. Found by OSS-Fuzz.	2020-07-06 12:17:20 +02:00
Nick Wellnhofer	32cb5dccda	Add test case for recursive external parsed entities	2020-02-11 17:36:43 +01:00
Jared Yanovich	2a350ee9b4	Large batch of typo fixes Closes #109.	2019-09-30 18:04:38 +02:00
Nick Wellnhofer	c51e38cb3a	Make xmlParseConditionalSections non-recursive Avoid call stack overflow in deeply nested conditional sections. Found by OSS-Fuzz.	2019-09-30 15:47:30 +02:00
Nick Wellnhofer	c2b0a184a9	Fix empty branch in regex Fixes bug 649244: https://bugzilla.gnome.org/show_bug.cgi?id=649244 Closes #57.	2019-09-25 14:22:47 +02:00
Nick Wellnhofer	6705f4d28e	Remove executable bit from non-executable files	2019-09-16 15:48:59 +02:00
Nick Wellnhofer	e8c9cd5c7a	Fix Schema determinism check of ##other namespaces Non-compound (##local) and compound string atoms are always disjoint regardless of whether the compound atom is negated (##other). Closes #40.	2019-09-16 15:36:02 +02:00
Nick Wellnhofer	8efc5b283c	14:00 is a valid timezone for xs:dateTime Closes #100	2019-09-13 12:24:23 +02:00
Jan Pokorný	ea695ac0d6	Fix unability to RelaxNG-validate grammar with choice-based name class Previously, test/relaxng/ambig_name-class2.xml would fail to validate against test/relaxng/ambig_name-class2.rng: > test/relaxng/ambig_name-class2.rng:4: > element attribute: Relax-NG parser error : > Found anyName attribute without oneOrMore ancestor > Relax-NG schema test/relaxng/ambig_name-class2.rng failed to compile Signed-off-by: Jan Pokorný <jpokorny@redhat.com>	2019-08-25 13:29:04 +02:00
Jan Pokorný	8074b88179	Fix unability to validate ambiguously constructed interleave for RelaxNG Previously, test/relaxng/ambig_name-class.xml would fail to validate for a simple reason -- interleave within "open-name-class" context is supposed to be fine with whatever else is pending the consumption, since effectively, it's unrelated from a higher parsing perspective. Signed-off-by: Jan Pokorný <jpokorny@redhat.com>	2019-08-25 13:29:04 +02:00
Nick Wellnhofer	c2f4da1a93	Improve XPath predicate and filter evaluation Consolidate code paths evaluating XPath predicates and filters. Don't push context node on stack when evaluating predicates. I have no idea why this was done. It seems completely useless and trying to pop the context node from a corrupted stack has already caused security issues. Filter nodesets in-place and don't create node sets with NULL gaps which allows to simplify merging a great deal. Simply move matched nodes backward and create a compact node set. Merge xmlXPathCompOpEvalPositionalPredicate into xmlXPathCompOpEvalPredicate.	2019-04-22 14:48:46 +02:00
Nick Wellnhofer	30a6533e01	Fix float casts in xmlXPathSubstringFunction Rewrite conversion of double to int in xmlXPathSubstringFunction, adding range checks to avoid undefined behavior. Make sure to add start and length as floating-point numbers before converting to int. Fix a bug when rounding negative start indices. Remove unneeded calls to xmlXPathIs{Inf,NaN} and rely on IEEE math instead. Avoid computing the string length. xmlUTF8Strsub works as expected if the length of the requested substring exceeds the input. Found with libFuzzer and UBSan.	2019-03-08 14:29:59 +01:00
Nikolai Weibull	c64d4efb31	Remove redefined starts and defines inside include elements When including a grammar from another grammar, we need to make sure that any redefines of starts and includes that that grammar does inside any of its include elements are also removed.	2018-11-29 21:06:06 +01:00
Nikolai Weibull	46da8fc529	Allow choice within choice in nameClass in RELAX NG The pattern nameClass allows for nested choice elements, for example <name> <choice> <choice> <name>a</name> <name>b</name> </choice> <name>c</name> </choice> </name> which is semantically equivalent to <name> <choice> <name>a</name> <name>b</name> <name>c</name> </choice> </name> The old code didn’t handle this correctly, as it never expected a choice inside another choice. This patch fixes this by flattening any nested choices. This pattern of nested choice elements comes up in RELAX NG simplification, where all choice elements are rewritten in this nested manner, see section 4.12 of the RELAX NG specification.	2018-11-29 21:03:11 +01:00
Nikolai Weibull	4338c310eb	Look inside divs for starts and defines inside include RELAX NG allows for div elements inside of include elements. We need to look inside those div elements for start and define elements that may be redefining start and define elements in the included grammar.	2018-11-29 21:00:46 +01:00
Nick Wellnhofer	7218255092	Add test for ICU flush and pivot buffer	2017-11-04 15:38:58 +01:00
Nick Wellnhofer	5af594d8bc	Fix comparison of nodesets to strings Fix two bugs in xmlXPathNodeValHash which could lead to errors when comparing nodesets to strings: - Only use contents of text nodes to compute the hash for element nodes. Comments, PIs, and other node types don't affect the string-value and must be ignored. - Reset `string` to NULL for node types other than text. Reported by Aleksei on the mailing list: https://mail.gnome.org/archives/xml/2017-September/msg00016.html	2017-10-07 15:22:57 +02:00
Nick Wellnhofer	69936b129f	Revert "Print error messages for truncated UTF-8 sequences" This reverts commit `79c8a6b` which caused a serious regression in streaming mode. Also reverts part of commit `52ceced` "Fix infinite loops with push parser in recovery mode". Fixes bug 786554.	2017-08-30 14:19:06 +02:00
Nick Wellnhofer	899a5d9f0e	Detect infinite recursion in parameter entities When expanding a parameter entity in a DTD, infinite recursion could lead to an infinite loop or memory exhaustion. Thanks to Wei Lei for the first of many reports. Fixes bug 759579.	2017-07-25 15:21:12 +02:00
Nick Wellnhofer	5f440d8cad	Rework entity boundary checks Make sure to finish all entities in the internal subset. Nevertheless, readd a sanity check in xmlParseStartTag2 that was lost in my previous commit. Also add a sanity check in xmlPopInput. Popping an input unexpectedly was the source of many recent memory bugs. The check doesn't mitigate such issues but helps with diagnosis. Always base entity boundary checks on the input ID, not the input pointer. The pointer could have been reallocated to the old address. Always throw a well-formedness error if a boundary check fails. In a few places, a validity error was thrown. Fix a few error codes and improve indentation.	2017-06-17 13:25:53 +02:00
David Kilzer	85c112a082	Add test cases for bug 758518 test/HTML/758518-entity.html exposed a bug in pushParseTest() in runtest.c which assumed that an input file was at least 4 bytes long. That test case is only 3 bytes, so we now take the minimum of 4 bytes or the length of the test input. We also now use 'chunkSize' in place of the hard-coded value '1024' later in the function.	2017-06-12 18:26:11 +02:00
Nick Wellnhofer	79c8a6b105	Print error messages for truncated UTF-8 sequences Before, truncated UTF-8 sequences at the end of a file were treated as EOF. Create an error message containing the offending bytes. xmlStringCurrentChar would also print characters from the input stream, not the string it's working on.	2017-06-10 18:11:58 +02:00
Nick Wellnhofer	932cc9896a	Fix buffer size checks in xmlSnprintfElementContent xmlSnprintfElementContent failed to correctly check the available buffer space in two locations. Fixes bug 781333 (CVE-2017-9047) and bug 781701 (CVE-2017-9048). Thanks to Marcel Böhme and Thuan Pham for the report.	2017-06-05 19:38:19 +02:00
Nick Wellnhofer	e26630548e	Fix handling of parameter-entity references There were two bugs where parameter-entity references could lead to an unexpected change of the input buffer in xmlParseNameComplex and xmlDictLookup being called with an invalid pointer. Percent sign in DTD Names ========================= The NEXTL macro used to call xmlParserHandlePEReference. When parsing "complex" names inside the DTD, this could result in entity expansion which created a new input buffer. The fix is to simply remove the call to xmlParserHandlePEReference from the NEXTL macro. This is safe because no users of the macro require expansion of parameter entities. - xmlParseNameComplex - xmlParseNCNameComplex - xmlParseNmtoken The percent sign is not allowed in names, which are grammatical tokens. - xmlParseEntityValue Parameter-entity references in entity values are expanded but this happens in a separate step in this function. - xmlParseSystemLiteral Parameter-entity references are ignored in the system literal. - xmlParseAttValueComplex - xmlParseCharDataComplex - xmlParseCommentComplex - xmlParsePI - xmlParseCDSect Parameter-entity references are ignored outside the DTD. - xmlLoadEntityContent This function is only called from xmlStringLenDecodeEntities and entities are replaced in a separate step immediately after the function call. This bug could also be triggered with an internal subset and double entity expansion. This fixes bug 766956 initially reported by Wei Lei and independently by Chromium's ClusterFuzz, Hanno Böck, and Marco Grassi. Thanks to everyone involved. xmlParseNameComplex with XML_PARSE_OLD10 ======================================== When parsing Names inside an expanded parameter entity with the XML_PARSE_OLD10 option, xmlParseNameComplex would call xmlGROW via the GROW macro if the input buffer was exhausted. At the end of the parameter entity's replacement text, this function would then call xmlPopInput which invalidated the input buffer. There should be no need to invoke GROW in this situation because the buffer is grown periodically every XML_PARSER_CHUNK_SIZE characters and, at least for UTF-8, in xmlCurrentChar. This also matches the code path executed when XML_PARSE_OLD10 is not set. This fixes bugs 781205 (CVE-2017-9049) and 781361 (CVE-2017-9050). Thanks to Marcel Böhme and Thuan Pham for the report. Additional hardening ==================== A separate check was added in xmlParseNameComplex to validate the buffer size.	2017-06-05 18:38:33 +02:00
Nick Wellnhofer	7482f41f61	Check for integer overflow in xmlXPathFormatNumber Check for overflow before casting double to int. Found with afl-fuzz and UBSan.	2017-06-01 22:00:19 +02:00

1 2 3 4 5 ...

379 Commits