libxml2

mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-01-12 09:17:37 +03:00

Author	SHA1	Message	Date
Nick Wellnhofer	c6083a32d6	parser: Improve error handling in push parser - Report errors earlier - Align error messages with pull parser	2023-08-29 18:41:05 +02:00
Nick Wellnhofer	855818bd2b	parser: Check for truncated multi-byte sequences When decoding input data, check whether the "raw" buffer is empty after parsing the document. Otherwise, the input ends with a truncated multi-byte sequence which shouldn't be silently ignored.	2023-08-08 15:21:37 +02:00
Nick Wellnhofer	0ffc2d82b5	runtest: Skip element name in schema error messages This makes sure that memory and streaming tests will report the same messages.	2023-04-30 21:45:39 +02:00
Nick Wellnhofer	e4f85f1bd2	[CVE-2023-28484] Fix null deref in xmlSchemaFixupComplexType Fix a null pointer dereference when parsing (invalid) XML schemas. Thanks to Robby Simpson for the report! Fixes #491.	2023-04-11 14:29:50 +02:00
David Kilzer	cb1b8b8516	xmlValidatePopElement() can return invalid value (-1) Covered by: test/VC/ElementValid5 This only affects XML Reader API with LIBXML_REGEXP_ENABLED and LIBXML_VALID_ENABLED turned on. * result/VC/ElementValid5.rdr: - Update result to add missing error message. * python/tests/reader2.py: * result/VC/ElementValid6.rdr: * result/VC/ElementValid7.rdr: * result/valid/781333.xml.err.rdr: - Update result to fix grammar issue. * valid.c: (xmlValidatePopElement): - Check return value of xmlRegExecPushString() to handle -1, and assign 'ret = 0;' to return 0 from xmlValidatePopElement(). This change affects xmlTextReaderValidatePop() from xmlreader.c. - Fix grammar of error message by changing 'child' to 'children'.	2023-04-10 13:21:53 -07:00
Nick Wellnhofer	d7d0bc6581	SAX2: Ignore namespaces in HTML documents In commit `21ca8829`, we started to ignore namespaces in HTML element names but we still called xmlSplitQName, effectively stripping the namespace prefix. This would cause elements like <o:p> being parsed as <p>. Now we leave the name untouched. Fixes #508.	2023-03-31 17:08:43 +02:00
Nick Wellnhofer	e20f4d7a65	xinclude: Fix quadratic behavior in xmlXIncludeLoadTxt Also make text inclusions work with memory buffers, for example when using a custom entity loader, and fix a memory leak in case of invalid characters. Fixes #483.	2023-02-14 12:25:07 +01:00
Nick Wellnhofer	be0ec005f3	xinclude: Abort immediately if max depth was exceeded Avoids resource exhaustion if the maximum recursion depth was exceeded. Note that the XInclude engine offers no protection against other "billion laughs"-style amplification attacks as long as they stay below the maximum depth.	2023-02-13 11:29:26 +01:00
Nick Wellnhofer	74aa61e0bd	parser: Halt parser on DTD errors If we try to continue parsing after an error in the internal or external subset, entity expansion accounting gets more complicated. Simply halt the parser. Found with libFuzzer.	2023-01-24 11:32:15 +01:00
Nick Wellnhofer	608c65bb8e	xpath: number('-') should return NaN Fixes https://gitlab.gnome.org/GNOME/libxslt/-/issues/81	2023-01-18 15:15:41 +01:00
Nick Wellnhofer	d320a683d1	parser: Fix entity check in attributes Don't set the "checked" flag when checking entities in default attribute values. These entities could reference other entities which weren't defined yet, so the check isn't reliable. This fixes a short-lived regression which could lead to a call stack overflow later in xmlStringGetNodeList.	2023-01-17 13:59:24 +01:00
Nick Wellnhofer	a41b09c739	parser: Improve detection of entity loops Set a flag to detect entity loops at once instead of processing until the depth limit is exceeded.	2022-12-23 22:11:18 +01:00
Nick Wellnhofer	d972393f30	parser: Only report a single entity error Don't report errors multiple times for nested entity references.	2022-12-23 22:10:39 +01:00
Nick Wellnhofer	ae0c9cfa05	uri: Fix handling of port numbers Allow port number without host, real fix for #71. Also compare port numbers in xmlBuildRelativeURI. Fix handling of port numbers in xmlUriEscape.	2022-12-13 01:43:49 +01:00
Nick Wellnhofer	76c6da4209	error: Make sure that error messages are valid UTF-8 This has caused issues with the Python bindings for a long time. Should fix #64.	2022-12-04 23:34:19 +01:00
Nick Wellnhofer	9c63cea5a6	test: Add test for push parser boundaries	2022-11-20 21:27:59 +01:00
Nick Wellnhofer	68a6518c45	parser: Rewrite push parser boundary checks Remove inaccurate xmlParseCheckTransition check. Remove non-incremental xmlParseGetLasts check. Add functions that check for several boundary constructs more accurately, keeping track of progress in ctxt->checkIndex. Fixes #439.	2022-11-20 21:27:08 +01:00
Nick Wellnhofer	76d6b0d768	html: Don't escape ASCII chars in href attributes In several cases, href attributes can contain ASCII characters which are illegal in URIs. Escaping them often does more harm than good. Fixes #321.	2022-11-20 21:16:03 +01:00
Nick Wellnhofer	f61b8a6233	parser: Fix DTD parser progress checks This is another attempt at fixing parser progress checks. Instead of relying on in->consumed, which could overflow, change some DTD parser functions to make guaranteed progress on certain byte sequences.	2022-11-20 21:16:03 +01:00
Nick Wellnhofer	b456e3bb42	xinclude: Always allow XPtr expressions in external documents	2022-10-31 16:49:36 +01:00
Nick Wellnhofer	eef0a7395c	xinclude: Implement "streaming" mode When using xmlreader, XPointer expressions in XIncludes simply cannot work. Expressions can reference nodes which weren't parsed yet or which were already deleted. After fixing nested XIncludes, we reference includes which were parsed previously. When streaming, these nodes could have been deleted, leading to use-after-free errors. Disallow XPointer expressions and truncate the include table in streaming mode.	2022-10-30 14:12:55 +01:00
Nick Wellnhofer	20e2fb4c1c	xinclude: Avoid creation of subcontexts Don't create subcontext in xmlXIncludeRecurseDoc. Save and restore 'doc' and 'incTab' instead. Make xmlXIncludeLoadFallback call xmlXIncludeCopyNode which seems safer than xmlXIncludeDoProcess since the latter may modify the document. This should also be more performant since we need to copy the whole fallback subtree anyway. Also make sure to avoid replacements in fallback elements in xmlXIncludeDoProcess.	2022-10-25 19:34:38 +02:00
Nick Wellnhofer	d2ed1e4f99	xinclude: Limit recursion depth This avoids call stack overflows.	2022-10-23 18:52:56 +02:00
Nick Wellnhofer	34496f26db	xinclude: Test for inclusion loops	2022-10-23 14:27:05 +02:00
Nick Wellnhofer	bc267cb9bc	xinclude: Expand includes in xmlXIncludeCopyNode This should make nested includes work reliably. Fixes #424.	2022-10-23 14:27:05 +02:00
Nick Wellnhofer	ea7c9fb5dd	xinclude: Don't create result doc for test with errors	2022-10-23 14:27:05 +02:00
Nick Wellnhofer	c99cde3f21	xinclude: Also test error messages The reader interface with XIncludes is somewhat broken and can generate different error messages. Start to move tests which are sketchy with reader to a separate directory.	2022-10-23 14:26:59 +02:00
Nick Wellnhofer	938105b572	Revert "xinclude: Fix regression with nested includes" This reverts commit `7f04e29731` which caused memory errors. See #424.	2022-10-21 15:56:12 +02:00
Nick Wellnhofer	7f04e29731	xinclude: Fix regression with nested includes This reverts commits `74dcc10b` and `87d20b55`. Fixes #424.	2022-10-18 19:17:45 +02:00
Nick Wellnhofer	1d4f5d24ac	schemas: Fix null-pointer-deref in xmlSchemaCheckCOSSTDerivedOK Found by OSS-Fuzz.	2022-09-13 16:56:59 +02:00
Nick Wellnhofer	c714979293	Fix --with-valid --without-regexps build This build config resulted in segfaults in 'runtest' because a special xmlElementContentPtr showed up in a few places. I'm not sure if this is the right fix. An error message was changed to conform to the --with-regexps build. There are still a few missing validity errors, so the tests don't pass.	2022-09-02 18:33:35 +02:00
Nick Wellnhofer	e986d09cf5	Skip incorrectly opened HTML comments Commit `4fd69f3e` fixed handling of '<' characters not followed by an ASCII letter. But a '<!' sequence followed by invalid characters should be treated as bogus comment and skipped. Fixes #380.	2022-08-02 14:38:09 +02:00
Nick Wellnhofer	145170125a	Fix parsing of subtracted regex character classes Fixes #370.	2022-04-23 19:22:42 +02:00
Nick Wellnhofer	4612ce3031	Implement xpath1() XPointer scheme See https://www.w3.org/2005/04/xpointer-schemes/	2022-04-21 04:26:52 +02:00
Nick Wellnhofer	41afa89fc9	Fix short-lived regression in xmlStaticCopyNode Commit `7618a3b1` didn't account for coalesced text nodes. I think it would be better if xmlStaticCopyNode didn't try to coalesce text nodes at all. This code path can only be triggered if some other code doesn't coalesce text nodes properly. In this case, OSS-Fuzz found such behavior in xinclude.c.	2022-04-10 14:17:31 +02:00
Nick Wellnhofer	4de7f2acfe	Remove unused result files	2022-04-04 04:28:15 +02:00
Nick Wellnhofer	f1c32b4c78	Allow missing result files in runtest Treat missing files as empty.	2022-04-04 04:28:15 +02:00
Nick Wellnhofer	95c7f315ab	Move SVG tests to runtest.c Also update the test results for the first time since 2000.	2022-04-04 04:18:07 +02:00
Nick Wellnhofer	48b03c8479	Remove major parts of old test suite Remove all the parts of the old test suite which are covered by runtest.c for quite some time. The following test programs are removed: - testC14N - testHTML - testReader - testRelax - testSAX - testSchemas - testURI - testXPath This also removes a few results of unimportant tests only run by the old test suite.	2022-04-04 04:14:55 +02:00
Nick Wellnhofer	57b81c208c	Normalize XPath strings in-place Simplify the code and fix a potential memory leak. Fixes #343.	2022-03-05 18:22:51 +01:00
Nick Wellnhofer	bc06a522c1	Fix recursion check in xinclude.c Compare the included URL with the document's URL to detect local inclusions. Fixes #348.	2022-03-02 20:44:41 +01:00
Mike Dalessio	d7b287b94c	htmlParseComment: handle abruptly-closed comments See guidance provided on abrutply-closed comments here: https://html.spec.whatwg.org/multipage/parsing.html#parse-error-abrupt-closing-of-empty-comment	2022-03-02 14:42:47 +00:00
Mike Dalessio	24cdc89006	test coverage for abruptly-closed comments These establish baseline behavior so that the subsequent commit is clear about the behavior it will modify.	2022-03-02 14:42:47 +00:00
Nick Wellnhofer	ea6e8f998d	Fix certain combinations of regex range quantifiers Fix regex transitions that have both min/max and a counter. In this case, we want to save the regex state before incrementing the counter. Fixes #301 and the issue reported here: https://mail.gnome.org/archives/xml/2016-April/msg00017.html	2022-02-28 16:56:02 +01:00
Nick Wellnhofer	382fb056b5	Fix range quantifier on subregex Make sure to add counted exit transitions before other counter transitions. Otherwise, we won't backtrack correctly. Fixes #65.	2022-02-28 16:56:02 +01:00
Nick Wellnhofer	ce0871e15c	Only warn on invalid redeclarations of predefined entities Downgrade the error message to a warning since the error was ignored, anyway. Also print the name of redeclared entity. For a proper fix that also shows filename and line number of the invalid redeclaration, we'd have to - pass the parser context to the entity functions somehow, or - make these functions return distinct error codes. Partial fix for #308.	2022-02-20 21:49:04 +01:00
Nick Wellnhofer	652dd12a85	[CVE-2022-23308] Use-after-free of ID and IDREF attributes If a document is parsed with XML_PARSE_DTDVALID and without XML_PARSE_NOENT, the value of ID attributes has to be normalized after potentially expanding entities in xmlRemoveID. Otherwise, later calls to xmlGetID can return a pointer to previously freed memory. ID attributes which are empty or contain only whitespace after entity expansion are affected in a similar way. This is fixed by not storing such attributes in the ID table. The test to detect streaming mode when validating against a DTD was broken. In connection with the defects above, this could result in a use-after-free when using the xmlReader interface with validation. Fix detection of streaming mode to avoid similar issues. (This changes the expected result of a test case. But as far as I can tell, using the XML reader with XIncludes referencing the root document never worked properly, anyway.) All of these issues can result in denial of service. Using xmlReader with validation could result in disclosure of memory via the error channel, typically stderr. The security impact of xmlGetID returning a pointer to freed memory depends on the application. The typical use case of calling xmlGetID on an unmodified document is not affected.	2022-02-19 19:26:42 +01:00
Nick Wellnhofer	9edc20c154	Fix double counting of CRLF in comments Fixes #151.	2022-02-07 20:54:07 +01:00
Nick Wellnhofer	5408c10c37	Don't normalize namespace URIs in XPointer xmlns() scheme Namespace URIs should be compared without escaping or unescaping: https://www.w3.org/TR/REC-xml-names/#NSNameComparison Fixes #289.	2022-02-04 14:00:09 +01:00
Nick Wellnhofer	1c7d91abe4	Fix handling of XSD with empty namespace An empty namespace means no default namespace. Fixes #303.	2022-02-03 23:31:19 +01:00
Nick Wellnhofer	f480f7509c	Update NewsML DTD in test suite Switch to version 1.2 which has a clearer license. Fixes #291.	2022-02-03 14:43:17 +01:00
Nick Wellnhofer	d85245f934	Fix regression with PEs in external DTD Fix a regression introduced with commit `a28f7d87`. In some cases, parameter entity references in external DTDs wouldn't be expanded. Fixes #306.	2022-01-16 21:56:10 +01:00
David Kilzer	03bb929390	Fix parse failure when 4-byte character in UTF-16 BE is split across a chunk This makes the logic in UTF16BEToUTF8() match UTF16LEToUTF8(). * encoding.c: (UTF16LEToUTF8): - Fix comment to describe what the code does. (UTF16BEToUTF8): - Fix undefined behavior which was applied to UTF16LEToUTF8() in `2f9382033e`. - Add bounds check to while() loop which was applied to UTF16LEToUTF8() in `be803967db`. - Do not return -2 when (in >= inend) to fix the bug. This was applied to UTF16LEToUTF8() in `496a1cf592`. - Inline (<< 8) statements to match UTF16LEToUTF8(). Add the following tests and results: test/text-4-byte-UTF-16-BE-offset.xml test/text-4-byte-UTF-16-BE.xml test/text-4-byte-UTF-16-LE-offset.xml test/text-4-byte-UTF-16-LE.xml	2022-01-16 14:07:17 +01:00
Nick Wellnhofer	2732b23466	Fix regression parsing public IDs literals in HTML Fix regression introduced when reworking htmlParsePubidLiteral in commit `93ce33c2`. Fixes #318.	2022-01-10 13:37:59 +01:00
Nick Wellnhofer	de5b624f10	Fix handling of unexpected EOF in xmlParseContent Readd the XML_ERR_TAG_NOT_FINISHED error on unexpected EOF which was removed in commit `62150ed2`. This commit also introduced a regression for direct users of xmlParseContent. Unclosed tags weren't checked.	2021-05-08 20:47:36 +02:00
Nick Wellnhofer	3e80560d4b	Fix line numbers in error messages for mismatched tags Commit `62150ed2` introduced a small regression in the error messages for mismatched tags. This typically only affected messages after the first mismatch, but with custom SAX handlers all line numbers would be off. This also fixes line numbers in the SAX push parser which were never handled correctly.	2021-05-07 11:48:11 +02:00
Nick Wellnhofer	01411e7c5e	Check for invalid redeclarations of predefined entities Implement section "4.6 Predefined Entities" of the XML 1.0 spec and check whether redeclarations of predefined entities match the original definitions. Note that some test cases declared <!ENTITY lt "<"> But the XML spec clearly states that this is illegal: > If the entities lt or amp are declared, they MUST be declared as > internal entities whose replacement text is a character reference to > the respective character (less-than sign or ampersand) being escaped; > the double escaping is REQUIRED for these entities so that references > to them produce a well-formed result. Also fixes #217 but the connection is only tangential. The integer overflow discovered by fuzzing was more related to the fact that various parts of the parser disagreed on whether to prefer predefined entities over their redeclarations. The whole situation is a mess and even depends on legacy parser options. But now that redeclarations are validated, it shouldn't make a difference. As noted in the added comment, this is also one of the cases where overly defensive checks can hide interesting logic bugs from fuzzers.	2021-02-08 21:51:26 +01:00
Nick Wellnhofer	79301d3d5e	Fix timeout when handling recursive entities Abort parsing early to avoid an almost infinite loop in certain error cases involving recursive entities. Found with libFuzzer.	2020-12-18 14:13:46 +01:00
Mike Dalessio	a67b63d183	use new htmlParseLookupCommentEnd to find comment ends Note that the caret in error messages generated during comment parsing may have moved by one byte. See guidance provided on incorrectly-closed comments here: https://html.spec.whatwg.org/multipage/parsing.html#parse-error-incorrectly-closed-comment	2020-12-16 16:12:07 +01:00
Mike Dalessio	29f5d20e84	htmlParseComment: treat `--!>` as if it closed the comment See guidance provided on incorrectly-closed comments here: https://html.spec.whatwg.org/multipage/parsing.html#parse-error-incorrectly-closed-comment	2020-12-16 16:12:07 +01:00
Mike Dalessio	e28d9347bc	add test coverage for incorrectly-closed comments this establishes the baseline behavior so that subsequent commits which modify this behavior are clear about what's being changed.	2020-12-16 16:12:07 +01:00
Nick Wellnhofer	87d20b554c	Fix regression introduced with commit `74dcc10b` The code wasn't dead after all, but I can see no reason in delaying the XPointer evaluation. This could lead to nodes included earlier appearing in XPointer results.	2020-08-19 13:52:08 +02:00
Nick Wellnhofer	d88df4bd48	Fix corner case with empty xi:fallback xi:fallback could become empty after recursive expansion. Use a flag to track whether nodes should be skipped.	2020-08-17 01:17:39 +02:00
Nick Wellnhofer	1abf2967f9	Fix exponential runtime and memory in xi:fallback processing When creating XML_XINCLUDE_START nodes, the children of the original xi:include node must be freed, otherwise fallback content is copied twice, doubling runtime and memory consumption for each nested xi:fallback/xi:include pair. Found with libFuzzer.	2020-08-07 19:59:07 +02:00
Nick Wellnhofer	0f9817c75b	Don't recurse into xi:include children in xmlXIncludeDoProcess Otherwise, nested xi:include nodes might result in a use-after-free if XML_PARSE_NOXINCNODE is specified. Found with libFuzzer and ASan.	2020-08-06 14:29:33 +02:00
Nick Wellnhofer	93ce33c2b8	Fix several quadratic runtime issues in HTML push parser Fix a few remaining cases where the HTML push parser would scan more content during lookahead than being parsed later. Make sure that htmlParseDocTypeDecl consumes all content up to the final '>' in case of errors. The old comment said "We shouldn't try to resynchronize", but ignoring invalid content is also what the HTML5 spec mandates. Likewise, make htmlParseEndTag skip to the final '>' in invalid end tags even if not in recovery mode. This is probably the most visible change in practice and leads to different output for some tests but is also more in line with HTML5. Make sure that htmlParsePI and htmlParseComment don't abort if invalid characters are encountered but log an error and ignore the character. Change some other end-of-buffer checks to test for a zero byte instead of relying on IS_CHAR. Fix usage of IS_CHAR macro in htmlParseScript.	2020-07-23 20:47:35 +02:00
David Kilzer	6b4717d61d	Add regexp regression tests - Bug 757711: heap-buffer-overflow in xmlFAParsePosCharGroup <https://bugzilla.gnome.org/show_bug.cgi?id=757711> - Bug 783015 - Integer-overflow in xmlFAParseQuantExact <https://bugzilla.gnome.org/show_bug.cgi?id=783015> (Regexptests): Add support for checking stderr output when running regexp tests. This makes it possible to check in test cases that fail and not see false-positive error output when running the tests. Unlike other libxml2 test suites, if there is no stderr output, no *.err file needs to be created.	2020-07-06 12:37:53 +02:00
Nick Wellnhofer	477c7f6aff	Fix quadratic runtime in HTML parser Commit `eeb99329` removed an important optimization avoiding quadratic runtime when repeatedly scanning the input buffer for terminating characters in the HTML push parser. The related bug is https://bugzilla.gnome.org/show_bug.cgi?id=444994 Make sure that ctxt->checkIndex is always written and store additional parser state in ctxt->inSubset which is unused in the HTML parser. Found by OSS-Fuzz.	2020-07-06 12:17:20 +02:00
Nick Wellnhofer	32cb5dccda	Add test case for recursive external parsed entities	2020-02-11 17:36:43 +01:00
Nick Wellnhofer	f20daa9e51	Enable error tests with entity substitution	2020-02-11 17:36:43 +01:00
Nick Wellnhofer	eddfbc38fa	Don't load external entity from xmlSAX2GetEntity Despite the comment, I can't see a reason why external entities must be loaded in the SAX handler. For external entities, the handler is typically first invoked via xmlParseReference which will later load the entity on its own if it wasn't loaded yet. The old code also lead to duplicated SAX events which makes it basically impossible to reuse xmlSAX2GetEntity for a custom SAX parser. See the change to the expected test output. Note that xmlSAX2GetEntity was loading the entity via xmlParseCtxtExternalEntity while xmlParseReference uses xmlParseExternalEntityPrivate. In the previous commit, the two functions were merged, trying to compensate for some slight differences between the two mostly identical implementations. But the more urgent reason for this change is that xmlParseReference has the facility to abort early when recursive entities are detected, avoiding what could practically amount to an infinite loop. If you want to backport this change, note that the previous three commits are required as well: `f9ea1a24` Fix copying of entities in xmlParseReference `5c7e0a9a` Copy some XMLReader option flags to parser context `1a3e584a` Merge code paths loading external entities Found by OSS-Fuzz.	2020-02-11 17:35:42 +01:00
Nick Wellnhofer	f9ea1a24ed	Fix copying of entities in xmlParseReference Before, reader mode would end up in a branch that didn't handle entities with multiple children and failed to update ent->last, so the hack copying the "extra" reader data wouldn't trigger. Consequently, some empty nodes in entities are correctly detected now in the test suite. (The detection of empty nodes in entities is still buggy, though.)	2020-02-11 16:37:52 +01:00
Jared Yanovich	2a350ee9b4	Large batch of typo fixes Closes #109.	2019-09-30 18:04:38 +02:00
Nick Wellnhofer	c2f209c09f	Disallow conditional sections in internal subset Conditional sections are only allowed in external parameter entities referenced from the internal subset.	2019-09-30 15:47:30 +02:00
Nick Wellnhofer	c51e38cb3a	Make xmlParseConditionalSections non-recursive Avoid call stack overflow in deeply nested conditional sections. Found by OSS-Fuzz.	2019-09-30 15:47:30 +02:00
Nick Wellnhofer	99a864a1f7	Fix Regextests - One of the bug316338 test cases is expected to succeed. - Memory leak in testRegexp.c. - Refcount handling in xmlExpHashGetEntry.	2019-09-25 15:27:45 +02:00
Nick Wellnhofer	c2b0a184a9	Fix empty branch in regex Fixes bug 649244: https://bugzilla.gnome.org/show_bug.cgi?id=649244 Closes #57.	2019-09-25 14:22:47 +02:00
Nick Wellnhofer	62150ed2ab	Make xmlParseContent and xmlParseElement non-recursive Split xmlParseElement into subfunctions. Use nameNsPush to store prefix, URI and nsNr on the heap, similar to the push parser. Closes #84.	2019-09-23 17:45:50 +02:00
Nick Wellnhofer	6705f4d28e	Remove executable bit from non-executable files	2019-09-16 15:48:59 +02:00
Nick Wellnhofer	eee1dd5acf	Fix expected output of test/schemas/any4 libxml2 correctly rejects any4_0.xsd as invalid schema. I can't figure out what the intent behind this test case was. Simply adjust the expected output to match the current behavior. Closes #92.	2019-09-16 15:36:44 +02:00
Nick Wellnhofer	e8c9cd5c7a	Fix Schema determinism check of ##other namespaces Non-compound (##local) and compound string atoms are always disjoint regardless of whether the compound atom is negated (##other). Closes #40.	2019-09-16 15:36:02 +02:00
bettermanzzy	01d8cf07d9	Misleading error message with xs:{min\|max}Inclusive Closes #53.	2019-08-25 14:12:34 +02:00
Jan Pokorný	ea695ac0d6	Fix unability to RelaxNG-validate grammar with choice-based name class Previously, test/relaxng/ambig_name-class2.xml would fail to validate against test/relaxng/ambig_name-class2.rng: > test/relaxng/ambig_name-class2.rng:4: > element attribute: Relax-NG parser error : > Found anyName attribute without oneOrMore ancestor > Relax-NG schema test/relaxng/ambig_name-class2.rng failed to compile Signed-off-by: Jan Pokorný <jpokorny@redhat.com>	2019-08-25 13:29:04 +02:00
Jan Pokorný	8074b88179	Fix unability to validate ambiguously constructed interleave for RelaxNG Previously, test/relaxng/ambig_name-class.xml would fail to validate for a simple reason -- interleave within "open-name-class" context is supposed to be fine with whatever else is pending the consumption, since effectively, it's unrelated from a higher parsing perspective. Signed-off-by: Jan Pokorný <jpokorny@redhat.com>	2019-08-25 13:29:04 +02:00
Nick Wellnhofer	f9fce96313	Fix unsigned integer overflow It's defined behavior but -fsanitize=unsigned-integer-overflow is useful to discover bugs.	2019-05-20 13:38:22 +02:00
Nick Wellnhofer	c2f4da1a93	Improve XPath predicate and filter evaluation Consolidate code paths evaluating XPath predicates and filters. Don't push context node on stack when evaluating predicates. I have no idea why this was done. It seems completely useless and trying to pop the context node from a corrupted stack has already caused security issues. Filter nodesets in-place and don't create node sets with NULL gaps which allows to simplify merging a great deal. Simply move matched nodes backward and create a compact node set. Merge xmlXPathCompOpEvalPositionalPredicate into xmlXPathCompOpEvalPredicate.	2019-04-22 14:48:46 +02:00
Nick Wellnhofer	30a6533e01	Fix float casts in xmlXPathSubstringFunction Rewrite conversion of double to int in xmlXPathSubstringFunction, adding range checks to avoid undefined behavior. Make sure to add start and length as floating-point numbers before converting to int. Fix a bug when rounding negative start indices. Remove unneeded calls to xmlXPathIs{Inf,NaN} and rely on IEEE math instead. Avoid computing the string length. xmlUTF8Strsub works as expected if the length of the requested substring exceeds the input. Found with libFuzzer and UBSan.	2019-03-08 14:29:59 +01:00
Nikolai Weibull	c64d4efb31	Remove redefined starts and defines inside include elements When including a grammar from another grammar, we need to make sure that any redefines of starts and includes that that grammar does inside any of its include elements are also removed.	2018-11-29 21:06:06 +01:00
Nikolai Weibull	46da8fc529	Allow choice within choice in nameClass in RELAX NG The pattern nameClass allows for nested choice elements, for example <name> <choice> <choice> <name>a</name> <name>b</name> </choice> <name>c</name> </choice> </name> which is semantically equivalent to <name> <choice> <name>a</name> <name>b</name> <name>c</name> </choice> </name> The old code didn’t handle this correctly, as it never expected a choice inside another choice. This patch fixes this by flattening any nested choices. This pattern of nested choice elements comes up in RELAX NG simplification, where all choice elements are rewritten in this nested manner, see section 4.12 of the RELAX NG specification.	2018-11-29 21:03:11 +01:00
Nikolai Weibull	4338c310eb	Look inside divs for starts and defines inside include RELAX NG allows for div elements inside of include elements. We need to look inside those div elements for start and define elements that may be redefining start and define elements in the included grammar.	2018-11-29 21:00:46 +01:00
Nick Wellnhofer	123234f2cf	Free input buffer in xmlHaltParser This avoids miscalculation of available bytes. Thanks to Yunho Kim for the report. Closes: #26	2018-09-11 15:06:17 +02:00
Nick Wellnhofer	7218255092	Add test for ICU flush and pivot buffer	2017-11-04 15:38:58 +01:00
Nick Wellnhofer	5af594d8bc	Fix comparison of nodesets to strings Fix two bugs in xmlXPathNodeValHash which could lead to errors when comparing nodesets to strings: - Only use contents of text nodes to compute the hash for element nodes. Comments, PIs, and other node types don't affect the string-value and must be ignored. - Reset `string` to NULL for node types other than text. Reported by Aleksei on the mailing list: https://mail.gnome.org/archives/xml/2017-September/msg00016.html	2017-10-07 15:22:57 +02:00
Nick Wellnhofer	69936b129f	Revert "Print error messages for truncated UTF-8 sequences" This reverts commit `79c8a6b` which caused a serious regression in streaming mode. Also reverts part of commit `52ceced` "Fix infinite loops with push parser in recovery mode". Fixes bug 786554.	2017-08-30 14:19:06 +02:00
Nick Wellnhofer	899a5d9f0e	Detect infinite recursion in parameter entities When expanding a parameter entity in a DTD, infinite recursion could lead to an infinite loop or memory exhaustion. Thanks to Wei Lei for the first of many reports. Fixes bug 759579.	2017-07-25 15:21:12 +02:00
Nick Wellnhofer	872fea9485	Get rid of "blanks wrapper" for parameter entities Now that replacement of parameter entities goes exclusively through xmlSkipBlankChars, we can account for the surrounding space characters there and remove the "blanks wrapper" hack.	2017-06-20 13:19:47 +02:00
Nick Wellnhofer	24246c7626	Fix xmlHaltParser Pop all extra input streams before resetting the input. Otherwise, a call to xmlPopInput could make input available again. Also set input->end to input->cur. Changes the test output for some error tests. Unfortunately, some fuzzed test cases were added to the test suite without manual cleanup. This makes it almost impossible to review the impact of later changes on the test output.	2017-06-20 13:15:43 +02:00
Nick Wellnhofer	8bbe4508ef	Spelling and grammar fixes Fixes bug 743172, bug 743489, bug 769632, bug 782400 and a few other misspellings.	2017-06-17 16:34:23 +02:00
Nick Wellnhofer	5f440d8cad	Rework entity boundary checks Make sure to finish all entities in the internal subset. Nevertheless, readd a sanity check in xmlParseStartTag2 that was lost in my previous commit. Also add a sanity check in xmlPopInput. Popping an input unexpectedly was the source of many recent memory bugs. The check doesn't mitigate such issues but helps with diagnosis. Always base entity boundary checks on the input ID, not the input pointer. The pointer could have been reallocated to the old address. Always throw a well-formedness error if a boundary check fails. In a few places, a validity error was thrown. Fix a few error codes and improve indentation.	2017-06-17 13:25:53 +02:00
Nick Wellnhofer	dbaab1f369	Test SAX2 callbacks with entity substitution This detects regressions like bug 760367.	2017-06-16 21:38:57 +02:00

1 2 3 4 5 ...

614 Commits