libxml2

mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2024-10-26 20:25:14 +03:00

Author	SHA1	Message	Date
Nick Wellnhofer	250faf3c83	parser: Fix regression in xmlParserNodeInfo accounting Commit `62150ed2` broke begin_pos and begin_line when extra node info was recorded. Fixes #523.	2023-04-20 15:38:00 +02:00
Nick Wellnhofer	d7d0bc6581	SAX2: Ignore namespaces in HTML documents In commit `21ca8829`, we started to ignore namespaces in HTML element names but we still called xmlSplitQName, effectively stripping the namespace prefix. This would cause elements like <o:p> being parsed as <p>. Now we leave the name untouched. Fixes #508.	2023-03-31 17:08:43 +02:00
Nick Wellnhofer	cb4334b7ab	malloc-fail: Fix memory leak in xmlSAX2StartElementNs Found with libFuzzer, see #344.	2023-02-17 17:16:51 +01:00
Nick Wellnhofer	0c5f40b788	malloc-fail: Fix null deref in xmlSAX2AttributeInternal Found with libFuzzer, see #344.	2023-01-24 11:32:15 +01:00
Nick Wellnhofer	b3b53dcce4	malloc-fail: Fix null deref in xmlSAX2Text Found with libFuzzer, see #344.	2023-01-24 11:32:15 +01:00
Nick Wellnhofer	463bbeeca1	entities: Rework entity amplification checks This commit implements robust detection of entity amplification attacks, better known as the "billion laughs" attack. We now limit the size of the document after substitution of entities to 10 times the size before expansion. This guarantees linear behavior by definition. There already was a similar check before, but the accounting of "sizeentities" (size of external entities) and "sizeentcopy" (size of all copies created by entity references) wasn't accurate. We also need saturation arithmetic since we're historically limited to "unsigned long" which is 32-bit on many platforms. A maximum of 10 MB of substitutions is always allowed. This should make use cases like DITA work which have caused problems in the past. The old checks based on the number of entities were removed. This is accounted for by adding a fixed cost to each entity reference. Entity amplification checks are now enabled even if XML_PARSE_HUGE is set. This option is mainly used to allow larger text nodes. Most users were unaware that it also disabled entity expansion checks. Some of the limits might be adjusted later. If this change turns out to affect legitimate use cases, we can add a separate parser option to disable the checks. Fixes #294. Fixes #345.	2022-12-21 20:19:10 +01:00
Nick Wellnhofer	cecd364dd2	parser: Don't call *DefaultSAXHandlerInit from xmlInitParser Change the default handler definitions to match the result after calling the initialization functions. This makes sure that no thread-local variables are accessed when calling xmlInitParser.	2022-11-25 15:02:04 +01:00
Nick Wellnhofer	68a6518c45	parser: Rewrite push parser boundary checks Remove inaccurate xmlParseCheckTransition check. Remove non-incremental xmlParseGetLasts check. Add functions that check for several boundary constructs more accurately, keeping track of progress in ctxt->checkIndex. Fixes #439.	2022-11-20 21:27:08 +01:00
Nick Wellnhofer	7ceaee9430	malloc-fail: Fix memory leak in xmlSAX2ExternalSubset Found with libFuzzer, see #344.	2022-11-02 16:05:05 +01:00
Nick Wellnhofer	81621b1fe4	Fix compiler warnings in SAX2.c	2022-09-02 18:44:59 +02:00
Nick Wellnhofer	ad338ca737	Remove explicit integer casts Remove explicit integer casts as final operation - in assignments - when passing arguments - when returning values Remove casts - to the same type - from certain range-bound values The main motivation is that these explicit casts don't change the result of operations and only render UBSan's implicit-conversion checks useless. Removing these casts allows UBSan to detect cases where truncation or sign-changes occur unexpectedly. Document some explicit casts as truncating and add a few missing ones.	2022-09-01 02:33:57 +02:00
Nick Wellnhofer	aeb69fd357	Fix overflow check in SAX2.c	2022-09-01 02:33:57 +02:00
Nick Wellnhofer	0f568c0b73	Consolidate private header files Private functions were previously declared - in header files in the root directory - in public headers guarded with IN_LIBXML - in libxml.h - redundantly in source files that used them. Consolidate all private header files in include/private.	2022-08-26 02:11:56 +02:00
Nick Wellnhofer	0e49f8826a	Mark most SAX1 functions as deprecated No compiler warnings generated yet.	2022-08-24 14:07:57 +02:00
Nick Wellnhofer	4b184240be	Remove htmlDefaultSAXHandler from non-SAX1 build This matches long-standing behavior of the XML counterpart.	2022-08-22 14:24:25 +02:00
Nick Wellnhofer	3e7b4f37aa	Avoid calling xmlSetTreeDoc Create text nodes with xmlNewDocText or set the document directly to avoid xmlSetTreeDoc being called when the node is inserted.	2022-06-20 01:49:39 +02:00
Nick Wellnhofer	40483d0ce2	Deprecate module init and cleanup functions These functions shouldn't be part of the public API. Most init functions are only thread-safe when called from xmlInitParser. Global variables should only be cleaned up by calling xmlCleanupParser.	2022-03-06 15:59:43 +01:00
Nick Wellnhofer	4a8c71eb7c	Remove DOCBparser This code has been broken and deprecated since version 2.6.0, released in 2003. Because of a bug in commit `961b535c`, DOCBparser.c was never compiled since 2012. I couldn't find a Debian package using any of its symbols, so it seems safe to remove this module.	2022-03-04 22:56:21 +01:00
Nick Wellnhofer	c41bc10da3	Fix unused variable warnings with disabled features	2022-02-22 19:57:12 +01:00
Nick Wellnhofer	346c3a930c	Remove elfgcchack.h The same optimization can be enabled with -fno-semantic-interposition since GCC 5. clang has always used this option by default.	2022-02-20 21:49:04 +01:00
Nick Wellnhofer	e03590c9ad	Don't add IDs containing unexpanded entity references When parsing without entity substitution, IDs or IDREFs containing unexpanded entity reference like "abc&x;def" could be created. We could try to expand these entities like in validation mode, but it seems safer to honor the request not to expand entities. We silently ignore such IDs for now.	2022-02-20 21:49:04 +01:00
Nick Wellnhofer	d7cb33cf44	Rework validation context flags Use a bitmask instead of magic values to - keep track whether the validation context is part of a parser context - keep track whether xmlValidateDtdFinal was called This allows to add addtional flags later. Note that this deliberately changes the name of a public struct member, assuming that this was always private data never to be used by client code.	2022-02-20 21:49:04 +01:00
Nick Wellnhofer	a647e43025	Fix casting of line numbers in SAX2.c The line member is an unsigned short. Avoids integer conversion warnings with UBSan. Also use USHRT_MAX instead of hard-coded constant.	2022-01-25 03:20:28 +01:00
David King	92bce68c0d	Fix memory leak in xmlSAX2AttributeDecl Found by Coverity. https://bugzilla.redhat.com/show_bug.cgi?id=1938806	2022-01-16 14:11:28 +01:00
Nick Wellnhofer	acb3566739	Fix quadratic runtime when parsing CDATA sections Use optimized concatenation for CDATA sections in addition to normal text. This also affects HTML script content. Found by OSS-Fuzz.	2021-02-03 13:57:26 +01:00
Nick Wellnhofer	21ca8829a7	Don't try to handle namespaces when building HTML documents Don't try to resolve namespace in xmlSAX2StartElement when parsing HTML documents. This useless operation could slow down the parser considerably. Found by OSS-Fuzz.	2020-07-25 17:57:29 +02:00
Nick Wellnhofer	20c60886e4	Fix typos Resolves #133.	2020-03-08 17:41:53 +01:00
Nick Wellnhofer	eddfbc38fa	Don't load external entity from xmlSAX2GetEntity Despite the comment, I can't see a reason why external entities must be loaded in the SAX handler. For external entities, the handler is typically first invoked via xmlParseReference which will later load the entity on its own if it wasn't loaded yet. The old code also lead to duplicated SAX events which makes it basically impossible to reuse xmlSAX2GetEntity for a custom SAX parser. See the change to the expected test output. Note that xmlSAX2GetEntity was loading the entity via xmlParseCtxtExternalEntity while xmlParseReference uses xmlParseExternalEntityPrivate. In the previous commit, the two functions were merged, trying to compensate for some slight differences between the two mostly identical implementations. But the more urgent reason for this change is that xmlParseReference has the facility to abort early when recursive entities are detected, avoiding what could practically amount to an infinite loop. If you want to backport this change, note that the previous three commits are required as well: `f9ea1a24` Fix copying of entities in xmlParseReference `5c7e0a9a` Copy some XMLReader option flags to parser context `1a3e584a` Merge code paths loading external entities Found by OSS-Fuzz.	2020-02-11 17:35:42 +01:00
Jared Yanovich	2a350ee9b4	Large batch of typo fixes Closes #109.	2019-09-30 18:04:38 +02:00
Nick Wellnhofer	6b49db2cb2	Fix memory leak in xmlSAX2StartElement Introduced by a recent commit. Only happens if max depth is exceeded in SAX1 mode. Found by OSS-Fuzz.	2019-01-07 18:07:00 +01:00
Nick Wellnhofer	1567b55b72	Set doc on element obtained from freeElems In commit `8c9daf79`, a call to xmlFreeNode was added in xmlSAX2StartElementNs. If a node was obtained from the freeElems list, make sure to set the doc, otherwise xmlFreeNode wouldn't realize that the node name might be in the dictionary, causing an invalid free. Note that the issue fixed in commit `8c9daf79` requires commit `0ed6addb` and this one to work properly. Found by OSS-Fuzz.	2018-11-22 16:28:46 +01:00
Nick Wellnhofer	0ed6addb8f	Unlink node before freeing it in xmlSAX2StartElement The node may have been added to the document already, so it must be unlinked first. Thanks to David Kilzer for spotting this.	2018-09-22 15:41:01 +02:00
Nick Wellnhofer	8c9daf790a	Check return value of nodePush in xmlSAX2StartElement If the maximum depth is exceeded, nodePush halts the parser which results in freeing the input buffer since the previous commit. This invalidates the attribute pointers, so the error condition must be checked. Found by OSS-Fuzz.	2018-09-12 13:52:47 +02:00
Nick Wellnhofer	d422b954be	Fix pointer/int cast warnings on 64-bit Windows On 64-bit Windows, `long` is 32 bits wide and can't hold a pointer. Switch to ptrdiff_t instead which should be the same size as a pointer on every somewhat sane platform without requiring C99 types like intptr_t. Fixes bug 788312. Thanks to J. Peter Mugaas for the report and initial patch.	2017-10-09 13:47:49 +02:00
Nick Wellnhofer	83fb4119a9	Fix memory leaks in SAX1 parser Found by OSS-Fuzz. I could only reproduce this with the (obsolete) SAX1 parser. One leak is caused by duplicate namespaced attribute names and can be reproduced in memory mode (testcase 4556417027538944): $ cat file <d xmlns:a="ns" a:x="v" xmlns:b="ns" b:x="v"/> $ xmllint --sax1 --memory file The other is caused by ATTLISTs with a normalized default for "xmlns" if they're processed after the entity recursion limit was hit (testcase 5580750034305024). $ cat file <!DOCTYPE d [ <!ENTITY a '<d>&a;'> <!ATTLIST d xmlns NMTOKEN 't'> ]> <d>&a; $ xmllint --sax1 --valid file Also see https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=2461	2017-09-06 01:12:34 +02:00
Nick Wellnhofer	8bbe4508ef	Spelling and grammar fixes Fixes bug 743172, bug 743489, bug 769632, bug 782400 and a few other misspellings.	2017-06-17 16:34:23 +02:00
David Tardon	074180119f	Do not leak the new CData node if adding fails For https://bugzilla.gnome.org/show_bug.cgi?id=780918	2017-04-07 18:24:52 +02:00
David Kilzer	4472c3a5a5	Fix some format string warnings with possible format string vulnerability For https://bugzilla.gnome.org/show_bug.cgi?id=761029 Decorate every method in libxml2 with the appropriate LIBXML_ATTR_FORMAT(fmt,args) macro and add some cleanups following the reports.	2016-05-23 15:01:07 +08:00
Daniel Veillard	a6ea72ad19	Fix processing in SAX2 in case of an allocation failure Related to https://bugzilla.gnome.org/show_bug.cgi?id=731360	2014-07-14 20:29:34 +08:00
Gaurav	3e0eec4319	Adding some missing NULL checks in SAX2 DOM building code and in the HTML parser	2014-06-13 14:45:20 +08:00
Nicolas Le Cam	52010c639a	Compile out use of xmlValidateNCName() when not available. Fix compilation with minimum and valid.	2014-02-10 10:36:20 +08:00
Nicolas Le Cam	77b5b46409	Legacy needs xmlSAX2StartElement() and xmlSAX2EndElement(). Fix compilation with minimum and legacy.	2014-02-10 10:32:45 +08:00
Gaurav	a885f13a67	Fix a possible NULL dereference https://bugzilla.gnome.org/show_bug.cgi?id=705400 In case of allocation error the pointer was dereferenced before the test for a failure	2013-08-03 22:16:02 +08:00
Daniel Veillard	ab0e35044c	Activate detection of encoding in external subset https://bugzilla.gnome.org/show_bug.cgi?id=694228 the ctxt->encoding was percolated down when parsing the external subset leading to failures	2013-03-27 13:21:38 +08:00
Daniel Veillard	cff2546f13	Cache presence of '<' in entities content slightly modify how ent->checked is used, and use the lowest bit to keep the information	2013-03-11 15:59:22 +08:00
Daniel Veillard	a3f1e3e571	Avoid extra processing on entities If an entity has already been checked for correctness no need to check it on every reference	2013-03-11 15:59:21 +08:00
Daniel Veillard	6c91aa384f	Fix a regression in 2.9.0 breaking validation while streaming https://bugzilla.gnome.org/show_bug.cgi?id=684774 with help from Kjell Ahlstedt <kjell.ahlstedt@bredband.net>	2012-10-25 15:33:59 +08:00
Daniel Veillard	7651606f31	Various cleanups to avoid compiler warnings	2012-09-11 14:02:08 +08:00
Daniel Veillard	f8e3db0445	Big space and tab cleanup Remove all space before tabs and space and tabs at end of lines.	2012-09-11 13:26:36 +08:00
Daniel Veillard	968a03a2e5	Add support for big line numbers in error reporting Fix the lack of line number as reported by Johan Corveleyn <jcorvel@gmail.com> * parser.c include/libxml/parser.h: add an XML_PARSE_BIG_LINES parser option not switch on by default, it's an opt-in * SAX2.c: if XML_PARSE_BIG_LINES is set store the long line numbers in the psvi field of text nodes * tree.c: expand xmlGetLineNo to extract those informations, also make sure we can't fail on recursive behaviour * error.c: in __xmlRaiseError, if a node is provided, call xmlGetLineNo() if we can't get a valid line number. * xmllint.c: switch on XML_PARSE_BIG_LINES in xmllint	2012-08-13 12:41:33 +08:00

1 2 3

128 Commits