libxml2

mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-04-25 22:50:08 +03:00

Author	SHA1	Message	Date
Nick Wellnhofer	463bbeeca1	entities: Rework entity amplification checks This commit implements robust detection of entity amplification attacks, better known as the "billion laughs" attack. We now limit the size of the document after substitution of entities to 10 times the size before expansion. This guarantees linear behavior by definition. There already was a similar check before, but the accounting of "sizeentities" (size of external entities) and "sizeentcopy" (size of all copies created by entity references) wasn't accurate. We also need saturation arithmetic since we're historically limited to "unsigned long" which is 32-bit on many platforms. A maximum of 10 MB of substitutions is always allowed. This should make use cases like DITA work which have caused problems in the past. The old checks based on the number of entities were removed. This is accounted for by adding a fixed cost to each entity reference. Entity amplification checks are now enabled even if XML_PARSE_HUGE is set. This option is mainly used to allow larger text nodes. Most users were unaware that it also disabled entity expansion checks. Some of the limits might be adjusted later. If this change turns out to affect legitimate use cases, we can add a separate parser option to disable the checks. Fixes #294. Fixes #345.	2022-12-21 20:19:10 +01:00
Nick Wellnhofer	7e3f469be9	entities: Use flags to store '<' check results Instead of abusing the LSB of the "checked" member, store the result of testing for occurrence of '<' character in "flags". Also use the flags in xmlParseStringEntityRef instead of rescanning every time.	2022-12-19 15:59:49 +01:00
Nick Wellnhofer	481d79d44c	entities: Add XML_ENT_PARSED flag To check whether an entity was already parsed, the code previously tested whether "checked" was non-zero or "children" was non-null. The "children" check could be unreliable because an empty entity also results in an empty (NULL) node list. Use a separate flag to make this check more reliable.	2022-12-19 15:26:46 +01:00
Alex Richardson	4b959ee168	Remove hacky heuristic from b2dc5675e94aa6b5557ba63f7d66b0f08dd17e4d Checking whether the context is close to the parent context by hardcoding 250 is not portable (I noticed tests were failing on Morello since the value is 288 there due to pointers being 128 bits). Instead we should ensure that the XML_VCTXT_USE_PCTXT flag is not set in cases where the user data is not actually a parser context (or ideally add a separate field but that would be an ABI break. From what I can see in the source, the XML_VCTXT_USE_PCTXT is only set if the userData field points to a valid context, and if this is not the case the flag should be cleared when changing userData rather than relying on the offset between the two. Looking at the history, I think d7cb33cf44aa688f24215c9cd398c1a26f0d25ff fixed most of the need for this workaround, but it looks like there are a few more locations that need updating; This commit changes two more places to set/clear/copy the XML_VCTXT_USE_PCTXT flag, so this heuristic should not be needed anymore. I've also drop two = NULL assignment in xmllint since this is not needed after a call to memset(). There was also an uninitialized vctxt.flags (and other fields) in `xmlShellValidate()`, which I've fixed by adding a memset() call.	2022-12-01 15:31:25 +00:00
Alex Richardson	c62c0d82cc	Correctly relocate internal pointers after realloc() Adding an offset to a deallocated pointer and assuming that it can be dereferenced is undefined behaviour. When running libxml2 on CHERI-enabled systems such as Arm Morello this results in the creation of an out-of-bounds pointer that cannot be dereferenced and therefore crashes at runtime. The effect of this UB is not just limited to architectures such as CHERI, incorrect relocation of pointers after realloc can in fact cause FORTIFY_SOURCE errors with recent GCC: https://developers.redhat.com/articles/2022/09/17/gccs-new-fortification-level	2022-12-01 15:14:40 +00:00
Nick Wellnhofer	c16fd705bb	xpath: Make init function private	2022-11-27 02:11:07 +01:00
Nick Wellnhofer	53ab38408d	encoding: Make init function private	2022-11-27 02:11:07 +01:00
Nick Wellnhofer	05c3a458aa	tests: Check that xmlInitParser doesn't allocate memory	2022-11-27 02:11:07 +01:00
Nick Wellnhofer	78c0391bc7	parser: Register atexit handler in locked section	2022-11-25 15:12:56 +01:00
Nick Wellnhofer	ed053c50cf	dict: Make init/cleanup functions private	2022-11-25 15:02:04 +01:00
Nick Wellnhofer	7010d8779b	threads: Rework initialization Make init/cleanup functions private. Merge xmlOnceInit into xmlInitThreadsInternal.	2022-11-25 15:02:04 +01:00
Nick Wellnhofer	9dbf137455	parser: Make some module init/cleanup functions private	2022-11-25 15:02:04 +01:00
Nick Wellnhofer	cecd364dd2	parser: Don't call *DefaultSAXHandlerInit from xmlInitParser Change the default handler definitions to match the result after calling the initialization functions. This makes sure that no thread-local variables are accessed when calling xmlInitParser.	2022-11-25 15:02:04 +01:00
Nick Wellnhofer	b1f9c19383	parser: Fix push parser with unterminated CDATA sections Short-lived regression found by OSS-Fuzz.	2022-11-22 21:39:01 +01:00
Nick Wellnhofer	0e193f0d61	parser: Remove dangerous check in xmlParseCharData If this check succeeds, xmlParseCharData could be called over and over again without making progress, resulting in an infinite loop. It's only important to check for XML_PARSER_EOF which is done later. Related to #441.	2022-11-21 22:09:19 +01:00
Nick Wellnhofer	94ca36c2c4	parser: Restore parser state in xmlParseCDSect Fixes #441.	2022-11-21 22:07:11 +01:00
Nick Wellnhofer	a8b31e68c2	parser: Fix progress check when parsing character data Skip over zero bytes to guarantee progress. Short-lived regression.	2022-11-21 21:39:10 +01:00
Nick Wellnhofer	c63900fbc1	parser: Check terminate flag when push parsing CDATA sections Found by OSS-Fuzz.	2022-11-21 20:39:17 +01:00
Nick Wellnhofer	a781ee3395	Revert "parser: Add overflow checks to xmlParseLookup functions" This reverts commit bfc55d688427972d093be010a8c2ef265375fcb2. It's better to fix the root cause.	2022-11-21 20:11:14 +01:00
Nick Wellnhofer	bfc55d6884	parser: Add overflow checks to xmlParseLookup functions Short-lived regression found by OSS-Fuzz.	2022-11-21 18:29:54 +01:00
Nick Wellnhofer	9e4a46ace6	parser: Merge misc, prolog and epilog cases in push parser	2022-11-20 22:03:08 +01:00
Nick Wellnhofer	55fb8f72ac	parser: Fix push parser with 1-3 byte initial chunk Make sure that ctxt->charset is initialized properly.	2022-11-20 21:27:59 +01:00
Nick Wellnhofer	68a6518c45	parser: Rewrite push parser boundary checks Remove inaccurate xmlParseCheckTransition check. Remove non-incremental xmlParseGetLasts check. Add functions that check for several boundary constructs more accurately, keeping track of progress in ctxt->checkIndex. Fixes #439.	2022-11-20 21:27:08 +01:00
Nick Wellnhofer	2059df5358	buf: Deprecate static/immutable buffers	2022-11-20 21:16:03 +01:00
Nick Wellnhofer	4955e0c9e1	io: Don't shrink memory input buffers	2022-11-20 21:16:03 +01:00
Nick Wellnhofer	117bab2256	parser: Don't call xmlSHRINK from push parser xmlSHRINK also calls xmlParserInputGrow which isn't needed in the push parser.	2022-11-20 21:16:03 +01:00
Nick Wellnhofer	f00739c12e	parser: Ignore cdata argument in xmlParseCharData It never could be used to parse CDATA sections.	2022-11-20 21:16:03 +01:00
Nick Wellnhofer	e4f56a7213	parser: Simplify xmlParseConditionalSections	2022-11-20 21:16:03 +01:00
Nick Wellnhofer	3582b07bd2	parser: Fix content parser progress checks This is another attempt at fixing parser progress checks. Instead of relying on in->consumed, which could overflow, change some content parser functions to make guaranteed progress on certain byte sequences.	2022-11-20 21:16:03 +01:00
Nick Wellnhofer	f7ad338e09	parser: Fix attribute parser progress checks This is another attempt at fixing parser progress checks. Instead of relying on in->consumed, which could overflow, make the attribute parser functions return a NULL name only if they don't make progress.	2022-11-20 21:16:03 +01:00
Nick Wellnhofer	f61b8a6233	parser: Fix DTD parser progress checks This is another attempt at fixing parser progress checks. Instead of relying on in->consumed, which could overflow, change some DTD parser functions to make guaranteed progress on certain byte sequences.	2022-11-20 21:16:03 +01:00
Nick Wellnhofer	46cd7d224e	io: Remove xmlInputReadCallbackNop In some cases, for example when using encoders, the read callback was set to NULL, in other cases it was set to xmlInputReadCallbackNop. xmlGROW only tested for xmlInputReadCallbackNop, resulting in errors when parsing large encoded content from memory. Always use a NULL callback for memory buffers to avoid ambiguities. Fixes #262.	2022-11-20 21:12:18 +01:00
Nick Wellnhofer	a70f7d4715	parser: Fix error message in xmlParseCommentComplex Fixes #421.	2022-11-04 14:03:31 +01:00
Nick Wellnhofer	afc7e3a7f4	malloc-fail: Fix memory leak in xmlParseReference Found with libFuzzer, see #344.	2022-11-02 16:11:00 +01:00
Nick Wellnhofer	e129c1d1a2	malloc-fail: Fix infinite loop in xmlSkipBlankChars Found with libFuzzer, see #344.	2022-11-02 16:02:39 +01:00
Nick Wellnhofer	865e142c41	malloc-fail: Fix memory leak in xmlCreatePushParserCtxt Found with libFuzzer, see #344.	2022-11-02 15:57:53 +01:00
Nick Wellnhofer	ffaec75809	Fix integer overflows with XML_PARSE_HUGE Also impose size limits when XML_PARSE_HUGE is set. Limit size of names to XML_MAX_TEXT_LENGTH (10 million bytes) and other content to XML_MAX_HUGE_LENGTH (1 billion bytes). Move some the length checks to the end of the respective loop to make them strict. xmlParseEntityValue didn't have a length limitation at all. But without XML_PARSE_HUGE, this should eventually trigger an error in xmlGROW. Thanks to Maddie Stone working with Google Project Zero for the report!	2022-10-14 15:01:46 +02:00
Nick Wellnhofer	1a2d8ddc06	parser: Fix potential memory leak in xmlParseAttValueInternal Fix memory leak in case xmlParseAttValueInternal is called with a NULL `len` a non-NULL `alloc` argument. This static function is never called with such arguments internally, but the misleading code should be fixed nevertheless. Fixes #422.	2022-10-11 13:14:37 +02:00
Nick Wellnhofer	a9669679f5	error: Don't use initGenericErrorDefaultFunc The code in xmlInitParser did only set the error handler if it was NULL which should never happen.	2022-09-09 13:52:48 +02:00
Nick Wellnhofer	59f2f60e3e	Remove "runtime debugging" This doesn't seem useful as configuration option.	2022-09-02 18:33:35 +02:00
Nick Wellnhofer	884e142dc5	Fix --with-schemas --without-xpath build xmlXPathInit must be called for schemas.	2022-09-02 18:33:35 +02:00
Nick Wellnhofer	6843fc726f	Remove or annotate char casts	2022-09-01 04:31:30 +02:00
Nick Wellnhofer	2cac626976	Don't use sizeof(xmlChar) or sizeof(char)	2022-09-01 03:35:19 +02:00
Nick Wellnhofer	ad338ca737	Remove explicit integer casts Remove explicit integer casts as final operation - in assignments - when passing arguments - when returning values Remove casts - to the same type - from certain range-bound values The main motivation is that these explicit casts don't change the result of operations and only render UBSan's implicit-conversion checks useless. Removing these casts allows UBSan to detect cases where truncation or sign-changes occur unexpectedly. Document some explicit casts as truncating and add a few missing ones.	2022-09-01 02:33:57 +02:00
Nick Wellnhofer	0f568c0b73	Consolidate private header files Private functions were previously declared - in header files in the root directory - in public headers guarded with IN_LIBXML - in libxml.h - redundantly in source files that used them. Consolidate all private header files in include/private.	2022-08-26 02:11:56 +02:00
Nick Wellnhofer	48f84ea8ed	Remove internal macros from parserInternals.h Replace MOVETO_ENDTAG with code that updates line and column numbers.	2022-08-25 21:31:08 +02:00
Nick Wellnhofer	58fc89e8a9	Deprecate internal parser functions	2022-08-25 21:04:57 +02:00
Nick Wellnhofer	34a050cdee	Move some HTML functions to correct header file	2022-08-24 16:44:39 +02:00
Nick Wellnhofer	fd85b566f7	Mark more parser functions as deprecated No compiler warnings generated yet.	2022-08-24 15:12:24 +02:00
Nick Wellnhofer	0e49f8826a	Mark most SAX1 functions as deprecated No compiler warnings generated yet.	2022-08-24 14:07:57 +02:00

1 2 3 4 5 ...

801 Commits