libxml2

mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-04-24 18:50:07 +03:00

Author	SHA1	Message	Date
Nick Wellnhofer	3ffcc03b16	parser: Deprecate more internal functions	2023-04-26 20:23:23 +02:00
Nick Wellnhofer	250faf3c83	parser: Fix regression in xmlParserNodeInfo accounting Commit 62150ed2 broke begin_pos and begin_line when extra node info was recorded. Fixes #523.	2023-04-20 15:38:00 +02:00
Nick Wellnhofer	9282b08431	parser: Fix regression in memory pull parser with encoding Revert another change from commit 98840d40. Decode the whole buffer when reading from memory and switching to the initial encoding. Add some comments about potential improvements.	2023-04-19 22:32:19 +02:00
David Kilzer	86105c0493	Fix use-after-free in xmlParseContentInternal() * parser.c: (xmlParseCharData): - Check if the parser has stopped before advancing `ctxt->input->cur`. This only occurs if a custom SAX error handler calls xmlStopParser() on fatal errors. Fixes #518.	2023-04-16 12:01:05 -07:00
Nick Wellnhofer	b4d46cee80	parser: Remove first line handling in xmlParseChunk After reworking EBCDIC detection, this isn't necessary.	2023-04-12 15:10:01 +02:00
Nick Wellnhofer	98840d40da	parser: Rework EBCDIC code page detection To detect EBCDIC code pages, we used to switch the encoding twice and had to be very careful not to decode data after the XML declaration before the second switch. This relied on a hard-coded expected size of the XML declaration and was complicated and unreliable. Now we convert the first 200 bytes to EBCDIC-US and parse the encoding declaration manually.	2023-03-21 21:35:15 +01:00
Nick Wellnhofer	3eb9f5ca4e	parser: Limit name length in xmlParseEncName	2023-03-21 13:19:31 +01:00
Nick Wellnhofer	04d1bedd8c	parser: Rework shrinking of input buffers Don't try to grow the input buffer in xmlParserShrink. This makes sure that no memory allocations are made and the function always succeeds. Remove unnecessary invocations of SHRINK. Invoke SHRINK at the end of DTD parsing loops. Shrink before growing.	2023-03-21 13:19:18 +01:00
Nick Wellnhofer	067986fa67	parser: Fix regressions from previous commits - Fix memory leak in xmlParseNmtoken. - Fix buffer overread after htmlParseCharDataInternal.	2023-03-18 16:51:40 +01:00
Nick Wellnhofer	3e85d7b7ab	parser: Rely on CUR_CHAR/NEXT to grow the input buffer The input buffer is now grown reliably when calling CUR_CHAR (xmlCurrentChar) or NEXT (xmlNextChar). This allows to remove many other invocations of GROW.	2023-03-17 14:02:23 +01:00
Nick Wellnhofer	c81d0d04bf	malloc-fail: Add more error checks when parsing names xmlParseName and similar functions must return NULL if an error occurs. Found by OSS-Fuzz, see #344.	2023-03-17 12:39:35 +01:00
Nick Wellnhofer	b167c73144	parser: Fix short-lived regression causing infinite loops Fix 3eb6bf03. We really have to halt the parser, so the input buffer gets reset.	2023-03-14 15:16:04 +01:00
Nick Wellnhofer	2099441f32	parser: Stop calling xmlParserInputShrink Introduce xmlParserShrink which takes a parser context to simplify error handling.	2023-03-13 17:51:13 +01:00
Nick Wellnhofer	cabde70f8b	parser: Simplify calculation of available buffer space	2023-03-12 19:07:23 +01:00
Nick Wellnhofer	b75976e029	parser: Use size_t when subtracting input buffer pointers Avoid integer overflows.	2023-03-12 19:06:19 +01:00
Nick Wellnhofer	9a6ca81612	parser: Check for integer overflow when updating checkIndex Unfortunately, checkIndex is a long, not a size_t. Check for integer overflow before updating the value.	2023-03-12 19:03:11 +01:00
Nick Wellnhofer	bd63d730b8	html: Impose some length limits Impose length limits on names, attribute values, PIs and comments, similar to the XML parser.	2023-03-12 17:40:55 +01:00
Nick Wellnhofer	3eb6bf0386	parser: Stop calling xmlParserInputGrow Introduce xmlParserGrow which takes a parser context to simplify error handling.	2023-03-12 17:05:51 +01:00
Nick Wellnhofer	207ebdfd2a	malloc-fail: Fix out-of-bounds read in xmlGROW Short-lived regression from 56cc2211.	2023-03-12 14:43:01 +01:00
Nick Wellnhofer	56cc2211bc	parser: Merge xmlParserInputGrow into xmlGROW Simplifies the code and makes error handling easier.	2023-03-09 22:27:58 +01:00
Nick Wellnhofer	14604a446e	malloc-fail: Fix out-of-bounds read in xmlCurrentChar Found by OSS-Fuzz.	2023-03-09 22:10:44 +01:00
Nick Wellnhofer	3f69fc805c	parser: Tighten expansion limits - Lower the amount of expansion which is always allowed from 10MB to 1MB. - Lower the maximum amplification factor from 10 to 5. - Lower the "fixed cost" from 50 to 20.	2023-03-08 13:58:49 +01:00
Nick Wellnhofer	5d55315e32	parser: Fix OOB read when formatting error message Don't try to print characters beyond the end of the buffer. Found by OSS-Fuzz.	2023-02-18 17:29:07 +01:00
Nick Wellnhofer	f8852184a1	malloc-fail: Fix memory leak in xmlParseEntityDecl Found with libFuzzer, see #344.	2023-02-17 17:16:50 +01:00
Nick Wellnhofer	e6d22f925a	malloc-fail: Fix reallocation in inputPush Store xmlRealloc result in temporary variable to avoid null deref in error handler. Found with libFuzzer, see #344.	2023-01-24 11:47:33 +01:00
Nick Wellnhofer	6fd8904108	malloc-fail: Fix use-after-free in xmlParseStartTag2 Fix error handling in xmlCtxtGrowAttrs. Found with libFuzzer, see #344.	2023-01-24 11:47:33 +01:00
Nick Wellnhofer	d1b8785693	malloc-fail: Fix infinite loop in xmlParseTextDecl Memory errors can set `instate` to `XML_PARSER_EOF` which results in `NEXT` making no progress. Found with libFuzzer, see #344.	2023-01-24 11:32:15 +01:00
Nick Wellnhofer	bd9de3a31f	malloc-fail: Fix null deref in xmlAddDefAttrs Found with libFuzzer, see #344.	2023-01-24 11:32:15 +01:00
Nick Wellnhofer	33d4a0fe40	parser: Fix progress check in xmlParseExternalSubset Avoid infinite loop. Short-lived regression from f61b8a62. Found with libFuzzer.	2023-01-24 11:32:15 +01:00
Nick Wellnhofer	74aa61e0bd	parser: Halt parser on DTD errors If we try to continue parsing after an error in the internal or external subset, entity expansion accounting gets more complicated. Simply halt the parser. Found with libFuzzer.	2023-01-24 11:32:15 +01:00
Nick Wellnhofer	d320a683d1	parser: Fix entity check in attributes Don't set the "checked" flag when checking entities in default attribute values. These entities could reference other entities which weren't defined yet, so the check isn't reliable. This fixes a short-lived regression which could lead to a call stack overflow later in xmlStringGetNodeList.	2023-01-17 13:59:24 +01:00
Nick Wellnhofer	59b3366178	error: Limit number of parser errors Reporting errors is expensive and some abusive test cases can generate an error for each invalid input byte. This causes the parser to spend most of the time with error handling. Limit the number of errors and warnings to 100.	2022-12-27 14:41:19 +01:00
Nick Wellnhofer	66e9fd66e8	parser: Fix infinite loop with push parser in recovery mode Short-lived regression from commit b1f9c193. Found by OSS-Fuzz.	2022-12-25 21:30:32 +01:00
Nick Wellnhofer	49b54d7e2b	parser: Fix null deref in xmlStringDecodeEntitiesInt Short-lived regression.	2022-12-25 15:06:51 +01:00
Nick Wellnhofer	1865668b61	parser: Fix accounting of consumed input bytes Only add consumed bytes if - we're not parsing an entity - we're parsing external parameter entities for the first time. Always ignore internal parameter entities.	2022-12-23 23:11:11 +01:00
Nick Wellnhofer	bc18f4a67c	parser: Lower entity nesting limit with XML_PARSE_HUGE The old limit of 1024 could lead to excessively deep call stacks. This could probably be set much lower without causing issues.	2022-12-23 22:11:18 +01:00
Nick Wellnhofer	dd62e541ec	parser: Don't increase depth twice when parsing internal entities Fix xmlParseBalancedChunkMemoryInternal.	2022-12-23 22:11:18 +01:00
Nick Wellnhofer	a41b09c739	parser: Improve detection of entity loops Set a flag to detect entity loops at once instead of processing until the depth limit is exceeded.	2022-12-23 22:11:18 +01:00
Nick Wellnhofer	d972393f30	parser: Only report a single entity error Don't report errors multiple times for nested entity references.	2022-12-23 22:10:39 +01:00
Nick Wellnhofer	077df27eb1	parser: Fix integer overflow of input ID Applies a patch from Chromium. Also stop incrementing input ID of subcontexts. This isn't necessary. Fixes #465.	2022-12-22 15:22:01 +01:00
David Kilzer	0bd4e4e032	xmlParseStartTag2() contains typo when checking for default definitions for an attribute in a namespace * parser.c: (xmlParseStartTag2): - Fix index into defaults->values. It is only correct the first time through the loop when i == 0. Fixes #467.	2022-12-21 19:35:33 -08:00
Nick Wellnhofer	b47ebf047e	parser: Deprecate xmlString*DecodeEntities These are internal functions.	2022-12-21 21:06:03 +01:00
Nick Wellnhofer	ec6633afae	parser: Remove useless ent->etype test in xmlParseReference If ent->etype is invalid, ret can't equal XML_ERR_OK.	2022-12-21 20:35:31 +01:00
Nick Wellnhofer	7ee7f0360a	parser: Remove useless ent->children tests in xmlParseReference The if-block before always returns if ent->children == NULL.	2022-12-21 20:35:31 +01:00
Nick Wellnhofer	ce76ebfd13	entities: Stop counting entities This was only used in the old version of xmlParserEntityCheck.	2022-12-21 20:19:10 +01:00
Nick Wellnhofer	a3c8b1805e	entities: Add entity flag for loop check	2022-12-21 20:19:10 +01:00
Nick Wellnhofer	463bbeeca1	entities: Rework entity amplification checks This commit implements robust detection of entity amplification attacks, better known as the "billion laughs" attack. We now limit the size of the document after substitution of entities to 10 times the size before expansion. This guarantees linear behavior by definition. There already was a similar check before, but the accounting of "sizeentities" (size of external entities) and "sizeentcopy" (size of all copies created by entity references) wasn't accurate. We also need saturation arithmetic since we're historically limited to "unsigned long" which is 32-bit on many platforms. A maximum of 10 MB of substitutions is always allowed. This should make use cases like DITA work which have caused problems in the past. The old checks based on the number of entities were removed. This is accounted for by adding a fixed cost to each entity reference. Entity amplification checks are now enabled even if XML_PARSE_HUGE is set. This option is mainly used to allow larger text nodes. Most users were unaware that it also disabled entity expansion checks. Some of the limits might be adjusted later. If this change turns out to affect legitimate use cases, we can add a separate parser option to disable the checks. Fixes #294. Fixes #345.	2022-12-21 20:19:10 +01:00
Nick Wellnhofer	7e3f469be9	entities: Use flags to store '<' check results Instead of abusing the LSB of the "checked" member, store the result of testing for occurrence of '<' character in "flags". Also use the flags in xmlParseStringEntityRef instead of rescanning every time.	2022-12-19 15:59:49 +01:00
Nick Wellnhofer	481d79d44c	entities: Add XML_ENT_PARSED flag To check whether an entity was already parsed, the code previously tested whether "checked" was non-zero or "children" was non-null. The "children" check could be unreliable because an empty entity also results in an empty (NULL) node list. Use a separate flag to make this check more reliable.	2022-12-19 15:26:46 +01:00
Alex Richardson	4b959ee168	Remove hacky heuristic from b2dc5675e94aa6b5557ba63f7d66b0f08dd17e4d Checking whether the context is close to the parent context by hardcoding 250 is not portable (I noticed tests were failing on Morello since the value is 288 there due to pointers being 128 bits). Instead we should ensure that the XML_VCTXT_USE_PCTXT flag is not set in cases where the user data is not actually a parser context (or ideally add a separate field but that would be an ABI break. From what I can see in the source, the XML_VCTXT_USE_PCTXT is only set if the userData field points to a valid context, and if this is not the case the flag should be cleared when changing userData rather than relying on the offset between the two. Looking at the history, I think d7cb33cf44aa688f24215c9cd398c1a26f0d25ff fixed most of the need for this workaround, but it looks like there are a few more locations that need updating; This commit changes two more places to set/clear/copy the XML_VCTXT_USE_PCTXT flag, so this heuristic should not be needed anymore. I've also drop two = NULL assignment in xmllint since this is not needed after a call to memset(). There was also an uninitialized vctxt.flags (and other fields) in `xmlShellValidate()`, which I've fixed by adding a memset() call.	2022-12-01 15:31:25 +00:00

1 2 3 4 5 ...

847 Commits