1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2024-10-26 20:25:14 +03:00
Commit Graph

403 Commits

Author SHA1 Message Date
Nick Wellnhofer
e4f85f1bd2 [CVE-2023-28484] Fix null deref in xmlSchemaFixupComplexType
Fix a null pointer dereference when parsing (invalid) XML schemas.

Thanks to Robby Simpson for the report!

Fixes #491.
2023-04-11 14:29:50 +02:00
Nick Wellnhofer
d7d0bc6581 SAX2: Ignore namespaces in HTML documents
In commit 21ca8829, we started to ignore namespaces in HTML element
names but we still called xmlSplitQName, effectively stripping the
namespace prefix. This would cause elements like <o:p> being parsed
as <p>. Now we leave the name untouched.

Fixes #508.
2023-03-31 17:08:43 +02:00
Nick Wellnhofer
3f69fc805c parser: Tighten expansion limits
- Lower the amount of expansion which is always allowed from
  10MB to 1MB.
- Lower the maximum amplification factor from 10 to 5.
- Lower the "fixed cost" from 50 to 20.
2023-03-08 13:58:49 +01:00
Nick Wellnhofer
e20f4d7a65 xinclude: Fix quadratic behavior in xmlXIncludeLoadTxt
Also make text inclusions work with memory buffers, for example when
using a custom entity loader, and fix a memory leak in case of invalid
characters.

Fixes #483.
2023-02-14 12:25:07 +01:00
Nick Wellnhofer
608c65bb8e xpath: number('-') should return NaN
Fixes https://gitlab.gnome.org/GNOME/libxslt/-/issues/81
2023-01-18 15:15:41 +01:00
Nick Wellnhofer
d320a683d1 parser: Fix entity check in attributes
Don't set the "checked" flag when checking entities in default attribute
values. These entities could reference other entities which weren't
defined yet, so the check isn't reliable.

This fixes a short-lived regression which could lead to a call stack
overflow later in xmlStringGetNodeList.
2023-01-17 13:59:24 +01:00
Nick Wellnhofer
cfc036bda8 testrecurse: Test parameter entity accounting 2022-12-21 20:35:31 +01:00
Nick Wellnhofer
079da5b26d testrecurse: Add external entities to huge test 2022-12-21 20:21:51 +01:00
Nick Wellnhofer
01bcb23de1 testrecurse: Add test cases for external entities
Add test cases for external general and parameter entities.
2022-12-21 20:21:51 +01:00
Nick Wellnhofer
046f99c543 testrecurse: Add lol_param.xml
Add test case contributed by Sebastian Pipping for CVE-2021-3541.
2022-12-21 20:20:11 +01:00
Nick Wellnhofer
fafa025209 testrecurse: Rename test files 2022-12-21 20:20:11 +01:00
Nick Wellnhofer
ae0c9cfa05 uri: Fix handling of port numbers
Allow port number without host, real fix for #71.

Also compare port numbers in xmlBuildRelativeURI.

Fix handling of port numbers in xmlUriEscape.
2022-12-13 01:43:49 +01:00
Nick Wellnhofer
76c6da4209 error: Make sure that error messages are valid UTF-8
This has caused issues with the Python bindings for a long time.

Should fix #64.
2022-12-04 23:34:19 +01:00
Nick Wellnhofer
9c63cea5a6 test: Add test for push parser boundaries 2022-11-20 21:27:59 +01:00
Nick Wellnhofer
b456e3bb42 xinclude: Always allow XPtr expressions in external documents 2022-10-31 16:49:36 +01:00
Nick Wellnhofer
eef0a7395c xinclude: Implement "streaming" mode
When using xmlreader, XPointer expressions in XIncludes simply cannot
work. Expressions can reference nodes which weren't parsed yet or which
were already deleted.

After fixing nested XIncludes, we reference includes which were parsed
previously. When streaming, these nodes could have been deleted, leading
to use-after-free errors.

Disallow XPointer expressions and truncate the include table in
streaming mode.
2022-10-30 14:12:55 +01:00
Nick Wellnhofer
20e2fb4c1c xinclude: Avoid creation of subcontexts
Don't create subcontext in xmlXIncludeRecurseDoc. Save and restore 'doc'
and 'incTab' instead.

Make xmlXIncludeLoadFallback call xmlXIncludeCopyNode which seems safer
than xmlXIncludeDoProcess since the latter may modify the document.
This should also be more performant since we need to copy the whole
fallback subtree anyway. Also make sure to avoid replacements in
fallback elements in xmlXIncludeDoProcess.
2022-10-25 19:34:38 +02:00
Nick Wellnhofer
d2ed1e4f99 xinclude: Limit recursion depth
This avoids call stack overflows.
2022-10-23 18:52:56 +02:00
Nick Wellnhofer
34496f26db xinclude: Test for inclusion loops 2022-10-23 14:27:05 +02:00
Nick Wellnhofer
bc267cb9bc xinclude: Expand includes in xmlXIncludeCopyNode
This should make nested includes work reliably.

Fixes #424.
2022-10-23 14:27:05 +02:00
Nick Wellnhofer
c99cde3f21 xinclude: Also test error messages
The reader interface with XIncludes is somewhat broken and can generate
different error messages. Start to move tests which are sketchy with
reader to a separate directory.
2022-10-23 14:26:59 +02:00
Nick Wellnhofer
938105b572 Revert "xinclude: Fix regression with nested includes"
This reverts commit 7f04e29731 which
caused memory errors.

See #424.
2022-10-21 15:56:12 +02:00
Nick Wellnhofer
7f04e29731 xinclude: Fix regression with nested includes
This reverts commits 74dcc10b and 87d20b55.

Fixes #424.
2022-10-18 19:17:45 +02:00
Nick Wellnhofer
1d4f5d24ac schemas: Fix null-pointer-deref in xmlSchemaCheckCOSSTDerivedOK
Found by OSS-Fuzz.
2022-09-13 16:56:59 +02:00
Nick Wellnhofer
e986d09cf5 Skip incorrectly opened HTML comments
Commit 4fd69f3e fixed handling of '<' characters not followed by an
ASCII letter. But a '<!' sequence followed by invalid characters should
be treated as bogus comment and skipped.

Fixes #380.
2022-08-02 14:38:09 +02:00
Nick Wellnhofer
145170125a Fix parsing of subtracted regex character classes
Fixes #370.
2022-04-23 19:22:42 +02:00
Nick Wellnhofer
4612ce3031 Implement xpath1() XPointer scheme
See https://www.w3.org/2005/04/xpointer-schemes/
2022-04-21 04:26:52 +02:00
Nick Wellnhofer
41afa89fc9 Fix short-lived regression in xmlStaticCopyNode
Commit 7618a3b1 didn't account for coalesced text nodes.

I think it would be better if xmlStaticCopyNode didn't try to coalesce
text nodes at all. This code path can only be triggered if some other
code doesn't coalesce text nodes properly. In this case, OSS-Fuzz found
such behavior in xinclude.c.
2022-04-10 14:17:31 +02:00
Nick Wellnhofer
57b81c208c Normalize XPath strings in-place
Simplify the code and fix a potential memory leak.

Fixes #343.
2022-03-05 18:22:51 +01:00
Nick Wellnhofer
bc06a522c1 Fix recursion check in xinclude.c
Compare the included URL with the document's URL to detect local
inclusions.

Fixes #348.
2022-03-02 20:44:41 +01:00
Mike Dalessio
24cdc89006 test coverage for abruptly-closed comments
These establish baseline behavior so that the subsequent commit is
clear about the behavior it will modify.
2022-03-02 14:42:47 +00:00
Damjan Jovanovic
966b0f21c1 Add whitespace folding for some atomic data types that it's missing on.
XSD validation fails when some atomic types contain surrounding whitespace
even though XML Schema Part 2: Datatypes Second Edition, section 4.3.6
says they should be collapsed. Fix this.

(I am not sure whether the test is correct.)

Issue: #278
2022-03-02 14:05:51 +00:00
Nick Wellnhofer
ea6e8f998d Fix certain combinations of regex range quantifiers
Fix regex transitions that have both min/max and a counter. In this
case, we want to save the regex state before incrementing the counter.

Fixes #301 and the issue reported here:

https://mail.gnome.org/archives/xml/2016-April/msg00017.html
2022-02-28 16:56:02 +01:00
Nick Wellnhofer
382fb056b5 Fix range quantifier on subregex
Make sure to add counted exit transitions before other counter
transitions. Otherwise, we won't backtrack correctly.

Fixes #65.
2022-02-28 16:56:02 +01:00
Nick Wellnhofer
ce0871e15c Only warn on invalid redeclarations of predefined entities
Downgrade the error message to a warning since the error was ignored,
anyway. Also print the name of redeclared entity. For a proper fix that
also shows filename and line number of the invalid redeclaration, we'd
have to

- pass the parser context to the entity functions somehow, or
- make these functions return distinct error codes.

Partial fix for #308.
2022-02-20 21:49:04 +01:00
Nick Wellnhofer
9edc20c154 Fix double counting of CRLF in comments
Fixes #151.
2022-02-07 20:54:07 +01:00
Nick Wellnhofer
5408c10c37 Don't normalize namespace URIs in XPointer xmlns() scheme
Namespace URIs should be compared without escaping or unescaping:

https://www.w3.org/TR/REC-xml-names/#NSNameComparison

Fixes #289.
2022-02-04 14:00:09 +01:00
Nick Wellnhofer
1c7d91abe4 Fix handling of XSD with empty namespace
An empty namespace means no default namespace.

Fixes #303.
2022-02-03 23:31:19 +01:00
Nick Wellnhofer
f480f7509c Update NewsML DTD in test suite
Switch to version 1.2 which has a clearer license.

Fixes #291.
2022-02-03 14:43:17 +01:00
Nick Wellnhofer
d85245f934 Fix regression with PEs in external DTD
Fix a regression introduced with commit a28f7d87. In some cases,
parameter entity references in external DTDs wouldn't be expanded.

Fixes #306.
2022-01-16 21:56:10 +01:00
David Kilzer
03bb929390 Fix parse failure when 4-byte character in UTF-16 BE is split across a chunk
This makes the logic in UTF16BEToUTF8() match UTF16LEToUTF8().

* encoding.c:
(UTF16LEToUTF8):
- Fix comment to describe what the code does.
(UTF16BEToUTF8):
- Fix undefined behavior which was applied to UTF16LEToUTF8() in
  2f9382033e.
- Add bounds check to while() loop which was applied to
  UTF16LEToUTF8() in be803967db.
- Do not return -2 when (in >= inend) to fix the bug.  This was
  applied to UTF16LEToUTF8() in 496a1cf592.
- Inline (<< 8) statements to match UTF16LEToUTF8().

Add the following tests and results:

  test/text-4-byte-UTF-16-BE-offset.xml
  test/text-4-byte-UTF-16-BE.xml
  test/text-4-byte-UTF-16-LE-offset.xml
  test/text-4-byte-UTF-16-LE.xml
2022-01-16 14:07:17 +01:00
Nick Wellnhofer
2732b23466 Fix regression parsing public IDs literals in HTML
Fix regression introduced when reworking htmlParsePubidLiteral in
commit 93ce33c2.

Fixes #318.
2022-01-10 13:37:59 +01:00
Nick Wellnhofer
01411e7c5e Check for invalid redeclarations of predefined entities
Implement section "4.6 Predefined Entities" of the XML 1.0 spec and
check whether redeclarations of predefined entities match the original
definitions.

Note that some test cases declared

    <!ENTITY lt "<">

But the XML spec clearly states that this is illegal:

> If the entities lt or amp are declared, they MUST be declared as
> internal entities whose replacement text is a character reference to
> the respective character (less-than sign or ampersand) being escaped;
> the double escaping is REQUIRED for these entities so that references
> to them produce a well-formed result.

Also fixes #217 but the connection is only tangential. The integer
overflow discovered by fuzzing was more related to the fact that various
parts of the parser disagreed on whether to prefer predefined entities
over their redeclarations. The whole situation is a mess and even
depends on legacy parser options. But now that redeclarations are
validated, it shouldn't make a difference.

As noted in the added comment, this is also one of the cases where
overly defensive checks can hide interesting logic bugs from fuzzers.
2021-02-08 21:51:26 +01:00
Mike Dalessio
e28d9347bc add test coverage for incorrectly-closed comments
this establishes the baseline behavior so that subsequent commits
which modify this behavior are clear about what's being changed.
2020-12-16 16:12:07 +01:00
Nick Wellnhofer
87d20b554c Fix regression introduced with commit 74dcc10b
The code wasn't dead after all, but I can see no reason in delaying
the XPointer evaluation. This could lead to nodes included earlier
appearing in XPointer results.
2020-08-19 13:52:08 +02:00
Nick Wellnhofer
d88df4bd48 Fix corner case with empty xi:fallback
xi:fallback could become empty after recursive expansion. Use a flag
to track whether nodes should be skipped.
2020-08-17 01:17:39 +02:00
Nick Wellnhofer
1abf2967f9 Fix exponential runtime and memory in xi:fallback processing
When creating XML_XINCLUDE_START nodes, the children of the original
xi:include node must be freed, otherwise fallback content is copied
twice, doubling runtime and memory consumption for each nested
xi:fallback/xi:include pair.

Found with libFuzzer.
2020-08-07 19:59:07 +02:00
Nick Wellnhofer
0f9817c75b Don't recurse into xi:include children in xmlXIncludeDoProcess
Otherwise, nested xi:include nodes might result in a use-after-free
if XML_PARSE_NOXINCNODE is specified.

Found with libFuzzer and ASan.
2020-08-06 14:29:33 +02:00
David Kilzer
6b4717d61d Add regexp regression tests
- Bug 757711: heap-buffer-overflow in xmlFAParsePosCharGroup
  <https://bugzilla.gnome.org/show_bug.cgi?id=757711>
- Bug 783015 - Integer-overflow in xmlFAParseQuantExact
  <https://bugzilla.gnome.org/show_bug.cgi?id=783015>

(Regexptests): Add support for checking stderr output when
running regexp tests.  This makes it possible to check in test
cases that fail and not see false-positive error output when
running the tests.  Unlike other libxml2 test suites, if there
is no stderr output, no *.err file needs to be created.
2020-07-06 12:37:53 +02:00
Nick Wellnhofer
477c7f6aff Fix quadratic runtime in HTML parser
Commit eeb99329 removed an important optimization avoiding quadratic
runtime when repeatedly scanning the input buffer for terminating
characters in the HTML push parser. The related bug is

    https://bugzilla.gnome.org/show_bug.cgi?id=444994

Make sure that ctxt->checkIndex is always written and store additional
parser state in ctxt->inSubset which is unused in the HTML parser.

Found by OSS-Fuzz.
2020-07-06 12:17:20 +02:00