1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-03-10 08:58:16 +03:00

5231 Commits

Author SHA1 Message Date
Damjan Jovanovic
ec8ff95ce3 Add support for some non-standard escapes in regular expressions.
This adds support for some non-standard escape sequences observed
in Microsoft's MSXML DLLs and used by Windows apps, and thus
needed by Wine. Some are also used in other XML implementations,
eg. Java's.

This isn't intended to be final. We probably wish to toggle these
non-standard escape sequences on and off somehow, as needed by
the caller.

Further discussion: https://gitlab.gnome.org/GNOME/libxml2/-/issues/260
2022-03-02 15:25:21 +00:00
Mike Dalessio
d7b287b94c htmlParseComment: handle abruptly-closed comments
See guidance provided on abrutply-closed comments here:

https://html.spec.whatwg.org/multipage/parsing.html#parse-error-abrupt-closing-of-empty-comment
2022-03-02 14:42:47 +00:00
Mike Dalessio
24cdc89006 test coverage for abruptly-closed comments
These establish baseline behavior so that the subsequent commit is
clear about the behavior it will modify.
2022-03-02 14:42:47 +00:00
Damjan Jovanovic
2fe372a0aa Properly fold whitespace around the QName value when validating an XSD schema.
(May also need fixing in other places.)

Issue: 239
2022-03-02 14:22:36 +00:00
Damjan Jovanovic
966b0f21c1 Add whitespace folding for some atomic data types that it's missing on.
XSD validation fails when some atomic types contain surrounding whitespace
even though XML Schema Part 2: Datatypes Second Edition, section 4.3.6
says they should be collapsed. Fix this.

(I am not sure whether the test is correct.)

Issue: #278
2022-03-02 14:05:51 +00:00
Oliver Diehl
e5cdb02d64 Add let variable tag support 2022-03-02 14:45:43 +01:00
Oliver Diehl
2cc93f7754 Add value-of tag support 2022-03-02 14:42:47 +01:00
Oliver Diehl
85cb388ff1 Replaced tabs by 4 spaces 2022-03-02 14:42:47 +01:00
Nick Wellnhofer
5c009c668b Remove obsolete AC_HEADER checks 2022-03-02 01:31:56 +01:00
Nick Wellnhofer
72119afe00 Don't check for standard C89 library functions
Don't check for

- fprintf
- localtime
- printf
- rand
- sprintf
- srand
- sscanf
- strftime
- time
- vfprintf
- vsprintf

If the C99 functions snprintf and vsnprintf are missing, Trio is
enabled.
2022-03-02 01:14:08 +01:00
Nick Wellnhofer
776d15d383 Don't check for standard C89 headers
Don't check for

- ctype.h
- errno.h
- float.h
- limits.h
- math.h
- signal.h
- stdarg.h
- stdlib.h
- string.h
- time.h

Stop including non-standard headers

- malloc.h
- strings.h
2022-03-02 00:43:54 +01:00
Nick Wellnhofer
8f3bd26241 Remove broken VxWorks support 2022-03-01 17:18:56 +01:00
Nick Wellnhofer
041ed3d6b0 Remove broken Mac OS 9 support 2022-03-01 17:17:19 +01:00
Nick Wellnhofer
551b558db0 Remove useless call to xmlRelaxNGCleanupTypes
xmlRelaxNGCleanupTypes is called from xmlCleanupParser later.
2022-03-01 17:15:12 +01:00
Nick Wellnhofer
89d9ef3ee8 Reset last error in xmlCleanupGlobals
Before, we tried to reset the last error in xmlCleanupParser. But if
xmlCleanupParser wasn't called from the main thread, this would reset
the thread-local error object. xmlCleanupGlobals has access to the
error object of the main thread and can reset it reliably.
2022-03-01 15:14:00 +01:00
Nick Wellnhofer
ebc5009793 Warn when using deprecated functions from Python bindings
This requires Python code to be run with -Wd.
2022-03-01 13:57:16 +01:00
Nick Wellnhofer
b66ce0bba8 Don't include ICU headers in public headers
There's no need to make these implementation details public.
2022-03-01 13:02:49 +01:00
Nick Wellnhofer
50f6feb9c9 Remove broken bakefile support 2022-03-01 00:05:54 +01:00
Nick Wellnhofer
d7c7425cd1 Remove broken Visual Studio 2010 support 2022-03-01 00:03:24 +01:00
Nick Wellnhofer
b094e814fa Remove broken Windows CE support 2022-03-01 00:02:59 +01:00
Nick Wellnhofer
655cf3f46f Always fopen files with "rb"
We never want translation of newlines when reading files, so it should
be safe to always specify "rb". On sane platforms, the "b" flag is
simply ignored.
2022-02-28 23:39:00 +01:00
Nick Wellnhofer
3f8655db97 Remove __DJGPP__ checks
Drop broken support for DJGPP.
2022-02-28 23:22:50 +01:00
Nick Wellnhofer
2489c1d024 Remove useless __CYGWIN__ checks
From what I can tell, some really early Cygwin versions from around
1998-2000 used to erroneously define _WIN32. This was eventually fixed,
but these days, the `defined(_WIN32) && !defined(__CYGWIN__)` idiom is
unnecessary.

Now, we only check for __CYGWIN__ in xmlexports.h when deciding whether
to use __declspec.
2022-02-28 22:58:35 +01:00
Nick Wellnhofer
ea6e8f998d Fix certain combinations of regex range quantifiers
Fix regex transitions that have both min/max and a counter. In this
case, we want to save the regex state before incrementing the counter.

Fixes #301 and the issue reported here:

https://mail.gnome.org/archives/xml/2016-April/msg00017.html
2022-02-28 16:56:02 +01:00
Nick Wellnhofer
382fb056b5 Fix range quantifier on subregex
Make sure to add counted exit transitions before other counter
transitions. Otherwise, we won't backtrack correctly.

Fixes #65.
2022-02-28 16:56:02 +01:00
Mike Dalessio
48ed5a74bd Update xmlStrlen() to use POSIX / ISO C strlen()
This should be faster on a wide range of platforms.

Closes #212
2022-02-26 16:20:32 +00:00
Nick Wellnhofer
5bc5f0762f Fix build with older Python versions
ModuleNotFoundError is only available since Python 3.6. Use the
superclass ImportError instead. Fixes commit 3cc64a89.

Fixes #347.
2022-02-24 18:41:23 +01:00
Nick Wellnhofer
c41bc10da3 Fix unused variable warnings with disabled features 2022-02-22 19:57:12 +01:00
Nick Wellnhofer
4fd69f3e27 Fix recovery from invalid HTML start tags
Only try to parse a start tag if there's a '<' followed by an ASCII
letter. This is more in line with HTML5 and the old behavior in
recovery mode. Emit a literal '<' if the following character is
invalid.

Fixes #101.
Fixes #339.
2022-02-22 18:41:00 +01:00
Nick Wellnhofer
b057239b3f More fixes to --without-valid build
Fix runtest and Python bindings when building --without-valid.

The Python tests still fail. There doesn't seem to be a mechanism to
disable tests depending on feature flags.
2022-02-22 11:52:38 +01:00
Nick Wellnhofer
d05317cee5 Fix --without-valid build
Regressed in commit 652dd12a.
2022-02-22 11:51:08 +01:00
Nick Wellnhofer
f550977295 Fix documentation in entities.c 2022-02-20 22:06:16 +01:00
Nick Wellnhofer
b26d581d66 Add note about optimization flags 2022-02-20 21:49:05 +01:00
Nick Wellnhofer
6117700e2c Remove special configuration for certain maintainers 2022-02-20 21:49:05 +01:00
Nick Wellnhofer
004fe9de53 Deprecate IDREF-related functions in valid.h
These functions are only needed internally for validation.

xmlGetRefs is inherently unsafe because the ref table isn't updated
if attributes are removed (unlike the ids table).

None of the Ubuntu 20.04 packages depending on libxml2 use any of these
functions (except xmlFreeRefTable in libxslt), so it seems perfectly
safe to deprecate them.

Remove xmlIsRef and xmlRemoveRef from the Python bindings.
2022-02-20 21:49:05 +01:00
Nick Wellnhofer
61de92979b Deprecate all functions in DOCBparser.h 2022-02-20 21:49:05 +01:00
Nick Wellnhofer
aeaf02c0a3 Disable docbook support by default
The docbook code is broken and has been deprecated for years.
2022-02-20 21:49:04 +01:00
Nick Wellnhofer
cf4893f7b3 Deprecate legacy functions 2022-02-20 21:49:04 +01:00
Nick Wellnhofer
96889d195b Disable legacy support by default
If you need support for legacy APIs, you have to enable it explicitly:

    ./configure --with-legacy
2022-02-20 21:49:04 +01:00
Nick Wellnhofer
9e0ca5a19f Deprecate all functions in nanoftp.h 2022-02-20 21:49:04 +01:00
Nick Wellnhofer
a0a0f3be93 Disable FTP support by default
In the unlikely case that you really need FTP support, you have to
enable it explicitly with:

    ./configure --with-ftp
2022-02-20 21:49:04 +01:00
Nick Wellnhofer
a2fe74c08a Add XML_DEPRECATED macro
__attribute__((deprecated)) is available since at least GCC 3.1, so an
exact version check is probably unnecessary.
2022-02-20 21:49:04 +01:00
Nick Wellnhofer
346c3a930c Remove elfgcchack.h
The same optimization can be enabled with -fno-semantic-interposition
since GCC 5. clang has always used this option by default.
2022-02-20 21:49:04 +01:00
Nick Wellnhofer
ce0871e15c Only warn on invalid redeclarations of predefined entities
Downgrade the error message to a warning since the error was ignored,
anyway. Also print the name of redeclared entity. For a proper fix that
also shows filename and line number of the invalid redeclaration, we'd
have to

- pass the parser context to the entity functions somehow, or
- make these functions return distinct error codes.

Partial fix for #308.
2022-02-20 21:49:04 +01:00
Nick Wellnhofer
e03590c9ad Don't add IDs containing unexpanded entity references
When parsing without entity substitution, IDs or IDREFs containing
unexpanded entity reference like "abc&x;def" could be created. We could
try to expand these entities like in validation mode, but it seems
safer to honor the request not to expand entities. We silently ignore
such IDs for now.
2022-02-20 21:49:04 +01:00
Nick Wellnhofer
274a1b5bec Remove unneeded code in xmlreader.c
Now that no references to ID and IDREF attributes are stored in
streaming validation mode, there's no need to try and remove them.

Also remove xmlTextReaderFreeIDTable which was identical to
xmlFreeIDTable.
2022-02-20 21:49:04 +01:00
Nick Wellnhofer
d7cb33cf44 Rework validation context flags
Use a bitmask instead of magic values to

- keep track whether the validation context is part of a parser context
- keep track whether xmlValidateDtdFinal was called

This allows to add addtional flags later.

Note that this deliberately changes the name of a public struct member,
assuming that this was always private data never to be used by client
code.
2022-02-20 21:49:04 +01:00
Nick Wellnhofer
a075d256fd Release v2.9.13 v2.9.13 2022-02-19 19:26:42 +01:00
Nick Wellnhofer
04d4124c15 Update news and rebuild documentation 2022-02-19 19:26:42 +01:00
Nick Wellnhofer
652dd12a85 [CVE-2022-23308] Use-after-free of ID and IDREF attributes
If a document is parsed with XML_PARSE_DTDVALID and without
XML_PARSE_NOENT, the value of ID attributes has to be normalized after
potentially expanding entities in xmlRemoveID. Otherwise, later calls
to xmlGetID can return a pointer to previously freed memory.

ID attributes which are empty or contain only whitespace after
entity expansion are affected in a similar way. This is fixed by
not storing such attributes in the ID table.

The test to detect streaming mode when validating against a DTD was
broken. In connection with the defects above, this could result in a
use-after-free when using the xmlReader interface with validation.
Fix detection of streaming mode to avoid similar issues. (This changes
the expected result of a test case. But as far as I can tell, using the
XML reader with XIncludes referencing the root document never worked
properly, anyway.)

All of these issues can result in denial of service. Using xmlReader
with validation could result in disclosure of memory via the error
channel, typically stderr. The security impact of xmlGetID returning
a pointer to freed memory depends on the application. The typical use
case of calling xmlGetID on an unmodified document is not affected.
2022-02-19 19:26:42 +01:00