1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2024-10-26 12:25:09 +03:00
Commit Graph

1022 Commits

Author SHA1 Message Date
Nick Wellnhofer
b52a3044aa parser: Use counted_by attribute if supported
We only have a single struct with a flexible array member.
2024-10-24 18:18:47 +02:00
Nick Wellnhofer
74dfc49b5f parser: Clarify logic in xmlParseStartTag2 2024-10-06 20:04:00 +02:00
Nick Wellnhofer
0bc4608c50 html: Use hash table to check for duplicate attributes 2024-10-06 20:04:00 +02:00
Nick Wellnhofer
0ce7bfe559 html: Try to avoid passing XML options to HTML parser 2024-10-06 20:04:00 +02:00
Nick Wellnhofer
16de1346eb parser: Make new options actually work 2024-10-06 20:04:00 +02:00
Nick Wellnhofer
dde62ae5d5 parser: Align push parsing of CDATA sections with pull parser
Remove special handling of CDATA sections in push parser. This makes
sure that only a single callback is generated for large sections.

Fixes #22 and needed for #412.
2024-08-29 01:28:49 +02:00
Nick Wellnhofer
4d10e53af1 parser: Make sure to set and increment input id
Revert part of commits 410931e3 and b9d2f3c9.
2024-08-28 22:47:20 +02:00
Nick Wellnhofer
6d365ca02c doc: XML_PARSE_NO_XXE is available since 2.13.0 2024-08-28 22:09:30 +02:00
makise-homura
103aadbc66 parser: Suppress EDG maybe-uninitialized warning 2024-08-16 22:26:07 +03:00
Nick Wellnhofer
02fcb1effb parser: Make xmlParseChunk return an error if parser was stopped
This regressed after enhancing the disableSAX member in 2.13.

Should fix #777.
2024-07-25 17:07:18 +02:00
Nick Wellnhofer
1a89323039 [CVE-2024-40896] Fix XXE protection in downstream code
Some users set an entity's children manually in the getEntity SAX
callback to restrict entity expansion. This stopped working after
renaming the "checked" member of xmlEntity, making at least one
downstream project and its dependants susceptible to XXE attacks.

See #761.
2024-07-24 17:19:32 +02:00
Nick Wellnhofer
6a3c0b0d93 parser: Increase XML_MAX_DICTIONARY_LIMIT
This limit is somewhat arbitrary and can be reached when fuzzing
documents up to 1 MB.

Increase limit to 100 MB and disable limit if XML_PARSE_HUGE is set.
2024-07-22 12:53:00 +02:00
Nick Wellnhofer
5d36664fc9 memory: Deprecate xmlGcMemSetup 2024-07-16 17:42:10 +02:00
Nick Wellnhofer
7148b77820 parser: Optimize memory buffer I/O
Reenable zero-copy IO for zero-terminated static memory buffers.

Don't stream zero-terminated dynamic memory buffers on top of creating
a copy.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
34c9108f15 encoding: Add sizeOut argument to xmlCharEncInput
When push parsing, we want to convert as much of the input as possible.
When pull parsing memory buffers, we want to convert data chunk by chunk
to save memory.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
6be79014d7 Remove unused code 2024-07-15 16:33:38 +02:00
Nick Wellnhofer
fee0006a06 parser: Fix memory leak after malloc failure in xml*ParseDTD 2024-07-15 13:03:55 +02:00
Nick Wellnhofer
8af55c8d20 parser: Rename new input API functions
These weren't made public yet.
2024-07-11 01:33:29 +02:00
Nick Wellnhofer
d74ca59491 parser: Rename internal xmlNewInput functions 2024-07-11 01:31:50 +02:00
Nick Wellnhofer
4f329dc524 parser: Implement xmlCtxtParseContent
This implements xmlCtxtParseContent, a better alternative to
xmlParseInNodeContext or xmlParseBalancedChunkMemory. It accepts a
parser context and a parser input, making it a lot more versatile.

xmlParseInNodeContext is now implemented in terms of
xmlCtxtParseContent. This makes sure that xmlParseInNodeContext never
modifies the target document, improving thread safety.
xmlParseInNodeContext is also more lenient now with regard to undeclared
entities.

Fixes #727.
2024-07-11 01:26:32 +02:00
Nick Wellnhofer
f51ad063a7 parser: Fix error return of xmlParseBalancedChunkMemory
Only return an error code if the chunk is not well-formed to match the
2.12 behavior. Return 0 on non-fatal errors like invalid namespaces.

Fixes #765.
2024-07-08 11:28:33 +02:00
Nick Wellnhofer
2e63656ec6 parser: Check return value of inputPush
inputPush typically doesn't fail because we pre-allocate the input
table. The return value should be checked nevertheless.
2024-07-08 11:27:52 +02:00
Nick Wellnhofer
1e5375c1b4 SAX2: Check return value of xmlPushInput
Fix null deref in case of malloc failure.
2024-07-06 15:33:06 +02:00
Nick Wellnhofer
38195cf596 parser: Don't produce names with invalid UTF-8 in recovery mode 2024-07-06 15:33:06 +02:00
Nick Wellnhofer
fdfeecfe5e parser: Reenable ctxt->directory
Unused internally, but used in downstream code.

Should fix #753.
2024-07-02 22:06:53 +02:00
Nick Wellnhofer
606f410891 parser: Allow to disable catalogs with parser options
Implement XML_PARSE_NO_SYS_CATALOG and XML_PARSE_NO_CATALOG_PI.

Fixes #735.
2024-07-02 22:06:53 +02:00
Nick Wellnhofer
866be54e22 parser: Don't use deprecated xmlSplitQName 2024-07-02 13:34:11 +02:00
Nick Wellnhofer
bc793390d5 parser: Update documentation 2024-06-27 16:23:14 +02:00
Nick Wellnhofer
eca972e682 parser: Add getters for XML declaration to parser context
Access to struct members will be deprecated.
2024-06-27 14:44:49 +02:00
Mike Dalessio
bbbbbb4649 parser: implement xmlCtxtGetOptions
In 712a31ab, the `options` struct member was deprecated. To allow
callers to check the status of options bits, introduce
xmlCtxtGetOptions.
2024-06-20 20:39:54 +00:00
Rosen Penev
217e9b7af2 clang-tidy: don't return in void functions
Found with readability-redundant-control-flow

Signed-off-by: Rosen Penev <rosenp@gmail.com>
2024-06-20 20:37:34 +00:00
Nick Wellnhofer
32cac377c8 parser: Selectively reenable reading from "-"
Make filename "-" mean stdin for legacy SAX1 functions and xmlReadFile.
This should hopefully fix most command line utilities.

See #737.
2024-06-17 18:08:31 +02:00
Nick Wellnhofer
33a1f8978d legacy: Merge SAX.c into legacy.c 2024-06-16 19:17:41 +02:00
Nick Wellnhofer
10d60d15d6 regexp: Stop using LIBXML_AUTOMATA_ENABLED
This macro always equals LIBXML_REGEXP_ENABLED.
2024-06-16 18:47:12 +02:00
Nick Wellnhofer
b0fc67aa22 build: Remove --with-tree configuration option
This option would allow for a smaller, but mostly useless minimal build.
But it complicates the symbol availability logic in an insane way and
requires specialized tools like our custom C parser in doc/apibuild.py.

See #717.
2024-06-16 18:47:12 +02:00
Nick Wellnhofer
039ce1e821 parser: Pass global object to sax->setDocumentLocator
Revert part of commit c011e760.

Fixes #732.
2024-06-14 16:41:43 +02:00
Nick Wellnhofer
dba1ed85a3 ftp: Remove FTP support
Remove the built-in FTP client. If you configure --with-legacy, old
symbols are retained for ABI compatibility.
2024-06-12 18:19:55 +02:00
Nick Wellnhofer
5238404325 parser: Pass resource type to resource loader 2024-06-12 16:36:12 +02:00
Nick Wellnhofer
89fcae4dfd parser: Don't report malloc failures when creating context
We don't want messages to stderr before an error handler could be set on
a parser context.
2024-06-12 16:36:12 +02:00
Nick Wellnhofer
410931e385 parser: Only set input ID for PE refs
Other input streams don't require IDs.
2024-06-12 16:22:52 +02:00
Nick Wellnhofer
ff3b091910 parser: Implement XML_PARSE_NO_UNZIP option 2024-06-12 16:14:15 +02:00
Nick Wellnhofer
47cbb6bb3c doc: Don't mention xmlNewInputURL 2024-06-12 16:04:45 +02:00
Nick Wellnhofer
8318b5a634 parser: Fix NULL checks for output arguments 2024-06-09 15:08:43 +02:00
Nick Wellnhofer
0cde1b78d6 parser: Fix "Truncated multi-byte sequence" error
Don't raise the error if decoding failed.
2024-06-07 00:02:31 +02:00
Nick Wellnhofer
122b61309f parser: Fix performance regression when parsing namespaces
The namespace hash table didn't reuse deleted buckets, leading to
quadratic behavior.

Also ignore deleted buckets when resizing.

Fixes #726.
2024-06-06 15:52:09 +02:00
Nick Wellnhofer
a7e26707be parser: Don't overwrite OOM errors in xmlSBuf 2024-06-03 14:04:44 +02:00
Nick Wellnhofer
e75e878e02 doc: Update and fix documentation 2024-05-20 14:23:39 +02:00
Nick Wellnhofer
4fefba4cf6 parser: Rework handling of undeclared entities
Throw an error if entity substitution was requested.

Now we only downgrade to a warning if

- XML_PARSE_DTDLOAD wasn't specified, and
- entity aren't substituted or XML_PARSE_NO_XXE was specified.

Should fix #724.
2024-05-15 17:58:48 +02:00
Nick Wellnhofer
4ff2dccf9f SAX2: Warn if URI resolution failed 2024-05-13 12:50:08 +02:00
Nick Wellnhofer
4fe116ebd3 parser: Don't report error on invalid URI
Only fragment identifiers are an error.

This removes the last user of xmlErrMsg*. Now every error reported by
the parser should result in one of ctxt->wellFormed, ctxt->nsWellFormed
or ctxt->valid being set to zero.
2024-05-13 12:50:08 +02:00