1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2024-10-26 12:25:09 +03:00
Commit Graph

333 Commits

Author SHA1 Message Date
Nick Wellnhofer
bd9eed4694 parser: Make unsupported encodings an error in declarations
This was changed in 45157261, but in encoding declarations, unsupported
encodings should raise a fatal error.

Fixes #794.
2024-09-02 19:29:39 +02:00
Nick Wellnhofer
1d009fe35d parser: Report at least one fatal error 2024-08-05 15:14:21 +02:00
Nick Wellnhofer
bfed6e6ae8 parser: Fix error handling after reaching limit
Mark document as non-wellformed and stop parser even if error limit was
reached.

Regressed in abd74186.
2024-08-05 14:58:37 +02:00
Nick Wellnhofer
6a3c0b0d93 parser: Increase XML_MAX_DICTIONARY_LIMIT
This limit is somewhat arbitrary and can be reached when fuzzing
documents up to 1 MB.

Increase limit to 100 MB and disable limit if XML_PARSE_HUGE is set.
2024-07-22 12:53:00 +02:00
Nick Wellnhofer
a6f54f055b io: Fine-tune initial IO buffer size 2024-07-16 17:42:10 +02:00
Nick Wellnhofer
34c9108f15 encoding: Add sizeOut argument to xmlCharEncInput
When push parsing, we want to convert as much of the input as possible.
When pull parsing memory buffers, we want to convert data chunk by chunk
to save memory.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
92f30711de parser: Optimize buffer shrinking
Remove checks now that we can shrink memory buffers efficiently.

Shrink more aggressively.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
a221cd7849 buf: Rework xmlBuf code
Always use what the old implementation called the "IO" allocation
scheme, allowing to move the content pointer past the initial
allocation. This is inexpensive and allows efficient shrinking.

Optimize xmlBufGrow, reusing shrunken memory as much as possible.

Simplify xmlBufAdd.

Make xmlBufBackToBuffer return an error on overflow.

Make "size" exclude the terminating NULL byte.

Always provide an initial size.

Reintroduce static buffers.

Remove xmlBufResize and several other functions.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
728869809e error: Add helper functions to print errors and abort 2024-07-15 16:33:38 +02:00
Nick Wellnhofer
aa6aec19b0 parser: Fix xmlInputSetEncodingHandler again
Short-lived regression.
2024-07-11 12:42:13 +02:00
Nick Wellnhofer
8af55c8d20 parser: Rename new input API functions
These weren't made public yet.
2024-07-11 01:33:29 +02:00
Nick Wellnhofer
d74ca59491 parser: Rename internal xmlNewInput functions 2024-07-11 01:31:50 +02:00
Nick Wellnhofer
4f329dc524 parser: Implement xmlCtxtParseContent
This implements xmlCtxtParseContent, a better alternative to
xmlParseInNodeContext or xmlParseBalancedChunkMemory. It accepts a
parser context and a parser input, making it a lot more versatile.

xmlParseInNodeContext is now implemented in terms of
xmlCtxtParseContent. This makes sure that xmlParseInNodeContext never
modifies the target document, improving thread safety.
xmlParseInNodeContext is also more lenient now with regard to undeclared
entities.

Fixes #727.
2024-07-11 01:26:32 +02:00
Nick Wellnhofer
4fec0889e0 parser: Fix memory leak in xmlInputSetEncodingHandler
Short-lived regression.
2024-07-10 22:32:33 +02:00
Nick Wellnhofer
5935471732 parser: Fix malloc failure handling in xmlInputSetEncodingHandler
Don't set encoder if allocating buffer failed. This could lead to
xmlByteConsumed processing invalid UTF-8.
2024-07-09 14:11:28 +02:00
Nick Wellnhofer
ea31ac5bba fuzz: Fix spaceMax 2024-07-07 04:19:09 +02:00
Nick Wellnhofer
29e3ab92f0 fuzz: Make reallocs more likely 2024-07-06 15:48:43 +02:00
Nick Wellnhofer
38195cf596 parser: Don't produce names with invalid UTF-8 in recovery mode 2024-07-06 15:33:06 +02:00
Nick Wellnhofer
ec0881099b parser: Upgrade XML_IO_NETWORK_ATTEMPT to error
Fixes XML::LibXML test suite.
2024-07-04 15:47:20 +02:00
Nick Wellnhofer
fdfeecfe5e parser: Reenable ctxt->directory
Unused internally, but used in downstream code.

Should fix #753.
2024-07-02 22:06:53 +02:00
Nick Wellnhofer
606f410891 parser: Allow to disable catalogs with parser options
Implement XML_PARSE_NO_SYS_CATALOG and XML_PARSE_NO_CATALOG_PI.

Fixes #735.
2024-07-02 22:06:53 +02:00
Nick Wellnhofer
197e09d5c5 parser: Fix xmlLoadResource
Short-lived regression.
2024-07-02 20:03:23 +02:00
Nick Wellnhofer
ede5d99af3 parser: Fix typo 2024-07-02 16:38:15 +02:00
Nick Wellnhofer
30ef77554b parser: Don't use deprecated xmlCopyChar 2024-07-02 13:34:11 +02:00
Nick Wellnhofer
751ba00e00 parser: Don't use deprecated xmlSwitchInputEncoding 2024-07-02 13:34:04 +02:00
Nick Wellnhofer
9a4770ef84 doc: Improve documentation 2024-07-02 13:34:04 +02:00
Nick Wellnhofer
0b0dd98983 parser: Fix EBCDIC detection 2024-07-01 18:05:40 +02:00
Nick Wellnhofer
221df37529 parser: Support custom charset conversion implementations
Implement xmlCtxtSetCharEncConvImpl. I agree that the name is terrible.
2024-07-01 18:05:40 +02:00
Nick Wellnhofer
e72eda101e parser: Add NULL check in xmlNewIOInputStream 2024-06-29 01:22:02 +02:00
Nick Wellnhofer
bc793390d5 parser: Update documentation 2024-06-27 16:23:14 +02:00
Nick Wellnhofer
193f4653a5 parser: Implement xmlCtxtGetStatus
This allows access to ctxt->wellFormed, ctxt->nsWellFormed and
ctxt->valid. It also detects several fatal non-parser errors which
really should be another error level.
2024-06-27 15:17:40 +02:00
Nick Wellnhofer
cc0cc2d3b7 parser: Add more parser context accessors 2024-06-27 14:45:33 +02:00
Nick Wellnhofer
eca972e682 parser: Add getters for XML declaration to parser context
Access to struct members will be deprecated.
2024-06-27 14:44:49 +02:00
Nick Wellnhofer
3ff8a2c4b8 parser: Deprecate xmlIsLetter 2024-06-27 14:43:10 +02:00
Nick Wellnhofer
fa50be923b parser: Move implementation of xmlCtxtGetLastError 2024-06-27 14:37:53 +02:00
Rosen Penev
217e9b7af2 clang-tidy: don't return in void functions
Found with readability-redundant-control-flow

Signed-off-by: Rosen Penev <rosenp@gmail.com>
2024-06-20 20:37:34 +00:00
Nick Wellnhofer
c5e9a5b2c9 parser: Use catalogs with resource loader 2024-06-17 15:49:25 +02:00
Nick Wellnhofer
6deebe036a parser: Make xmlInputCreateUrl handle HTTP input 2024-06-17 15:47:43 +02:00
Nick Wellnhofer
d2fd9d37b0 parser: Fix swapped arguments 2024-06-17 15:47:43 +02:00
Nick Wellnhofer
2608baaf92 parser: Make failure to load main document a warning
Revert the change that made failures to load the main document an error.

This fixes the --path option of xmllint and xsltproc.

Should fix #733.
2024-06-14 20:06:07 +02:00
Nick Wellnhofer
dba1ed85a3 ftp: Remove FTP support
Remove the built-in FTP client. If you configure --with-legacy, old
symbols are retained for ABI compatibility.
2024-06-12 18:19:55 +02:00
Nick Wellnhofer
5238404325 parser: Pass resource type to resource loader 2024-06-12 16:36:12 +02:00
Nick Wellnhofer
ab5e6debd1 parser: Introduce XML_INPUT_NETWORK input flag
This allows to disable network access when creating parser inputs with
xmlInputCreateUrl.
2024-06-12 16:36:12 +02:00
Nick Wellnhofer
89fcae4dfd parser: Don't report malloc failures when creating context
We don't want messages to stderr before an error handler could be set on
a parser context.
2024-06-12 16:36:12 +02:00
Nick Wellnhofer
64ad272525 parser: Introduce per-context resource loader 2024-06-12 16:22:52 +02:00
Nick Wellnhofer
b9d2f3c911 parser: Introduce new input API
- xmlInputCreateUrl
- xmlInputCreateMemory
- xmlInputCreateString
- xmlInputCreateFd
- xmlInputCreateIO
- xmlInputSetEncoding

These functions don't take a parser context and work on xmlParserInputs,
replacing functions working on xmlParserInputBuffers.

xmlInputCreateUrl and xmlInputSetEncoding offer fine-grained error
handling.

Several XML_INPUT_* flags offer additional control.
2024-06-12 16:22:52 +02:00
Nick Wellnhofer
410931e385 parser: Only set input ID for PE refs
Other input streams don't require IDs.
2024-06-12 16:22:52 +02:00
Nick Wellnhofer
a3b2baeb67 parser: Simplify xmlNewInputFromFile 2024-06-12 16:22:52 +02:00
Nick Wellnhofer
0b58838764 parser: Rework XML_PARSE_NONET handling 2024-06-12 16:22:52 +02:00
Nick Wellnhofer
ff3b091910 parser: Implement XML_PARSE_NO_UNZIP option 2024-06-12 16:14:15 +02:00