1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2024-10-26 12:25:09 +03:00
Commit Graph

122 Commits

Author SHA1 Message Date
Nick Wellnhofer
5e7874015e save: Make xmlEscapeTab signed
Fixes issues in platforms where char is unsigned.

Fixes #797.
2024-09-10 17:50:08 +02:00
Nick Wellnhofer
a530ff125d io: Always consume encoding handler when creating output buffers
Also free encoding handler in error case.

Remove xmlAllocOutputBufferInternal which was identical to
xmlAllocOutputBuffer.
2024-07-29 14:25:39 +02:00
Nick Wellnhofer
bc14d70f49 xmlsave: Improve "unsupported encoding" error message
Incomplete support of XML_SAVE_* error codes was removed. Error handling
still needs work. xmlOutputBufferCreateFilename should return an error
code.
2024-07-25 00:26:48 +02:00
Nick Wellnhofer
5862e9dd37 Add NULL checks
Short-lived regression.
2024-07-18 02:10:16 +02:00
Nick Wellnhofer
a221cd7849 buf: Rework xmlBuf code
Always use what the old implementation called the "IO" allocation
scheme, allowing to move the content pointer past the initial
allocation. This is inexpensive and allows efficient shrinking.

Optimize xmlBufGrow, reusing shrunken memory as much as possible.

Simplify xmlBufAdd.

Make xmlBufBackToBuffer return an error on overflow.

Make "size" exclude the terminating NULL byte.

Always provide an initial size.

Reintroduce static buffers.

Remove xmlBufResize and several other functions.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
2adcde3920 save: Optimize xmlSerializeText
Use lookup tables.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
1b06708271 save: Always serialize CR as decimal "
"
We used to serialize CR as "
" when there was no encoding and we
weren't in an attribute. This was somewhat inconsistent.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
1cfc5b8089 entities: Rework serialization of numeric character references 2024-07-16 17:42:10 +02:00
Nick Wellnhofer
8d1606265d entities: Rework text escaping 2024-07-16 17:42:10 +02:00
Nick Wellnhofer
cc45f618ae save: Rework text escaping
Stop using xmlOutputBufferWriteEscape except when using deprecated
xmlSaveSetEscape. Rewrite xmlOutputBufferWriteEscape to use an extra
buffer and call xmlOutputBufferWrite.

Introduce xmlSerializeText to serialize both text and attribute content.

Don't read encoding from document when serializing and remove all hacks
that temporarily changed the document's encoding.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
e488695b1a save: Deprecate xmlSaveSet*Escape
xmlSaveSetAttrEscape never had an effect.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
673ca0edaf tests: Regenerate testapi.c 2024-07-11 01:23:57 +02:00
Nick Wellnhofer
96d850c3cb save: Fix "Factor out xmlSaveWriteIndent" 2024-07-02 22:43:49 +02:00
Nick Wellnhofer
35146ff31c save: Implement xmlSaveSetIndentString
Allow to set indent string without using global xmlTreeIndentString.

See #736.
2024-07-02 20:03:23 +02:00
Nick Wellnhofer
7cc619d568 save: Implement save options for indenting
Implement XML_SAVE_NO_INDENT to disable and XML_SAVE_INDENT to enable
indenting regardless of the global xmlIndentTreeOutput.

Implement XML_SAVE_EMPTY to enable empty tags regardless of the global
xmlSaveNoEmptyTags.

See #736.
2024-07-02 20:03:23 +02:00
Nick Wellnhofer
2c4204ecee save: Factor out xmlSaveWriteIndent 2024-07-02 20:03:23 +02:00
Nick Wellnhofer
202045f8df save: Pass options to xmlSaveCtxtInit 2024-07-02 20:03:23 +02:00
Nick Wellnhofer
598ee0d2c6 error: Remove underscores from xmlRaiseError 2024-06-27 14:43:10 +02:00
Nick Wellnhofer
5b893fa999 encoding: Fix encoding lookup with xmlOpenCharEncodingHandler
Make xmlOpenCharEncodingHandler call xmlParseCharEncoding first so we
prefer our own handlers for names like "UTF8". Only UTF-16 needs an
exception.

Make callers check the return value. For UTF-8, a NULL encoding doesn't
mean an error.

Remove unnecessary UTF-8 check from htmlFindOutputEncoder. Don't try to
look up ASCII handler since the HTML handler is always available.

Fix return code of xmlParseCharEncoding.

Should fix #744.
2024-06-22 21:59:03 +02:00
Rosen Penev
2def7b4b28 clang-tidy: move assignments out of if
Found with bugprone-assignment-in-if-condition

Signed-off-by: Rosen Penev <rosenp@gmail.com>
2024-06-20 21:11:44 -07:00
Nick Wellnhofer
e75e878e02 doc: Update and fix documentation 2024-05-20 14:23:39 +02:00
Nick Wellnhofer
f506ec6654 parser: Always decode entities in namespace URIs
Also decode entities in namespace URIs if entity substitution wasn't
requested. This should fix some corner cases when comparing namespace
URIs. The Namespaces in XML 1.0 spec says:

> In a namespace declaration, the URI reference is the normalized value
> of the attribute, so replacement of XML character and entity
> references has already been done before any comparison.

Make the serialization code escape special characters in namespace URIs
like in attribute values. This fixes serialization if entities were
substituted when parsing.

Fixes https://gitlab.gnome.org/GNOME/libxslt/-/issues/106
2024-04-15 12:34:26 +02:00
Nick Wellnhofer
20fca2bb3d save: Report malloc failure in xmlAttrSerializeTxtContent
Flush buffer before checking for errors.
2024-04-09 16:53:57 +02:00
Nick Wellnhofer
86c27206f9 save: Handle invalid parent pointers in xhtmlNodeDumpOutput
See #255 and commit 85b1792e.
2024-04-02 15:34:45 +02:00
Nick Wellnhofer
fb1e63025b save: Check for NULL node->name in xhtmlIsEmpty 2024-03-17 19:42:59 +01:00
Nick Wellnhofer
ee0c1f87c0 fuzz: New tree API fuzzer 2024-03-15 19:54:27 +01:00
Nick Wellnhofer
10c202f9dc malloc-fail: Check for NULL pointer in xmlSaveNotation* 2024-03-15 19:47:08 +01:00
Nick Wellnhofer
b1e75a9191 save: Report malloc failure in xmlAttrSerializeTxtContent 2024-03-15 19:47:08 +01:00
Nick Wellnhofer
3494aa4fd5 save: Cast return code of xmlBufNodeDump
Avoid implicit sign change.
2024-03-15 19:47:08 +01:00
Nick Wellnhofer
1d392fabb9 save: Check for output buffer errors
Report more error conditions.
2024-03-15 19:47:08 +01:00
Nick Wellnhofer
d2f7ca5305 save: Add range check for level in xmlNodeDump 2024-03-15 19:47:08 +01:00
Nick Wellnhofer
e314109ad1 save: Don't write directly to internal buffer
Make sure that OOM errors are reported.
2024-02-16 16:14:05 +01:00
Nick Wellnhofer
fbe10a466f save: Move DTD serialization code to xmlsave.c 2024-02-04 14:33:19 +01:00
Nick Wellnhofer
c2b3294f60 fuzz: Abort on invalid UTF-8
The parser should never generate invalid UTF-8 these days even in
recovery mode.
2024-01-04 21:20:51 +01:00
Nick Wellnhofer
ca5965d594 save: Report more malloc failures 2024-01-02 23:43:06 +01:00
Nick Wellnhofer
0821efc8ee encoding: Check whether encoding handlers support input/output
The "HTML" encoding handler doesn't support input which could lead to a
wrong error report.
2024-01-02 19:48:23 +01:00
Nick Wellnhofer
4dcc2d743e save: Output U+FFFD replacement characters
This degrades more gracefully and helps to diagnose errors.

We stop raising errors for now, since there's no way to report malloc
failures during error handling yet.
2024-01-02 15:39:11 +01:00
Nick Wellnhofer
bc1e030664 save: Improve error handling
Handle malloc failrue from xmlRaiseError.

Use xmlRaiseMemoryError.

Stop using xmlGenericError.

Remove argument from memory error handler.

Remove TODO macro.
2023-12-21 15:02:24 +01:00
Nick Wellnhofer
6c8acdecd2 save: Fix build --without-html
Fixes #646
2023-12-14 13:49:08 +01:00
Nick Wellnhofer
0d97e43993 save: Report malloc failures
Fix places where malloc failures aren't report.

Introduce a new API function xmlSaveFinish which returns an error code.
2023-12-11 22:13:05 +01:00
Nick Wellnhofer
8c084ebdc7 doc: Make apibuild.py happy 2023-09-21 22:57:33 +02:00
Nick Wellnhofer
da274bfa55 build: Fix build when certain modules are disabled 2023-09-21 02:26:43 +02:00
Nick Wellnhofer
4e1c13ebfd debug: Remove debugging code
This is barely useful these days and only clutters the code base.
2023-09-19 17:35:09 +02:00
Nick Wellnhofer
c82701ff0b malloc-fail: Fix memory leak in xmlDocDumpFormatMemoryEnc
Found with libFuzzer, see #344.
2023-02-17 17:16:51 +01:00
Nick Wellnhofer
bdcf842cdb Move xmlIsXHTML to tree.c
It's declared in tree.h and not guarded by LIBXML_OUTPUT_ENABLED like
the other functions in xmlsave.c.
2022-09-02 18:33:35 +02:00
Nick Wellnhofer
ad338ca737 Remove explicit integer casts
Remove explicit integer casts as final operation

- in assignments
- when passing arguments
- when returning values

Remove casts

- to the same type
- from certain range-bound values

The main motivation is that these explicit casts don't change the result
of operations and only render UBSan's implicit-conversion checks
useless. Removing these casts allows UBSan to detect cases where
truncation or sign-changes occur unexpectedly.

Document some explicit casts as truncating and add a few missing ones.
2022-09-01 02:33:57 +02:00
Nick Wellnhofer
0f568c0b73 Consolidate private header files
Private functions were previously declared

- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.

Consolidate all private header files in include/private.
2022-08-26 02:11:56 +02:00
Nick Wellnhofer
3e7b4f37aa Avoid calling xmlSetTreeDoc
Create text nodes with xmlNewDocText or set the document directly to
avoid xmlSetTreeDoc being called when the node is inserted.
2022-06-20 01:49:39 +02:00
Nick Wellnhofer
d99ddd9bd5 Improve buffer allocation scheme
In most places, we really need the double-it scheme to avoid quadratic
behavior. The hybrid scheme still can cause many reallocations and the
bounded scheme doesn't seem to provide meaningful protection in
xmlreader.c.
2022-03-06 02:26:22 +01:00
Nick Wellnhofer
346c3a930c Remove elfgcchack.h
The same optimization can be enabled with -fno-semantic-interposition
since GCC 5. clang has always used this option by default.
2022-02-20 21:49:04 +01:00