1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-07-10 00:59:39 +03:00

155 Commits

Author SHA1 Message Date
598ee0d2c6 error: Remove underscores from xmlRaiseError 2024-06-27 14:43:10 +02:00
5b893fa999 encoding: Fix encoding lookup with xmlOpenCharEncodingHandler
Make xmlOpenCharEncodingHandler call xmlParseCharEncoding first so we
prefer our own handlers for names like "UTF8". Only UTF-16 needs an
exception.

Make callers check the return value. For UTF-8, a NULL encoding doesn't
mean an error.

Remove unnecessary UTF-8 check from htmlFindOutputEncoder. Don't try to
look up ASCII handler since the HTML handler is always available.

Fix return code of xmlParseCharEncoding.

Should fix #744.
2024-06-22 21:59:03 +02:00
2def7b4b28 clang-tidy: move assignments out of if
Found with bugprone-assignment-in-if-condition

Signed-off-by: Rosen Penev <rosenp@gmail.com>
2024-06-20 21:11:44 -07:00
e75e878e02 doc: Update and fix documentation 2024-05-20 14:23:39 +02:00
f506ec6654 parser: Always decode entities in namespace URIs
Also decode entities in namespace URIs if entity substitution wasn't
requested. This should fix some corner cases when comparing namespace
URIs. The Namespaces in XML 1.0 spec says:

> In a namespace declaration, the URI reference is the normalized value
> of the attribute, so replacement of XML character and entity
> references has already been done before any comparison.

Make the serialization code escape special characters in namespace URIs
like in attribute values. This fixes serialization if entities were
substituted when parsing.

Fixes https://gitlab.gnome.org/GNOME/libxslt/-/issues/106
2024-04-15 12:34:26 +02:00
20fca2bb3d save: Report malloc failure in xmlAttrSerializeTxtContent
Flush buffer before checking for errors.
2024-04-09 16:53:57 +02:00
86c27206f9 save: Handle invalid parent pointers in xhtmlNodeDumpOutput
See #255 and commit 85b1792e.
2024-04-02 15:34:45 +02:00
fb1e63025b save: Check for NULL node->name in xhtmlIsEmpty 2024-03-17 19:42:59 +01:00
ee0c1f87c0 fuzz: New tree API fuzzer 2024-03-15 19:54:27 +01:00
10c202f9dc malloc-fail: Check for NULL pointer in xmlSaveNotation* 2024-03-15 19:47:08 +01:00
b1e75a9191 save: Report malloc failure in xmlAttrSerializeTxtContent 2024-03-15 19:47:08 +01:00
3494aa4fd5 save: Cast return code of xmlBufNodeDump
Avoid implicit sign change.
2024-03-15 19:47:08 +01:00
1d392fabb9 save: Check for output buffer errors
Report more error conditions.
2024-03-15 19:47:08 +01:00
d2f7ca5305 save: Add range check for level in xmlNodeDump 2024-03-15 19:47:08 +01:00
e314109ad1 save: Don't write directly to internal buffer
Make sure that OOM errors are reported.
2024-02-16 16:14:05 +01:00
fbe10a466f save: Move DTD serialization code to xmlsave.c 2024-02-04 14:33:19 +01:00
c2b3294f60 fuzz: Abort on invalid UTF-8
The parser should never generate invalid UTF-8 these days even in
recovery mode.
2024-01-04 21:20:51 +01:00
ca5965d594 save: Report more malloc failures 2024-01-02 23:43:06 +01:00
0821efc8ee encoding: Check whether encoding handlers support input/output
The "HTML" encoding handler doesn't support input which could lead to a
wrong error report.
2024-01-02 19:48:23 +01:00
4dcc2d743e save: Output U+FFFD replacement characters
This degrades more gracefully and helps to diagnose errors.

We stop raising errors for now, since there's no way to report malloc
failures during error handling yet.
2024-01-02 15:39:11 +01:00
bc1e030664 save: Improve error handling
Handle malloc failrue from xmlRaiseError.

Use xmlRaiseMemoryError.

Stop using xmlGenericError.

Remove argument from memory error handler.

Remove TODO macro.
2023-12-21 15:02:24 +01:00
6c8acdecd2 save: Fix build --without-html
Fixes #646
2023-12-14 13:49:08 +01:00
0d97e43993 save: Report malloc failures
Fix places where malloc failures aren't report.

Introduce a new API function xmlSaveFinish which returns an error code.
2023-12-11 22:13:05 +01:00
8c084ebdc7 doc: Make apibuild.py happy 2023-09-21 22:57:33 +02:00
da274bfa55 build: Fix build when certain modules are disabled 2023-09-21 02:26:43 +02:00
4e1c13ebfd debug: Remove debugging code
This is barely useful these days and only clutters the code base.
2023-09-19 17:35:09 +02:00
c82701ff0b malloc-fail: Fix memory leak in xmlDocDumpFormatMemoryEnc
Found with libFuzzer, see #344.
2023-02-17 17:16:51 +01:00
bdcf842cdb Move xmlIsXHTML to tree.c
It's declared in tree.h and not guarded by LIBXML_OUTPUT_ENABLED like
the other functions in xmlsave.c.
2022-09-02 18:33:35 +02:00
ad338ca737 Remove explicit integer casts
Remove explicit integer casts as final operation

- in assignments
- when passing arguments
- when returning values

Remove casts

- to the same type
- from certain range-bound values

The main motivation is that these explicit casts don't change the result
of operations and only render UBSan's implicit-conversion checks
useless. Removing these casts allows UBSan to detect cases where
truncation or sign-changes occur unexpectedly.

Document some explicit casts as truncating and add a few missing ones.
2022-09-01 02:33:57 +02:00
0f568c0b73 Consolidate private header files
Private functions were previously declared

- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.

Consolidate all private header files in include/private.
2022-08-26 02:11:56 +02:00
3e7b4f37aa Avoid calling xmlSetTreeDoc
Create text nodes with xmlNewDocText or set the document directly to
avoid xmlSetTreeDoc being called when the node is inserted.
2022-06-20 01:49:39 +02:00
d99ddd9bd5 Improve buffer allocation scheme
In most places, we really need the double-it scheme to avoid quadratic
behavior. The hybrid scheme still can cause many reallocations and the
bounded scheme doesn't seem to provide meaningful protection in
xmlreader.c.
2022-03-06 02:26:22 +01:00
346c3a930c Remove elfgcchack.h
The same optimization can be enabled with -fno-semantic-interposition
since GCC 5. clang has always used this option by default.
2022-02-20 21:49:04 +01:00
13ad8736d2 Fix regression in xmlNodeDumpOutputInternal
Commit 85b1792e could cause additional whitespace if xmlNodeDump was
called with a non-zero starting level.
2021-05-25 11:16:13 +02:00
85b1792e37 Work around lxml API abuse
Make xmlNodeDumpOutput and htmlNodeDumpFormatOutput work with corrupted
parent pointers. This used to work with the old recursive code but the
non-recursive rewrite required parent pointers to be set correctly.

Unfortunately, lxml relies on the old behavior and passes subtrees with
a corrupted structure. Fall back to a recursive function call if an
invalid parent pointer is detected.

Fixes #255.
2021-05-21 12:19:25 +02:00
0b3c64d9f2 Handle dumps of corrupted documents more gracefully
Check parent pointers for NULL after the non-recursive rewrite of the
serialization code. This avoids segfaults with corrupted documents
which can apparently be seen with lxml, see issue #187.
2020-09-29 18:08:37 +02:00
00a86d414b Don't add formatting newlines to XInclude nodes 2020-08-17 01:17:39 +02:00
1a360c1c2e More *NodeDumpOutput fixes
When leaving nodes, restrict more operations to XML_ELEMENT_NODEs.
2020-07-29 00:39:15 +02:00
7b2e517261 Fix *NodeDumpOutput functions
Only output end tag for elements. Should fix serialization of document
fragments.
2020-07-28 21:52:55 +02:00
dc6f009280 Make xmlNodeDumpOutputInternal non-recursive
Fixes stack overflow with deeply nested documents.
2020-07-28 21:00:09 +02:00
5330153da4 Make xhtmlNodeDumpOutput non-recursive
Fixes stack overflow with deeply nested documents.
2020-07-28 21:00:09 +02:00
20c60886e4 Fix typos
Resolves #133.
2020-03-08 17:41:53 +01:00
c9faa29259 Fix overflow check in xmlNodeDump
Store return value of xmlBufNodeDump in a size_t before checking for
integer overflow.

Found by lgtm.com
2020-01-02 14:12:39 +01:00
42942066e1 Fix memory leaks of encoding handlers in xmlsave.c
Fix leak of iconv/ICU encoding handler in xmlSaveToBuffer.

Fix leaks of iconv/ICU encoding handlers in xmlSaveTo* error paths.

Closes #127.
2019-11-11 14:04:57 +01:00
2a350ee9b4 Large batch of typo fixes
Closes #109.
2019-09-30 18:04:38 +02:00
81958b6e94 Doc: do not mislead towards "infeasible" scenario wrt. xmlBufNodeDump
At least when merely public API is to be leveraged, one cannot use
xmlBufCreate function that would otherwise be a clear fit, and relying
on some invariants wrt. how some other struct fields will get
initialized along the construction/filling such parent struct and
(ab)using that instead does not appear clever, either.

Hence, instruct people what's the Right Thing for the moment, that is,
make them use xmlNodeDumpOutput instead (together with likewise public
xmlAllocOutputBuffer).

Going forward, it's questionable what do with xmlBuf* family of
functions that are once public, since they, for any practical purpose,
cannot be used by the library clients (that's how I've run into this).

Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
2019-08-25 13:23:49 +02:00
96125557b6 Remove unused member doc in xmlSaveCtxt 2019-05-10 12:30:03 +02:00
ee501f5449 Stop using doc->charset outside parser code
doc->charset does not specify the in-memory encoding which is always
UTF-8.
2018-10-13 16:47:01 +02:00
cb5541c9f3 Fix libz and liblzma detection
If libz or liblzma are detected with pkg-config, AC_CHECK_HEADERS must
not be run because the correct CPPFLAGS aren't set. It is actually not
required have separate checks for LIBXML_ZLIB_ENABLED and HAVE_ZLIB_H.
Only check for LIBXML_ZLIB_ENABLED and remove HAVE_ZLIB_H macro.

Fixes bug 764657, bug 787041.
2017-11-27 14:33:37 +01:00
359e750482 Fix -Wmisleading-indentation warnings 2017-11-27 13:42:30 +01:00