1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-02-05 05:47:00 +03:00

6861 Commits

Author SHA1 Message Date
Nick Wellnhofer
38195cf596 parser: Don't produce names with invalid UTF-8 in recovery mode 2024-07-06 15:33:06 +02:00
Nick Wellnhofer
c45c15f5af ci: Add job for perl-XML-LibXML 2024-07-04 15:47:49 +02:00
Nick Wellnhofer
ec0881099b parser: Upgrade XML_IO_NETWORK_ATTEMPT to error
Fixes XML::LibXML test suite.
2024-07-04 15:47:20 +02:00
Nick Wellnhofer
f86d17c163 encoding: Fix xmlParseCharEncoding
Make "UTF-16" return the UTF16LE handler as before.

Fix error return.
2024-07-04 15:47:20 +02:00
Nick Wellnhofer
10082a3d54 testchar: Don't invoke encoding handler directly 2024-07-04 15:47:20 +02:00
Mike Dalessio
446a3610fd test: add a downstream integration test job for nokogiri
Related to #758
2024-07-04 13:30:48 +00:00
Andrew Potter
67fa4a43f3 meson: Disable python when python is disabled 2024-07-03 13:40:04 -07:00
Nick Wellnhofer
e2a49afe3e build: Read version number from VERSION file 2024-07-03 20:32:23 +02:00
Nick Wellnhofer
c3731347c4 build: Introduce LIBXML_MINOR_COMPAT
This is set to 0 for now but could be used to avoid ABI stability
issues.
2024-07-03 18:33:16 +02:00
Nick Wellnhofer
606310a381 meson: Set soversion 2024-07-03 18:05:05 +02:00
Nick Wellnhofer
944cc23c84 tree: Fix handling of empty strings in xmlNodeParseContent
We shouldn't create an empty text node to match the old behavior.

Fixes #759.
2024-07-03 16:07:10 +02:00
Nick Wellnhofer
46ec621eb7 encoding: Clarify xmlUconvConvert 2024-07-03 16:06:59 +02:00
Nick Wellnhofer
48fec2429b encoding: Remove duplicate code
Fix recent commit.
2024-07-03 15:11:20 +02:00
Nick Wellnhofer
71fb257912 encoding: Fix ICU build 2024-07-03 14:35:49 +02:00
Nick Wellnhofer
80aabea1d6 SAX2: Reenable 'directory' as base URI fallback
Apparently, some users overwrite this member manually to set a base URI
for memory streams.

Fixes #753.
2024-07-03 11:55:38 +02:00
Nick Wellnhofer
842a044831 valid: Restore ID lookup
Revert a change from d025cfbb and don't overwrite ID table entries, so
that the first attribute will be returned if there are duplicate IDs.

This requires two other changes:

- Attributes in entity content are never added to the ID table. This
  seems reasonable.

- Remove the optimization to skip ID lookup when copying and the target
  document has an empty ID table. This also seems more correct since the
  document could have ID declarations nevertheless or we could be
  copying xml:ids into the document for the first time.

Fixes #757.
2024-07-03 11:46:06 +02:00
Nick Wellnhofer
f906526175 SAX2: Fix HTML IDs
Short-lived regression. Fixes #755.
2024-07-02 23:59:28 +02:00
Nick Wellnhofer
785ed5c4cd meson: Don't auto-enable legacy and tls
These features should be requested explicitly.
2024-07-02 23:03:46 +02:00
Nick Wellnhofer
96d850c3cb save: Fix "Factor out xmlSaveWriteIndent" 2024-07-02 22:43:49 +02:00
Nick Wellnhofer
205e56dafe parser: Undeprecate ctxt->directory 2024-07-02 22:32:43 +02:00
Nick Wellnhofer
8fb1dc9a62 Clarify xpointer() extension removal 2024-07-02 22:17:08 +02:00
Nick Wellnhofer
fdfeecfe5e parser: Reenable ctxt->directory
Unused internally, but used in downstream code.

Should fix #753.
2024-07-02 22:06:53 +02:00
Nick Wellnhofer
c127c89f98 catalog: Deprecate xmlCatalogSetDefaultPrefer 2024-07-02 22:06:53 +02:00
Nick Wellnhofer
606f410891 parser: Allow to disable catalogs with parser options
Implement XML_PARSE_NO_SYS_CATALOG and XML_PARSE_NO_CATALOG_PI.

Fixes #735.
2024-07-02 22:06:53 +02:00
Nick Wellnhofer
6794c1b91d globals: Document remaining thread-local vars as deprecated
See #407.
2024-07-02 20:03:23 +02:00
Nick Wellnhofer
35146ff31c save: Implement xmlSaveSetIndentString
Allow to set indent string without using global xmlTreeIndentString.

See #736.
2024-07-02 20:03:23 +02:00
Nick Wellnhofer
7cc619d568 save: Implement save options for indenting
Implement XML_SAVE_NO_INDENT to disable and XML_SAVE_INDENT to enable
indenting regardless of the global xmlIndentTreeOutput.

Implement XML_SAVE_EMPTY to enable empty tags regardless of the global
xmlSaveNoEmptyTags.

See #736.
2024-07-02 20:03:23 +02:00
Nick Wellnhofer
2c4204ecee save: Factor out xmlSaveWriteIndent 2024-07-02 20:03:23 +02:00
Nick Wellnhofer
202045f8df save: Pass options to xmlSaveCtxtInit 2024-07-02 20:03:23 +02:00
Nick Wellnhofer
197e09d5c5 parser: Fix xmlLoadResource
Short-lived regression.
2024-07-02 20:03:23 +02:00
Nick Wellnhofer
ede5d99af3 parser: Fix typo 2024-07-02 16:38:15 +02:00
Nick Wellnhofer
866be54e22 parser: Don't use deprecated xmlSplitQName 2024-07-02 13:34:11 +02:00
Nick Wellnhofer
30ef77554b parser: Don't use deprecated xmlCopyChar 2024-07-02 13:34:11 +02:00
Nick Wellnhofer
751ba00e00 parser: Don't use deprecated xmlSwitchInputEncoding 2024-07-02 13:34:04 +02:00
Nick Wellnhofer
9a4770ef84 doc: Improve documentation 2024-07-02 13:34:04 +02:00
Nick Wellnhofer
0b0dd98983 parser: Fix EBCDIC detection 2024-07-01 18:05:40 +02:00
Nick Wellnhofer
37a9ff11d8 encoding: Simplify xmlCharEncCloseFunc 2024-07-01 18:05:40 +02:00
Nick Wellnhofer
1167c3340e encoding: Don't include iconv.h from libxml/encoding.h 2024-07-01 18:05:40 +02:00
Nick Wellnhofer
95d3633350 encoding: Rework conversion error codes
This should match the old code more closely. Remove XML_ERR_PARTIAL.

It's unlikely that anyone is using these codes already.
2024-07-01 18:05:40 +02:00
Nick Wellnhofer
dd8e378513 HTML: Rework UTF8ToHtml
Optimize code. Check for XML_ENC_ERR_SPACE. Use error macros.
2024-07-01 18:05:40 +02:00
Nick Wellnhofer
30be984a0f encoding: Rework ISO-8859-X conversion
Optimize code. Pass tables as context parameter. Check for
XML_ENC_ERR_SPACE.
2024-07-01 18:05:40 +02:00
Nick Wellnhofer
282ec1d548 encoding: Rework xmlCharEncodingHandler layout
Reuse some of the old members.

The "input" and "output" function pointers are actually of type
xmlCharEncConvFunc, accepting an additional argument. For default
handlers, this argument is unused, so this should work with most ABIs.
For iconv handlers, these function pointers used to be NULL but now
point to a function which requires the extra argument.

"iconv_in" and "iconv_out" are made void pointers. "uconv_in" and
"uconv_out" are renamed and made void pointers. This is unlikely to
cause issues.

We now expect that the built-in conversion functions correctly report
XML_ENC_ERR_SPACE. For UTF8ToHtml and the ISO-8859-X code, this will be
done in the following commits.
2024-07-01 18:05:40 +02:00
Nick Wellnhofer
57e37dff4e encoding: Rework UTF-16 conversion functions
Optimize UTF-16 conversion functions. Avoid misaligned memory access.
Don't rely on 'sizeof(short) == 2'. Check for XML_ENC_ERR_SPACE. Add
some tests for UTF-16 conversion.
2024-07-01 18:05:40 +02:00
Nick Wellnhofer
bb8e81c788 encoding: Rework simple conversions function
Use a single function for ASCII conversion. Optimize code. Check for
XML_ENC_ERR_SPACE.
2024-07-01 18:05:40 +02:00
Nick Wellnhofer
501e5d195d encoding: Stop using XML_ENC_ERR_PARTIAL 2024-07-01 18:05:40 +02:00
Nick Wellnhofer
221df37529 parser: Support custom charset conversion implementations
Implement xmlCtxtSetCharEncConvImpl. I agree that the name is terrible.
2024-07-01 18:05:40 +02:00
Nick Wellnhofer
c59c24494d encoding: Support custom implementations 2024-07-01 18:05:40 +02:00
Nick Wellnhofer
1e3da9f4d4 encoding: Start with callbacks 2024-07-01 18:05:40 +02:00
Nick Wellnhofer
6d8427dc97 encoding: Rework encoding lookup
Add missing xmlCharEncoding enum values.

Simplify and speed up encoding lookup by using a table mapping names to
xmlCharEncoding enums and binary search. Rearrange the default handler
table to match the enum layout.

For some encodings we now only lookup the provided or most canonical
name instead of trying several names, expecting that iconv or ICU handle
aliases:

- IBM037 (EBCDIC)
- UCS-2
- UCS-4
- Shift_JIS
2024-07-01 18:05:40 +02:00
Nick Wellnhofer
16e7ecd478 xinclude: Check URI length
Don't report long URIs as OOM errors.
2024-07-01 18:03:06 +02:00