1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-04-09 14:50:07 +03:00

6202 Commits

Author SHA1 Message Date
Nick Wellnhofer
80a0580f23 xinclude: Expand comment about fuzz timeouts 2023-09-30 15:47:46 +02:00
Nick Wellnhofer
fa48187304 fuzz: Disable XML_PARSE_SAX1 option in xml fuzzer
There a no plans to fix quadratic behavior in the legacy SAX1 interface.
2023-09-30 14:45:53 +02:00
Nick Wellnhofer
5c150accba doc: Add notes about runtest to MAINTAINERS.md 2023-09-29 16:07:45 +02:00
Nick Wellnhofer
06e2f3a46e legacy: Add private declarations for stubs
Required after 8c084ebd.
2023-09-29 13:19:37 +02:00
Nick Wellnhofer
0533daf5d2 encoding: Fix infinite loop in xmlCharEncInput
Short-lived regression from 95e81a36.
2023-09-29 12:43:46 +02:00
Nick Wellnhofer
e0dd330b8f parser: Use hash tables to avoid quadratic behavior
Use a hash table to lookup namespaces by prefix. The hash table stores
an index into the namespace table. Auxiliary data for namespaces is
stored in a separate array along the main namespace table.

Use a hash table to verify attribute uniqueness. The hash table stores
an index into the attribute table.

Reuse hash value from the dictionary to avoid computing them twice.

See #346.
2023-09-29 12:43:22 +02:00
Nick Wellnhofer
e48f3d8e0a tests: Add more tests for redefined attributes 2023-09-29 12:43:08 +02:00
Nick Wellnhofer
a873191cd2 parser: Introduce xmlParseQNameHashed 2023-09-29 12:43:08 +02:00
Nick Wellnhofer
cb927e8519 parser: Don't skip CR in xmlCurrentChar
Skip over carriage returns later in xmlNextChar.
2023-09-29 12:43:08 +02:00
Nick Wellnhofer
19161bab15 dict: Internal API to look up hash values 2023-09-29 12:43:08 +02:00
Nick Wellnhofer
d147f5644e dict: Rewrite dictionary hash table code
Rewrite the dictionary hash table to use open addressing with Robin Hood
probing. See previous commit.
2023-09-29 12:41:37 +02:00
Nick Wellnhofer
4a513d5667 hash: Rewrite hash table code
This is a complete rewrite of the code in hash.c

Move from a chained hash table implementation to open addressing with
Robin Hood probing. This allows to increase the maximum fill factor and
further reduce the growth factor, saving considerable amounts of memory
without sacrificing performance.

To make this work, hash values are now cached in the table entry
also avoiding many key comparisons.

Tables are created lazily with a smaller minimum size.

Insertion functions now report an error if growing the table resulted in
a memory allocation failure.

Some string comparisons were optimized to call directly into libc
instead of using the xmlstring API.

The length of inserted keys is computed along with the hash improving
allocation performance.

Bounds checking was made more robust.

In dictionary-based mode, unneeded interning of strings is avoided.
2023-09-29 02:25:57 +02:00
Nick Wellnhofer
4f221a7748 hash: Add hash table tests
Make sure to properly test removal from hash tables.
2023-09-29 00:15:40 +02:00
Nick Wellnhofer
1425d8f67b dict: Separate RNG code 2023-09-29 00:15:40 +02:00
Nick Wellnhofer
42a0bc6d96 tests: Add ATTRIBUTE_NO_SANITIZE_INTEGER macro 2023-09-29 00:15:40 +02:00
Nick Wellnhofer
845bd99f8b string: Fix UTF-8 validation in xmlGetUTF8Char 2023-09-29 00:15:40 +02:00
Nick Wellnhofer
3e7673bc2d malloc-fail: Report malloc failure in xmlFARegExec 2023-09-29 00:15:40 +02:00
Nick Wellnhofer
b31813e60c include: Add more missing stdio.h includes 2023-09-28 15:34:08 +02:00
Nick Wellnhofer
b8961a75e9 parser: Fix reinitialization 2023-09-27 17:24:46 +02:00
James Le Cuirot
c7ff438b83
cmake: Only use pkg-config for .pc files, not for building binaries
Using `pkg_check_modules(FOO IMPORTED_TARGET foo)` with
`target_link_libraries()` leads to `INTERFACE_LINK_LIBRARIES` in the
resulting export file having `\$<LINK_ONLY:PkgConfig::FOO>` rather than
the currently expected `\$<LINK_ONLY:FOO::FOO>`, leading to breakage.
This can be worked around like so:

    target_link_libraries(UseFoo
      PUBLIC "$<BUILD_INTERFACE:PkgConfig::FOO>"
      INTERFACE "$<INSTALL_INTERFACE:FOO::FOO>"
    )

However, following some discussion, it is preferable to primarily use
find modules as before and only use `pkg_check_modules` for correctly
populating the .pc file.

Also move `find_package()` calls earlier so that builds fail faster when
dependencies are missing.
2023-09-23 16:48:57 +01:00
James Le Cuirot
9d53452206
cmake: Check whether static linking dependencies found in config files
If they were required when building libxml2 then they will also be
required when statically linking against it. Failing to find them will
just lead to undefined references later so detect this early.
2023-09-23 16:48:54 +01:00
James Le Cuirot
8617d8aa10
cmake: Find threads dep early as it may be needed for later checks 2023-09-23 16:48:51 +01:00
Nick Wellnhofer
b7d56ef7f1 malloc-fail: Report malloc failure in xmlRegEpxFromParse
Also check whether malloc failures are reported when fuzzing.
2023-09-22 19:53:11 +02:00
Nick Wellnhofer
d94f0b0ba2 doc: Update MAINTAINERS and NEWS 2023-09-22 19:01:11 +02:00
Nick Wellnhofer
84e1ffc813 doc: Don't document internal macros in xmlversion.h 2023-09-22 19:01:11 +02:00
Nick Wellnhofer
b9db3d7d02 parser: Simplify xmlStringCurrentChar
Start to move away from using this function.
2023-09-22 19:01:11 +02:00
Nick Wellnhofer
f98fa86318 regexp: Fix status codes and handle invalid UTF-8
Fixes #561.
2023-09-22 19:01:11 +02:00
Nick Wellnhofer
b94283fbda regexp: Add missing include 2023-09-22 14:23:27 +02:00
Nick Wellnhofer
bc4e82ff42 globals: Don't use thread-local storage on Darwin
It seems that thread-local storage destructors are run before pthread
thread-specific data destructors on Darwin, defeating our scheme to use
TSD to clean up TLS.

Here's an example program that reports a use-after-free when compiled
with `-fsanitize=address` on macOS:

    #include <pthread.h>

    typedef struct {
	int v;
    } my_struct;

    static _Thread_local my_struct tls;
    pthread_key_t key;

    void dtor(void *tsd) {
	my_struct *s = (my_struct *) tsd;
	/*
	 * This will crash ASan, apparently because
	 * TLS has already been freed.
	 */
	s->v = 1;
    }

    void *thread(void *p) {
	pthread_setspecific(key, &tls);
	return NULL;
    }

    int main(void) {
	pthread_key_create(&key, dtor);

	pthread_t handle;
	pthread_create(&handle, NULL, thread, NULL);
	pthread_join(handle, NULL);

	return 0;
    }
2023-09-22 13:37:28 +02:00
Nick Wellnhofer
45470611b0 error: Make xmlGetLastError return a const error
This is a slight break of the API, but users really shouldn't modify the
global error struct. The goal is to make xmlLastError use static buffers
for its strings eventually. This should warn people if they're abusing
the struct.
2023-09-22 13:29:07 +02:00
Nick Wellnhofer
fc26934eb0 memory: Fix memory debugging with Windows threads
On Windows, malloc hooks can be called after the final call to
xmlCleanupParser in various tests. This means that xmlMemMutex can still
be accessed if memory debugging is enabled, so the mutex should not be
cleaned.

This also means that tests may report spurious memory leaks on Windows.

The old implementation avoided the issue by keeping track of all
global state objects in a doubly linked list, so they could be cleaned
during xmlCleanupParser.

But as far as I can tell all memory will be freed eventually, so this is
mostly an issue with our test suite.
2023-09-21 23:29:18 +02:00
Nick Wellnhofer
6eb2a00da4 tests: Update testapi.c 2023-09-21 22:58:02 +02:00
Nick Wellnhofer
8c084ebdc7 doc: Make apibuild.py happy 2023-09-21 22:57:33 +02:00
Nick Wellnhofer
e4091bcfea doc: Allow 'unsigned' without 'int' 2023-09-21 22:54:57 +02:00
Nick Wellnhofer
46d7aaecff doc: Add ignored tokens to apibuild.py 2023-09-21 22:54:30 +02:00
Nick Wellnhofer
6c4ea468b2 python: Fix tests
Revert part of commit 138213ac.
2023-09-21 21:31:52 +02:00
Nick Wellnhofer
05135536b1 globals: Fix build --with-threads --without-output
Fixes #593.
2023-09-21 20:40:32 +02:00
Nick Wellnhofer
c5890716a6 html: Fix logic in htmlAutoClose
Note that the function is never called with a NULL newtag.

Fixes #591.
2023-09-21 17:01:35 +02:00
Nick Wellnhofer
81741ea4c0 xmlreader: Fix EOF detection in xmlTextReaderPushData 2023-09-21 16:29:28 +02:00
Nick Wellnhofer
89ee0369d2 python: Fix potential crash in tests/thread2.py
Memory debugging must be initialized.
2023-09-21 15:19:42 +02:00
Nick Wellnhofer
72262030a6 parser: Readd some includes to parser.h and xmlreader.h
Fix backward compatibility.
2023-09-21 15:06:05 +02:00
Nick Wellnhofer
9fc5090c05 hash: Clean up libxml/hash.h
Rename variables, fix subincludes, whitespace.
2023-09-21 14:47:25 +02:00
Nick Wellnhofer
de4b270aef autotools: Make --with-minimum disable lzma support
Fix an oversight when handling the --with-minimum option.
2023-09-21 14:31:31 +02:00
Nick Wellnhofer
f9d717af97 fuzz: Allow to fuzz without push, reader or output modules 2023-09-21 13:05:49 +02:00
Nick Wellnhofer
fe1bfb349b gitlab-ci: Add a "medium" config build
Also run CI tests with a build where most modules except a few are
disabled. This is the minimum configuration required for libxslt:

    --with-tree --with-xpath --with-output --with-html

Also add --with-threads.
2023-09-21 12:42:19 +02:00
Nick Wellnhofer
e7f0d88ba4 build: Remove some GCC warnings
-Wnested-externs produces spurious warnings after implicit
declaration of functions.

-Winline is useless since we don't use inlines.

-Wredundant-decls was already removed for autotools.
2023-09-21 02:26:43 +02:00
Nick Wellnhofer
da274bfa55 build: Fix build when certain modules are disabled 2023-09-21 02:26:43 +02:00
Nick Wellnhofer
9b5cce7a71 include: Remove more unnecessary includes 2023-09-21 01:50:53 +02:00
Nick Wellnhofer
f0e8358eae globals: Final fixes 2023-09-20 23:18:21 +02:00
Nick Wellnhofer
d6ba403368 globals: Move remaining declarations to correct places
globals.h is now deprecated. Sanity is restored.
2023-09-20 22:22:51 +02:00