1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2024-10-27 04:55:04 +03:00
Commit Graph

6236 Commits

Author SHA1 Message Date
Nick Wellnhofer
dc3382ef97 globals: Move xmlRegisterNodeDefault to tree.c
Code in globals.c must not try to access globals itself since the
accessor macros aren't defined and we would only see the main
variable.
2023-09-20 22:06:49 +02:00
Nick Wellnhofer
759767423a globals: Add a few comments 2023-09-20 22:06:49 +02:00
Nick Wellnhofer
ecbd634c9f threads: Fix double-checked locking in xmlInitParser
Hopefully work around the classic problem with double-checked locking:
Another thread could read xmlParserInitialized == 1 but doesn't see
other initialization results yet due to compiler or hardware reordering.
While unlikely, this seems theoretically possible.

The solution is to add a memory barrier after initializing the data but
before setting xmlParserInitialized. It might be enough to use a second
initialization flag which is only used inside the locked section and
update xmlParserInitialized after unlocking. But I haven't seen this
approach in many articles discussing this issue, so it's possibly
flawed as well.
2023-09-20 22:06:49 +02:00
Nick Wellnhofer
f7a403c21f globals: Move xmlIsMainThread to globals.c
xmlIsMainThread is mainly needed for global variables.
2023-09-20 22:06:49 +02:00
Nick Wellnhofer
b173b724d1 globals: Use thread-local storage if available
Also use thread-local storage to store globals on POSIX platforms.

Most importantly, this makes sure that global variable access can't fail
when allocating the global state struct.
2023-09-20 22:06:49 +02:00
Nick Wellnhofer
e7b6ca156f globals: Rework global state destruction on Windows
If DllMain is used, rely on it working as expected. The old code seemed
to attempt to free global state of other threads if, for some reason,
the DllMain mechanism didn't work.

In a static build, register a destructor with
RegisterWaitForSingleObject.

Make public functions xmlGetGlobalState and xmlInitializeGlobalState
no-ops.

Move initialization and registration of global state objects to
xmlInitGlobalState. Lookup global state with xmlGetThreadLocalStorage
which can be inlined nicely.

Also cleanup global state when using TLS. xmlLastError must be reset.
2023-09-20 22:06:49 +02:00
Nick Wellnhofer
39a275a541 globals: Define globals using macros
Declare and define globals and helper functions by (ab)using the
preprocessor.
2023-09-20 22:06:49 +02:00
Nick Wellnhofer
bf6bd16154 globals: Introduce xmlCheckThreadLocalStorage
Checks whether (emulated) thread-local storage could be allocated.
2023-09-20 22:06:43 +02:00
Nick Wellnhofer
89f4976728 globals: Make xmlGlobalState private
This removes a public struct but it seems impossible to use its members
in a sensible way from external code.
2023-09-19 17:36:29 +02:00
Nick Wellnhofer
a07ec7c1a7 threads: Move library initialization code to threads.c
This allows to consolidate the initialization code since the global init
lock was already implemented in threads.c.
2023-09-19 17:35:12 +02:00
Nick Wellnhofer
4e1c13ebfd debug: Remove debugging code
This is barely useful these days and only clutters the code base.
2023-09-19 17:35:09 +02:00
Nick Wellnhofer
c19771c1f1 globals: Move code from threads.c to globals.c
Move all code that handles globals to the place where it belongs.
2023-09-19 17:34:38 +02:00
Nick Wellnhofer
2a4b811424 globals: Rename members of xmlGlobalState
This is a deliberate first step to remove some internals from the
public API and to avoid issues when redefining tokens.
2023-09-19 17:34:30 +02:00
Nick Wellnhofer
d7cfe35650 parser: Avoid undefined behavior in xmlParseStartTag2
Instead of using arithmetic on dangling pointers, store ptrdiff_t values
in void pointers which is at least implementation-defined.
2023-09-14 20:52:24 +02:00
Nick Wellnhofer
90d5b79958 schemas: Fix memory leak of annotations in notations
Found by OSS-Fuzz.
2023-09-14 15:30:38 +02:00
Markus Rickert
99cba4b37b Handle NOCONFIG case when setting locations from CMake target properties 2023-09-09 17:46:34 +02:00
Nick Wellnhofer
4aa08c80b7 xinclude: Fix 'last' pointer in xmlXIncludeCopyNode
Also set the 'last' pointer for the root node.

Fixes https://gitlab.gnome.org/GNOME/libxslt/-/issues/93
2023-09-08 14:52:22 +02:00
James Le Cuirot
f369154fce
cmake: Generate better pkg-config file for SYSROOT builds under CMake
I recently fixed this for Autotools but said that fixing this for CMake
was not feasible due to it using `find_package` rather than
`pkg_check_modules`. I then thought about it and couldn't find any
reason why CMake couldn't try `pkg_check_modules` first and then fall
back to `find_package`, as that's basically what Autotools does.

I had wanted to use the linker flags generated by CMake when it does
fall back to `find_package`, but it only returns direct paths to the
libraries, as opposed to `-l` flags. Baking these library paths into the
pkg-config and xml2-config files would break static linking and
cross-compiling, so I've stuck with the `-l` flags we already have.

There is no need to set `CMAKE_REQUIRED_LIBRARIES` because we already
add the dependencies to the library target.
2023-09-04 22:14:16 +01:00
James Le Cuirot
5a18c505a7
autoconf: Include non-pkg-config dependency flags in the pkg-config file
These were present before, but I accidentally dropped them in my recent
build improvements.
2023-09-04 22:14:13 +01:00
James Le Cuirot
6864d92f6c
autoconf: Don't bake build time CFLAGS into pkg-config file
Having slept on it, I've realised that baking the dependency CFLAGS into
the pkg-config file is pointless when it is only used to link against
them. It may even cause problems.
2023-09-04 22:14:02 +01:00
Nick Wellnhofer
efcaeadc3e hash: Fix use-of-uninitialized-value
Short-lived regression.
2023-09-04 16:07:40 +02:00
Nick Wellnhofer
05c283052d dict: Stop using uint32_t
stdint.h is a C99 header.
2023-09-04 16:07:40 +02:00
Nick Wellnhofer
f45abbd3e5 dict: Fix integer overflow of string lengths
Fixes #546.
2023-09-04 16:07:40 +02:00
Nick Wellnhofer
edc2dd48cb dict: Update hash function
Update hash function from classic Jenkins OAAT (dict.c) and a variant of
DJB2 (hash.c) to "GoodOAAT" taken from the SMHasher repo. This hash
function passes all SMHasher tests.
2023-09-04 16:07:23 +02:00
James Le Cuirot
93e8bb2a40
build: Generate better pkg-config files for static-only builds
pkg-config supports `Requires.private` and `Libs.private` fields for
static linking. However, if you're building a dynamic binary, then
pkg-config will use the non-private fields, even if just the static
libxml2 is available. This will result in libxml2 being underlinked,
causing the build to fail. The solution is to fold the private fields
into the non-private fields when the shared libxml2 is not being built.

This works for Autotools and CMake. Meson also knows how to handle this
when it automatically generates pkg-config files.
2023-09-03 08:52:36 +01:00
James Le Cuirot
4640ccac85
build: Generate better pkg-config file for SYSROOT builds
The -I and -L flags you use to build should not necessarily be the same
ones you bake into installed files. If you are building with
dependencies located under a SYSROOT then the installed files should
have no knowledge of that SYSROOT. For example, if the build requires
`-L/path/to/sysroot/usr/lib/foo` then only `-L/usr/lib/foo` should be
baked into the installed files.

pkg-config is SYSROOT-aware, so this issue can be sidestepped by using
the `Requires` field rather than the `Libs` and `Cflags` fields. This is
easily resolved if you rely solely on pkg-config, but this project falls
back to standard Autoconf checks, so a little more effort is required.

Unfortunately, this issue cannot feasibly be resolved for CMake.
`find_package` is used rather than `pkg_check_modules`, so we cannot
tell whether a pkg-config file for each dependency is present or not,
even if `find_package` uses pkg-config behind the scenes. The CMake
build does not record any dependency -I or -L flags into the pkg-config
file anyway. This is a problem in itself, although these dependencies
are most likely installed to standard locations.

Meson is very much better at handling this, as it generates the
pkg-config file automatically using the correct logic.
2023-09-03 08:52:22 +01:00
Nick Wellnhofer
54a0b19a9f autoconf: Allow custom --with-icu configure option 2023-09-01 14:52:14 +02:00
Nick Wellnhofer
c5989473b9 dict: Use thread-local storage for PRNG state 2023-09-01 14:52:11 +02:00
Nick Wellnhofer
57cfd221a6 dict: Use xoroshiro64** as PRNG
Stop using rand_r. This enables hash randomization on all platforms.
2023-09-01 14:52:04 +02:00
Nick Wellnhofer
6d7aaaa835 dict: Tune hash table growth
Introduce load factor as main trigger and increase MAX_HASH_LEN. This
should make growth behavior more predictable.

Raise size limit to INT_MAX. This avoids quadratic behavior with larger
tables.
2023-09-01 14:51:55 +02:00
Nick Wellnhofer
4b8f7cf05d hash: Fix integer overflow of nbElems 2023-09-01 14:43:08 +02:00
Nick Wellnhofer
bfd7d28698 xmllint: Fix more error messages 2023-08-29 21:16:34 +02:00
Nick Wellnhofer
373244bc66 xmllint: Fix error message when push parsing empty documents 2023-08-29 21:05:32 +02:00
Nick Wellnhofer
53050b1dd8 parser: More fixes to push parser error handling 2023-08-29 20:06:43 +02:00
Nick Wellnhofer
bbd918b2e7 parser: Fix detection of null bytes
Also suppress misleading extra errors.

Fixes #122.
2023-08-29 18:43:10 +02:00
Nick Wellnhofer
c6083a32d6 parser: Improve error handling in push parser
- Report errors earlier
- Align error messages with pull parser
2023-08-29 18:41:05 +02:00
Nick Wellnhofer
1edae30f82 parser: Don't check inputNr in xmlParseTryOrFinish
There's no apparent reason for this check. inputNr should always be 1
here.
2023-08-29 18:17:14 +02:00
Nick Wellnhofer
e48f2695fe parser: Remove push parser debugging code 2023-08-29 18:17:09 +02:00
Nick Wellnhofer
cde4499778 SAX2: Allow multiple top-level elements
When parsing with HTML_PARSE_NOIMPLIED, the result document can contain
multiple top-level elements. Rework xmlSAX2StartElement to simply add
the element as a child of ctxt->node or ctxt->myDoc.

Don't invoke xmlAddSibling for non-element parents. The context node
should always be an element node.

Fixes #584.
2023-08-27 16:35:23 +02:00
Nick Wellnhofer
d39f78069d tree: Fix copying of DTDs
- Don't create multiple DTD nodes.
- Fix UAF if malloc fails.
- Skip DTD nodes if tree module is disabled.

Fixes #583.
2023-08-23 20:43:14 +02:00
Nick Wellnhofer
4e4c89a4bc doc: Improve documentation of configuration options 2023-08-21 11:13:33 +02:00
Nick Wellnhofer
778cca386d legacy: Add stubs for disabled modules
When legacy support is requested, always enable stubs for FTP and
XPointer location modules which were removed from the standard
configuration. Going forward, the --with-legacy configuration option
should be used to provide maximum ABI compatibility.

Fixes #433.
2023-08-20 23:16:12 +02:00
Nick Wellnhofer
ed3bd05284 parser: Allow to set maximum amplification factor 2023-08-20 20:49:16 +02:00
Nick Wellnhofer
9d80a2b134 entities: Don't change doc when encoding entities
doc->encoding shouldn't be touched by xmlEncodeEntitiesInternal.
2023-08-17 12:47:14 +02:00
Nick Wellnhofer
f1c1f5c6b4 parser: Revert change to doc->encoding
Fixes #579.
2023-08-17 12:47:14 +02:00
Nick Wellnhofer
61b8e097b9 parser: Never use UTF-8 encoding handler 2023-08-16 19:50:36 +02:00
Nick Wellnhofer
507f11edf0 encoding: Remove debugging code 2023-08-16 19:50:36 +02:00
Nick Wellnhofer
138213acdf python: Fix tests on MinGW
Add the directory containing libxml2.dll with os.add_dll_directory to
make tests work on MinGW.

This has changed in Python 3.8 but for some reason, the issue only
turned up with Python 3.11 on MinGW. Contrary to documentation, copying
libxml2.dll into the directory containing the .pyd file doesn't work.
2023-08-15 12:55:35 +02:00
Nick Wellnhofer
e2ab48b9b5 malloc-fail: Fix unsigned integer overflow in xmlTextReaderPushData
Return immediately if xmlParserInputBufferRead fails.

Found by OSS-Fuzz, see #344.
2023-08-14 15:06:31 +02:00
Nick Wellnhofer
0d24fc0a47 html: Remove encoding hack in htmlCreateFileParserCtxt
Switch encoding directly instead of calling htmlCheckEncoding with faked
content.
2023-08-14 12:53:49 +02:00