IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This commit implements robust detection of entity amplification attacks,
better known as the "billion laughs" attack.
We now limit the size of the document after substitution of entities to
10 times the size before expansion. This guarantees linear behavior by
definition. There already was a similar check before, but the accounting
of "sizeentities" (size of external entities) and "sizeentcopy" (size of
all copies created by entity references) wasn't accurate.
We also need saturation arithmetic since we're historically limited to
"unsigned long" which is 32-bit on many platforms.
A maximum of 10 MB of substitutions is always allowed. This should make
use cases like DITA work which have caused problems in the past.
The old checks based on the number of entities were removed. This is
accounted for by adding a fixed cost to each entity reference.
Entity amplification checks are now enabled even if XML_PARSE_HUGE is
set. This option is mainly used to allow larger text nodes. Most users
were unaware that it also disabled entity expansion checks.
Some of the limits might be adjusted later. If this change turns out to
affect legitimate use cases, we can add a separate parser option to
disable the checks.
Fixes#294.
Fixes#345.
Instead of abusing the LSB of the "checked" member, store the result
of testing for occurrence of '<' character in "flags".
Also use the flags in xmlParseStringEntityRef instead of rescanning
every time.
To check whether an entity was already parsed, the code previously
tested whether "checked" was non-zero or "children" was non-null. The
"children" check could be unreliable because an empty entity also
results in an empty (NULL) node list. Use a separate flag to make this
check more reliable.
This debug function was always unsafe and hard-coded pointer sizes to
32 bits. Instead of attempting a fix, remove it completely. These days,
tools like ASan are much better to debug memory issues.
Fixes#214.
This was an experimental and undocumented micro-optimization for
Windows which apparently required different calling conventions for
variable-argument functions, making it impossible to maintain without
domain knowledge.
This enables some tests with testcases in
- test/xsdtest
- test/relaxng/OASIS/spectest.xml
- test/relaxng/testsuite.xml
The XML Schema Test Suite will also be run it was downloaded, see
xstc/Makefile.am. Gitlab CI should be updated to fetch these files.
There are 10 expected errors in the XSD test suite. This seems to be the
case since at least version 2.9.0 from 2012.
As per https://peps.python.org/pep-0394/, the python binary can be one
of the following options:
- Python 2
- Python 3
- Not exist
All of the scripts in libxml2 use 'python', which may not exist.
As Python 2 reached EOL on the 1st January 2020, it's safe to move the
scripts to use python3 explicitly.
pkg-config has been around for a very long time now, so deprecate the
hand-written libxml.m4 fragment providing AM_PATH_XML2 and simply
change it to a wrapper around PKG_CHECK_MODULES.
The expected errors contain an relative path, but the messages from the
parser contain absolute paths. However, due to the tests not actually
failing if there was an error this wasn't noticed.
Instead of putting relative paths in the expected messages use format()
to embed the correct absolute path.
Also use os.path.join() consistently when constructing paths to ensure
uniformly formatted paths.
When shared libraries are disabled we can't build loadable modules
either, so the testModule test can't work as the testdso.la target
doesn't build a module.
The XML test case tarball isn't actually compressed: the published URL
is a .tar and fetches of the .tar.gz redirect silently to the .tar,
which is then passed to gzip which refuses to decompress uncompressed
data.
Fetch the .tar as that is the documented URL, and remove the
decompression.
Checking whether the context is close to the parent context by hardcoding
250 is not portable (I noticed tests were failing on Morello since the value
is 288 there due to pointers being 128 bits). Instead we should ensure
that the XML_VCTXT_USE_PCTXT flag is not set in cases where the user data
is not actually a parser context (or ideally add a separate field but that
would be an ABI break.
From what I can see in the source, the XML_VCTXT_USE_PCTXT is only set if
the userData field points to a valid context, and if this is not the case
the flag should be cleared when changing userData rather than relying on
the offset between the two. Looking at the history, I think
d7cb33cf44aa688f24215c9cd398c1a26f0d25ff fixed most of the need for this
workaround, but it looks like there are a few more locations that need
updating; This commit changes two more places to set/clear/copy the
XML_VCTXT_USE_PCTXT flag, so this heuristic should not be needed anymore.
I've also drop two = NULL assignment in xmllint since this is not needed
after a call to memset().
There was also an uninitialized vctxt.flags (and other fields) in
`xmlShellValidate()`, which I've fixed by adding a memset() call.
Creating more than one-past-the-end pointers is undefined behaviour in C
and while this code is unlikely to be miscompiled, I discovered that an
out-of-bounds pointer is being created using UBSan on a CHERI-enabled
system.
Adding an offset to a deallocated pointer and assuming that it can be
dereferenced is undefined behaviour. When running libxml2 on CHERI-enabled
systems such as Arm Morello this results in the creation of an out-of-bounds
pointer that cannot be dereferenced and therefore crashes at runtime.
The effect of this UB is not just limited to architectures such as CHERI,
incorrect relocation of pointers after realloc can in fact cause
FORTIFY_SOURCE errors with recent GCC:
https://developers.redhat.com/articles/2022/09/17/gccs-new-fortification-level
The Python tests call xmlMemUsed after xmlCleanupParser which doesn't
work with statically allocated mutexes. This is only used for debugging,
so a lock isn't necessary.