IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
The with statement creates a new variable. I thought it opens a block
where "e" is only valid in that block. But instead it runs the whole
thing, expecting an exception somewhere. Learning python....
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: David Mulder <dmulder@samba.org>
This happens when a path has an unknown reparse point in the middle
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: David Mulder <dmulder@samba.org>
Cast from 'uint32_t *' (aka 'unsigned int *') to 'size_t *' (aka
'unsigned long *') increases required alignment from 4 to 8
==10343==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffdc6784fc0 at pc 0x7f339f1ea500 bp 0x7ffdc6784ed0 sp 0x7ffdc6784ec8
WRITE of size 8 at 0x7ffdc6784fc0 thread T0
#0 0x7f339f1ea4ff in fd_load ../../lib/util/util_file.c:220
#1 0x7f339f1ea5a4 in file_load ../../lib/util/util_file.c:245
#2 0x56363209a596 in net_offlinejoin_requestodj ../../source3/utils/net_offlinejoin.c:267
#3 0x56363209a9d0 in net_offlinejoin ../../source3/utils/net_offlinejoin.c:74
#4 0x56363208f61c in net_run_function ../../source3/utils/net_util.c:453
#5 0x563631fe8a9f in main ../../source3/utils/net.c:1358
#6 0x7f339b22c5af in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#7 0x7f339b22c678 in __libc_start_main_impl ../csu/libc-start.c:381
#8 0x563631faf374 in _start ../sysdeps/x86_64/start.S:115
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15257
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Volker Lendecke <vl@samba.org>
Autobuild-User(master): Andreas Schneider <asn@cryptomilk.org>
Autobuild-Date(master): Mon Dec 5 12:05:24 UTC 2022 on sn-devel-184
If Samba is built against the system libldb, use the system tools.
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Volker Lendecke <vl@samba.org>
Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Mon Dec 5 09:36:40 UTC 2022 on sn-devel-184
We don't include source4/selftest/provisions/ in source tarballs!
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Volker Lendecke <vl@samba.org>
Autobuild-User(master): Volker Lendecke <vl@samba.org>
Autobuild-Date(master): Mon Dec 5 08:22:29 UTC 2022 on sn-devel-184
lib/compression/tests/test_lzx_huffman.c: In function ‘test_lzxpress_huffman_overlong_matches’:
lib/compression/tests/test_lzx_huffman.c:1013:35: error: ‘j’ may be used uninitialized [-Werror=maybe-uninitialized]
1013 | assert_int_equal(score, i * j);
| ^
lib/compression/tests/test_lzx_huffman.c:979:19: note: ‘j’ was declared here
979 | size_t i, j;
| ^
lib/compression/tests/test_lzx_huffman.c: In function ‘test_lzxpress_huffman_overlong_matches_abc’:
lib/compression/tests/test_lzx_huffman.c:1059:39: error: ‘k’ may be used uninitialized [-Werror=maybe-uninitialized]
1059 | assert_int_equal(score, i * j * k);
| ^
lib/compression/tests/test_lzx_huffman.c:1020:22: note: ‘k’ was declared here
1020 | size_t i, j, k;
| ^
lib/compression/tests/test_lzx_huffman.c:1059:35: error: ‘j’ may be used uninitialized [-Werror=maybe-uninitialized]
1059 | assert_int_equal(score, i * j * k);
| ^
lib/compression/tests/test_lzx_huffman.c:1020:19: note: ‘j’ was declared here
1020 | size_t i, j, k;
| ^
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Autobuild-User(master): Andreas Schneider <asn@cryptomilk.org>
Autobuild-Date(master): Sun Dec 4 09:12:30 UTC 2022 on sn-devel-184
These functions are now only called from check_chown in posix_acls.c
Signed-off-by: Christof Schmitt <cs@samba.org>
Reviewed-by: Volker Lendecke <vl@samba.org>
The same assignment is already done earlier, and nothing is changed in
between.
Signed-off-by: Christof Schmitt <cs@samba.org>
Reviewed-by: Volker Lendecke <vl@samba.org>
This uses the same hash table method as lzxpress_huffman, though the
code can't be directly reused as the sizes of the offsets is
different, and there is not a block processing step here.
This will worsen the compression ratio compared to the exhaustive
search we previously used, though we still perform better than
Windows. To put numbers on it, the test files used to compress to 0.91
of Windows' compression size, and now they compress to 0.96.
On the other hand this is many orders of magnitude faster. It is
difficult to say exactly how much faster -- while the testsuite time
has only improved 200-fold (from 7 minutes to 2 seconds), most of the
remaining 2 seconds is used in data generation and management, not
compression. OSSFuzz consistently finds new vectors that time out
after a minute; on these we'll see nearly an order of magnitude of
orders of magnitude inprovement.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Autobuild-User(master): Joseph Sutton <jsutton@samba.org>
Autobuild-Date(master): Fri Dec 2 00:00:04 UTC 2022 on sn-devel-184
This makes it easier to rework the encoding decision to depend on a
hash table match rather than the current exhaustive search.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
This will make it possible to move encoding operations into helper
functions, which will make it easier to restructure the code to use a
hash table for faster matching.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
These are based on (i.e. copied and pasted from) the LZ77 + Huffman
tests.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Everything that is in testdata/compression/lzxpress-huffman/ can also
be used for lzxpress plain tests, which is something we really need.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
We are going to change from a slow exact match algorithm to a fast
heuristic search that will not always get the same results as the
exhaustive search.
To be precise, a million zeros will compress to 112 rather than 93 bytes.
We don't insist on an exact size, because that is not an issue here.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Mainly so I can go
make bin/test_lzxpress_plain && bin/test_lzxpress_plain
valgrind bin/test_lzxpress_plain
rr bin/test_lzxpress_plain
rr replay
in a tight loop.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
st/summary is useless. If you'll find anything, it'll be in st/subunit.
However, in case *something* useful ever ends up there we still mention it.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
If compiled on Windows using Cygwin, MSYS2, or similar, this will output
compressed versions of files exactly as specified by MZ-XCA, if the
following conditions are met:
1. The file > 300 bytes.
2. The compressed file is smaller than the decompressed file.
Otherwise it returns the data unchanged. Without warning; that's just
how the API works.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Compression uses a 3 byte hash remember LZ77 matches in a 14-bit table.
This script runs the hash over all 16M combinations, then again over
all ASCII combinations, counting collisions to find hot-spots.
If you think you have a better hash, you are probably right, but you
should try it here -- alter h() -- before committing to it. This one is
literally the first one I thought of.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Huffman tree re-quantisation and perhaps other code paths are only
triggered by pathological data like this.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
This compresses some data, decompresses it, and asserts that the
result is identical to the original string.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
This differs from fuzz_lzxpress_huffman_round_trip (next commit) in
that the output buffer might be too small for the compressed data, in
which case we want to see an error and not a crash.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Most strings will not successfully decompress, which is OK. What we
care about of course is memory safety.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
With LZXHUFF_DEBUG_VERBOSE set, we measure the compression and
decompression rate relative to the decompressed size.
On reasonably long strings on my laptop, compiled with -O0, it turns
out to between 20 and 500 MB/s, both ways, depending on the complexity
of the string. Very short strings are of course dominated by overhead.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
If you need to see a Huffman tree (and sometimes you do), set
DEBUG_HUFFMAN_TREE to true at the top of lzxpress_huffman.c, and run:
make bin/test_lzx_huffman && bin/test_lzx_huffman
Actually, that will show you hundreds of trees, and you'll be glad of
that if you are ever trying to understand this.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Encoding without LZ77 matches is valid, and it is useful for isolating
bugs.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
This compresses files as described in MS-XCA 2.2, and as decompressed
by the decompressor in the previous commit.
As with the decompressor, there are two public functions -- one that
uses a talloc context, and one that uses pre-allocated memory. The
compressor requires a tightly bound amount of auxillary memory
(>220kB) in a few different buffers, which is all gathered together in
the public struct lzxhuff_compressor_mem. An instantiated but not
initialised copy of this struct is required by the non-talloc
function; it can be used over and over again.
Our compression speed is about the same as the decompression speed
(between 20 and 500 MB/s on this laptop, depending on the data), and
our compression ratio is very similar to that of Windows.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
This format is described in [MS-XCA] 2.1 and 2.2, with exegesis in
many posts on the cifs-protocol list[1].
The two public functions are:
ssize_t lzxpress_huffman_decompress(const uint8_t *input,
size_t input_size,
uint8_t *output,
size_t output_size);
uint8_t *lzxpress_huffman_decompress_talloc(TALLOC_CTX *mem_ctx,
const uint8_t *input_bytes,
size_t input_size,
size_t output_size);
In both cases the caller needs to know the *exact* decompressed size,
which is essential for decompression. The _talloc version allocates
the buffer for you, and uses the talloc context to allocate a 128k
working buffer. THe non-talloc function will allocate the working
buffer on the stack.
This compression format gives better compression for messages of
several kilobytes than the "plain" LXZPRESS compression, but is
probably a bit slower to decompress and is certainly worse for very
short messages, having a fixed 256 byte overhead for the first Huffman
table.
Experiments show decompression rates between 20 and 500 MB per second,
depending on the compression ratio and data size, on an i5-1135G7 with
no compiler optimisations.
This compression format is used in AD claims and in SMB, but that
doesn't happen with this commit.
I will not try to describe LZ77 or Huffman encoding here. Don't expect
an answer in MS-XCA either; instead read the code and/or Wikipedia.
[1] Much of that starts here:
https://lists.samba.org/archive/cifs-protocol/2022-October/
but there's more earlier, particularly in June/July 2020, when
Aurélien Aptel was working on an implementation that ended up in
Wireshark.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Pair-programmed-with: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Some of the decompressed files were found via fuzzing, some are public
domain texts, and some are designed to test one aspect or another of
the format. For example, some aspects of Huffman tree creation can
only be tested when there is an extreme imbalance in the frequency of
symbols.
See the README for what files are where.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
We are going to have all kinds of rubbish there.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
We are going to add more tests for lib/compression, and they can't all
be called "testsuite.c".
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>