IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
lib/compression/tests/test_lzx_huffman.c: In function ‘test_lzxpress_huffman_overlong_matches’:
lib/compression/tests/test_lzx_huffman.c:1013:35: error: ‘j’ may be used uninitialized [-Werror=maybe-uninitialized]
1013 | assert_int_equal(score, i * j);
| ^
lib/compression/tests/test_lzx_huffman.c:979:19: note: ‘j’ was declared here
979 | size_t i, j;
| ^
lib/compression/tests/test_lzx_huffman.c: In function ‘test_lzxpress_huffman_overlong_matches_abc’:
lib/compression/tests/test_lzx_huffman.c:1059:39: error: ‘k’ may be used uninitialized [-Werror=maybe-uninitialized]
1059 | assert_int_equal(score, i * j * k);
| ^
lib/compression/tests/test_lzx_huffman.c:1020:22: note: ‘k’ was declared here
1020 | size_t i, j, k;
| ^
lib/compression/tests/test_lzx_huffman.c:1059:35: error: ‘j’ may be used uninitialized [-Werror=maybe-uninitialized]
1059 | assert_int_equal(score, i * j * k);
| ^
lib/compression/tests/test_lzx_huffman.c:1020:19: note: ‘j’ was declared here
1020 | size_t i, j, k;
| ^
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Autobuild-User(master): Andreas Schneider <asn@cryptomilk.org>
Autobuild-Date(master): Sun Dec 4 09:12:30 UTC 2022 on sn-devel-184
This uses the same hash table method as lzxpress_huffman, though the
code can't be directly reused as the sizes of the offsets is
different, and there is not a block processing step here.
This will worsen the compression ratio compared to the exhaustive
search we previously used, though we still perform better than
Windows. To put numbers on it, the test files used to compress to 0.91
of Windows' compression size, and now they compress to 0.96.
On the other hand this is many orders of magnitude faster. It is
difficult to say exactly how much faster -- while the testsuite time
has only improved 200-fold (from 7 minutes to 2 seconds), most of the
remaining 2 seconds is used in data generation and management, not
compression. OSSFuzz consistently finds new vectors that time out
after a minute; on these we'll see nearly an order of magnitude of
orders of magnitude inprovement.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Autobuild-User(master): Joseph Sutton <jsutton@samba.org>
Autobuild-Date(master): Fri Dec 2 00:00:04 UTC 2022 on sn-devel-184
This makes it easier to rework the encoding decision to depend on a
hash table match rather than the current exhaustive search.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
This will make it possible to move encoding operations into helper
functions, which will make it easier to restructure the code to use a
hash table for faster matching.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
These are based on (i.e. copied and pasted from) the LZ77 + Huffman
tests.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Everything that is in testdata/compression/lzxpress-huffman/ can also
be used for lzxpress plain tests, which is something we really need.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
We are going to change from a slow exact match algorithm to a fast
heuristic search that will not always get the same results as the
exhaustive search.
To be precise, a million zeros will compress to 112 rather than 93 bytes.
We don't insist on an exact size, because that is not an issue here.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Mainly so I can go
make bin/test_lzxpress_plain && bin/test_lzxpress_plain
valgrind bin/test_lzxpress_plain
rr bin/test_lzxpress_plain
rr replay
in a tight loop.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
If compiled on Windows using Cygwin, MSYS2, or similar, this will output
compressed versions of files exactly as specified by MZ-XCA, if the
following conditions are met:
1. The file > 300 bytes.
2. The compressed file is smaller than the decompressed file.
Otherwise it returns the data unchanged. Without warning; that's just
how the API works.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Compression uses a 3 byte hash remember LZ77 matches in a 14-bit table.
This script runs the hash over all 16M combinations, then again over
all ASCII combinations, counting collisions to find hot-spots.
If you think you have a better hash, you are probably right, but you
should try it here -- alter h() -- before committing to it. This one is
literally the first one I thought of.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Huffman tree re-quantisation and perhaps other code paths are only
triggered by pathological data like this.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
This compresses some data, decompresses it, and asserts that the
result is identical to the original string.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
This differs from fuzz_lzxpress_huffman_round_trip (next commit) in
that the output buffer might be too small for the compressed data, in
which case we want to see an error and not a crash.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Most strings will not successfully decompress, which is OK. What we
care about of course is memory safety.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
With LZXHUFF_DEBUG_VERBOSE set, we measure the compression and
decompression rate relative to the decompressed size.
On reasonably long strings on my laptop, compiled with -O0, it turns
out to between 20 and 500 MB/s, both ways, depending on the complexity
of the string. Very short strings are of course dominated by overhead.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
If you need to see a Huffman tree (and sometimes you do), set
DEBUG_HUFFMAN_TREE to true at the top of lzxpress_huffman.c, and run:
make bin/test_lzx_huffman && bin/test_lzx_huffman
Actually, that will show you hundreds of trees, and you'll be glad of
that if you are ever trying to understand this.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Encoding without LZ77 matches is valid, and it is useful for isolating
bugs.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
This compresses files as described in MS-XCA 2.2, and as decompressed
by the decompressor in the previous commit.
As with the decompressor, there are two public functions -- one that
uses a talloc context, and one that uses pre-allocated memory. The
compressor requires a tightly bound amount of auxillary memory
(>220kB) in a few different buffers, which is all gathered together in
the public struct lzxhuff_compressor_mem. An instantiated but not
initialised copy of this struct is required by the non-talloc
function; it can be used over and over again.
Our compression speed is about the same as the decompression speed
(between 20 and 500 MB/s on this laptop, depending on the data), and
our compression ratio is very similar to that of Windows.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
This format is described in [MS-XCA] 2.1 and 2.2, with exegesis in
many posts on the cifs-protocol list[1].
The two public functions are:
ssize_t lzxpress_huffman_decompress(const uint8_t *input,
size_t input_size,
uint8_t *output,
size_t output_size);
uint8_t *lzxpress_huffman_decompress_talloc(TALLOC_CTX *mem_ctx,
const uint8_t *input_bytes,
size_t input_size,
size_t output_size);
In both cases the caller needs to know the *exact* decompressed size,
which is essential for decompression. The _talloc version allocates
the buffer for you, and uses the talloc context to allocate a 128k
working buffer. THe non-talloc function will allocate the working
buffer on the stack.
This compression format gives better compression for messages of
several kilobytes than the "plain" LXZPRESS compression, but is
probably a bit slower to decompress and is certainly worse for very
short messages, having a fixed 256 byte overhead for the first Huffman
table.
Experiments show decompression rates between 20 and 500 MB per second,
depending on the compression ratio and data size, on an i5-1135G7 with
no compiler optimisations.
This compression format is used in AD claims and in SMB, but that
doesn't happen with this commit.
I will not try to describe LZ77 or Huffman encoding here. Don't expect
an answer in MS-XCA either; instead read the code and/or Wikipedia.
[1] Much of that starts here:
https://lists.samba.org/archive/cifs-protocol/2022-October/
but there's more earlier, particularly in June/July 2020, when
Aurélien Aptel was working on an implementation that ended up in
Wireshark.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Pair-programmed-with: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
We are going to add more tests for lib/compression, and they can't all
be called "testsuite.c".
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Sometimes (e.g. in lzxpress Huffman encoding, and in some of our
tests: c.f. https://lists.samba.org/archive/samba-technical/2018-March/126010.html)
we want a stable sort algorithm (meaning one that retains the previous
order of items that compare equal).
The GNU libc qsort() is *usually* stable, in that it first tries to
use a mergesort but reverts to quicksort if the necessary allocations
fail. That has led Samba developers to unthinkingly assume qsort() is
stable which is not the case on many platforms, and might not always
be on GNU/Linuxes either.
This adds four functions. stable_sort() sorts an array, and requires
an auxiliary working array of the same size. stable_sort_talloc()
takes a talloc context so it ca create a working array and call
stable_sort(). stable_sort_r() takes an opaque context blob that gets
passed to the compare function, like qsort_r() and ldb_qsort(). And
stable_sort_talloc_r() rounds out the quadrant.
These are LGPL so that the can be used in ldb, which has problems with
unstable sort.
The tests are borrowed and extended from test_ldb_qsort.c.
When sorting non-trivial structs this is roughly as fast as GNU qsort,
but GNU qsort has optimisations for small items, using direct
assignments of rather than memcpy where the size allows the item to be
cast as some kind of int.
Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
We need "struct iovec", which comes in via sys/uio.h, incuded by
system/filesys.h
Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
We can't unconditionally assume (as we did in
third_party/heimdal_build/wscript_configure) that Heimdal has this type,
since we may have an older system Heimdal that lacks it. We must also
check whether krb5_pac_get_buffer() is usable with krb5_const_pac, and
declare krb5_const_pac as a non-const typedef if not.
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
https://fedoraproject.org/wiki/Changes/PortingToModernC
We define True to true from stdbool.h and the same for false. So we
don't have to do a cleanup now.
Signed-off-by: Andreas Schneider <asn@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
Autobuild-User(master): Jeremy Allison <jra@samba.org>
Autobuild-Date(master): Thu Oct 27 19:11:30 UTC 2022 on sn-devel-184
The result is not used, it is only part of the macro to gain
type-checking.
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
This allows this type to be used in Samba in the future for
both Kerberos implementations
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
There were some reports that strace output an LDAP server socket is in
CLOSE_WAIT state, returning EAGAIN for writev over and over (after a call to
epoll() each time).
In the tstream_bsd code the problem happens when we have a pending
writev_send, while there's no readv_send pending. In that case
we still ask for TEVENT_FD_READ in order to notice connection errors
early, so we try to call writev even if the socket doesn't report TEVENT_FD_WRITE.
And there are situations where we do that over and over again.
It happens like this with a Linux kernel:
tcp_fin() has this:
struct tcp_sock *tp = tcp_sk(sk);
inet_csk_schedule_ack(sk);
sk->sk_shutdown |= RCV_SHUTDOWN;
sock_set_flag(sk, SOCK_DONE);
switch (sk->sk_state) {
case TCP_SYN_RECV:
case TCP_ESTABLISHED:
/* Move to CLOSE_WAIT */
tcp_set_state(sk, TCP_CLOSE_WAIT);
inet_csk_enter_pingpong_mode(sk);
break;
It means RCV_SHUTDOWN gets set as well as TCP_CLOSE_WAIT, but
sk->sk_err is not changed to indicate an error.
tcp_sendmsg_locked has this:
...
err = -EPIPE;
if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
goto do_error;
while (msg_data_left(msg)) {
int copy = 0;
skb = tcp_write_queue_tail(sk);
if (skb)
copy = size_goal - skb->len;
if (copy <= 0 || !tcp_skb_can_collapse_to(skb)) {
bool first_skb;
new_segment:
if (!sk_stream_memory_free(sk))
goto wait_for_space;
...
wait_for_space:
set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
if (copied)
tcp_push(sk, flags & ~MSG_MORE, mss_now,
TCP_NAGLE_PUSH, size_goal);
err = sk_stream_wait_memory(sk, &timeo);
if (err != 0)
goto do_error;
It means if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)) doesn't
hit as we only have RCV_SHUTDOWN and sk_stream_wait_memory returns
-EAGAIN.
tcp_poll has this:
if (sk->sk_shutdown & RCV_SHUTDOWN)
mask |= EPOLLIN | EPOLLRDNORM | EPOLLRDHUP;
So we'll get EPOLLIN | EPOLLRDNORM | EPOLLRDHUP triggering
TEVENT_FD_READ and writev/sendmsg keeps getting EAGAIN.
So we need to always clear TEVENT_FD_READ if we don't
have readable handler in order to avoid burning cpu.
But we turn it on again after a timeout of 1 second
in order to monitor the error state of the connection.
And now that our tsocket_bsd_error() helper checks for POLLRDHUP,
we can check if the socket is in an error state before calling the
writable handler when TEVENT_FD_READ was reported.
Only on error we'll call the writable handler, which will pick
the error without calling writev().
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15202
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Ralph Boehme <slow@samba.org>
If we found that the connection is broken, there's no point
in trying to use it anymore, so just return the first error we detected.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15202
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Ralph Boehme <slow@samba.org>
This also returns an error if we got TCP_FIN from the peer,
which is only reported by an explicit POLLRDHUP check.
Also on FreeBSD getsockopt(fd, SOL_SOCKET, SO_ERROR) fetches
and resets the error, so a 2nd call no longer returns an error.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15202
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Ralph Boehme <slow@samba.org>
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15202
Pair-Programmed-With: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Andrew Bartlett <abartlet@samba.org>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
Autobuild-User(master): Andrew Bartlett <abartlet@samba.org>
Autobuild-Date(master): Wed Oct 5 05:23:50 UTC 2022 on sn-devel-184
These tests are redeclared later and so are never used. Give them new
names so that they will be run again.
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
Getting an HMAC too long to fit our array is a programming error. It
should always be 64 bytes exactly.
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
This matches the other comparisons against krbtgt, kadmin, etc., which
are all case-sensitive.
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
The return codes of these functions are not often checked. Throwing an
exception ensures we won't continue blindly on if DN manipulation fails.
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Andrew Bartlett <abartlet@samba.org>
We presumably meant to clear this bit, rather than clearing all bits
other than it.
Signed-off-by: Joseph Sutton <josephsutton@catalyst.net.nz>
Reviewed-by: Jeremy Allison <jra@samba.org>
Autobuild-User(master): Jeremy Allison <jra@samba.org>
Autobuild-Date(master): Mon Oct 3 21:05:31 UTC 2022 on sn-devel-184