1
0
mirror of https://github.com/samba-team/samba.git synced 2025-01-05 09:18:06 +03:00
samba-mirror/testdata/compression/README
Douglas Bagnall ce7ea07d07 testdata: move compression examples to re-use with lzxpress plain
Everything that is in testdata/compression/lzxpress-huffman/ can also
be used for lzxpress plain tests, which is something we really need.

Signed-off-by: Douglas Bagnall <douglas.bagnall@catalyst.net.nz>
Reviewed-by: Joseph Sutton <josephsutton@catalyst.net.nz>
2022-12-01 22:56:40 +00:00

48 lines
2.1 KiB
Plaintext

# Test vectors for MS-XCA [de-]compression
There are currently two supported varients of the Xpress Compression
Algorithm, "Plain LZ77" and "LZ77 + Huffman". For each we two
directories of files compressed on Windows, corresponding to the two
compression levels that Windows offers.
The subdirectories are
./decompressed - test files to compress with .decomp extension.
./compressed-huffman - LZ77+Huffman compressed, with .lzhuff extension.
./compressed-more-huffman - LZ77+Huffman compressed, with .lzhuff extension.
./compressed-plain - Plain LZ77 compressed, with .lzplain extension.
./compressed-more-plain - Plain LZ77 compressed, with .lzplain extension.
where the more-compressed-* versions have the files that Windows put
more effort into compressing (largely in vain -- they are similar in
size). Windows probably does not use this more effortful compression
in network protocols, but these files must be decompressible.
The compressed files were made using the Windows Compression API,
which uses the same underlying code as MS-XCA, but which puts some
annoying hurdles in the way. In particular, it won't perform
LZ77+Huffman compression on any file smaller than 300 bytes. The
relationship between the two is covered in various messages in
https://lists.samba.org/archive/cifs-protocol/2022-October/
https://lists.samba.org/archive/cifs-protocol/2022-November/
To recreate these files or add more, use
lib/compression/tests/scripts/generate-windows-test-vectors.c under
Cygwin or MSYS2. This file is also in the decompressed directory.
Some of the decompressed files were found via fuzzing, some are designed
to test one aspect or another of the format, while others are public
domain texts.
These are used in compression and decompression tests.
- For decompression tests, we need the decompressed versions to
compare against.
- For compression tests, we do not assert that the compressed file is
identical to the Windows compressed file. Exact equality is not
expected by MS-XCA, which leaves room for implementation tricks, but
the size of the compressed file allows us to make ballpark
assertions about expected compression ratios.