David Sterba fa33b70513 btrfs: add xxhash to fast checksum implementations
[ Upstream commit efcfcbc6a36195c42d98e0ee697baba36da94dc8 ]

The implementation of XXHASH is now CPU only but still fast enough to be
considered for the synchronous checksumming, like non-generic crc32c.

A userspace benchmark comparing it to various implementations (patched
hash-speedtest from btrfs-progs):

  Block size:     4096
  Iterations:     1000000
  Implementation: builtin
  Units:          CPU cycles

	NULL-NOP: cycles:     73384294, cycles/i       73
     NULL-MEMCPY: cycles:    228033868, cycles/i      228,    61664.320 MiB/s
      CRC32C-ref: cycles:  24758559416, cycles/i    24758,      567.950 MiB/s
       CRC32C-NI: cycles:   1194350470, cycles/i     1194,    11773.433 MiB/s
  CRC32C-ADLERSW: cycles:   6150186216, cycles/i     6150,     2286.372 MiB/s
  CRC32C-ADLERHW: cycles:    626979180, cycles/i      626,    22427.453 MiB/s
      CRC32C-PCL: cycles:    466746732, cycles/i      466,    30126.699 MiB/s
	  XXHASH: cycles:    860656400, cycles/i      860,    16338.188 MiB/s

Comparing purely software implementation (ref), current outdated
accelerated using crc32q instruction (NI), optimized implementations by
M. Adler (https://stackoverflow.com/questions/17645167/implementing-sse-4-2s-crc32c-in-software/17646775#17646775)
and the best one that was taken from kernel using the PCLMULQDQ
instruction (PCL).

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-01 13:21:55 +01:00
..
2024-03-01 13:21:53 +01:00
2021-08-26 22:28:02 +02:00
2024-02-23 08:54:51 +01:00
2021-12-29 12:28:59 +01:00
2024-03-01 13:21:47 +01:00
2023-09-19 12:22:53 +02:00
2024-03-01 13:21:43 +01:00
2023-02-22 12:57:05 +01:00
2023-09-23 11:10:01 +02:00
2023-12-08 08:48:04 +01:00
2023-08-30 16:18:19 +02:00
2024-01-05 15:13:36 +01:00
2022-12-14 11:37:31 +01:00
2023-01-12 11:58:47 +01:00
2022-07-02 16:41:17 +02:00
2022-07-12 16:35:08 +02:00
2021-12-14 10:57:15 +01:00