IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This patch is a CE-optimized assembly implementation for cmac/xcbc/cbcmac.
Benchmark on T-Head Yitian-710 2.75 GHz, the data comes from the 300 mode of
tcrypt, and compared the performance before and after this patch (the driver
used before this patch is XXXmac(sm4-ce)). The abscissas are blocks of
different lengths. The data is tabulated and the unit is Mb/s:
Before:
update-size | 16 64 256 1024 2048 4096 8192
---------------+--------------------------------------------------------
cmac(sm4-ce) | 293.33 403.69 503.76 527.78 531.10 535.46 535.81
xcbc(sm4-ce) | 292.83 402.50 504.02 529.08 529.87 536.55 538.24
cbcmac(sm4-ce) | 318.42 415.79 497.12 515.05 523.15 521.19 523.01
After:
update-size | 16 64 256 1024 2048 4096 8192
---------------+--------------------------------------------------------
cmac-sm4-ce | 371.99 675.28 903.56 971.65 980.57 990.40 991.04
xcbc-sm4-ce | 372.11 674.55 903.47 971.61 980.96 990.42 991.10
cbcmac-sm4-ce | 371.63 675.33 903.23 972.07 981.42 990.93 991.45
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch is a CE-optimized assembly implementation for XTS mode.
Benchmark on T-Head Yitian-710 2.75 GHz, the data comes from the 218 mode of
tcrypt, and compared the performance before and after this patch (the driver
used before this patch is xts(ecb-sm4-ce)). The abscissas are blocks of
different lengths. The data is tabulated and the unit is Mb/s:
Before:
xts(ecb-sm4-ce) | 16 64 128 256 1024 1420 4096
----------------+--------------------------------------------------------------
XTS enc | 117.17 430.56 732.92 1134.98 2007.03 2136.23 2347.20
XTS dec | 116.89 429.02 733.40 1132.96 2006.13 2130.50 2347.92
After:
xts-sm4-ce | 16 64 128 256 1024 1420 4096
----------------+--------------------------------------------------------------
XTS enc | 224.68 798.91 1248.08 1714.60 2413.73 2467.84 2612.62
XTS dec | 229.85 791.34 1237.79 1720.00 2413.30 2473.84 2611.95
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch is a CE-optimized assembly implementation for CTS-CBC mode.
Benchmark on T-Head Yitian-710 2.75 GHz, the data comes from the 218 mode of
tcrypt, and compared the performance before and after this patch (the driver
used before this patch is cts(cbc-sm4-ce)). The abscissas are blocks of
different lengths. The data is tabulated and the unit is Mb/s:
Before:
cts(cbc-sm4-ce) | 16 64 128 256 1024 1420 4096
----------------+--------------------------------------------------------------
CTS-CBC enc | 286.09 297.17 457.97 627.75 868.58 900.80 957.69
CTS-CBC dec | 286.67 285.63 538.35 947.08 2241.03 2577.32 3391.14
After:
cts-cbc-sm4-ce | 16 64 128 256 1024 1420 4096
----------------+--------------------------------------------------------------
CTS-CBC enc | 288.19 428.80 593.57 741.04 911.73 931.80 950.00
CTS-CBC dec | 292.22 468.99 838.23 1380.76 2741.17 3036.42 3409.62
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In the accelerated implementation of the SM4 algorithm using the Crypto
Extension instructions, there are some functions that can be reused in
the upcoming accelerated implementation of the GCM/CCM mode, and the
CBC/CFB encryption is reused in the optimized implementation of SVESM4.
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use a 128-bit swap mask and tbl instruction to simplify the implementation
for generating SM4 rkey_dec.
Also fixed the issue of not being wrapped by kernel_neon_begin/end() when
using the sm4_ce_expand_key() function.
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch does not add new features, but only refactors and simplifies the
implementation of the Crypto Extension acceleration of the SM4 algorithm:
Extract the macro optimized by SM4 Crypto Extension for reuse in the
subsequent optimization of CCM/GCM modes.
Encryption in CBC and CFB modes processes four blocks at a time instead of
one, allowing the ld1 instruction to load 64 bytes of data at a time, which
will reduces unnecessary memory accesses.
CBC/CFB/CTR makes full use of free registers to reduce redundant memory
accesses, and rearranges some instructions to improve out-of-order execution
capabilities.
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The subsequent patches of the series will have an implementation
of SM4-ECB/CBC/CFB/CTR accelerated by the CE instruction set, which
conflicts with the current module name. In order to keep the naming
rules of the AES algorithm consistent, the sm4-ce algorithm is
renamed to sm4-ce-cipher.
In addition, the speed of sm4-ce-cipher is better than that of SM4
NEON. By the way, the priority of the algorithm is adjusted to 300,
which is also to leave room for the priority of SM4 NEON.
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
SM4 library is abstracted from sm4-generic algorithm, sm4-ce can depend on
the SM4 library instead of sm4-generic, and some functions in sm4-generic
do not need to be exported.
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Replace all calls to may_use_simd() in the arm64 crypto code with
crypto_simd_usable(), in order to allow testing the no-SIMD code paths.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
ARMv8.2 specifies special instructions for the SM3 cryptographic hash
and the SM4 symmetric cipher. While it is unlikely that a core would
implement one and not the other, we should only use SM4 instructions
if the SM4 CPU feature bit is set, and we currently check the SM3
feature bit instead. So fix that.
Fixes: e99ce921c468 ("crypto: arm64 - add support for SM4...")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add support for the SM4 symmetric cipher implemented using the special
SM4 instructions introduced in ARM architecture revision 8.2.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>