7 Commits

Author SHA1 Message Date
Ard Biesheuvel
6ab39f9992 crypto: x86/aria - Use 16 byte alignment for GFNI constant vectors
The GFNI routines in the AVX version of the ARIA implementation now use
explicit VMOVDQA instructions to load the constant input vectors, which
means they must be 16 byte aligned. So ensure that this is the case, by
dropping the section split and the incorrect .align 8 directive, and
emitting the constants into the 16-byte aligned section instead.

Note that the AVX2 version of this code deviates from this pattern, and
does not require a similar fix, given that it loads these contants as
8-byte memory operands, for which AVX2 permits any alignment.

Cc: Taehee Yoo <ap420073@gmail.com>
Fixes: 8b84475318641c2b ("crypto: x86/aria-avx - Do not use avx2 instructions")
Reported-by: syzbot+a6abcf08bad8b18fd198@syzkaller.appspotmail.com
Tested-by: syzbot+a6abcf08bad8b18fd198@syzkaller.appspotmail.com
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-05-24 18:10:27 +08:00
Ard Biesheuvel
52fc482a12 crypto: x86/aria - Use RIP-relative addressing
Prefer RIP-relative addressing where possible, which removes the need
for boot time relocation fixups.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-04-20 18:20:04 +08:00
Taehee Yoo
8b84475318 crypto: x86/aria-avx - Do not use avx2 instructions
vpbroadcastb and vpbroadcastd are not AVX instructions.
But the aria-avx assembly code contains these instructions.
So, kernel panic will occur if the aria-avx works on AVX2 unsupported
CPU.

vbroadcastss, and vpshufb are used to avoid using vpbroadcastb in it.
Unfortunately, this change reduces performance by about 5%.
Also, vpbroadcastd is simply replaced by vmovdqa in it.

Fixes: ba3579e6e45c ("crypto: aria-avx - add AES-NI/AVX/x86_64/GFNI assembler implementation of aria cipher")
Reported-by: Herbert Xu <herbert@gondor.apana.org.au>
Reported-by: Erhard F. <erhard_f@mailbox.org>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-02-14 13:39:33 +08:00
Taehee Yoo
e3cf2f8794 crypto: x86/aria-avx - fix build failure with old binutils
The minimum version of binutils for kernel build is currently 2.23 and
it doesn't support GFNI.
So, it fails to build the aria-avx if the old binutils is used.
The code using GFNI is an optional part of aria-avx.
So, it disables GFNI part in it when the old binutils is used.
In order to check whether the using binutils is supporting GFNI or not,
AS_GFNI is added.

Fixes: ba3579e6e45c ("crypto: aria-avx - add AES-NI/AVX/x86_64/GFNI assembler implementation of aria cipher")
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-20 18:29:31 +08:00
Taehee Yoo
35344cf30f crypto: x86/aria - do not use magic number offsets of aria_ctx
aria-avx assembly code accesses members of aria_ctx with magic number
offset. If the shape of struct aria_ctx is changed carelessly,
aria-avx will not work.
So, we need to ensure accessing members of aria_ctx with correct
offset values, not with magic numbers.

It adds ARIA_CTX_enc_key, ARIA_CTX_dec_key, and ARIA_CTX_rounds in the
asm-offsets.c So, correct offset definitions will be generated.
aria-avx assembly code can access members of aria_ctx safely with
these definitions.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-01-06 17:15:47 +08:00
Eric Biggers
c67b553a4f crypto: x86/aria - fix crash with CFI enabled
aria_aesni_avx_encrypt_16way(), aria_aesni_avx_decrypt_16way(),
aria_aesni_avx_ctr_crypt_16way(), aria_aesni_avx_gfni_encrypt_16way(),
aria_aesni_avx_gfni_decrypt_16way(), and
aria_aesni_avx_gfni_ctr_crypt_16way() are called via indirect function
calls.  Therefore they need to use SYM_TYPED_FUNC_START instead of
SYM_FUNC_START to cause their type hashes to be emitted when the kernel
is built with CONFIG_CFI_CLANG=y.  Otherwise, the code crashes with a
CFI failure.

Fixes: ccace936eec7 ("x86: Add types to indirectly called assembly functions")
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Cc: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-11-25 17:39:19 +08:00
Taehee Yoo
ba3579e6e4 crypto: aria-avx - add AES-NI/AVX/x86_64/GFNI assembler implementation of aria cipher
The implementation is based on the 32-bit implementation of the aria.
Also, aria-avx process steps are the similar to the camellia-avx.
1. Byteslice(16way)
2. Add-round-key.
3. Sbox
4. Diffusion layer.

Except for s-box, all steps are the same as the aria-generic
implementation. s-box step is very similar to camellia and
sm4 implementation.

There are 2 implementations for s-box step.
One is to use AES-NI and affine transformation, which is the same as
Camellia, sm4, and others.
Another is to use GFNI.
GFNI implementation is faster than AES-NI implementation.
So, it uses GFNI implementation if the running CPU supports GFNI.

There are 4 s-boxes in the ARIA and the 2 s-boxes are the same as
AES's s-boxes.

To calculate the first sbox, it just uses the aesenclast and then
inverts shift_row.
No more process is needed for this job because the first s-box is
the same as the AES encryption s-box.

To calculate the second sbox(invert of s1), it just uses the aesdeclast
and then inverts shift_row.
No more process is needed for this job because the second s-box is
the same as the AES decryption s-box.

To calculate the third s-box, it uses the aesenclast,
then affine transformation, which is combined AES inverse affine and
ARIA S2.

To calculate the last s-box, it uses the aesdeclast,
then affine transformation, which is combined X2 and AES forward affine.

The optimized third and last s-box logic and GFNI s-box logic are
implemented by Jussi Kivilinna.

The aria-generic implementation is based on a 32-bit implementation,
not an 8-bit implementation. the aria-avx Diffusion Layer implementation
is based on aria-generic implementation because 8-bit implementation is
not fit for parallel implementation but 32-bit is enough to fit for this.

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-09-24 16:14:44 +08:00