IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
commit dec3d0b1071a0f3194e66a83d26ecf4aa8c5910e upstream.
The ->digest() method of crct10dif-pclmul reads the current CRC value
from the shash_desc context. But this value is uninitialized, causing
crypto_shash_digest() to compute the wrong result. Fix it.
Probably this wasn't noticed before because lib/crc-t10dif.c only uses
crypto_shash_update(), not crypto_shash_digest(). Likewise,
crypto_shash_digest() is not yet tested by the crypto self-tests because
those only test the ahash API which only uses shash init/update/final.
Fixes: 0b95a7f85718 ("crypto: crct10dif - Glue code to cast accelerated CRCT10DIF assembly as a crypto transform")
Cc: <stable@vger.kernel.org> # v3.11+
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 678cce4019d746da6c680c48ba9e6d417803e127 upstream.
The x86_64 implementation of Poly1305 produces the wrong result on some
inputs because poly1305_4block_avx2() incorrectly assumes that when
partially reducing the accumulator, the bits carried from limb 'd4' to
limb 'h0' fit in a 32-bit integer. This is true for poly1305-generic
which processes only one block at a time. However, it's not true for
the AVX2 implementation, which processes 4 blocks at a time and
therefore can produce intermediate limbs about 4x larger.
Fix it by making the relevant calculations use 64-bit arithmetic rather
than 32-bit. Note that most of the carries already used 64-bit
arithmetic, but the d4 -> h0 carry was different for some reason.
To be safe I also made the same change to the corresponding SSE2 code,
though that only operates on 1 or 2 blocks at a time. I don't think
it's really needed for poly1305_block_sse2(), but it doesn't hurt
because it's already x86_64 code. It *might* be needed for
poly1305_2block_sse2(), but overflows aren't easy to reproduce there.
This bug was originally detected by my patches that improve testmgr to
fuzz algorithms against their generic implementation. But also add a
test vector which reproduces it directly (in the AVX2 case).
Fixes: b1ccc8f4b631 ("crypto: poly1305 - Add a four block AVX2 variant for x86_64")
Fixes: c70f4abef07a ("crypto: poly1305 - Add a SSE2 SIMD variant for x86_64")
Cc: <stable@vger.kernel.org> # v4.3+
Cc: Martin Willi <martin@strongswan.org>
Cc: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Martin Willi <martin@strongswan.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
In chacha20-simd, clear the MAY_SLEEP flag in the blkcipher_desc to
prevent sleeping with preemption disabled, under kernel_fpu_begin().
This was fixed upstream incidentally by a large refactoring,
commit 9ae433bc79f9 ("crypto: chacha20 - convert generic and x86
versions to skcipher"). But syzkaller easily trips over this when
running on older kernels, as it's easily reachable via AF_ALG.
Therefore, this patch makes the minimal fix for older kernels.
Fixes: c9320b6dcb89 ("crypto: chacha20 - Add a SSSE3 SIMD variant for x86_64")
Cc: linux-crypto@vger.kernel.org
Cc: Martin Willi <martin@strongswan.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fdb2726f4e61c5e3abc052f547d5a5f6c0dc5504 upstream.
aes_ctrby8_avx-x86_64.S uses the C preprocessor for token pasting
of character sequences that are not valid preprocessor tokens.
While this is allowed when preprocessing assembler files it exposes
an incompatibilty between the clang and gcc preprocessors where
clang does not strip leading white space from macro parameters,
leading to the CONCAT(%xmm, i) macro expansion on line 96 resulting
in a token with a space character embedded in it.
While this could be resolved by deleting the offending space character,
the assembler is perfectly capable of doing the token pasting correctly
for itself so we can just get rid of the preprocessor macros.
Signed-off-by: Michael Davidson <md@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 02f39b2379fb81557ae864ec8f85421c0250c954 upstream.
The crypto code was checking both use_eager_fpu() and
defined(X86_FEATURE_EAGER_FPU). The latter was nonsensical, so
remove it. This will avoid breakage when we remove
X86_FEATURE_EAGER_FPU.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Rik van Riel <riel@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: pbonzini@redhat.com
Link: http://lkml.kernel.org/r/1475627678-20788-2-git-send-email-riel@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c207aee48037abca71c669cbec407b9891965c34 upstream.
In preparation for an objtool rewrite which will have broader checks,
whitelist functions and files which cause problems because they do
unusual things with the stack.
These whitelists serve as a TODO list for which functions and files
don't yet have undwarf unwinder coverage. Eventually most of the
whitelists can be removed in favor of manual CFI hint annotations or
objtool improvements.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/7f934a5d707a574bda33ea282e9478e627fb1829.1498659915.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8f461b1e02ed546fbd0f11611138da67fd85a30f upstream.
With ecb-cast5-avx, if a 128+ byte scatterlist element followed a
shorter one, then the algorithm accidentally encrypted/decrypted only 8
bytes instead of the expected 128 bytes. Fix it by setting the
encryption/decryption 'fn' correctly.
Fixes: c12ab20b162c ("crypto: cast5/avx - avoid using temporary stack buffers")
Cc: <stable@vger.kernel.org> # v3.8+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a208fa8f33031b9e0aba44c7d1b7e68eb0cbd29e upstream.
We need to consistently enforce that keyed hashes cannot be used without
setting the key. To do this we need a reliable way to determine whether
a given hash algorithm is keyed or not. AF_ALG currently does this by
checking for the presence of a ->setkey() method. However, this is
actually slightly broken because the CRC-32 algorithms implement
->setkey() but can also be used without a key. (The CRC-32 "key" is not
actually a cryptographic key but rather represents the initial state.
If not overridden, then a default initial state is used.)
Prepare to fix this by introducing a flag CRYPTO_ALG_OPTIONAL_KEY which
indicates that the algorithm has a ->setkey() method, but it is not
required to be called. Then set it on all the CRC-32 algorithms.
The same also applies to the Adler-32 implementation in Lustre.
Also, the cryptd and mcryptd templates have to pass through the flag
from their underlying algorithm.
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d8c7fe9f2a486a6e5f0d5229ca43807af5ab22c6 upstream.
Using %rbp as a temporary register breaks frame pointer convention and
breaks stack traces when unwinding from an interrupt in the crypto code.
In twofish-3way, we can't simply replace %rbp with another register
because there are none available. Instead, we use the stack to hold the
values that %rbp, %r11, and %r12 were holding previously. Each of these
values represents the half of the output from the previous Feistel round
that is being passed on unchanged to the following round. They are only
used once per round, when they are exchanged with %rax, %rbx, and %rcx.
As a result, we free up 3 registers (one per block) and can reassign
them so that %rbp is not used, and additionally %r14 and %r15 are not
used so they do not need to be saved/restored.
There may be a small overhead caused by replacing 'xchg REG, REG' with
the needed sequence 'mov MEM, REG; mov REG, MEM; mov REG, REG' once per
round. But, counterintuitively, when I tested "ctr-twofish-3way" on a
Haswell processor, the new version was actually about 2% faster.
(Perhaps 'xchg' is not as well optimized as plain moves.)
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eff84b379089cd8b4e83599639c1f5f6e34ef7bf upstream.
The SHA-512 multibuffer code keeps track of the number of blocks pending
in each lane. The minimum of these values is used to identify the next
lane that will be completed. Unused lanes are set to a large number
(0xFFFFFFFF) so that they don't affect this calculation.
However, it was forgotten to set the lengths to this value in the
initial state, where all lanes are unused. As a result it was possible
for sha512_mb_mgr_get_comp_job_avx2() to select an unused lane, causing
a NULL pointer dereference. Specifically this could happen in the case
where ->update() was passed fewer than SHA512_BLOCK_SIZE bytes of data,
so it then called sha_complete_job() without having actually submitted
any blocks to the multi-buffer code. This hit a NULL pointer
dereference if another task happened to have submitted blocks
concurrently to the same CPU and the flush timer had not yet expired.
Fix this by initializing sha512_mb_mgr->lens correctly.
As usual, this bug was found by syzkaller.
Fixes: 45691e2d9b18 ("crypto: sha512-mb - submit/flush routines for AVX2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a16e772e664b9a261424107784804cffc8894977 upstream.
Since Poly1305 requires a nonce per invocation, the Linux kernel
implementations of Poly1305 don't use the crypto API's keying mechanism
and instead expect the key and nonce as the first 32 bytes of the data.
But ->setkey() is still defined as a stub returning an error code. This
prevents Poly1305 from being used through AF_ALG and will also break it
completely once we start enforcing that all crypto API users (not just
AF_ALG) call ->setkey() if present.
Fix it by removing crypto_poly1305_setkey(), leaving ->setkey as NULL.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9c674e1e2f9e24fa4392167efe343749008338e0 upstream.
GCM can be invoked with a zero destination buffer. This is possible if
the AAD and the ciphertext have zero lengths and only the tag exists in
the source buffer (i.e. a source buffer cannot be zero). In this case,
the GCM cipher only performs the authentication and no decryption
operation.
When the destination buffer has zero length, it is possible that no page
is mapped to the SG pointing to the destination. In this case,
sg_page(req->dst) is an invalid access. Therefore, page accesses should
only be allowed if the req->dst->length is non-zero which is the
indicator that a page must exist.
This fixes a crash that can be triggered by user space via AF_ALG.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ecaaab5649781c5a0effdaf298a925063020500e upstream.
When asked to encrypt or decrypt 0 bytes, both the generic and x86
implementations of Salsa20 crash in blkcipher_walk_done(), either when
doing 'kfree(walk->buffer)' or 'free_page((unsigned long)walk->page)',
because walk->buffer and walk->page have not been initialized.
The bug is that Salsa20 is calling blkcipher_walk_done() even when
nothing is in 'walk.nbytes'. But blkcipher_walk_done() is only meant to
be called when a nonzero number of bytes have been provided.
The broken code is part of an optimization that tries to make only one
call to salsa20_encrypt_bytes() to process inputs that are not evenly
divisible by 64 bytes. To fix the bug, just remove this "optimization"
and use the blkcipher_walk API the same way all the other users do.
Reproducer:
#include <linux/if_alg.h>
#include <sys/socket.h>
#include <unistd.h>
int main()
{
int algfd, reqfd;
struct sockaddr_alg addr = {
.salg_type = "skcipher",
.salg_name = "salsa20",
};
char key[16] = { 0 };
algfd = socket(AF_ALG, SOCK_SEQPACKET, 0);
bind(algfd, (void *)&addr, sizeof(addr));
reqfd = accept(algfd, 0, 0);
setsockopt(algfd, SOL_ALG, ALG_SET_KEY, key, sizeof(key));
read(reqfd, key, sizeof(key));
}
Reported-by: syzbot <syzkaller@googlegroups.com>
Fixes: eb6f13eb9f81 ("[CRYPTO] salsa20_generic: Fix multi-page processing")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5dfeaac15f2b1abb5a53c9146041c7235eb9aa04 upstream.
struct sha256_ctx_mgr allocated in sha256_mb_mod_init() via kzalloc()
and later passed in sha256_mb_flusher_mgr_flush_avx2() function where
instructions vmovdqa used to access the struct. vmovdqa requires
16-bytes aligned argument, but nothing guarantees that struct
sha256_ctx_mgr will have that alignment. Unaligned vmovdqa will
generate GP fault.
Fix this by replacing vmovdqa with vmovdqu which doesn't have alignment
requirements.
Fixes: a377c6b1876e ("crypto: sha256-mb - submit/flush routines for AVX2")
Reported-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Tim Chen
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d041b557792c85677f17e08eee535eafbd6b9aa2 upstream.
struct sha1_ctx_mgr allocated in sha1_mb_mod_init() via kzalloc()
and later passed in sha1_mb_flusher_mgr_flush_avx2() function where
instructions vmovdqa used to access the struct. vmovdqa requires
16-bytes aligned argument, but nothing guarantees that struct
sha1_ctx_mgr will have that alignment. Unaligned vmovdqa will
generate GP fault.
Fix this by replacing vmovdqa with vmovdqu which doesn't have alignment
requirements.
Fixes: 2249cbb53ead ("crypto: sha-mb - SHA1 multibuffer submit and flush routines for AVX2")
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8861249c740fc4af9ddc5aee321eafefb960d7c6 upstream.
It was reported that the sha1 AVX2 function(sha1_transform_avx2) is
reading ahead beyond its intended data, and causing a crash if the next
block is beyond page boundary:
http://marc.info/?l=linux-crypto-vger&m=149373371023377
This patch makes sure that there is no overflow for any buffer length.
It passes the tests written by Jan Stancek that revealed this problem:
https://github.com/jstancek/sha1-avx2-crash
I have re-enabled sha1-avx2 by reverting commit
b82ce24426a4071da9529d726057e4e642948667
Fixes: b82ce24426a4 ("crypto: sha1-ssse3 - Disable avx2")
Originally-by: Ilya Albrekht <ilya.albrekht@intel.com>
Tested-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b82ce24426a4071da9529d726057e4e642948667 upstream.
It has been reported that sha1-avx2 can cause page faults by reading
beyond the end of the input. This patch disables it until it can be
fixed.
Fixes: 7c1da8d0d046 ("crypto: sha - SHA1 transform x86_64 AVX2")
Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The rfc4106 encrypy/decrypt helper functions cause an annoying
false-positive warning in allmodconfig if we turn on
-Wmaybe-uninitialized warnings again:
arch/x86/crypto/aesni-intel_glue.c: In function ‘helper_rfc4106_decrypt’:
include/linux/scatterlist.h:67:31: warning: ‘dst_sg_walk.sg’ may be used uninitialized in this function [-Wmaybe-uninitialized]
The problem seems to be that the compiler doesn't track the state of the
'one_entry_in_sg' variable across the kernel_fpu_begin/kernel_fpu_end
section.
This takes the easy way out by adding a bogus initialization, which
should be harmless enough to get the patch into v4.9 so we can turn on
this warning again by default without producing useless output. A
follow-up patch for v4.10 rearranges the code to make the warning go
away.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
1. fix ctx pointer
Use req_ctx which is the ctx for the next job that have
been completed in the lanes instead of the first
completed job rctx, whose completion could have been
called and released.
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
1. fix ctx pointer
Use req_ctx which is the ctx for the next job that have
been completed in the lanes instead of the first
completed job rctx, whose completion could have been
called and released.
2. fix digest copy
Use XMM register to copy another 16 bytes sha256 digest
instead of a regular register.
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
for condition comparison and cleanup multiline comment style
In sha*_ctx_mgr_submit, we currently use the | operator instead of ||
((ctx->partial_block_buffer_length) | (len < SHA1_BLOCK_SIZE))
Switching it to || and remove extraneous paranthesis to
adhere to coding style.
Also cleanup inconsistent multiline comment style.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Currently aesni uses an async ctr(aes) to derive the rfc4106
subkey, which was presumably copied over from the generic rfc4106
code. Over there it's done that way because we already have a
ctr(aes) spawn. But it is simply overkill for aesni since we
have to go get a ctr(aes) from scratch anyway.
This patch simplifies the subkey derivation by using a straight
aes cipher instead.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch introduces the assembly routines to do SHA512 computation on
buffers belonging to several jobs at once. The assembly routines are
optimized with AVX2 instructions that have 4 data lanes and using AVX2
registers.
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch introduces the data structures and prototypes of functions
needed for computing SHA512 hash using multi-buffer. Included are the
structures of the multi-buffer SHA512 job, job scheduler in C and x86
assembly.
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch introduces the routines used to submit and flush buffers
belonging to SHA512 crypto jobs to the SHA512 multibuffer algorithm.
It is implemented mostly in assembly optimized with AVX2 instructions.
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch introduces the multi-buffer job manager which is responsible
for submitting scatter-gather buffers from several SHA512 jobs to the
multi-buffer algorithm. It also contains the flush routine that's called
by the crypto daemon to complete the job when no new jobs arrive before
the deadline of maximum latency of a SHA512 crypto job.
The SHA512 multi-buffer crypto algorithm is defined and initialized in this
patch.
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Until now, there was only support for the SHA1 multibuffer algorithm.
Hence, there was just one sha-mb folder. Now, with the introduction of
the SHA256 multi-buffer algorithm , it is logical to name the existing
folder as sha1-mb.
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch introduces the assembly routines to do SHA256 computation
on buffers belonging to several jobs at once. The assembly routines
are optimized with AVX2 instructions that have 8 data lanes and using
AVX2 registers.
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch introduces the data structures and prototypes of
functions needed for computing SHA256 hash using multi-buffer.
Included are the structures of the multi-buffer SHA256 job,
job scheduler in C and x86 assembly.
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch introduces the routines used to submit and flush buffers
belonging to SHA256 crypto jobs to the SHA256 multibuffer algorithm. It
is implemented mostly in assembly optimized with AVX2 instructions.
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch introduces the multi-buffer job manager which is responsible for
submitting scatter-gather buffers from several SHA256 jobs to the
multi-buffer algorithm. It also contains the flush routine to that's
called by the crypto daemon to complete the job when no new jobs arrive
before the deadline of maximum latency of a SHA256 crypto job.
The SHA256 multi-buffer crypto algorithm is defined and initialized in
this patch.
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert wants the sha1-mb algorithm to have an async implementation:
https://lkml.org/lkml/2016/4/5/286.
Currently, sha1-mb uses an async interface for the outer algorithm
and a sync interface for the inner algorithm. This patch introduces
a async interface for even the inner algorithm.
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch fixes an old bug where requests can be reordered because
some are processed by cryptd while others are processed directly
in softirq context.
The fix is to always postpone to cryptd if there are currently
requests outstanding from the same tfm.
This patch also removes the redundant use of cryptd in the async
init function as init never touches the FPU.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch fixes an old bug where gcm requests can be reordered
because some are processed by cryptd while others are processed
directly in softirq context.
The fix is to always postpone to cryptd if there are currently
requests outstanding from the same tfm.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
On 16-byte requests the optimised version is actually slower than
the generic code, so we should simply use that instead.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Cheers,
Currently there are several checkpatch warnings in the sha1_mb.c file:
'WARNING: line over 80 characters' in the sha1_mb.c file. Also, the
syntax of some multi-line comments are not correct. This patch fixes
these issues.
Signed-off-by: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Add the MODULE_ALIAS for the cra_driver_name of the different ciphers to
allow an automated loading if a driver name is used.
Signed-off-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull crypto fix from Herbert Xu:
"Fix a regression that causes sha-mb to crash"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: sha1-mb - make sha1_x8_avx2() conform to C function ABI
Megha Dey reported a kernel panic in crypto code. The problem is that
sha1_x8_avx2() clobbers registers r12-r15 without saving and restoring
them.
Before commit aec4d0e301f1 ("x86/asm/crypto: Simplify stack usage in
sha-mb functions"), those registers were saved and restored by the
callers of the function. I removed them with that commit because I
didn't realize sha1_x8_avx2() clobbered them.
Fix the potential undefined behavior associated with clobbering the
registers and make the behavior less surprising by changing the
registers to be callee saved/restored to conform with the C function
call ABI.
Also, rdx (aka RSP_SAVE) doesn't need to be saved: I verified that none
of the callers rely on it being saved, and it's not a callee-saved
register in the C ABI.
Fixes: aec4d0e301f1 ("x86/asm/crypto: Simplify stack usage in sha-mb functions")
Cc: stable@vger.kernel.org # 4.6
Reported-by: Megha Dey <megha.dey@linux.intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Four instances of incorrect usage of non-static "inline" crept up
in arch/x86, all trivial; cleaning them up:
EVT_TO_HPET_DEV() - made static, it is only used in kernel/hpet.c
Debug version of check_iommu_entries() is an __init function.
Non-debug dummy empty version of it is declared "inline" instead -
which doesn't help to eliminate it (the caller is in a different unit,
inlining doesn't happen).
Switch to non-inlined __init function, which does eliminate it
(by discarding it as part of __init section).
crypto/sha-mb/sha1_mb.c: looks like they just forgot to add "static"
to their two internal inlines, which emitted two unused functions into
vmlinux.
text data bss dec hex filename
95903394 20860288 35991552 152755234 91adc22 vmlinux_before
95903266 20860288 35991552 152755106 91adba2 vmlinux
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/1460739626-12179-1-git-send-email-dvlasenk@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
In sha_complete_job, incorrect mcryptd_hash_request_ctx pointer is used
when check and complete other jobs. If the memory of first completed req
is freed, while still completing other jobs in the func, kernel will
crash since NULL pointer is assigned to RIP.
Cc: <stable@vger.kernel.org>
Signed-off-by: Xiaodong Liu <xiaodong.liu@intel.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>