IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Based on lessons learnt from optimizing the 32-bit version of this driver,
we can simplify the arm64 version considerably, by reordering the final
two stores when the last block is not a multiple of 64 bytes. This removes
the need to use permutation instructions to calculate the elements that are
clobbered by the final overlapping store, given that the store of the
penultimate block now follows it, and that one carries the correct values
for those elements already.
While at it, simplify the overlapping loads as well, by calculating the
address of the final overlapping load upfront, and switching to this
address for every load that would otherwise extend past the end of the
source buffer.
There is no impact on performance, but the resulting code is substantially
smaller and easier to follow.
Cc: Eric Biggers <ebiggers@google.com>
Cc: "Jason A . Donenfeld" <Jason@zx2c4.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>