IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
The commit 5054e778fc ("dm crypt: allocate compound pages if
possible") changed dm-crypt to use compound pages to improve
performance. Unfortunately, there was an oversight: the allocation of
compound pages was not accounted at all. Normal pages are accounted in
a percpu counter cc->n_allocated_pages and dm-crypt is limited to
allocate at most 2% of memory. Because compound pages were not
accounted at all, dm-crypt could allocate memory over the 2% limit.
Fix this by adding the accounting of compound pages, so that memory
consumption of dm-crypt is properly limited.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Fixes: 5054e778fc ("dm crypt: allocate compound pages if possible")
Cc: stable@vger.kernel.org # v6.5+
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Simplify crypt_iv_tcw_whitening() by using crypto_shash_digest() instead
of an init+update+final sequence. This should also improve performance.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Previously, the bip's bi_size has been set before an integrity pages
were added. If a problem occurs in the process of adding pages for
bip, the bi_size mismatch problem must be dealt with.
When the page is successfully added to bvec, the bi_size is updated.
The parts affected by the change were also contained in this commit.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jinyoung Choi <j-young.choi@samsung.com>
Tested-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20230803024956epcms2p38186a17392706650c582d38ef3dbcd32@epcms2p3
Signed-off-by: Jens Axboe <axboe@kernel.dk>
API:
- Add linear akcipher/sig API.
- Add tfm cloning (hmac, cmac).
- Add statesize to crypto_ahash.
Algorithms:
- Allow only odd e and restrict value in FIPS mode for RSA.
- Replace LFSR with SHA3-256 in jitter.
- Add interface for gathering of raw entropy in jitter.
Drivers:
- Fix race on data_avail and actual data in hwrng/virtio.
- Add hash and HMAC support in starfive.
- Add RSA algo support in starfive.
- Add support for PCI device 0x156E in ccp.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmSdECcACgkQxycdCkmx
i6dW3g//a4DR6aaqYF8pU4svAzO56a0Plx3DVHUiJ4ygRB7xOzrQqXjCren6wY2a
LFuetwxebAhIAPsC79vI+3j8VAIlU9cNVqOxBIJHGY7wFO4m1AjqBjlealzqLrth
+nEIeUibqLeRw7imOO4adzSsKuSQgyU5rPtKWfrGqqI3RhuMgfWroCtmJ82jmq5l
uMZgB+aGGkzyXztxubHRPeJ3nOFEzo95SscpJ43lOjMcURRBhEa+20jXDhUGwpI7
9ycFV31AW+tfkIprAcliiIzZuwIbzlCkte6AxjAVsN100T/wh9JS1Y+uf1P0oZ9y
AUQQKyc8/QpSkzHZPTncat5P6zta28r8Q5neCvEEEGGuOE8Oc6kb0Os+RE5ANMU4
2A/zrKGOMIWeEWwXGc51xT3gxyl/Rn5wLw1pW7Lm4d5osGT9jiVXx/g66hKLpagJ
jegI6CqgvUajkRNi7JPVnSAauu0Ay8O6pU37/8gLOXNGVZBqONpRimk9qB05LNSF
QYzM2sgYv1tQEmjnG8jLhF5Z8brnqYTv2TZwBX43W10EDQNqUYUDff9Flean5xCb
+2mxJc81rgtUffnMXyYvQwKLhVKoLpeLR6Ts455S5aP06WAfoyEJyYTA/LHG24GX
H2HdS9g5y/K15k9yygMWaXgAx7O7MjM9gEa2VQakhnByj/eQM0s=
=rOLu
-----END PGP SIGNATURE-----
Merge tag 'v6.5-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
"API:
- Add linear akcipher/sig API
- Add tfm cloning (hmac, cmac)
- Add statesize to crypto_ahash
Algorithms:
- Allow only odd e and restrict value in FIPS mode for RSA
- Replace LFSR with SHA3-256 in jitter
- Add interface for gathering of raw entropy in jitter
Drivers:
- Fix race on data_avail and actual data in hwrng/virtio
- Add hash and HMAC support in starfive
- Add RSA algo support in starfive
- Add support for PCI device 0x156E in ccp"
* tag 'v6.5-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (85 commits)
crypto: akcipher - Do not copy dst if it is NULL
crypto: sig - Fix verify call
crypto: akcipher - Set request tfm on sync path
crypto: sm2 - Provide sm2_compute_z_digest when sm2 is disabled
hwrng: imx-rngc - switch to DEFINE_SIMPLE_DEV_PM_OPS
hwrng: st - keep clock enabled while hwrng is registered
hwrng: st - support compile-testing
hwrng: imx-rngc - fix the timeout for init and self check
KEYS: asymmetric: Use new crypto interface without scatterlists
KEYS: asymmetric: Move sm2 code into x509_public_key
KEYS: Add forward declaration in asymmetric-parser.h
crypto: sig - Add interface for sign/verify
crypto: akcipher - Add sync interface without SG lists
crypto: cipher - On clone do crypto_mod_get()
crypto: api - Add __crypto_alloc_tfmgfp
crypto: api - Remove crypto_init_ops()
crypto: rsa - allow only odd e and restrict value in FIPS mode
crypto: geniv - Split geniv out of AEAD Kconfig option
crypto: algboss - Add missing dependency on RNG2
crypto: starfive - Add RSA algo support
...
- Fix DM crypt target's crypt_ctr_cipher_new return value on invalid
AEAD cipher.
- Fix DM flakey testing target's write bio corruption feature to
corrupt the data of a cloned bio instead of the original.
- Add random_read_corrupt and random_write_corrupt features to DM
flakey target.
- Fix ABBA deadlock in DM thin metadata by resetting associated bufio
client rather than destroying and recreating it.
- A couple other small DM thinp cleanups.
- Update DM core to support disabling block core IO stats accounting
and optimize away code that isn't needed if stats are disabled.
- Other small DM core cleanups.
- Improve DM integrity target to not require so much memory on 32 bit
systems. Also only allocate the recalculate buffer as needed (and
increasingly reduce its size on allocation failure).
- Update DM integrity to use %*ph for printing hexdump of a small
buffer. Also update DM integrity documentation.
- Various DM core ioctl interface hardening. Now more careful about
alignment of structures and processing of input passed to the kernel
from userspace. Also disallow the creation of DM devices named
"control", "." or ".."
- Eliminate GFP_NOIO workarounds for __vmalloc and kvmalloc in DM
core's ioctl and bufio code.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAmScyjAACgkQxSPxCi2d
A1pilwgAucNIAB6uN4ke4WZrMVxSFntUkqDTkCs2Ycw+W4Tf1Mtrj/4WeBzFdJaA
oZMK04LUGaFFXn+halsCDzB354yT9C7V/KfXW8pCM1c9BRz4e8272i2HSN4WwD5n
BU4gVaOV5BwxynfF3Z5siRraad1AwmdoRGGsqzVRESAKaObXU//1tnO42UhxRVhn
nzFqhIm0xRcLAd8xIBlMsZQGIloicdDP9wZdWzTEDspQiwR2dFRmH9bUF8OmsS+h
KwhtDty7aZO+4gJ1ccBImijzQCmbAo7dmFhDfoLXaA5Jt6UwTXMeBHm4aUPMnvQe
NVXoRZJodDemwM642Q/Tx1SpsX6QmA==
=4R7u
-----END PGP SIGNATURE-----
Merge tag 'for-6.5/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mike Snitzer:
- Update DM crypt to allocate compound pages if possible
- Fix DM crypt target's crypt_ctr_cipher_new return value on invalid
AEAD cipher
- Fix DM flakey testing target's write bio corruption feature to
corrupt the data of a cloned bio instead of the original
- Add random_read_corrupt and random_write_corrupt features to DM
flakey target
- Fix ABBA deadlock in DM thin metadata by resetting associated bufio
client rather than destroying and recreating it
- A couple other small DM thinp cleanups
- Update DM core to support disabling block core IO stats accounting
and optimize away code that isn't needed if stats are disabled
- Other small DM core cleanups
- Improve DM integrity target to not require so much memory on 32 bit
systems. Also only allocate the recalculate buffer as needed (and
increasingly reduce its size on allocation failure)
- Update DM integrity to use %*ph for printing hexdump of a small
buffer. Also update DM integrity documentation
- Various DM core ioctl interface hardening. Now more careful about
alignment of structures and processing of input passed to the kernel
from userspace.
Also disallow the creation of DM devices named "control", "." or ".."
- Eliminate GFP_NOIO workarounds for __vmalloc and kvmalloc in DM
core's ioctl and bufio code
* tag 'for-6.5/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (28 commits)
dm: get rid of GFP_NOIO workarounds for __vmalloc and kvmalloc
dm integrity: scale down the recalculate buffer if memory allocation fails
dm integrity: only allocate recalculate buffer when needed
dm integrity: reduce vmalloc space footprint on 32-bit architectures
dm ioctl: Refuse to create device named "." or ".."
dm ioctl: Refuse to create device named "control"
dm ioctl: Avoid double-fetch of version
dm ioctl: structs and parameter strings must not overlap
dm ioctl: Avoid pointer arithmetic overflow
dm ioctl: Check dm_target_spec is sufficiently aligned
Documentation: dm-integrity: Document an example of how the tunables relate.
Documentation: dm-integrity: Document default values.
Documentation: dm-integrity: Document the meaning of "buffer".
Documentation: dm-integrity: Fix minor grammatical error.
dm integrity: Use %*ph for printing hexdump of a small buffer
dm thin: disable discards for thin-pool if no_discard_passdown
dm: remove stale/redundant dm_internal_{suspend,resume} prototypes in dm.h
dm: skip dm-stats work in alloc_io() unless needed
dm: avoid needless dm_io access if all IO accounting is disabled
dm: support turning off block-core's io stats accounting
...
- Yosry has also eliminated cgroup's atomic rstat flushing.
- Nhat Pham adds the new cachestat() syscall. It provides userspace
with the ability to query pagecache status - a similar concept to
mincore() but more powerful and with improved usability.
- Mel Gorman provides more optimizations for compaction, reducing the
prevalence of page rescanning.
- Lorenzo Stoakes has done some maintanance work on the get_user_pages()
interface.
- Liam Howlett continues with cleanups and maintenance work to the maple
tree code. Peng Zhang also does some work on maple tree.
- Johannes Weiner has done some cleanup work on the compaction code.
- David Hildenbrand has contributed additional selftests for
get_user_pages().
- Thomas Gleixner has contributed some maintenance and optimization work
for the vmalloc code.
- Baolin Wang has provided some compaction cleanups,
- SeongJae Park continues maintenance work on the DAMON code.
- Huang Ying has done some maintenance on the swap code's usage of
device refcounting.
- Christoph Hellwig has some cleanups for the filemap/directio code.
- Ryan Roberts provides two patch series which yield some
rationalization of the kernel's access to pte entries - use the provided
APIs rather than open-coding accesses.
- Lorenzo Stoakes has some fixes to the interaction between pagecache
and directio access to file mappings.
- John Hubbard has a series of fixes to the MM selftesting code.
- ZhangPeng continues the folio conversion campaign.
- Hugh Dickins has been working on the pagetable handling code, mainly
with a view to reducing the load on the mmap_lock.
- Catalin Marinas has reduced the arm64 kmalloc() minimum alignment from
128 to 8.
- Domenico Cerasuolo has improved the zswap reclaim mechanism by
reorganizing the LRU management.
- Matthew Wilcox provides some fixups to make gfs2 work better with the
buffer_head code.
- Vishal Moola also has done some folio conversion work.
- Matthew Wilcox has removed the remnants of the pagevec code - their
functionality is migrated over to struct folio_batch.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZJejewAKCRDdBJ7gKXxA
joggAPwKMfT9lvDBEUnJagY7dbDPky1cSYZdJKxxM2cApGa42gEA6Cl8HRAWqSOh
J0qXCzqaaN8+BuEyLGDVPaXur9KirwY=
=B7yQ
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull mm updates from Andrew Morton:
- Yosry Ahmed brought back some cgroup v1 stats in OOM logs
- Yosry has also eliminated cgroup's atomic rstat flushing
- Nhat Pham adds the new cachestat() syscall. It provides userspace
with the ability to query pagecache status - a similar concept to
mincore() but more powerful and with improved usability
- Mel Gorman provides more optimizations for compaction, reducing the
prevalence of page rescanning
- Lorenzo Stoakes has done some maintanance work on the
get_user_pages() interface
- Liam Howlett continues with cleanups and maintenance work to the
maple tree code. Peng Zhang also does some work on maple tree
- Johannes Weiner has done some cleanup work on the compaction code
- David Hildenbrand has contributed additional selftests for
get_user_pages()
- Thomas Gleixner has contributed some maintenance and optimization
work for the vmalloc code
- Baolin Wang has provided some compaction cleanups,
- SeongJae Park continues maintenance work on the DAMON code
- Huang Ying has done some maintenance on the swap code's usage of
device refcounting
- Christoph Hellwig has some cleanups for the filemap/directio code
- Ryan Roberts provides two patch series which yield some
rationalization of the kernel's access to pte entries - use the
provided APIs rather than open-coding accesses
- Lorenzo Stoakes has some fixes to the interaction between pagecache
and directio access to file mappings
- John Hubbard has a series of fixes to the MM selftesting code
- ZhangPeng continues the folio conversion campaign
- Hugh Dickins has been working on the pagetable handling code, mainly
with a view to reducing the load on the mmap_lock
- Catalin Marinas has reduced the arm64 kmalloc() minimum alignment
from 128 to 8
- Domenico Cerasuolo has improved the zswap reclaim mechanism by
reorganizing the LRU management
- Matthew Wilcox provides some fixups to make gfs2 work better with the
buffer_head code
- Vishal Moola also has done some folio conversion work
- Matthew Wilcox has removed the remnants of the pagevec code - their
functionality is migrated over to struct folio_batch
* tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (380 commits)
mm/hugetlb: remove hugetlb_set_page_subpool()
mm: nommu: correct the range of mmap_sem_read_lock in task_mem()
hugetlb: revert use of page_cache_next_miss()
Revert "page cache: fix page_cache_next/prev_miss off by one"
mm/vmscan: fix root proactive reclaim unthrottling unbalanced node
mm: memcg: rename and document global_reclaim()
mm: kill [add|del]_page_to_lru_list()
mm: compaction: convert to use a folio in isolate_migratepages_block()
mm: zswap: fix double invalidate with exclusive loads
mm: remove unnecessary pagevec includes
mm: remove references to pagevec
mm: rename invalidate_mapping_pagevec to mapping_try_invalidate
mm: remove struct pagevec
net: convert sunrpc from pagevec to folio_batch
i915: convert i915_gpu_error to use a folio_batch
pagevec: rename fbatch_count()
mm: remove check_move_unevictable_pages()
drm: convert drm_gem_put_pages() to use a folio_batch
i915: convert shmem_sg_free_table() to use a folio_batch
scatterlist: add sg_set_folio()
...
If the user specifies invalid AEAD cipher, dm-crypt should return the
error returned from crypt_ctr_auth_spec, not -ENOMEM.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
It was reported that allocating pages for the write buffer in dm-crypt
causes measurable overhead [1].
Change dm-crypt to allocate compound pages if they are available. If
not, fall back to the mempool.
[1] https://listman.redhat.com/archives/dm-devel/2023-February/053284.html
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
MAX_CIPHER_BLOCKSIZE is an internal implementation detail and should
not be relied on by users of the Crypto API.
Instead of storing the IV on the stack, allocate it together with
the crypto request.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Reviewed-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Eliminate duplicate boilerplate code for simple modules that contain
a single DM target driver without any additional setup code.
Add a new module_dm() macro, which replaces the module_init() and
module_exit() with template functions that call dm_register_target()
and dm_unregister_target() respectively.
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
Simplifies each DM target's init method by making dm_register_target()
responsible for its error reporting (on behalf of targets).
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
target flag (initially added to allow swap to dm-crypt) to throttle
the amount of outstanding swap bios.
- Fix DM crypt soft lockup warnings by calling cond_resched() from the
cpu intensive loop in dmcrypt_write().
- Fix DM crypt to not access an uninitialized tasklet. This fix allows
for consistent handling of IO completion, by _not_ needlessly punting
to a workqueue when tasklets are not needed.
- Fix DM core's alloc_dev() initialization for DM stats to check for
and propagate alloc_percpu() failure.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAmQd2p0ACgkQxSPxCi2d
A1qvXAf/WCMNXRbFhO35QqukBqS7sUOMfWl1hIEdABRu+3Ul1KHBWzXVYWuWgebw
kr79V3LZG63cLvhreCy64X/0tXLZa0c0AGWZI6rJ/QAozSCs9R8BqOrJnB5GT1o9
/lvmOL31MloMnIKArWseIQViNM97gEHmFpuj0saqitcvNTjjipzxq/wOyhmDQwnE
8rxJpKSHBJXs9X/VyM9FTWxtijTQw3c8wxJJo7eV6TTuLyrErm46tyI1cBQ4vDoa
ogMVWVrf51uTsqL6DqGenDc+kO7CH5lipIJij1bTtKgs3aBNlaiZQC1nPkMST9Ue
hpH61ixAg+bsWi4/xLFafCl6QAGMlA==
=71ya
-----END PGP SIGNATURE-----
Merge tag 'for-6.3/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mike Snitzer:
- Fix DM thin to work as a swap device by using 'limit_swap_bios' DM
target flag (initially added to allow swap to dm-crypt) to throttle
the amount of outstanding swap bios.
- Fix DM crypt soft lockup warnings by calling cond_resched() from the
cpu intensive loop in dmcrypt_write().
- Fix DM crypt to not access an uninitialized tasklet. This fix allows
for consistent handling of IO completion, by _not_ needlessly punting
to a workqueue when tasklets are not needed.
- Fix DM core's alloc_dev() initialization for DM stats to check for
and propagate alloc_percpu() failure.
* tag 'for-6.3/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm stats: check for and propagate alloc_percpu failure
dm crypt: avoid accessing uninitialized tasklet
dm crypt: add cond_resched() to dmcrypt_write()
dm thin: fix deadlock when swapping to thin device
When neither "no_read_workqueue" nor "no_write_workqueue" are enabled,
tasklet_trylock() in crypt_dec_pending() may still return false due to
an uninitialized state, and dm-crypt will unnecessarily do io completion
in io_queue workqueue instead of current context.
Fix this by adding an 'in_tasklet' flag to dm_crypt_io struct and
initialize it to false in crypt_io_init(). Set this flag to true in
kcryptd_queue_crypt() before calling tasklet_schedule(). If set
crypt_dec_pending() will punt io completion to a workqueue.
This also nicely avoids the tasklet_trylock/unlock hack when tasklets
aren't in use.
Fixes: 8e14f61015 ("dm crypt: do not call bio_endio() from the dm-crypt tasklet")
Cc: stable@vger.kernel.org
Reported-by: Hou Tao <houtao1@huawei.com>
Suggested-by: Ignat Korchagin <ignat@cloudflare.com>
Reviewed-by: Ignat Korchagin <ignat@cloudflare.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
slab BUG will occur when kmem_cache_destroy() is called.
- Improve 2 of DM's shrinker names to reflect their use.
- Fix the DM flakey target to not corrupt the zero page. Fix dm-flakey
on 32-bit hughmem systems by using bvec_kmap_local instead of
page_address. Also, fix logic used when imposing the
"corrupt_bio_byte" feature.
- Stop using WQ_UNBOUND for DM verity target's verify_wq because it
causes significant Android latencies on ARM64 (and doesn't show real
benefit on other architectures).
- Add negative check to catch simple case of a DM table referencing
itself. More complex scenarios that use intermediate devices to
self-reference still need to be avoided/handled in userspace.
- Fix DM core's resize to only send one uevent instead of two. This
fixes a race with udev, that if udev wins, will cause udev to miss
uevents (which caused premature unmount attempts by systemd).
- Add cond_resched() to workqueue functions in DM core, dn-thin and
dm-cache so that their loops aren't the cause of unintended cpu
scheduling fairness issues.
- Fix all of DM's checkpatch errors and warnings (famous last words).
Various other small cleanups.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAmPzrP4ACgkQxSPxCi2d
A1quGQgArlqtlYTl3ese9Kxdpq5fta69v77IooF2gp7PJgRzQ624L7gTFaWZE38v
9ib5FRgTe84Nm+H/x0TAJKgoWOhwen24w2G5KMXKOhIOJgXV6xBK0gXV7cQajr6e
RPml8hL6e/1K1IbmGrPn1Mpg6tOlSUM273z8pL+E6IkzIFdU/pay3WN6fcjC5vsM
a3y739KCeo2/fMTCSX5B4owSvwTm1rX/wF4QwdqhgcaHhEqddFmcvmHAn/p7kHxb
WbAT58A5jP5SaRyWv1MLCb8pzOivI8WFxFw4l2Fs/opYTG9jLrmmTejJndWVEE1Q
PFcjFv/L5sRhXGRfH8dqNEbhX9Lubw==
=2o1v
-----END PGP SIGNATURE-----
Merge tag 'for-6.3/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mike Snitzer:
- Fix DM cache target to free background tracker work items, otherwise
slab BUG will occur when kmem_cache_destroy() is called.
- Improve 2 of DM's shrinker names to reflect their use.
- Fix the DM flakey target to not corrupt the zero page. Fix dm-flakey
on 32-bit hughmem systems by using bvec_kmap_local instead of
page_address. Also, fix logic used when imposing the
"corrupt_bio_byte" feature.
- Stop using WQ_UNBOUND for DM verity target's verify_wq because it
causes significant Android latencies on ARM64 (and doesn't show real
benefit on other architectures).
- Add negative check to catch simple case of a DM table referencing
itself. More complex scenarios that use intermediate devices to
self-reference still need to be avoided/handled in userspace.
- Fix DM core's resize to only send one uevent instead of two. This
fixes a race with udev, that if udev wins, will cause udev to miss
uevents (which caused premature unmount attempts by systemd).
- Add cond_resched() to workqueue functions in DM core, dn-thin and
dm-cache so that their loops aren't the cause of unintended cpu
scheduling fairness issues.
- Fix all of DM's checkpatch errors and warnings (famous last words).
Various other small cleanups.
* tag 'for-6.3/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (62 commits)
dm: remove unnecessary (void*) conversion in event_callback()
dm ioctl: remove unnecessary check when using dm_get_mdptr()
dm ioctl: assert _hash_lock is held in __hash_remove
dm cache: add cond_resched() to various workqueue loops
dm thin: add cond_resched() to various workqueue loops
dm: add cond_resched() to dm_wq_requeue_work()
dm: add cond_resched() to dm_wq_work()
dm sysfs: make kobj_type structure constant
dm: update targets using system workqueues to use a local workqueue
dm: remove flush_scheduled_work() during local_exit()
dm clone: prefer kvmalloc_array()
dm: declare variables static when sensible
dm: fix suspect indent whitespace
dm ioctl: prefer strscpy() instead of strlcpy()
dm: avoid void function return statements
dm integrity: change macros min/max() -> min_t/max_t where appropriate
dm: fix use of sizeof() macro
dm: avoid 'do {} while(0)' loop in single statement macros
dm log: avoid multiple line dereference
dm log: avoid trailing semicolon in macro
...
'GPL-2.0-only' is used instead of 'GPL-2.0' because SPDX has
deprecated its use.
Suggested-by: John Wiele <jwiele@redhat.com>
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
This patch removes the temporary scaffolding now that the comletion
function signature has been converted.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
This patch adds temporary scaffolding so that the Crypto API
completion function can take a void * instead of crypto_async_request.
Once affected users have been converted this can be removed.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Use strchr() instead of strpbrk() when there is only 1 element in the set
of characters to look for.
This potentially saves a few cycles, but gcc does already account for
optimizing this pattern thanks to it's fold_builtin_strpbrk().
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
The device mapper dm-crypt target is using scnprintf("%02x", cc->key[i]) to
report the current key to userspace. However, this is not a constant-time
operation and it may leak information about the key via timing, via cache
access patterns or via the branch predictor.
Change dm-crypt's key printing to use "%c" instead of "%02x". Also
introduce hex2asc() that carefully avoids any branching or memory
accesses when converting a number in the range 0 ... 15 to an ascii
character.
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Tested-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
This series consists of the usual driver updates (qla2xxx, pm8001,
libsas, smartpqi, scsi_debug, lpfc, iscsi, mpi3mr) plus minor updates
and bug fixes. The high blast radius core update is the removal of
write same, which affects block and several non-SCSI devices. The
other big change, which is more local, is the removal of the SCSI
pointer.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYjzDQyYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQMYAQDEWUGV
6U0+736AHVtOfiMNfiRN79B1HfXVoHvemnPcTwD/UlndwFfy/3GGOtoZmqEpc73J
Ec1HDuUCE18H1H2QAh0=
=/Ty9
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This series consists of the usual driver updates (qla2xxx, pm8001,
libsas, smartpqi, scsi_debug, lpfc, iscsi, mpi3mr) plus minor updates
and bug fixes.
The high blast radius core update is the removal of write same, which
affects block and several non-SCSI devices. The other big change,
which is more local, is the removal of the SCSI pointer"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (281 commits)
scsi: scsi_ioctl: Drop needless assignment in sg_io()
scsi: bsg: Drop needless assignment in scsi_bsg_sg_io_fn()
scsi: lpfc: Copyright updates for 14.2.0.0 patches
scsi: lpfc: Update lpfc version to 14.2.0.0
scsi: lpfc: SLI path split: Refactor BSG paths
scsi: lpfc: SLI path split: Refactor Abort paths
scsi: lpfc: SLI path split: Refactor SCSI paths
scsi: lpfc: SLI path split: Refactor CT paths
scsi: lpfc: SLI path split: Refactor misc ELS paths
scsi: lpfc: SLI path split: Refactor VMID paths
scsi: lpfc: SLI path split: Refactor FDISC paths
scsi: lpfc: SLI path split: Refactor LS_RJT paths
scsi: lpfc: SLI path split: Refactor LS_ACC paths
scsi: lpfc: SLI path split: Refactor the RSCN/SCR/RDF/EDC/FARPR paths
scsi: lpfc: SLI path split: Refactor PLOGI/PRLI/ADISC/LOGO paths
scsi: lpfc: SLI path split: Refactor base ELS paths and the FLOGI path
scsi: lpfc: SLI path split: Introduce lpfc_prep_wqe
scsi: lpfc: SLI path split: Refactor fast and slow paths to native SLI4
scsi: lpfc: SLI path split: Refactor lpfc_iocbq
scsi: lpfc: Use kcalloc()
...
accounting with focus on fixing wildly inaccurate IO stats for
dm-crypt (and other DM targets that defer bio submission in their
own workqueues). End result is proper IO accounting, made possible
by targets being updated to use the new dm_submit_bio_remap()
interface.
- Add hipri bio polling support (REQ_POLLED) to bio-based DM.
- Reduce dm_io and dm_target_io structs so that a single dm_io (which
contains dm_target_io and first clone bio) weighs in at 256 bytes.
For reference the bio struct is 128 bytes.
- Various other small cleanups, fixes or improvements in DM core and
targets.
- Update MAINTAINERS with my kernel.org email address to allow
distinction between my "upstream" and "Red" Hats.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAmI7ghIACgkQxSPxCi2d
A1q04wgAzaRu186WjfCO8MK0uyv3S52Rw1EsgYealAqoPwQJ9KkW2icvjtwRL+fJ
1+w6qE/Da6QdwXj9lGtp1XIXJFipNJSw3PSaE/tV2cXiBemZlzJ5vR6F6dfeYKmV
/sGas46H2l+aD4Xr7unUmcN/AYrNIFtnucClY3+DlJFPesXQQc9a/XmL9RX9MrN4
MS9wLkh/5QSG3zReEct/4GVmNSJAjFfLkkeFHtLN82jvvDmnszRT5+aJ06WkXeOz
OZmQfOPnJv5MnFUz9DOaRb/fTCoyxzxLnNM5Lt3jyFPk9Jf8Qz9TJ2rgskxsE83u
UsCD/Y/QAdDcrRVB5SS6+yx4AS6uSA==
=cinj
-----END PGP SIGNATURE-----
Merge tag 'for-5.18/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mike Snitzer:
- Significant refactoring and fixing of how DM core does bio-based IO
accounting with focus on fixing wildly inaccurate IO stats for
dm-crypt (and other DM targets that defer bio submission in their own
workqueues). End result is proper IO accounting, made possible by
targets being updated to use the new dm_submit_bio_remap() interface.
- Add hipri bio polling support (REQ_POLLED) to bio-based DM.
- Reduce dm_io and dm_target_io structs so that a single dm_io (which
contains dm_target_io and first clone bio) weighs in at 256 bytes.
For reference the bio struct is 128 bytes.
- Various other small cleanups, fixes or improvements in DM core and
targets.
- Update MAINTAINERS with my kernel.org email address to allow
distinction between my "upstream" and "Red" Hats.
* tag 'for-5.18/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (46 commits)
dm: consolidate spinlocks in dm_io struct
dm: reduce size of dm_io and dm_target_io structs
dm: switch dm_target_io booleans over to proper flags
dm: switch dm_io booleans over to proper flags
dm: update email address in MAINTAINERS
dm: return void from __send_empty_flush
dm: factor out dm_io_complete
dm cache: use dm_submit_bio_remap
dm: simplify dm_sumbit_bio_remap interface
dm thin: use dm_submit_bio_remap
dm: add WARN_ON_ONCE to dm_submit_bio_remap
dm: support bio polling
block: add ->poll_bio to block_device_operations
dm mpath: use DMINFO instead of printk with KERN_INFO
dm: stop using bdevname
dm-zoned: remove the ->name field in struct dmz_dev
dm: remove unnecessary local variables in __bind
dm: requeue IO if mapping table not yet available
dm io: remove stale comment block for dm_io()
dm thin metadata: remove unused dm_thin_remove_block and __remove
...
Remove the from_wq argument from dm_sumbit_bio_remap(). Eliminates the
need for dm_sumbit_bio_remap() callers to know whether they are
calling for a workqueue or from the original dm_submit_bio().
Add map_task to dm_io struct, record the map_task in alloc_io and
clear it after all target ->map() calls have completed. Update
dm_sumbit_bio_remap to check if 'current' matches io->map_task rather
than rely on passed 'from_rq' argument.
This change really simplifies the chore of porting each DM target to
using dm_sumbit_bio_remap() because there is no longer the risk of
programming error by not completely knowing all the different contexts
a particular method that calls dm_sumbit_bio_remap() might be used in.
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Use the %pg format specifier to save on stack consuption and code size.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20220304180105.409765-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There are no more end-users of REQ_OP_WRITE_SAME left, so we can start
deleting it.
Link: https://lore.kernel.org/r/20220209082828.2629273-7-hch@lst.de
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Explicitly convert unsigned int in the right of the conditional
expression to int to match the left side operand and the return type,
fixing the following compiler warning:
drivers/md/dm-crypt.c:2593:43: warning: signed and unsigned
type in conditional expression [-Wsign-compare]
Fixes: c538f6ec9f ("dm crypt: add ability to use keys from the kernel key retention service")
Signed-off-by: Aashish Sharma <shraash@google.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Care was taken to support kcryptd_io_read being called from crypt_map
or workqueue. Use of an intermediate CRYPT_MAP_READ_GFP gfp_t
(defined as GFP_NOWAIT) should protect from maintenance burden if that
flag were to change for some reason.
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Pass a block_device to bio_clone_fast and __bio_clone_fast and give
the functions more suitable names.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20220202160109.108149-14-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pass the block_device and operation that we plan to use this bio for to
bio_alloc_bioset to optimize the assigment. NULL/0 can be passed, both
for the passthrough case on a raw request_queue and to temporarily avoid
refactoring some nasty code.
Also move the gfp_mask argument after the nr_vecs argument for a much
more logical calling convention matching what most of the kernel does.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220124091107.642561-16-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Just open code it next to the bio allocations, which saves a few lines
of code, prepares for future changes and allows to remove the duplicate
bi_opf assignment for the bio_clone_fast case in kcryptd_io_read.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220124091107.642561-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Remove handling of NULL returns from sleeping bio_alloc calls given that
those can't fail.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220124091107.642561-6-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
subsystem. Also enhance both the integrity and crypt targets to emit
events to via dm-audit.
- Various other simple code improvements and cleanups.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAmGJlFkACgkQxSPxCi2d
A1pqwwf/YZ6kNKRQaKF1mbkkHOxa/ULf7qIhi/R0epwJu4j1RGsCACS34EqzLc4c
x15h6flCNj1IBVAqTvMUETYTjTLtyrcfD0yBRWYw2RL0ksHMHyMvd1r/7aE64+pj
EeZk9Xzcx3Gsq9GOzKfYA2AX0PrypkKSjgHK7hgv+Jh5heqkFcnMXSl3l7BQ6vbr
ue9joPSI7+6eVFMDn32KxyHzfm6zZo1nmKZ6tQBBHD1D9yBqWTAhXiyXhRA+BOYH
Tg5wE1fvZ/htyZNEc1cMRArzLF6q9pEU4r8j472N6IcJbhIJzSu0V60zVvexNWG3
fJSIWqlta1KFK8SQttmDmfFnJiFcyw==
=t097
-----END PGP SIGNATURE-----
Merge tag 'for-5.16/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mike Snitzer:
- Add DM core support for emitting audit events through the audit
subsystem. Also enhance both the integrity and crypt targets to emit
events to via dm-audit.
- Various other simple code improvements and cleanups.
* tag 'for-5.16/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
dm table: log table creation error code
dm: make workqueue names device-specific
dm writecache: Make use of the helper macro kthread_run()
dm crypt: Make use of the helper macro kthread_run()
dm verity: use bvec_kmap_local in verity_for_bv_block
dm log writes: use memcpy_from_bvec in log_writes_map
dm integrity: use bvec_kmap_local in __journal_read_write
dm integrity: use bvec_kmap_local in integrity_metadata
dm: add add_disk() error handling
dm: Remove redundant flush_workqueue() calls
dm crypt: log aead integrity violations to audit subsystem
dm integrity: log audit events for dm-integrity target
dm: introduce audit event module for device mapper
Replace kthread_create/wake_up_process() with kthread_run()
to simplify the code.
Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Since dm-crypt target can be stacked on dm-integrity targets to
provide authenticated encryption, integrity violations are recognized
here during aead computation. We use the dm-audit submodule to
signal those events to user space, too.
The construction and destruction of crypt device mappings are also
logged as audit events.
Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Split the integrity/metadata handling definitions out into a new header.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20210920123328.1399408-17-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
On systems with many cores using dm-crypt, heavy spinlock contention in
percpu_counter_compare() can be observed when the page allocation limit
for a given device is reached or close to be reached. This is due
to percpu_counter_compare() taking a spinlock to compute an exact
result on potentially many CPUs at the same time.
Switch to non-exact comparison of allocated and allowed pages by using
the value returned by percpu_counter_read_positive() to avoid taking
the percpu_counter spinlock.
This may over/under estimate the actual number of allocated pages by at
most (batch-1) * num_online_cpus().
Currently, batch is bounded by 32. The system on which this issue was
first observed has 256 CPUs and 512GB of RAM. With a 4k page size, this
change may over/under estimate by 31MB. With ~10G (2%) allowed dm-crypt
allocations, this seems an acceptable error. Certainly preferred over
running into the spinlock contention.
This behavior was reproduced on an EC2 c5.24xlarge instance with 96 CPUs
and 192GB RAM as follows, but can be provoked on systems with less CPUs
as well.
* Disable swap
* Tune vm settings to promote regular writeback
$ echo 50 > /proc/sys/vm/dirty_expire_centisecs
$ echo 25 > /proc/sys/vm/dirty_writeback_centisecs
$ echo $((128 * 1024 * 1024)) > /proc/sys/vm/dirty_background_bytes
* Create 8 dmcrypt devices based on files on a tmpfs
* Create and mount an ext4 filesystem on each crypt devices
* Run stress-ng --hdd 8 within one of above filesystems
Total %system usage collected from sysstat goes to ~35%. Write throughput
on the underlying loop device is ~2GB/s. perf profiling an individual
kworker kcryptd thread shows the following profile, indicating spinlock
contention in percpu_counter_compare():
99.98% 0.00% kworker/u193:46 [kernel.kallsyms] [k] ret_from_fork
|
--ret_from_fork
kthread
worker_thread
|
--99.92%--process_one_work
|
|--80.52%--kcryptd_crypt
| |
| |--62.58%--mempool_alloc
| | |
| | --62.24%--crypt_page_alloc
| | |
| | --61.51%--__percpu_counter_compare
| | |
| | --61.34%--__percpu_counter_sum
| | |
| | |--58.68%--_raw_spin_lock_irqsave
| | | |
| | | --58.30%--native_queued_spin_lock_slowpath
| | |
| | --0.69%--cpumask_next
| | |
| | --0.51%--_find_next_bit
| |
| |--10.61%--crypt_convert
| | |
| | |--6.05%--xts_crypt
...
After applying this patch and running the same test, %system usage is
lowered to ~7% and write throughput on the loop device increases
to ~2.7GB/s. perf report shows mempool_alloc() as ~8% rather than ~62%
in the profile and not hitting the percpu_counter() spinlock anymore.
|--8.15%--mempool_alloc
| |
| |--3.93%--crypt_page_alloc
| | |
| | --3.75%--__alloc_pages
| | |
| | --3.62%--get_page_from_freelist
| | |
| | --3.22%--rmqueue_bulk
| | |
| | --2.59%--_raw_spin_lock
| | |
| | --2.57%--native_queued_spin_lock_slowpath
| |
| --3.05%--_raw_spin_lock_irqsave
| |
| --2.49%--native_queued_spin_lock_slowpath
Suggested-by: DJ Gregor <dj@corelight.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Arne Welzel <arne.welzel@corelight.com>
Fixes: 5059353df8 ("dm crypt: limit the number of allocated pages")
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>