IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
- Fix revoke processing at unmount and on read-only remount.
- Refuse reading in inodes with an impossible indirect block height.
- Various minor cleanups.
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmRH00EUHGFncnVlbmJh
QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTpfSw//a0IG/6tVfPwlCPd1qPKN/JbxHQVa
3JtRJXr5+RJ3vIKmJb+AFfEiPHRKgCk28d9hMbW8j3ieKSy/dPJe8ARIZmu8phpK
dOSiue87v4Pz6Tz2qpb6KgzUENohsqjTy0mEAEWxxZb7mWxUqj7t057hiEp/9NX6
jXcfoNInMxnf2HX3d11LoHXYOX37xvEjyhbZ4OmPn+xzmYQcfhuO96CTrIOe7lKe
U8dbFzDwtoUxYljb16do0OLva/YxaNdH33yCzrixVi9ki1aCIH6A2paznlUCzgF1
wzOhzV3kbz5e4flov34X1Jnd5RVkBmzMMM5bkcJzNg6JOSGL0bLg0PGYiXASAfje
KmsNVvq0yMWLLQIr2gMcEVI1IVDxkNKJruizbFfUs/SWHI55eDUZtI5jjA0v0cCw
5HRWialgCNdC9Wd/4HW+DfpztwxR5+j52TCtqgyRGQcGvkNtwGMeGgh9/sreuHhr
7Lod7LpBVmEzlXD97VIcJUcU2vhXtGA25pKXFHCilXh/MdA2UpmBWkVUkNVS/awr
ZTzoi/WkUIWLLVHxHruLOoJBWit8Mhnuxh1rlY0WD3wDEbnwCZKX8ziTYIsVvkWc
d/xjjafrolC6g1m/RoW3rXO/MLXaE0rRy4nb3MLhwsqaAQyZE6puwo40Kj/xNQqf
68kLauI2QFNpFm8=
=l1E9
-----END PGP SIGNATURE-----
Merge tag 'gfs2-v6.3-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 updates from Andreas Gruenbacher:
- Fix revoke processing at unmount and on read-only remount
- Refuse reading in inodes with an impossible indirect block height
- Various minor cleanups
* tag 'gfs2-v6.3-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: gfs2_ail_empty_gl no log flush on error
gfs2: Issue message when revokes cannot be written
gfs2: Perform second log flush in gfs2_make_fs_ro
gfs2: return errors from gfs2_ail_empty_gl
gfs2: Move variable assignment behind a null pointer check in inode_go_dump
gfs2: Use gfs2_holder_initialized for jindex
gfs2: Eliminate gfs2_trim_blocks
gfs2: Fix inode height consistency check
gfs2: Remove ghs[] from gfs2_unlink
gfs2: Remove ghs[] from gfs2_link
gfs2: Remove duplicate i_nlink check from gfs2_link()
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmRHC3gACgkQxWXV+ddt
WDvI/A//ZzREEE0wNexbuidoTacDVXVJ6LBb2K1eP+HUKfsmd6GYWQDJ9x/ExpKb
T1ehLibCYWLeYxEREFbjXI3x9G8mrvLzvzsqXs/MzJPkmEF1igPddFztidBwvLQH
ey/Bh+cra2bpVhRhkX0Cf09/q/YWp17/d14ZxxW60PMfyhx8RWXejXhHkulOPVv8
+3FL8E0kc2Zjx9ioUwOy/i18LR6YzsCNVXoHzUZuWyWM4A7NG2TZR6FhuLSjlWSZ
3RAnROwr+8i5nR0xchcyYaVMO2LMbqH6mBtHnXCtxCr+4pFrfrvKym+CQco/Xriz
v1y/xDc23XeYXLCVhb0beJ6uRcjaM9+gvDF1oVBSJEv6V7sQr/tEGo/8QRehfEfT
FTro7Lf89R1GOa1IBSkv/T5S25d9LlIID3/g7PbcUBtXNKvLAjDAGTH9bzL4HS5x
/MKwN80GvaGs1KyEfUndbVPIpAwNFDYZPHM7nw1x+JTkIBcHgfjRyAMAC9jrJd0D
730W04c+0nXZtQGtKKsxc3U8y4ewzSJAKx9t7Vgo7+1P6dSRnzvJee3x/5kXV9Yn
MhxxzYDfIN9EcWbASdSm11gY5WZdG3an609pO7nc1T2K4Tuo0SPs4xOR7c3xuZrY
MN5z3QFWyI2ustUuTG+nsd5J81j76DEmj5ymWQfG3SBplTneDM0=
=Jt7p
-----END PGP SIGNATURE-----
Merge tag 'for-6.4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"Mostly core changes and cleanups, some notable fixes and two
performance improvements in directory logging.
The IO path cleanups are removing or refactoring old code, scrub main
loop has been completely rewritten also refactoring old code.
There are some changes to non-btrfs code, mostly trivial, the cgroup
punt bio logic is only moved from generic code.
Performance improvements:
- improve logging changes in a directory during one transaction,
avoid iterating over items and reduce lock contention (fsync time
4x lower)
- when logging directory entries during one transaction, reduce
locking of subvolume trees by checking tree-log instead
(improvement in throughput and latency for concurrent access to a
subvolume)
Notable fixes:
- dev-replace:
- properly honor read mode when requested to avoid reading from
source device
- target device won't be used for eventual read repair, this is
unreliable for NODATASUM files
- when there are unpaired (and unrepairable) metadata during
replace, exit early with error and don't try to finish whole
operation
- scrub ioctl properly rejects unknown flags
- fix global block reserve calculations
- fix partial direct io write when there's a page fault in the
middle, iomap will try to continue with partial request but the
btrfs part did not match that, this can lead to zeros written
instead of data
Core changes:
- io path:
- continued cleanups and refactoring around bio handling
- extent io submit path simplifications and cleanups
- flush write path simplifications and cleanups
- rework logic of passing sync mode of bio, with further cleanups
- rewrite scrub code flow, restructure how the stripes are enumerated
and verified in a more unified way
- allow to set lower threshold for block group reclaim in debug mode
to aid zoned mode testing
- remove obsolete time-based delayed ref throttling logic when
truncating items
- DREW locks are not using percpu variables anymore
- more warning fixes (-Wmaybe-uninitialized)
- u64 division simplifications
- error handling improvements
Non-btrfs code changes:
- push cgroup punt bio logic to btrfs code (there was no other user
of that), the functionality can be now selected separately by
BLK_CGROUP_PUNT_BIO
- crc32c_impl removed after removing last uses in btrfs code
- add btrfs_assertfail() to objtool table"
* tag 'for-6.4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (147 commits)
btrfs: mark btrfs_assertfail() __noreturn
btrfs: fix uninitialized variable warnings
btrfs: use log root when iterating over index keys when logging directory
btrfs: avoid iterating over all indexes when logging directory
btrfs: dev-replace: error out if we have unrepaired metadata error during
btrfs: remove pointless loop at btrfs_get_next_valid_item()
btrfs: scrub: reject unsupported scrub flags
btrfs: reinterpret async discard iops_limit=0 as no delay
btrfs: set default discard iops_limit to 1000
btrfs: remove unused raid56 functions which were dedicated for scrub
btrfs: scrub: remove scrub_bio structure
btrfs: scrub: remove scrub_block and scrub_sector structures
btrfs: scrub: remove the old scrub recheck code
btrfs: scrub: remove the old writeback infrastructure
btrfs: scrub: remove scrub_parity structure
btrfs: scrub: use scrub_stripe to implement RAID56 P/Q scrub
btrfs: scrub: switch scrub_simple_mirror() to scrub_stripe infrastructure
btrfs: scrub: introduce helper to queue a stripe for scrub
btrfs: scrub: introduce error reporting functionality for scrub_stripe
btrfs: scrub: introduce a writeback helper for scrub_stripe
...
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmRHq24ACgkQnJ2qBz9k
QNkowggAuyZ313GrAUNDH2Cvu3PE804LpW6k27VxtIxmgtMSZvuOED0fC99Xt+3y
SnQrJKfw4HVKIDT2sx3ECaiKa+v1UBMG+KmaZ5u5qLjxfjY8YHy6TbHMwY/NV2Aj
jGgJYN+5E9AJqStzuu4C9fUzgOgvi4IzgNX4hp0WbESVk6jbDqW2KGNAv6IRM9ku
qDg54aL8W50uDMgUJysqe180SlDgeYGPBg1OfA7xrzMaBpEK3z5+v52ZlpGNguDP
MZMGwZgFLCP2R5Z9+zqhyK9UuIHUDNE05y20+Hsm1AxlI0iVVt+CahQ9T/6P9OBT
QQ+vnepQR6ORtFBlPCVgyiTQgsO3Tw==
=+MkS
-----END PGP SIGNATURE-----
Merge tag 'fs_for_v6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull ext2, reiserfs, udf, and quota updates from Jan Kara:
"A couple of small fixes and cleanups for ext2, udf, reiserfs, and
quota.
The biggest change is making CONFIG_PRINT_QUOTA_WARNING depend on
BROKEN with an outlook for removing it completely in an year or so"
* tag 'fs_for_v6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
quota: mark PRINT_QUOTA_WARNING as BROKEN
quota: update Kconfig comment
reiserfs: remove unused iter variable
quota: Use register_sysctl_init() for registering fs_dqstats_table
reiserfs: remove unused sched_count variable
ext2: remove redundant assignment to pointer end
quota: make dquot_set_dqinfo return errors from ->write_info
quota: fixup *_write_file_info() to return proper error code
quota: simplify two-level sysctl registration for fs_dqstats_table
udf: use wrapper i_blocksize() in udf_discard_prealloc()
udf: Use folios in udf_adinicb_writepage()
ext2: Check block size validity during mount
ext2: Correct maximum ext2 filesystem block size
* The data=journal writepath has been significantly cleaned up and
simplified, and reduces a large number of data=journal special cases
by Jan Kara.
* Ojaswin Muhoo has replaced linked list used to track extents that
have been used for inode preallocation with a red-black tree in the
multi-block allocator. This improves performance for workloads
which do a large number of random allocating writes.
* Thanks to Kemeng Shi for a lot of cleanup and bug fixes in the
multi-block allocator.
* Matthew wilcox has converted the code paths for reading and writing
ext4 pages to use folios.
* Jason Yan has continued to factor out ext4_fill_super() into smaller
functions for improve ease of maintenance and comprehension.
* Josh Triplett has created an uapi header for ext4 userspace API's.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAmRHS3IACgkQ8vlZVpUN
gaNN7AgAnFiWfk4UqKpBsUL5iQKJgf2K4tjlNXgPd6ghNns0IdFEyeWSHhr6KLv/
SQeoMMyiWaUcTvZs9DokD8U/9M1ELPUiE9W5c9GxJjM86SXp8BlLYSZTiRoNHzGJ
noQpvikj4qTRviK0rA3q5ICTP2eh1ECHMFJy2wcsZQgwnBelUejQHsTGtOwSvFWF
8wMdfuVtAFDZJjzOxzVKfHP22R5HVRWlAU7P1d97qKjBj4Se3+QchI+zdcIrmU9A
tTmCXj57NpTDyLjS9dIDmLygtTv93lOzOmZS8glw0BFonPcd3ObI4RHVxR+V9xu1
lN13YYgBrK6yfApn9L5XL/31PuLfbg==
=VLBx
-----END PGP SIGNATURE-----
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"There are a number of major cleanups in ext4 this cycle:
- The data=journal writepath has been significantly cleaned up and
simplified, and reduces a large number of data=journal special
cases by Jan Kara.
- Ojaswin Muhoo has replaced linked list used to track extents that
have been used for inode preallocation with a red-black tree in the
multi-block allocator. This improves performance for workloads
which do a large number of random allocating writes.
- Thanks to Kemeng Shi for a lot of cleanup and bug fixes in the
multi-block allocator.
- Matthew wilcox has converted the code paths for reading and writing
ext4 pages to use folios.
- Jason Yan has continued to factor out ext4_fill_super() into
smaller functions for improve ease of maintenance and
comprehension.
- Josh Triplett has created an uapi header for ext4 userspace API's"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (105 commits)
ext4: Add a uapi header for ext4 userspace APIs
ext4: remove useless conditional branch code
ext4: remove unneeded check of nr_to_submit
ext4: move dax and encrypt checking into ext4_check_feature_compatibility()
ext4: factor out ext4_block_group_meta_init()
ext4: move s_reserved_gdt_blocks and addressable checking into ext4_check_geometry()
ext4: rename two functions with 'check'
ext4: factor out ext4_flex_groups_free()
ext4: use ext4_group_desc_free() in ext4_put_super() to save some duplicated code
ext4: factor out ext4_percpu_param_init() and ext4_percpu_param_destroy()
ext4: factor out ext4_hash_info_init()
Revert "ext4: Fix warnings when freezing filesystem with journaled data"
ext4: Update comment in mpage_prepare_extent_to_map()
ext4: Simplify handling of journalled data in ext4_bmap()
ext4: Drop special handling of journalled data from ext4_quota_on()
ext4: Drop special handling of journalled data from ext4_evict_inode()
ext4: Fix special handling of journalled data from extent zeroing
ext4: Drop special handling of journalled data from extent shifting operations
ext4: Drop special handling of journalled data from ext4_sync_file()
ext4: Commit transaction before writing back pages in data=journal mode
...
Several cleanups and fixes for fs/verity/, including a couple minor
fixes to the changes in 6.3 that added support for Merkle tree block
sizes less than the page size.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCZEdweBQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK8EDAP91ondqa2EZ9C87NsBby+lPdo+wkekh
VQ4EMxrg1yhUXQD/fvOp+NeSpnqW8BG3Q78KI7Rgvrpr6CtBF4Lfg4jsWQo=
=EEFA
-----END PGP SIGNATURE-----
Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux
Pull fsverity updates from Eric Biggers:
"Several cleanups and fixes for fs/verity/, including a couple minor
fixes to the changes in 6.3 that added support for Merkle tree block
sizes less than the page size"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
fsverity: reject FS_IOC_ENABLE_VERITY on mode 3 fds
fsverity: explicitly check for buffer overflow in build_merkle_tree()
fsverity: use WARN_ON_ONCE instead of WARN_ON
fs-verity: simplify sysctls with register_sysctl()
fs/buffer.c: use b_folio for fsverity work
A few cleanups for fs/crypto/, and another patch to prepare for the
upcoming CephFS encryption support.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCZEdv1xQcZWJpZ2dlcnNA
Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK/jyAPwNyGlOQGuiFwsu8Mjtppkx4NUHK83K
gp0N/aT2So6C4wD+IlqDVXiTinnmKOoMs/orBsML4Ub/8etuZZDGu4FipgY=
=X+dx
-----END PGP SIGNATURE-----
Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux
Pull fscrypt updates from Eric Biggers:
"A few cleanups for fs/crypto/, and another patch to prepare for the
upcoming CephFS encryption support"
* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux:
fscrypt: optimize fscrypt_initialize()
fscrypt: use WARN_ON_ONCE instead of WARN_ON
fscrypt: new helper function - fscrypt_prepare_lookup_partial()
fs/buffer.c: use b_folio for fscrypt work
When queueing a dispose list to the appropriate "freeme" lists, it
pointlessly queues the objects one at a time to an intermediate list.
Remove a few helpers and just open code a list_move to make it more
clear and efficient. Better document the resulting functions with
kerneldoc comments.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
There have been several bugs over the years where the NFSD splice
actor has attempted to write outside the rq_pages array.
This is a "should never happen" condition, but if for some reason
the pipe splice actor should attempt to walk past the end of
rq_pages, it needs to terminate the READ operation to prevent
corruption of the pointer addresses in the fields just beyond the
array.
A server crash is thus prevented. Since the code is not behaving,
the READ operation returns -EIO to the client. None of the READ
payload data can be trusted if the splice actor isn't operating as
expected.
Suggested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
The get_expiry() function currently returns a timestamp, and uses the
special return value of 0 to indicate an error.
Unfortunately this causes a problem when 0 is the correct return value.
On a system with no RTC it is possible that the boot time will be seen
to be "3". When exportfs probes to see if a particular filesystem
supports NFS export it tries to cache information with an expiry time of
"3". The intention is for this to be "long in the past". Even with no
RTC it will not be far in the future (at most a second or two) so this
is harmless.
But if the boot time happens to have been calculated to be "3", then
get_expiry will fail incorrectly as it converts the number to "seconds
since bootime" - 0.
To avoid this problem we change get_expiry() to report the error quite
separately from the expiry time. The error is now the return value.
The expiry time is reported through a by-reference parameter.
Reported-by: Jerry Zhang <jerry@skydio.com>
Tested-by: Jerry Zhang <jerry@skydio.com>
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
lockd needs to be able to hash filehandles for tracepoints. Move the
nfs_fhandle_hash() helper to a common nfs include file.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Currently lockd just dequeues the block and ignores it if the client
sends a GRANT_RES with a status of nlm_lck_denied. That status is an
indicator that the client has rejected the lock, so the right thing to
do is to unlock the lock we were trying to grant.
Reported-by: Yongcheng Yang <yoyang@redhat.com>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2063818
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
After the wait for a grant is done (for whatever reason), nlmclnt_block
updates the status of the nlm_rqst with the status of the block. At the
point it does this, however, the block is still queued its status could
change at any time.
This is particularly a problem when the waiting task is signaled during
the wait. We can end up giving up on the lock just before the GRANTED_MSG
callback comes in, and accept it even though the lock request gets back
an error, leaving a dangling lock on the server.
Since the nlm_wait never lives beyond the end of nlmclnt_lock, put it on
the stack and add functions to allow us to enqueue and dequeue the
block. Enqueue it just before the lock/wait loop, and dequeue it
just after we exit the loop instead of waiting until the end of
the function. Also, scrape the status at the time that we dequeue it to
ensure that it's final.
Reported-by: Yongcheng Yang <yoyang@redhat.com>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2063818
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
The next patch needs struct nlm_wait in fs/lockd/clntproc.c, so move
the definition to a shared header file. As an added clean-up, drop
the unused b_reclaim field.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
It's easily possible for the server to have an outstanding lock when we
go to shut down. When that happens, we often get a warning like this in
the kernel log:
lockd: couldn't shutdown host module for net f0000000!
This is because the shutdown procedures skip removing any hosts that
still have outstanding resources (locks). Eventually, things seem to get
cleaned up anyway, but the log message is unsettling, and server
shutdown doesn't seem to be working the way it was intended.
Ensure that we tear down any resources held on behalf of a client when
tearing one down for server shutdown.
Reported-by: Yongcheng Yang <yoyang@redhat.com>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2063818
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
While we were converting the nfs4_file hashtable to use the kernel's
resizable hashtable data structure, Neil Brown observed that the
list variant (rhltable) would be better for managing nfsd_file items
as well. The nfsd_file hash table will contain multiple entries for
the same inode -- these should be kept together on a list. And, it
could be possible for exotic or malicious client behavior to cause
the hash table to resize itself on every insertion.
A nice simplification is that rhltable_lookup() can return a list
that contains only nfsd_file items that match a given inode, which
enables us to eliminate specialized hash table helper functions and
use the default functions provided by the rhashtable implementation).
Since we are now storing nfsd_file items for the same inode on a
single list, that effectively reduces the number of hash entries
that have to be tracked in the hash table. The mininum bucket count
is therefore lowered.
Light testing with fstests generic/531 show no regressions.
Suggested-by: Neil Brown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
On most filesystems, there is no reason to delay reaping an nfsd_file
just because its underlying inode is still under writeback. nfsd just
relies on client activity or the local flusher threads to do writeback.
The main exception is NFS, which flushes all of its dirty data on last
close. Add a new EXPORT_OP_FLUSH_ON_CLOSE flag to allow filesystems to
signal that they do this, and only skip closing files under writeback on
such filesystems.
Also, remove a redundant NULL file pointer check in
nfsd_file_check_writeback, and clean up nfs's export op flag
definitions.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Acked-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
The last thing that filp_close does is an fput, so don't bother taking
and putting the extra reference.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
David Howells mentioned that he found this bit of code confusing, so
sprinkle in some comments to clarify.
Reported-by: David Howells <dhowells@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
An error from break_lease is non-fatal, so we needn't destroy the
nfsd_file in that case. Just put the reference like we normally would
and return the error.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
test_bit returns bool, so we can just compare the result of that to the
key->gc value without the "!!".
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Since v4 files are expected to be long-lived, there's little value in
closing them out of the cache when there is conflicting access.
Change the comparator to also match the gc value in the key. Change both
of the current users of that key to set the gc value in the key to
"true".
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Pipes themselves do not hold the the pipe lock across IO, and hence are
safe for RWF_NOWAIT/IOCB_NOWAIT usage. The "contract" for NOWAIT is
really "should not do IO under this lock", not strictly that we cannot
block or that the below code is in any way atomic. Pipes fulfil that
criteria.
Acked-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In preparation for pipes setting FMODE_NOWAIT on pipes to indicate that
RWF_NOWAIT/IOCB_NOWAIT is fully supported, have splice and vmsplice
clear that file flag. Splice holds the pipe lock around IO and cannot
easily be refactored to avoid that, as splice and pipes are inherently
tied together.
By clearing FMODE_NOWAIT if splice is being used on a pipe, other users
of the pipe will know that the pipe is no longer safe for RWF_NOWAIT
and friends.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEe7vIQRWZI0iWSE3xu+CwddJFiJoFAmRCSGEACgkQu+CwddJF
iJpA2wgAkwMP++Znd8JU3iQ4N53lv18euNuEMLTOY+jk7zXHvsRX8KyzLmsohUKO
SSGVi1Om785AidOsJhARJawW7AWYuJ5l7ri+FyskTwrTUcMC4UZ/IT2tB22lRsXi
0f3lgbdArZbj7aq7AVO9N7bh9rgVUHa/RHIwXzMp0sc9nekne9t+FFv7tyRnr7cc
SMp/FdMZqbt9pVf0Uwud1BpdgER7QqQaSfaxITL7D2oJTePRZVWiXerrr4hMcQl1
s6kgUgKdlaYmIx2N8eP1Nmp7undtwHo1C8dLLWKGCEuEAaXIxtXUtaUWFFmBDzH9
Fv6qswNFcfwiLNPsY+xi9iA+vlGKAg==
=T0EM
-----END PGP SIGNATURE-----
Merge tag 'slab-for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab updates from Vlastimil Babka:
"The main change is naturally the SLOB removal. Since its deprecation
in 6.2 I've seen no complaints so hopefully SLUB_(TINY) works well for
everyone and we can proceed.
Besides the code cleanup, the main immediate benefit will be allowing
kfree() family of function to work on kmem_cache_alloc() objects,
which was incompatible with SLOB. This includes kfree_rcu() which had
no kmem_cache_free_rcu() counterpart yet and now it shouldn't be
necessary anymore.
Besides that, there are several small code and comment improvements
from Thomas, Thorsten and Vernon"
* tag 'slab-for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
mm/slab: document kfree() as allowed for kmem_cache_alloc() objects
mm/slob: remove slob.c
mm/slab: remove CONFIG_SLOB code from slab common code
mm, pagemap: remove SLOB and SLQB from comments and documentation
mm, page_flags: remove PG_slob_free
mm/slob: remove CONFIG_SLOB
mm/slub: fix help comment of SLUB_DEBUG
mm: slub: make kobj_type structure constant
slab: Adjust comment after refactoring of gfp.h
These are various cleanups, fixing a number of uapi header files to no
longer reference CONFIG_* symbols, and one patch that introduces the
new CONFIG_HAS_IOPORT symbol for architectures that provide working
inb()/outb() macros, as a preparation for adding driver dependencies
on those in the following release.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmRG8IkACgkQYKtH/8kJ
Uid15Q/9E/neIIEqEk6IvtyhUicrJiIZUM0rGoYtWXiz75ggk6Kx9+3I+j8zIQ/E
kf2TzAG7q9Md7nfTDFLr4FSr0IcNDj+VG4nYxUyDHdKGcARO+g9Kpdvscxip3lgU
Rw5w74Gyd30u4iUKGS39OYuxcCgl9LaFjMA9Gh402Oiaoh+OYLmgQS9h/goUD5KN
Nd+AoFvkdbnHl0/SpxthLRyL5rFEATBmAY7apYViPyMvfjS3gfDJwXJR9jkKgi6X
Qs4t8Op8BA3h84dCuo6VcFqgAJs2Wiq3nyTSUnkF8NxJ2RFTpeiVgfsLOzXHeDgz
SKDB4Lp14o3mlyZyj00MWq1uMJRRetUgNiVb6iHOoKQ/E4demBdh+mhIFRybjM5B
XNTWFcg9PWFCMa4W9jnLfZBc881X4+7T+qUF8I0W/1AbRJUmyGj8HO6jLceC4yGD
UYLn5oFPM6OWXHp6DqJrCr9Yw8h6fuviQZFEbl/ARlgVGt+J4KbYweJYk8DzfX6t
PZIj8LskOqyIpRuC2oDA1PHxkaJ1/z+N5oRBHq1uicSh4fxY5HW7HnyzgF08+R3k
cf+fjAhC3TfGusHkBwQKQJvpxrxZjPuvYXDZ0GxTvNKJRB8eMeiTm1n41E5oTVwQ
swSblSCjZj/fMVVPXLcjxEW4SBNWRxa9Lz3tIPXb3RheU10Lfy8=
=H3k4
-----END PGP SIGNATURE-----
Merge tag 'asm-generic-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
"These are various cleanups, fixing a number of uapi header files to no
longer reference CONFIG_* symbols, and one patch that introduces the
new CONFIG_HAS_IOPORT symbol for architectures that provide working
inb()/outb() macros, as a preparation for adding driver dependencies
on those in the following release"
* tag 'asm-generic-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
Kconfig: introduce HAS_IOPORT option and select it as necessary
scripts: Update the CONFIG_* ignore list in headers_install.sh
pktcdvd: Remove CONFIG_CDROM_PKTCDVD_WCACHE from uapi header
Move bp_type_idx to include/linux/hw_breakpoint.h
Move ep_take_care_of_epollwakeup() to fs/eventpoll.c
Move COMPAT_ATM_ADDPARTY to net/atm/svc.c
When inotify_freeing_mark() races with inotify_handle_inode_event() it
can happen that inotify_handle_inode_event() sees that i_mark->wd got
already reset to -1 and reports this value to userspace which can
confuse the inotify listener. Avoid the problem by validating that wd is
sensible (and pretend the mark got removed before the event got
generated otherwise).
CC: stable@vger.kernel.org
Fixes: 7e790dd5fc93 ("inotify: fix error paths in inotify_update_watch")
Message-Id: <20230424163219.9250-1-jack@suse.cz>
Reported-by: syzbot+4a06d4373fd52f0b2f9c@syzkaller.appspotmail.com
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Before this patch, function gfs2_ail_empty_gl called gfs2_log_flush even
in cases where it encountered an error. It should probably skip the log
flush and leave the file system in an inconsistent state, letting a
subsequent withdraw force the journal to be replayed to reestablish
metadata consistency.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Before this patch, function gfs2_ail_empty_gl would silently return an
error to the caller. This would get silently set into sd_log_error which
would cause a withdraw, but there was no indication why the file system
was withdrawn. This patch adds a fs_err to log the appropriate error
message.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Before this patch, function gfs2_make_fs_ro called gfs2_log_flush once to
finalize the log. However, if there's dirty metadata, log flushes tend
to sync the metadata and formulate revokes. Before this patch, those
revokes may not be written out to the journal immediately, which meant
unresolved glocks could still have revokes in their ail lists. When the
glock worker runs, it tries to transition the glock, but the unresolved
revokes in the ail still need to be written, so it tries to start a
transaction. It's impossible to start a transaction because at that
point, the SDF_JOURNAL_LIVE flag has been cleared by gfs2_make_fs_ro.
That causes the glock worker to fail, unable to write the revokes. The
calling sequence looked something like this:
gfs2_make_fs_ro
gfs2_log_flush - with GFS2_LOG_HEAD_FLUSH_SHUTDOWN flag set
if (flags & GFS2_LOG_HEAD_FLUSH_SHUTDOWN)
clear_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags);
...meanwhile...
glock_work_func
do_xmote
rgrp_go_sync (or possibly inode_go_sync)
...
gfs2_ail_empty_gl
__gfs2_trans_begin
if (unlikely(!test_bit(SDF_JOURNAL_LIVE, &sdp->sd_flags))) {
...
return -EROFS;
The previous patch in the series ("gfs2: return errors from
gfs2_ail_empty_gl") now causes the transaction error to no longer be
ignored, so it causes a warning from MOST of the xfstests:
WARNING: CPU: 11 PID: X at fs/gfs2/super.c:603 gfs2_put_super [gfs2]
which corresponds to:
WARN_ON(gfs2_withdrawing(sdp));
The withdraw was triggered silently from do_xmote by:
if (unlikely(sdp->sd_log_error && !gfs2_withdrawn(sdp)))
gfs2_withdraw_delayed(sdp);
This patch adds a second log_flush to gfs2_make_fs_ro: one to sync the
data and one to sync any outstanding revokes and finalize the journal.
Note that both of these log flushes need to be "special," in other
words, not GFS2_LOG_HEAD_FLUSH_NORMAL.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Before this patch, function gfs2_ail_empty_gl did not return errors it
encountered from __gfs2_trans_begin. Those errors usually came from the
fact that the file system was made read-only, often due to unmount
(but theoretically could be due to -o remount,ro), which prevented
the transaction from starting.
The inability to start a transaction prevented its revokes from being
properly written to the journal for glocks during unmount (and
transition to ro).
That meant glocks could be unlocked without the metadata properly
revoked in the journal. So other nodes could grab the glock thinking
that their lvb values were correct, but in fact corresponded to the
glock without its revokes properly synced. That presented as lvb
mismatch errors.
This patch allows gfs2_ail_empty_gl to return the error properly to
the caller.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZEYC7AAKCRBZ7Krx/gZQ
66NCAP9khf7D5Clz5BsUjlsYoXpBPDaSGhVnpAR8mRQ4/Y8eRQD/dyBVt0FuU72Y
j1w/foMeP3195Bfdp7UwwZLSU+hbxgw=
=M/uq
-----END PGP SIGNATURE-----
Merge tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs pile from Al Viro.
Random minor cleanups.
* tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: Fix description of vfs_tmpfile()
sysv: switch to put_and_unmap_page()
fs/sysv: Don't round down address for kunmap_flush_on_unmap()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZEYDPAAKCRBZ7Krx/gZQ
65qvAQC60ak71gAqOSktR93GXYrQ5Q8hbpug7t3lzDQE9huGqgEAl0Zvr8d5ir3j
y0X5U5Yl6bcUSQDd4VY76C+53yIKZQs=
=rKXE
-----END PGP SIGNATURE-----
Merge tag 'pull-old-dio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull legacy dio cleanup from Al Viro.
* tag 'pull-old-dio' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
__blockdev_direct_IO(): get rid of submit_io callback
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZEYC0gAKCRBZ7Krx/gZQ
6+N5AQCERtd5I2RLm7URX0kurwOthe7o+DX4Lj7y/mcjZV2N4gEAu9VrgHBIuLev
+KuGGY0VnKwCtcgGcGGNBrfrRjyKugs=
=gDk0
-----END PGP SIGNATURE-----
Merge tag 'pull-write-one-page' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs write_one_page removal from Al Viro:
"write_one_page series"
* tag 'pull-write-one-page' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
mm,jfs: move write_one_page/folio_write_one to jfs
ocfs2: don't use write_one_page in ocfs2_duplicate_clusters_by_page
ufs: don't flush page immediately for DIRSYNC directories
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZEYCQAAKCRBZ7Krx/gZQ
64FdAQDZ2hTDyZEWPt486dWYPYpiKyaGFXSXDGo7wgP0fiwxXQEA/mROKb6JqYw6
27mZ9A7qluT8r3AfTTQ0D+Yse/dr4AM=
=GA9W
-----END PGP SIGNATURE-----
Merge tag 'pull-fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fget updates from Al Viro:
"fget() to fdget() conversions"
* tag 'pull-fd' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fuse_dev_ioctl(): switch to fdget()
cgroup_get_from_fd(): switch to fdget_raw()
bpf: switch to fdget_raw()
build_mount_idmapped(): switch to fdget()
kill the last remaining user of proc_ns_fget()
SVM-SEV: convert the rest of fget() uses to fdget() in there
convert sgx_set_attribute() to fdget()/fdput()
convert setns(2) to fdget()/fdput()
- Add sub-page block size support for uncompressed files;
- Support flattened block device for multi-blob images to be attached
into virtual machines (including cloud servers) and bare metals;
- Support long xattr name prefixes to optimize images with common
xattr namespaces (e.g. files with overlayfs xattrs) use cases;
- Various minor cleanups & fixes.
-----BEGIN PGP SIGNATURE-----
iIcEABYIAC8WIQThPAmQN9sSA0DVxtI5NzHcH7XmBAUCZETCNREceGlhbmdAa2Vy
bmVsLm9yZwAKCRA5NzHcH7XmBJCMAP9VkAPycbbqa6qWUASdyh/HGyuLJTHSfmsJ
zO4y6hBgOwD9GXg55sY8ycvcOx9ayaUt5V5f9zhs4wdGcoPhj5fWzgA=
=nUva
-----END PGP SIGNATURE-----
Merge tag 'erofs-for-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs updates from Gao Xiang:
"In this cycle, sub-page block support for uncompressed files is
available. It's mainly used to enable original signing ('golden')
4k-block images on arm64 with 16/64k pages. In addition, end users
could also use this feature to build a manifest to directly refer to
golden tar data.
Besides, long xattr name prefix support is also introduced in this
cycle to avoid too many xattrs with the same prefix (e.g. overlayfs
xattrs). It's useful for erofs + overlayfs combination (like Composefs
model): the image size is reduced by ~14% and runtime performance is
also slightly improved.
Others are random fixes and cleanups as usual.
Summary:
- Add sub-page block size support for uncompressed files
- Support flattened block device for multi-blob images to be attached
into virtual machines (including cloud servers) and bare metals
- Support long xattr name prefixes to optimize images with common
xattr namespaces (e.g. files with overlayfs xattrs) use cases
- Various minor cleanups & fixes"
* tag 'erofs-for-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: cleanup i_format-related stuffs
erofs: sunset erofs_dbg()
erofs: fix potential overflow calculating xattr_isize
erofs: get rid of z_erofs_fill_inode()
erofs: enable long extended attribute name prefixes
erofs: handle long xattr name prefixes properly
erofs: add helpers to load long xattr name prefixes
erofs: introduce on-disk format for long xattr name prefixes
erofs: move packed inode out of the compression part
erofs: keep meta inode into erofs_buf
erofs: initialize packed inode after root inode is assigned
erofs: stop parsing non-compact HEAD index if clusterofs is invalid
erofs: don't warn ztailpacking feature anymore
erofs: simplify erofs_xattr_generic_get()
erofs: rename init_inode_xattrs with erofs_ prefix
erofs: move several xattr helpers into xattr.c
erofs: tidy up EROFS on-disk naming
erofs: support flattened block device for multi-blob images
erofs: set block size to the on-disk block size
erofs: avoid hardcoded blocksize for subpage block support
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZEEn8AAKCRCRxhvAZXjc
okJoAQC1bjJp4SQw7VQhTuyv0Ak67PACwKPNUPQyHcqMV5s5DAD/fcnMjq7+UieH
TEk/zRBGWWI8m0wb51MMO+VVM2GeXwI=
=EKj/
-----END PGP SIGNATURE-----
Merge tag 'v6.4/vfs.open' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs open fixlet from Christian Brauner:
"EINVAL ist keinmal: This contains the changes to make O_DIRECTORY when
specified together with O_CREAT an invalid request.
The wider background is that a regression report about the behavior of
O_DIRECTORY | O_CREAT was sent to fsdevel about a behavior that was
changed multiple years and LTS releases earlier during v5.7
development.
This has also been covered in
https://lwn.net/Articles/926782/
which provides an excellent summary of the discussion"
* tag 'v6.4/vfs.open' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
open: return EINVAL for O_DIRECTORY | O_CREAT
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZEJxiQAKCRCRxhvAZXjc
oj4jAP4pt/SuHdZQzv9k2j+UA4hpDmwt7UUWhLxtycQrt9oEoQD/U9hCwnYY9Ksj
5TSZiK/OGyoslmF1zfAC+1R2crgr3A0=
=PpBS
-----END PGP SIGNATURE-----
Merge tag 'v6.4/vfs.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner:
"This contains a pile of various smaller fixes. Most of them aren't
very interesting so this just highlights things worth mentioning:
- Various filesystems contained the same little helper to convert
from the mode of a dentry to the DT_* type of that dentry.
They have now all been switched to rely on the generic
fs_umode_to_dtype() helper. All custom helpers are removed (Jeff)
- Fsnotify now reports ACCESS and MODIFY events for splice
(Chung-Chiang Cheng)
- After converting timerfd a long time ago to rely on
wait_event_interruptible_*() apis, convert eventfd as well. This
removes the complex open-coded wait code (Wen Yang)
- Simplify sysctl registration for devpts, avoiding the declaration
of two tables. Instead, just use a prefixed path with
register_sysctl() (Luis)
- The setattr_should_drop_sgid() helper is now exported so NFS can
use it. By switching NFS to this helper an NFS setgid inheritance
bug is fixed (me)"
* tag 'v6.4/vfs.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs: hfsplus: remove WARN_ON() from hfsplus_cat_{read,write}_inode()
pnode: pass mountpoint directly
eventfd: use wait_event_interruptible_locked_irq() helper
splice: report related fsnotify events
fs: consolidate duplicate dt_type helpers
nfs: use vfs setgid helper
Update relatime comments to include equality
fs/buffer: Remove redundant assignment to err
fs_context: drop the unused lsm_flags member
fs/namespace: fnic: Switch to use %ptTd
Documentation: update idmappings.rst
devpts: simplify two-level sysctl registration for pty_kern_table
eventpoll: align comment with nested epoll limitation
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZEEhwgAKCRCRxhvAZXjc
otwgAQDXHnKiPm/d76lITXbxdUNCtvZz+ig26EbOrD+vEszzIQEA81dru0QbCNCt
ctoZdcsmtKbt2VaYQF1CDOhlnNg5VQM=
=pER1
-----END PGP SIGNATURE-----
Merge tag 'v6.4/vfs.acl' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull acl updates from Christian Brauner:
"After finishing the introduction of the new posix acl api last cycle
the generic POSIX ACL xattr handlers are still around in the
filesystems xattr handlers for two reasons:
(1) Because a few filesystems rely on the ->list() method of the
generic POSIX ACL xattr handlers in their ->listxattr() inode
operation.
(2) POSIX ACLs are only available if IOP_XATTR is raised. The
IOP_XATTR flag is raised in inode_init_always() based on whether
the sb->s_xattr pointer is non-NULL. IOW, the registered xattr
handlers of the filesystem are used to raise IOP_XATTR. Removing
the generic POSIX ACL xattr handlers from all filesystems would
risk regressing filesystems that only implement POSIX ACL support
and no other xattrs (nfs3 comes to mind).
This contains the work to decouple POSIX ACLs from the IOP_XATTR flag
as they don't depend on xattr handlers anymore. So it's now possible
to remove the generic POSIX ACL xattr handlers from the sb->s_xattr
list of all filesystems. This is a crucial step as the generic POSIX
ACL xattr handlers aren't used for POSIX ACLs anymore and POSIX ACLs
don't depend on the xattr infrastructure anymore.
Adressing problem (1) will require more long-term work. It would be
best to get rid of the ->list() method of xattr handlers completely at
some point.
For erofs, ext{2,4}, f2fs, jffs2, ocfs2, and reiserfs the nop POSIX
ACL xattr handler is kept around so they can continue to use
array-based xattr handler indexing.
This update does simplify the ->listxattr() implementation of all
these filesystems however"
* tag 'v6.4/vfs.acl' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
acl: don't depend on IOP_XATTR
ovl: check for ->listxattr() support
reiserfs: rework priv inode handling
fs: rename generic posix acl handlers
reiserfs: rework ->listxattr() implementation
fs: simplify ->listxattr() implementation
fs: drop unused posix acl handlers
xattr: remove unused argument
xattr: add listxattr helper
xattr: simplify listxattr helpers
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZEEt8gAKCRCRxhvAZXjc
oppuAQDu9kwAQWAl0KzlpjQkrEDAEuyHRy6SCpo1kPPD5f3rigD+INZb3fi2QXmK
ZL/c6XtII9ah/8i2zfzAgH9Q2ZZu0gk=
=xcAX
-----END PGP SIGNATURE-----
Merge tag 'v6.4/pidfd.file' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull pidfd updates from Christian Brauner:
"This adds a new pidfd_prepare() helper which allows the caller to
reserve a pidfd number and allocates a new pidfd file that stashes the
provided struct pid.
It should be avoided installing a file descriptor into a task's file
descriptor table just to close it again via close_fd() in case an
error occurs. The fd has been visible to userspace and might already
be in use. Instead, a file descriptor should be reserved but not
installed into the caller's file descriptor table.
If another failure path is hit then the reserved file descriptor and
file can just be put without any userspace visible side-effects. And
if all failure paths are cleared the file descriptor and file can be
installed into the task's file descriptor table.
This helper is now used in all places that open coded this
functionality before. For example, this is currently done during
copy_process() and fanotify used pidfd_create(), which returns a pidfd
that has already been made visibile in the caller's file descriptor
table, but then closed it using close_fd().
In one of the next merge windows there is also new functionality
coming to unix domain sockets that will have to rely on
pidfd_prepare()"
* tag 'v6.4/pidfd.file' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
fanotify: use pidfd_prepare()
fork: use pidfd_prepare()
pid: add pidfd_prepare()
still a fair amount going on, including:
- Reorganizing the architecture-specific documentation under
Documentation/arch. This makes the structure match the source directory
and helps to clean up the mess that is the top-level Documentation
directory a bit. This work creates the new directory and moves x86 and
most of the less-active architectures there. The current plan is to move
the rest of the architectures in 6.5, with the patches going through the
appropriate subsystem trees.
- Some more Spanish translations and maintenance of the Italian
translation.
- A new "Kernel contribution maturity model" document from Ted.
- A new tutorial on quickly building a trimmed kernel from Thorsten.
Plus the usual set of updates and fixes.
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmRGze0PHGNvcmJldEBs
d24ubmV0AAoJEBdDWhNsDH5Y/VsH/RyWqinorRVFZmHqRJMRhR0j7hE2pAgK5prE
dGXYVtHHNQ+25thNaqhZTOLYFbSX6ii2NG7sLRXmyOTGIZrhUCFFXCHkuq4ZUypR
gJpMUiKQVT4dhln3gIZ0k09NSr60gz8UTcq895N9UFpUdY1SCDhbCcLc4uXTRajq
NrdgFaHWRkPb+gBRbXOExYm75DmCC6Ny5AyGo2rXfItV//ETjWIJVQpJhlxKrpMZ
3LgpdYSLhEFFnFGnXJ+EAPJ7gXDi2Tg5DuPbkvJyFOTouF3j4h8lSS9l+refMljN
xNRessv+boge/JAQidS6u8F2m2ESSqSxisv/0irgtKIMJwXaoX4=
=1//8
-----END PGP SIGNATURE-----
Merge tag 'docs-6.4' of git://git.lwn.net/linux
Pull documentation updates from Jonathan Corbet:
"Commit volume in documentation is relatively low this time, but there
is still a fair amount going on, including:
- Reorganize the architecture-specific documentation under
Documentation/arch
This makes the structure match the source directory and helps to
clean up the mess that is the top-level Documentation directory a
bit. This work creates the new directory and moves x86 and most of
the less-active architectures there.
The current plan is to move the rest of the architectures in 6.5,
with the patches going through the appropriate subsystem trees.
- Some more Spanish translations and maintenance of the Italian
translation
- A new "Kernel contribution maturity model" document from Ted
- A new tutorial on quickly building a trimmed kernel from Thorsten
Plus the usual set of updates and fixes"
* tag 'docs-6.4' of git://git.lwn.net/linux: (47 commits)
media: Adjust column width for pdfdocs
media: Fix building pdfdocs
docs: clk: add documentation to log which clocks have been disabled
docs: trace: Fix typo in ftrace.rst
Documentation/process: always CC responsible lists
docs: kmemleak: adjust to config renaming
ELF: document some de-facto PT_* ABI quirks
Documentation: arm: remove stih415/stih416 related entries
docs: turn off "smart quotes" in the HTML build
Documentation: firmware: Clarify firmware path usage
docs/mm: Physical Memory: Fix grammar
Documentation: Add document for false sharing
dma-api-howto: typo fix
docs: move m68k architecture documentation under Documentation/arch/
docs: move parisc documentation under Documentation/arch/
docs: move ia64 architecture docs under Documentation/arch/
docs: Move arc architecture docs under Documentation/arch/
docs: move nios2 documentation under Documentation/arch/
docs: move openrisc documentation under Documentation/arch/
docs: move superh documentation under Documentation/arch/
...
o MAINTAINERS files additions and changes.
o Fix hotplug warning in nohz code.
o Tick dependency changes by Zqiang.
o Lazy-RCU shrinker fixes by Zqiang.
o rcu-tasks stall reporting improvements by Neeraj.
o Initial changes for renaming of k[v]free_rcu() to its new k[v]free_rcu_mightsleep()
name for robustness.
o Documentation Updates:
o Significant changes to srcu_struct size.
o Deadlock detection for srcu_read_lock() vs synchronize_srcu() from Boqun.
o rcutorture and rcu-related tool, which are targeted for v6.4 from Boqun's tree.
o Other misc changes.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEcoCIrlGe4gjE06JJqA4nf2o45hAFAmQuBnIACgkQqA4nf2o4
5hACVRAAoXu7/gfh5Pjw9O4E4pCdPJKsZZVYrcrVGrq6NAxRn6M1SgurAdC5grj2
96x0waoGaiO82V0H5iJMcKdAVu67x9R8WaQ1JoxN75Efn8h9W4TguB87TV1gk0xS
eZ18b/CyEaM5mNb80DFFF4FLohy5737p/kNTMqXQdUyR1BsDl16iRMgjiBiFhNUx
yPo8Y2kC2U2OTbldZgaE7s9bQO3xxEcifx93sGWsAex/gx54FYNisiwSlCOSgOE+
XkYo/OKk8Xvr82tLVX8XQVEPCMJ+rxea8T5zSs8/alvsPq7gA8wW3y6fsoa3vUU/
+Gd+W+Q/OsONIDtp8rQAY1qsD0ScDpaR8052RSH0zTa7pj8HsQgE5PjZ+cJW0SEi
cKN+Oe8+ETqKald+xZ6PDf58O212VLrru3RpQWrOQcJ7fmKmfT4REK0RcbLgg4qT
CBgOo6eg+ub4pxq2y11LZJBNTv1/S7xAEzFE0kArew64KB2gyVud0VJRZVAJnEfe
93QQVDFrwK2bhgWQZ6J6IbTvGeQW0L93IibuaU6jhZPR283VtUIIvM7vrOylN7Fq
4jsae0T7YGYfKUhgTpm7rCnm8A/D3Ni8MY0sKYYgDSyKmZUsnpI5wpx1xke4lwwV
ErrY46RCFa+k8wscc6iWfB4cGXyyFHyu+wtyg0KpFn5JAzcfz4A=
=Rgbj
-----END PGP SIGNATURE-----
Merge tag 'rcu.6.4.april5.2023.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux
Pull RCU updates from Joel Fernandes:
- Updates and additions to MAINTAINERS files, with Boqun being added to
the RCU entry and Zqiang being added as an RCU reviewer.
I have also transitioned from reviewer to maintainer; however, Paul
will be taking over sending RCU pull-requests for the next merge
window.
- Resolution of hotplug warning in nohz code, achieved by fixing
cpu_is_hotpluggable() through interaction with the nohz subsystem.
Tick dependency modifications by Zqiang, focusing on fixing usage of
the TICK_DEP_BIT_RCU_EXP bitmask.
- Avoid needless calls to the rcu-lazy shrinker for CONFIG_RCU_LAZY=n
kernels, fixed by Zqiang.
- Improvements to rcu-tasks stall reporting by Neeraj.
- Initial renaming of k[v]free_rcu() to k[v]free_rcu_mightsleep() for
increased robustness, affecting several components like mac802154,
drbd, vmw_vmci, tracing, and more.
A report by Eric Dumazet showed that the API could be unknowingly
used in an atomic context, so we'd rather make sure they know what
they're asking for by being explicit:
https://lore.kernel.org/all/20221202052847.2623997-1-edumazet@google.com/
- Documentation updates, including corrections to spelling,
clarifications in comments, and improvements to the srcu_size_state
comments.
- Better srcu_struct cache locality for readers, by adjusting the size
of srcu_struct in support of SRCU usage by Christoph Hellwig.
- Teach lockdep to detect deadlocks between srcu_read_lock() vs
synchronize_srcu() contributed by Boqun.
Previously lockdep could not detect such deadlocks, now it can.
- Integration of rcutorture and rcu-related tools, targeted for v6.4
from Boqun's tree, featuring new SRCU deadlock scenarios, test_nmis
module parameter, and more
- Miscellaneous changes, various code cleanups and comment improvements
* tag 'rcu.6.4.april5.2023.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jfern/linux: (71 commits)
checkpatch: Error out if deprecated RCU API used
mac802154: Rename kfree_rcu() to kvfree_rcu_mightsleep()
rcuscale: Rename kfree_rcu() to kfree_rcu_mightsleep()
ext4/super: Rename kfree_rcu() to kfree_rcu_mightsleep()
net/mlx5: Rename kfree_rcu() to kfree_rcu_mightsleep()
net/sysctl: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
lib/test_vmalloc.c: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
tracing: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
misc: vmw_vmci: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
drbd: Rename kvfree_rcu() to kvfree_rcu_mightsleep()
rcu: Protect rcu_print_task_exp_stall() ->exp_tasks access
rcu: Avoid stack overflow due to __rcu_irq_enter_check_tick() being kprobe-ed
rcu-tasks: Report stalls during synchronize_srcu() in rcu_tasks_postscan()
rcu: Permit start_poll_synchronize_rcu_expedited() to be invoked early
rcu: Remove never-set needwake assignment from rcu_report_qs_rdp()
rcu: Register rcu-lazy shrinker only for CONFIG_RCU_LAZY=y kernels
rcu: Fix missing TICK_DEP_MASK_RCU_EXP dependency check
rcu: Fix set/clear TICK_DEP_BIT_RCU_EXP bitmask race
rcu/trace: use strscpy() to instead of strncpy()
tick/nohz: Fix cpu_is_hotpluggable() by checking with nohz subsystem
...
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEES0KozwfymdVUl37v6iDy2pc3iXMFAmRBolwUHHBhdWxAcGF1
bC1tb29yZS5jb20ACgkQ6iDy2pc3iXMy/w//YOB9EJ7hpAGouq0Il+SyWdLQP1Bw
dOaJ5Xs0zDUQJsloqLpkk83aKocHXnl2jIE0mYVhfX2tdd2odKv/qKFcSPCBx1pf
STRHsDBkNfi9wldAWZ6y92WZk9l0lqwdP/sJ4TMsrJLEnkeOBwcwAA4zzPRVu+dN
aJQkSCj/5hF7r7/BvpfO+78O2h3dC42L6SepHrjnc/btSZ4qW4dPMJfTD7zT6r5Y
tVRD/IZ+f7cakKulnWvOIXNGR45CTdE6TiPd9mxkbA2I86wvEec6jLIYtpPoEmtU
+vENXjKDAX+Af3DyIC0rZECBFoAjLR0Myi75i74Haug0nxPyPqcjDKKYpfKwYxT0
CH1LHx4rHUbUvXz4tbLuEiNEb5ZX+P5Rpklev8aijvQ/3iVjdzkg74a4QDZcHi8K
1V/uKSBcC6De3789KmwEYIQu35cXqbT5TscuK4Hf8fdHcPZGRvjps12JSkuRhrIQ
B5vJ4AZ3O5CWXO9u/n9czssnQ0WHSFFy1/OEpsVgXLpYMwP4xIr0q+C3n1Efnxnp
HjoqE1N8bgsV4hYzwZwX3z490Vo4V3S6cpYp40UoeiJ0bJup5WuBselOSnZozyLQ
hxxNHXFY8QtwQ0Ik4rTHfttwa28DE6qF+zh6mJDdgdbLfmlBGn3EaW9cwJrCiQ6X
pZ6R6SdwFdyj7Uk=
=JtiD
-----END PGP SIGNATURE-----
Merge tag 'lsm-pr-20230420' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull lsm updates from Paul Moore:
- Move the LSM hook comment blocks into security/security.c
For many years the LSM hook comment blocks were located in a very odd
place, include/linux/lsm_hooks.h, where they lived on their own,
disconnected from both the function prototypes and definitions.
In keeping with current kernel conventions, this moves all of these
comment blocks to the top of the function definitions, transforming
them into the kdoc format in the process. This should make it much
easier to maintain these comments, which are the main source of LSM
hook documentation.
For the most part the comment contents were left as-is, although some
glaring errors were corrected. Expect additional edits in the future
as we slowly update and correct the comment blocks.
This is the bulk of the diffstat.
- Introduce LSM_ORDER_LAST
Similar to how LSM_ORDER_FIRST is used to specify LSMs which should
be ordered before "normal" LSMs, the LSM_ORDER_LAST is used to
specify LSMs which should be ordered after "normal" LSMs.
This is one of the prerequisites for transitioning IMA/EVM to a
proper LSM.
- Remove the security_old_inode_init_security() hook
The security_old_inode_init_security() LSM hook only allows for a
single xattr which is problematic both for LSM stacking and the
IMA/EVM-as-a-LSM effort. This finishes the conversion over to the
security_inode_init_security() hook and removes the single-xattr LSM
hook.
- Fix a reiserfs problem with security xattrs
During the security_old_inode_init_security() removal work it became
clear that reiserfs wasn't handling security xattrs properly so we
fixed it.
* tag 'lsm-pr-20230420' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (32 commits)
reiserfs: Add security prefix to xattr name in reiserfs_security_write()
security: Remove security_old_inode_init_security()
ocfs2: Switch to security_inode_init_security()
reiserfs: Switch to security_inode_init_security()
security: Remove integrity from the LSM list in Kconfig
Revert "integrity: double check iint_cache was initialized"
security: Introduce LSM_ORDER_LAST and set it for the integrity LSM
device_cgroup: Fix typo in devcgroup_css_alloc description
lsm: fix a badly named parameter in security_get_getsecurity()
lsm: fix doc warnings in the LSM hook comments
lsm: styling fixes to security/security.c
lsm: move the remaining LSM hook comments to security/security.c
lsm: move the io_uring hook comments to security/security.c
lsm: move the perf hook comments to security/security.c
lsm: move the bpf hook comments to security/security.c
lsm: move the audit hook comments to security/security.c
lsm: move the binder hook comments to security/security.c
lsm: move the sysv hook comments to security/security.c
lsm: move the key hook comments to security/security.c
lsm: move the xfrm hook comments to security/security.c
...
This comment make no sense and is in the wrong place, so let's
remove it.
Signed-off-by: Qi Han <hanqi@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
When a node block is missing for atomic write block replacement, we need
to allocate it in advance of the replacement.
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Need to use cow inode data content instead of the one in the original
inode, when we try to write the already updated atomic write files.
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>