IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
In commit f61b9bb3f838 ("fs: add a new SB_I_NOUMASK flag") we added a
new SB_I_NOUMASK flag that is used by filesystems like NFS to indicate
that umask stripping is never supposed to be done in the vfs independent
of whether or not POSIX ACLs are supported.
Overlayfs falls into the same category as it raises SB_POSIXACL
unconditionally to defer umask application to the upper filesystem.
Now that we have SB_I_NOUMASK use that and make SB_POSIXACL properly
conditional on whether or not the kernel does have support for it. This
will enable use to turn IS_POSIXACL() into nop on kernels that don't
have POSIX ACL support avoding bugs from missed umask stripping.
Link: https://lore.kernel.org/r/20231012-einband-uferpromenade-80541a047a1f@brauner
Acked-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
A backing file struct stores two path's, one "real" path that is referring
to f_inode and one "fake" path, which should be displayed to users in
/proc/<pid>/maps.
There is a lot more potential code that needs to know the "real" path, then
code that needs to know the "fake" path.
Instead of code having to request the "real" path with file_real_path(),
store the "real" path in f_path and require code that needs to know the
"fake" path request it with file_user_path().
Replace the file_real_path() helper with a simple const accessor f_path().
After this change, file_dentry() is not expected to observe any files
with overlayfs f_path and real f_inode, so the call to ->d_real() should
not be needed. Leave the ->d_real() call for now and add an assertion
in ovl_d_real() to catch if we made wrong assumptions.
Suggested-by: Miklos Szeredi <miklos@szeredi.hu>
Link: https://lore.kernel.org/r/CAJfpegtt48eXhhjDFA1ojcHPNKj3Go6joryCPtEFAKpocyBsnw@mail.gmail.com/
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://lore.kernel.org/r/20231009153712.1566422-4-amir73il@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
Overlayfs uses backing files with "fake" overlayfs f_path and "real"
underlying f_inode, in order to use underlying inode aops for mapped
files and to display the overlayfs path in /proc/<pid>/maps.
In preparation for storing the overlayfs "fake" path instead of the
underlying "real" path in struct backing_file, define a noop helper
file_user_path() that returns f_path for now.
Use the new helper in procfs and kernel logs whenever a path of a
mapped file is displayed to users.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://lore.kernel.org/r/20231009153712.1566422-3-amir73il@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
A writeable mapped backing file can perform writes to the real inode.
Therefore, the real path mount must be kept writable so long as the
writable map exists.
This may not be strictly needed for ovelrayfs private upper mount,
but it is correct to take the mnt_writers count in the vfs helper.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://lore.kernel.org/r/20231009153712.1566422-2-amir73il@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
So happens it already was not doing it, but there is no need to "hope"
as indicated in the comment.
No changes in generated assembly.
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://lore.kernel.org/r/20231004111916.728135-3-mjguzik@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
Backing files as used by overlayfs are never installed into file
descriptor tables and are explicitly documented as such. They aren't
subject to rcu access conditions like regular files are.
Their lifetime is bound to the lifetime of the overlayfs file, i.e.,
they're stashed in ovl_file->private_data and go away otherwise. If
they're set as vma->vm_file - which is their main purpose - then they're
subject to regular refcount rules and vma->vm_file can't be installed
into an fdtable after having been set. All in all I don't see any need
for rcu delay here. So free it directly.
This all hinges on such hybrid beasts to never actually be installed
into fdtables which - as mentioned before - is not allowed. So add an
explicit WARN_ON_ONCE() so we catch any case where someone is suddenly
trying to install one of those things into a file descriptor table so we
can have a nice long chat with them.
Link: https://lore.kernel.org/r/20231005-sakralbau-wappnen-f5c31755ed70@brauner
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
In recent discussions around some performance improvements in the file
handling area we discussed switching the file cache to rely on
SLAB_TYPESAFE_BY_RCU which allows us to get rid of call_rcu() based
freeing for files completely. This is a pretty sensitive change overall
but it might actually be worth doing.
The main downside is the subtlety. The other one is that we should
really wait for Jann's patch to land that enables KASAN to handle
SLAB_TYPESAFE_BY_RCU UAFs. Currently it doesn't but a patch for this
exists.
With SLAB_TYPESAFE_BY_RCU objects may be freed and reused multiple times
which requires a few changes. So it isn't sufficient anymore to just
acquire a reference to the file in question under rcu using
atomic_long_inc_not_zero() since the file might have already been
recycled and someone else might have bumped the reference.
In other words, callers might see reference count bumps from newer
users. For this reason it is necessary to verify that the pointer is the
same before and after the reference count increment. This pattern can be
seen in get_file_rcu() and __files_get_rcu().
In addition, it isn't possible to access or check fields in struct file
without first aqcuiring a reference on it. Not doing that was always
very dodgy and it was only usable for non-pointer data in struct file.
With SLAB_TYPESAFE_BY_RCU it is necessary that callers first acquire a
reference under rcu or they must hold the files_lock of the fdtable.
Failing to do either one of this is a bug.
Thanks to Jann for pointing out that we need to ensure memory ordering
between reallocations and pointer check by ensuring that all subsequent
loads have a dependency on the second load in get_file_rcu() and
providing a fixup that was folded into this patch.
Cc: Jann Horn <jannh@google.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Failed opens (mostly ENOENT) legitimately happen a lot, for example here
are stats from stracing kernel build for few seconds (strace -fc make):
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ------------------
0.76 0.076233 5 15040 3688 openat
(this is tons of header files tried in different paths)
In the common case of there being nothing to close (only the file object
to free) there is a lot of overhead which can be avoided.
This is most notably delegation of freeing to task_work, which comes
with an enormous cost (see 021a160abf62 ("fs: use __fput_sync in
close(2)" for an example).
Benchmarked with will-it-scale with a custom testcase based on
tests/open1.c, stuffed into tests/openneg.c:
[snip]
while (1) {
int fd = open("/tmp/nonexistent", O_RDONLY);
assert(fd == -1);
(*iterations)++;
}
[/snip]
Sapphire Rapids, openneg_processes -t 1 (ops/s):
before: 1950013
after: 2914973 (+49%)
file refcount is checked as a safety belt against buggy consumers with
an atomic cmpxchg. Technically it is not necessary, but it happens to
not be measurable due to several other atomics which immediately follow.
Optmizing them away to make this atomic into a problem is left as an
exercise for the reader.
v2:
- unexport fput_badopen and move to fs/internal.h
- handle the refcount with cmpxchg, adjust commentary accordingly
- tweak the commit message
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://lore.kernel.org/r/20230926162228.68666-1-mjguzik@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
Because 'inode' is being initialised before checking if 'dentry' is negative
it looks like an extra iput() on 'inode' may happen since the ihold() is
done only if the dentry is *not* negative. In reality this doesn't happen
because d_is_negative() is never true if ->d_inode is NULL. This patch only
makes the code easier to understand, as I was initially mislead by it.
Signed-off-by: Luís Henriques <lhenriques@suse.de>
Link: https://lore.kernel.org/r/20230928152341.303-1-lhenriques@suse.de
Signed-off-by: Christian Brauner <brauner@kernel.org>
If there is no watch_queue, holding the pipe mutex is enough to
prevent concurrent writes, and we can avoid the spinlock.
O_NOTIFICATION_QUEUE is an exotic and rarely used feature, and of all
the pipes that exist at any given time, only very few actually have a
watch_queue, therefore it appears worthwile to optimize the common
case.
This patch does not optimize pipe_resize_ring() where the spinlocks
could be avoided as well; that does not seem like a worthwile
optimization because this function is not called often.
Related commits:
- commit 8df441294dd3 ("pipe: Check for ring full inside of the
spinlock in pipe_write()")
- commit b667b8673443 ("pipe: Advance tail pointer inside of wait
spinlock in pipe_read()")
- commit 189b0ddc2451 ("pipe: Fix missing lock in pipe_resize_ring()")
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Message-Id: <20230921075755.1378787-4-max.kellermann@ionos.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
This reverts commit 8df441294dd3 ("pipe: Check for ring full inside of
the spinlock in pipe_write()") which was obsoleted by commit
c73be61cede ("pipe: Add general notification queue support") because
now pipe_write() fails early with -EXDEV if there is a watch_queue.
Without a watch_queue, no notifications can be posted to the pipe and
mutex protection is enough, as can be seen in splice_pipe_to_pipe()
which does not use the spinlock either.
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Message-Id: <20230921075755.1378787-3-max.kellermann@ionos.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
This declutters the code by reducing the number of #ifdefs and makes
the watch_queue checks simpler. This has no runtime effect; the
machine code is identical.
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Message-Id: <20230921075755.1378787-2-max.kellermann@ionos.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
SB_POSIXACL must be set when a filesystem supports POSIX ACLs, but NFSv4
also sets this flag to prevent the VFS from applying the umask on
newly-created files. NFSv4 doesn't support POSIX ACLs however, which
causes confusion when other subsystems try to test for them.
Add a new SB_I_NOUMASK flag that allows filesystems to opt-in to umask
stripping without advertising support for POSIX ACLs. Set the new flag
on NFSv4 instead of SB_POSIXACL.
Also, move mode_strip_umask to namei.h and convert init_mknod and
init_mkdir to use it.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Message-Id: <20230911-acl-fix-v3-1-b25315333f6c@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Overlayfs is going to use those to get write access on the upper mount
during entire copy up without taking freeze protection on upper sb for
the entire copy up.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Message-Id: <20230908132900.2983519-3-amir73il@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Before exporting these helpers to modules, make their names more
meaningful.
The names mnt_{get,put)_write_access*() were chosen, because they rhyme
with the inode {get,put)_write_access() helpers, which have a very close
meaning for the inode object.
Suggested-by: Christian Brauner <brauner@kernel.org>
Link: https://lore.kernel.org/r/20230817-anfechtbar-ruhelosigkeit-8c6cca8443fc@brauner/
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Message-Id: <20230908132900.2983519-2-amir73il@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmT8qQwACgkQiiy9cAdy
T1Hjzgv/dCmqowHN9fI1gP/SpkbyZt0/GrlhpRMELQ2THAYuJmWQ9i/vR7VFMMv7
C7vfx3nYTQfQAcsshA+bfOkABr02A9HD5HuBEhWldEFQD/qIPrPMyU53DJ6m4QOb
81/8ox6XJMDNHWJUPJVfhAyHoKBxYyNcqqkITnBcSpqENbm7IZ8UhUJe1l75nkBi
IYR28e4Aci5uGPuMB7NAuKmJM0IKi1ipMy4MCXkSD/vQZStz76mhzx200Wm6ObuX
jqPzIb32simsVY7q/Hbkxt2MPDUbsn6kBhOSe8oJVxEgrMy56pZCOByPjc0ZFzE6
SziZ0JanrfjlGdBg5qjmcUamEMV44Oi1QQ6PpkHRA2zDyXMpv8ulwy/3Z40O7zbb
WBBar0HsJE5osCL0jiwLbNorctEnmtSKWcyXOI924Bli4eesE/ojq2u4H4KrlXzA
gDqPwAhhw/PrbuEmgRy9maYE8KwutZh2HCRvxgFB1NgbcNLvKgkQrly4QP29hKQy
+HViqKAe
=6QhY
-----END PGP SIGNATURE-----
Merge tag '6.6-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:
- six smb3 client fixes including ones to allow controlling smb3
directory caching timeout and limits, and one debugging improvement
- one fix for nls Kconfig (don't need to expose NLS_UCS2_UTILS option)
- one minor spnego registry update
* tag '6.6-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
spnego: add missing OID to oid registry
smb3: fix minor typo in SMB2_GLOBAL_CAP_LARGE_MTU
cifs: update internal module version number for cifs.ko
smb3: allow controlling maximum number of cached directories
smb3: add trace point for queryfs (statfs)
nls: Hide new NLS_UCS2_UTILS
smb3: allow controlling length of time directory entries are cached with dir leases
smb: propagate error code of extract_sharename()
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmT7eEwACgkQiiy9cAdy
T1EY1Qv+OIgraDJ23wl4FX17dEQLKqcVH5wCWhU9IEq9oIhKSMngDE4kAKu/unEJ
GOKy9QfY2eshPWHCoZjh9lI1Xg776nC3GeAtekOFHXX9FXRZfWP0+Sp6HRNk1TxD
iTrRUDGnGA20QF7tgxNo5EJb2yvxBmEmhiSTwlFUuV+oOjvbQw0eB2EheEEzhPJF
7qrlvC0GvvgUOrVVg6YEMCNFb9bvrns06Rwrl3Hq86gaauXIg4pBIF/AHiwh7jCo
+upXo1vgfWp29XMwX79LqCOx9u96q0i5AZonXTN2W/AxsgsWA6RQnGyZmL+Q/2/4
w4jN76NaLspoVE8grbzcSJq8b3Rlx2HUlzBaZgqAbuB8zx/59RXf559Aewt+Y9Pt
anGsTTpp6a9nc9YYlAGSmaA97y5fnHO47sHpKHVhR9vKbQgvnqGzvmqgNUeRbDzb
wgUjOma6SPfZL6JNqNb+SrktI410MjS11/+mP4NlTrVVomYyKZXslcvpnMuoiHMH
U4RpYB84
=9Rd7
-----END PGP SIGNATURE-----
Merge tag '6.6-rc-ksmbd' of git://git.samba.org/ksmbd
Pull smb server update from Steve French:
"After two years, many fixes and much testing, ksmbd is no longer
experimental"
* tag '6.6-rc-ksmbd' of git://git.samba.org/ksmbd:
ksmbd: remove experimental warning
There was a minor typo in the define for SMB2_GLOBAL_CAP_LARGE_MTU
0X00000004 instead of 0x00000004
make it consistent
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Recently we moved most cleanup from ntfs_put_super() into
ntfs3_kill_sb() as part of a bigger cleanup. This accidently also moved
dropping inode references stashed in ntfs3's sb->s_fs_info from
@sb->put_super() to @sb->kill_sb(). But generic_shutdown_super()
verifies that there are no busy inodes past sb->put_super(). Fix this
and disentangle dropping inode references from freeing @sb->s_fs_info.
Fixes: a4f64a300a29 ("ntfs3: free the sbi in ->kill_sb") # mainline only
Reported-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mateusz reports that glibc turns 'fstat()' calls into 'fstatat()', and
that seems to have been going on for quite a long time due to glibc
having tried to simplify its stat logic into just one point.
This turns out to cause completely unnecessary overhead, where we then
go off and allocate the kernel side pathname, and actually look up the
empty path. Sure, our path lookup is quite optimized, but it still
causes a fair bit of allocation overhead and a couple of completely
unnecessary rounds of lockref accesses etc.
This is all hopefully getting fixed in user space, and there is a patch
floating around for just having glibc use the native fstat() system
call. But even with the current situation we can at least improve on
things by catching the situation and short-circuiting it.
Note that this is still measurably slower than just a plain 'fstat()',
since just checking that the filename is actually empty is somewhat
expensive due to inevitable user space access overhead from the kernel
(ie verifying pointers, and SMAP on x86). But it's still quite a bit
faster than actually looking up the path for real.
To quote numers from Mateusz:
"Sapphire Rapids, will-it-scale, ops/s
stock fstat 5088199
patched fstat 7625244 (+49%)
real fstat 8540383 (+67% / +12%)"
where that 'stock fstat' is the glibc translation of fstat into
fstatat() with an empty path, the 'patched fstat' is with this short
circuiting of the path lookup, and the 'real fstat' is the actual native
fstat() system call with none of this overhead.
Link: https://lore.kernel.org/lkml/20230903204858.lv7i3kqvw6eamhgz@f/
Reported-by: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allow adjusting the maximum number of cached directories per share
(defaults to 16) via mount parm "max_cached_dirs"
Signed-off-by: Steve French <stfrench@microsoft.com>
In debugging a recent performance problem with statfs, it would have
been helpful to be able to trace the smb3 query fs info request
more narrowly. Add a trace point "smb3_qfs_done"
Which displays:
stat-68950 [008] ..... 1472.360598: smb3_qfs_done: xid=14 sid=0xaa9765e4 tid=0x95a76f54 unc_name=\\localhost\test rc=0
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
fscrypt support to CephFS! The list of things which don't work with
encryption should be fairly short, mostly around the edges: fallocate
(not supported well in CephFS to begin with), copy_file_range (requires
re-encryption), non-default striping patterns.
This was a multi-year effort principally by Jeff Layton with assistance
from Xiubo Li, Luís Henriques and others, including several dependant
changes in the MDS, netfs helper library and fscrypt framework itself.
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAmT4pl4THGlkcnlvbW92
QGdtYWlsLmNvbQAKCRBKf944AhHzi5kzB/4sMgzZyUa3T1vA/G2pPvEkyy1qDxsW
y+o4dDMWA9twcrBVpNuGd54wbXpmO/LAekHEdorjayH+f0zf10MsnP1ePz9WB3NG
jr7RRujb+Gpd2OFYJXGSEbd3faTg8M2kpGCCrVe7SFNoyu8z9NwFItwWMog5aBjX
ODGQrq+kA4ARA6xIqwzF5gP0zr+baT9rWhQdm7Xo9itWdosnbyDLJx1dpEfLuqBX
te3SmifDzedn3Gw73hdNo/+ybw0kHARoK+RmXCTsoDDQw+JsoO9KxZF5Q8QcDELq
2woPNp0Hl+Dm4MkzGnPxv56Qj8ZDViS59syXC0CfGRmu4nzF1Rw+0qn5
=/WlE
-----END PGP SIGNATURE-----
Merge tag 'ceph-for-6.6-rc1' of https://github.com/ceph/ceph-client
Pull ceph updates from Ilya Dryomov:
"Mixed with some fixes and cleanups, this brings in reasonably complete
fscrypt support to CephFS! The list of things which don't work with
encryption should be fairly short, mostly around the edges: fallocate
(not supported well in CephFS to begin with), copy_file_range
(requires re-encryption), non-default striping patterns.
This was a multi-year effort principally by Jeff Layton with
assistance from Xiubo Li, Luís Henriques and others, including several
dependant changes in the MDS, netfs helper library and fscrypt
framework itself"
* tag 'ceph-for-6.6-rc1' of https://github.com/ceph/ceph-client: (53 commits)
ceph: make num_fwd and num_retry to __u32
ceph: make members in struct ceph_mds_request_args_ext a union
rbd: use list_for_each_entry() helper
libceph: do not include crypto/algapi.h
ceph: switch ceph_lookup/atomic_open() to use new fscrypt helper
ceph: fix updating i_truncate_pagecache_size for fscrypt
ceph: wait for OSD requests' callbacks to finish when unmounting
ceph: drop messages from MDS when unmounting
ceph: update documentation regarding snapshot naming limitations
ceph: prevent snapshot creation in encrypted locked directories
ceph: add support for encrypted snapshot names
ceph: invalidate pages when doing direct/sync writes
ceph: plumb in decryption during reads
ceph: add encryption support to writepage and writepages
ceph: add read/modify/write to ceph_sync_write
ceph: align data in pages in ceph_sync_write
ceph: don't use special DIO path for encrypted inodes
ceph: add truncate size handling support for fscrypt
ceph: add object version support for sync read
libceph: allow ceph_osdc_new_request to accept a multi-op read
...
- Fix a glock state (non-)transition bug when a dlm request times out
and is canceled, and we have locking requests that can now be granted
immediately.
- Various fixes and cleanups in how the logd and quotad daemons are
woken up and terminated.
- Fix several bugs in the quota data reference counting and shrinking.
Free quota data objects synchronously in put_super() instead of
letting call_rcu() run wild.
- Make sure not to deallocate quota data during a withdraw; rather, defer
quota data deallocation to put_super(). Withdraws can happen in
contexts in which callers on the stack are holding quota data references.
- Many minor quota fixes and cleanups by Bob.
- Update the the mailing list address for gfs2 and dlm. (It's the same
list for both and we are moving it to gfs2@lists.linux.dev.)
- Various other minor cleanups.
-----BEGIN PGP SIGNATURE-----
iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmT3T7UUHGFncnVlbmJh
QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTqxhw/+IWp+4cY4htNkTRG7xkheTeQ+5whG
NU40mp7Hj+WY5GoHqsk676q1pBkVAq5mNN1kt9S/oC6lLHrdu1HLpdIkgFow2nAC
nDqlEqx9/Da9Q4H/+K442usO90S4o1MmOXOE9xcGcvJLqK4FLDOVDXbUWa43OXrK
4HxgjgGSNPF4itD+U0o/V4f19jQ+4cNwmo6hGLyhsYillaUHiIXJezQlH7XycByM
qGJqlG6odJ56wE38NG8Bt9Lj+91PsLLqO1TJxSYyzpf0h9QGQ2ySvu6/esTXwxWO
XRuT4db7yjyAUhJoJMw+YU77xWQTz0/jriIDS7VqzvR1ns3GPaWdtb31TdUTBG4H
IvBA8ep3oxHtcYFoPzCLBXgOIDej6KjAgS3RSv51yLeaZRHFUBc21fTSXbcDTIUs
gkusZlRNQ9ANdBCVyf8hZxbE54HnaBJ8dKMZtynOXJEHs0EtGV8YKCNIpkFLxOvE
vZkKcRsmVtuZ9fVhX1iH7dYmcsCMPI8RNo47k7hHk2EG8dU+eqyPSbi4QCmErNFf
DlqX+fIuiDtOkbmWcrb2qdphn6j6bMLhDaOMJGIBOmgOPi+AU9dNAfmtu1cG4u1b
2TFyUISayiwqHJQgguzvDed15fxexYdgoLB7O9t9TMbCENxisguNa5TsAN6ZkiLQ
0hY6h80xSR2kCPU=
=EonA
-----END PGP SIGNATURE-----
Merge tag 'gfs2-v6.5-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2
Pull gfs2 updates from Andreas Gruenbacher:
- Fix a glock state (non-)transition bug when a dlm request times out
and is canceled, and we have locking requests that can now be granted
immediately
- Various fixes and cleanups in how the logd and quotad daemons are
woken up and terminated
- Fix several bugs in the quota data reference counting and shrinking.
Free quota data objects synchronously in put_super() instead of
letting call_rcu() run wild
- Make sure not to deallocate quota data during a withdraw; rather,
defer quota data deallocation to put_super(). Withdraws can happen in
contexts in which callers on the stack are holding quota data
references
- Many minor quota fixes and cleanups by Bob
- Update the the mailing list address for gfs2 and dlm. (It's the same
list for both and we are moving it to gfs2@lists.linux.dev)
- Various other minor cleanups
* tag 'gfs2-v6.5-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (51 commits)
MAINTAINERS: Update dlm mailing list
MAINTAINERS: Update gfs2 mailing list
gfs2: change qd_slot_count to qd_slot_ref
gfs2: check for no eligible quota changes
gfs2: Remove useless assignment
gfs2: simplify slot_get
gfs2: Simplify qd2offset
gfs2: introduce qd_bh_get_or_undo
gfs2: Remove quota allocation info from quota file
gfs2: use constant for array size
gfs2: Set qd_sync_gen in do_sync
gfs2: Remove useless err set
gfs2: Small gfs2_quota_lock cleanup
gfs2: move qdsb_put and reduce redundancy
gfs2: improvements to sysfs status
gfs2: Don't try to sync non-changes
gfs2: Simplify function need_sync
gfs2: remove unneeded pg_oflow variable
gfs2: remove unneeded variable done
gfs2: pass sdp to gfs2_write_buf_to_page
...
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCZPYlzAAKCRDh3BK/laaZ
PEcxAP4suFAlonGntKJ5ltR+7ZN+WYdiraQ+5c6ISBFc+pFXgQD7B0xhztV4umSF
III+pbD6lE5gP5u7+Kw/pOnTI42yTQ8=
=aPjn
-----END PGP SIGNATURE-----
Merge tag 'fuse-update-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse updates from Miklos Szeredi:
- Revert non-waiting FLUSH due to a regression
- Fix a lookup counter leak in readdirplus
- Add an option to allow shared mmaps in no-cache mode
- Add btime support and statx intrastructure to the protocol
- Invalidate positive/negative dentry on failed create/delete
* tag 'fuse-update-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: conditionally fill kstat in fuse_do_statx()
fuse: invalidate dentry on EEXIST creates or ENOENT deletes
fuse: cache btime
fuse: implement statx
fuse: add ATTR_TIMEOUT macro
fuse: add STATX request
fuse: handle empty request_mask in statx
fuse: write back dirty pages before direct write in direct_io_relax mode
fuse: add a new fuse init flag to relax restrictions in no cache mode
fuse: invalidate page cache pages before direct write
fuse: nlookup missing decrement in fuse_direntplus_link
Revert "fuse: in fuse_flush only wait if someone wants the return code"
- Also a number of singleton patches, mainly cleanups and leftovers.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZPZGXwAKCRDdBJ7gKXxA
jkjpAP9F0t5xy3JGs8Iew47Yqva+fvvrZdUSx3aHIZ/C3HyaJwEAi7DwzqludyHi
851+qSdyX3bWnDEuejuNeMykh2QF1wo=
=pw9A
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-09-04-14-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more MM updates from Andrew Morton:
- Stefan Roesch has added ksm statistics to /proc/pid/smaps
- Also a number of singleton patches, mainly cleanups and leftovers
* tag 'mm-stable-2023-09-04-14-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm/kmemleak: move up cond_resched() call in page scanning loop
mm: page_alloc: remove stale CMA guard code
MAINTAINERS: add rmap.h to mm entry
rmap: remove anon_vma_link() nommu stub
proc/ksm: add ksm stats to /proc/pid/smaps
mm/hwpoison: rename hwp_walk* to hwpoison_walk*
mm: memory-failure: add PageOffline() check
Variable qd_slot_count is a reference count, not a count of slots. This
patch renames it to qd_slot_ref to make that more clear.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Before this patch, function gfs2_quota_sync would always allocate a page
full of memory and increment its quota sync generation number. This
happened even when the system was completely idle or if no blocks were
allocated or quota changes made. This patch adds function qd_changed
to determine if any changes have been made that qualify for a
quota sync. If not, it avoids the memory allocation and bumping the
generation number, along with all the additional work it would do.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
This assignment is unnecessary because if error was not already 0, it
would have branched to an error label already.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Simplify function slot_get and get rid of the goto that jumps into the
middle of an else branch.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
This is a minor cleanup of function qd2offset.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
This patch is an attempt to force some consistency in quota sync
processing. Two functions (qd_fish and gfs2_quota_unlock) called
qd_check_sync, after which they both called bh_get, and if that failed,
they took the same steps to undo the actions of qd_check_sync.
This patch introduces a new function, qd_bh_get_or_undo, which performs
the same steps, reducing code redundancy.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Function do_sync called gfs2_qa_get and put for quota allocation data.
But the inode in question is the system master quota file, which is
never subject to quotas. Therefore, a qa structure should be unnecessary
and if anything accesses it, it's probably a bug.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Function gfs2_quota_unlock declared an array of 4 qd elements. We have a
constant for that, we should be using it.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Func do_sync was called in two places: gfs2_quota_unlock and
gfs2_quota_sync. In gfs2_quota_sync it updated qd_sync_gen to the latest
superblock sync gen, if do_sync was successful. In gfs2_quota_unlock it
didn't update the value. That can only lead to extra work, for example,
if the value is synced by gfs2_quota_unlock but still has the old value.
This patch moves the setting of qd_sync_gen inside do_sync so we are
guaranteed consistency.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Function gfs2_adjust_quota set variable err, then set it again to a
different value. This patch removes the redundant set.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
No need to set error = 0 since it's set further down.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
This patch looks more invasive than it is. It simply moves function
qdsb_put before qd_unlock, then changes qd_unlock to call it rather than
open coding it. Again, this reduces redundancy.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
This patch adds some new fields to the gfs2 status file in sysfs to aid
in debugging.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Function need_sync is supposed to determine if a qd element needs to be
synced. If the "change" (qd_change) is zero, it does not need to be
synced because there's literally no change in the value. Before this
patch need_sync returned false if value < 0. That should be <= 0.
This patch changes the check to <=.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
This patch simplifies function need_sync by eliminating a variable in
favor of just returning the appropriate value as soon as we know it.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Function gfs2_write_disk_quota checks if its write overflows onto
another page, and if so, does a second write. Before this patch it kept
two variables for this, but only one is needed. This patch simplifies
it by eliminating pg_oflow.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Function gfs2_write_buf_to_page uses variable done to exit its loop, but
it's unnecessary if we just code an infinite loop and exit when we need.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
This patch passes the superblock pointer to gfs2_write_buf_to_page so it
becomes more apparent it's dealing with the system quota file.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Like the previous patch, we now pass the superblock pointer to function
gfs2_write_disk_quota. This makes the code more understandable, since it
only operates on the quota inode.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Before this change function gfs2_adjust_quota's first parameter was an
gfs2_inode pointer. But it always pointed to the quota inode. Here we
switch that to pass the superblock pointer, sdp, so it is easier to read
the code and understand that it's only dealing with the quota inode.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Since patch 845802b112ee function gfs2_write_buf_to_page checks if the
target inode is jdata or ordered. This function only operates on the
system quota file, which is always jdata, so the check for jdata is
useless. This patch removes it.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
This patch adds a new mount option quota=quiet which is the same as
quota=on but it suppresses gfs2 quota error messages.
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>