linux

iv/linux

Author	SHA1	Message	Date
Colin Ian King	10150232dc	NFSD: Initialize pointer ni with NULL and not plain integer 0 [ Upstream commit 8e70bf27fd20cc17e87150327a640e546bfbee64 ] Pointer ni is being initialized with plain integer zero. Fix this by initializing with NULL. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:53 +02:00
NeilBrown	767727a7e0	NFSD: simplify struct nfsfh [ Upstream commit d8b26071e65e80a348602b939e333242f989221b ] Most of the fields in 'struct knfsd_fh' are 2 levels deep (a union and a struct) and are accessed using macros like: #define fh_FOO fh_base.fh_new.fb_FOO This patch makes the union and struct anonymous, so that "fh_FOO" can be a name directly within 'struct knfsd_fh' and the #defines aren't needed. The file handle as a whole is sometimes accessed as "fh_base" or "fh_base.fh_pad", neither of which are particularly helpful names. As the struct holding the filehandle is now anonymous, we cannot use the name of that, so we union it with 'fh_raw' and use that where the raw filehandle is needed. fh_raw also ensure the structure is large enough for the largest possible filehandle. fh_raw is a 'char' array, removing any need to cast it for memcpy etc. SVCFH_fmt() is simplified using the "%ph" printk format. This changes the appearance of filehandles in dprintk() debugging, making them a little more precise. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:53 +02:00
NeilBrown	ed0815c8b7	NFSD: drop support for ancient filehandles [ Upstream commit c645a883df34ee10b884ec921e850def54b7f461 ] Filehandles not in the "new" or "version 1" format have not been handed out for new mounts since Linux 2.4 which was released 20 years ago. I think it is safe to say that no such file handles are still in use, and that we can drop support for them. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:53 +02:00
NeilBrown	f829bb3a06	NFSD: move filehandle format declarations out of "uapi". [ Upstream commit ef5825e3cf0d0af657f5fb4dd86d750ed42fee0a ] A small part of the declaration concerning filehandle format are currently in the "uapi" include directory: include/uapi/linux/nfsd/nfsfh.h There is a lot more to the filehandle format, including "enum fid_type" and "enum nfsd_fsid" which are not exported via "uapi". This small part of the filehandle definition is of minimal use outside of the kernel, and I can find no evidence that an other code is using it. Certainly nfs-utils and wireshark (The most likely candidates) do not use these declarations. So move it out of "uapi" by copying the content from include/uapi/linux/nfsd/nfsfh.h into fs/nfsd/nfsfh.h A few unnecessary "#include" directives are not copied, and neither is the #define of fh_auth, which is annotated as being for userspace only. The copyright claims in the uapi file are identical to those in the nfsd file, so there is no need to copy those. The "__u32" style integer types are only needed in "uapi". In kernel-only code we can use the more familiar "u32" style. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:53 +02:00
Chuck Lever	af12deba53	NFSD: Optimize DRC bucket pruning [ Upstream commit 8847ecc9274a14114385d1cb4030326baa0766eb ] DRC bucket pruning is done by nfsd_cache_lookup(), which is part of every NFSv2 and NFSv3 dispatch (ie, it's done while the client is waiting). I added a trace_printk() in prune_bucket() to see just how long it takes to prune. Here are two ends of the spectrum: prune_bucket: Scanned 1 and freed 0 in 90 ns, 62 entries remaining prune_bucket: Scanned 2 and freed 1 in 716 ns, 63 entries remaining ... prune_bucket: Scanned 75 and freed 74 in 34149 ns, 1 entries remaining Pruning latency is noticeable on fast transports with fast storage. By noticeable, I mean that the latency measured here in the worst case is the same order of magnitude as the round trip time for cached server operations. We could do something like moving expired entries to an expired list and then free them later instead of freeing them right in prune_bucket(). But simply limiting the number of entries that can be pruned by a lookup is simple and retains more entries in the cache, making the DRC somewhat more effective. Comparison with a 70/30 fio 8KB 12 thread direct I/O test: Before: write: IOPS=61.6k, BW=481MiB/s (505MB/s)(14.1GiB/30001msec); 0 zone resets WRITE: 1848726 ops (30%) avg bytes sent per op: 8340 avg bytes received per op: 136 backlog wait: 0.635158 RTT: 0.128525 total execute time: 0.827242 (milliseconds) After: write: IOPS=63.0k, BW=492MiB/s (516MB/s)(14.4GiB/30001msec); 0 zone resets WRITE: 1891144 ops (30%) avg bytes sent per op: 8340 avg bytes received per op: 136 backlog wait: 0.616114 RTT: 0.126842 total execute time: 0.805348 (milliseconds) Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:53 +02:00
Chuck Lever	61116d723e	NFS: Move NFS protocol display macros to global header [ Upstream commit 8791545eda52e8f3bc48e3cd902e38bf4ba4c9de ] Refactor: surface useful show_ macros so they can be shared between the client and server trace code. Additional clean up: - Housekeeping: ensure the correct #include files are pulled in and add proper TRACE_DEFINE_ENUM where they are missing - Use a consistent naming scheme for the helpers - Store values to be displayed symbolically as unsigned long, as that is the type that the __print_yada() functions take Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:52 +02:00
Chuck Lever	53af55ba9e	NFS: Move generic FS show macros to global header [ Upstream commit 9d2d48bbbdabf7b2f029369c4f926d133c1d47ad ] Refactor: Surface useful show_ macros for use by other trace subsystems. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:52 +02:00
Chuck Lever	f78ab849df	SUNRPC: Tracepoints should display tk_pid and cl_clid as a fixed-size field [ Upstream commit b4776a341ec05e809d21e98db5ed49dbdc81d5d8 ] For certain special cases, RPC-related tracepoints record a -1 as the task ID or the client ID. It's ugly for a trace event to display 4 billion in these cases. To help keep SUNRPC tracepoints consistent, create a macro that defines the print format specifiers for tk_pid and cl_clid. At some point in the future we might try tk_pid with a wider range of values than 0..64K so this makes it easier to make that change. RPC tracepoints now look like this: <...>-1276 [009] 149.720358: rpc_clnt_new: client=00000005 peer=[192.168.2.55]:20049 program=nfs server=klimt.ib <...>-1342 [004] 149.921234: rpc_xdr_recvfrom: task:0000001a@00000005 head=[0xff1242d9ab6dc01c,144] page=0 tail=[(nil),0] len=144 <...>-1342 [004] 149.921235: xprt_release_cong: task:0000001a@00000005 snd_task:ffffffff cong=256 cwnd=16384 <...>-1342 [004] 149.921235: xprt_put_cong: task:0000001a@00000005 snd_task:ffffffff cong=0 cwnd=16384 Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:52 +02:00
Chuck Lever	20bbdbe31c	NFS: Remove unnecessary TRACE_DEFINE_ENUM()s [ Upstream commit 8e09650f5ec68858f4b8b67cdef9e2ece9b208f3 ] Clean up: TRACE_DEFINE_ENUM is unnecessary because the target symbols are all C macros, not enums. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:52 +02:00
Gabriel Krisman Bertazi	b394c80e2c	ext4: Send notifications on error [ Upstream commit 9a089b21f79b47eed240d4da7ea0d049de7c9b4d ] Send a FS_ERROR message via fsnotify to a userspace monitoring tool whenever a ext4 error condition is triggered. This follows the existing error conditions in ext4, so it is hooked to the ext4_error* functions. Link: https://lore.kernel.org/r/20211025192746.66445-30-krisman@collabora.com Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Acked-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:52 +02:00
Gabriel Krisman Bertazi	0e825bc84d	fanotify: Allow users to request FAN_FS_ERROR events [ Upstream commit 9709bd548f11a092d124698118013f66e1740f9b ] Wire up the FAN_FS_ERROR event in the fanotify_mark syscall, allowing user space to request the monitoring of FAN_FS_ERROR events. These events are limited to filesystem marks, so check it is the case in the syscall handler. Link: https://lore.kernel.org/r/20211025192746.66445-29-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:52 +02:00
Gabriel Krisman Bertazi	c7c013dff4	fanotify: Emit generic error info for error event [ Upstream commit 130a3c742107acff985541c28360c8b40203559c ] The error info is a record sent to users on FAN_FS_ERROR events documenting the type of error. It also carries an error count, documenting how many errors were observed since the last reporting. Link: https://lore.kernel.org/r/20211025192746.66445-28-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:52 +02:00
Gabriel Krisman Bertazi	92eb45210b	fanotify: Report fid info for file related file system errors [ Upstream commit 936d6a38be39177495af38497bf8da1c6128fa1b ] Plumb the pieces to add a FID report to error records. Since all error event memory must be pre-allocated, we pre-allocate the maximum file handle size possible, such that it should always fit. For errors that don't expose a file handle, report it with an invalid FID. Internally we use zero-length FILEID_ROOT file handle for passing the information (which we report as zero-length FILEID_INVALID file handle to userspace) so we update the handle reporting code to deal with this case correctly. Link: https://lore.kernel.org/r/20211025192746.66445-27-krisman@collabora.com Link: https://lore.kernel.org/r/20211025192746.66445-25-krisman@collabora.com Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> [Folded two patches into 2 to make series bisectable] Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:52 +02:00
Gabriel Krisman Bertazi	c8b6b09f3f	fanotify: WARN_ON against too large file handles [ Upstream commit 572c28f27a269f88e2d8d7b6b1507f114d637337 ] struct fanotify_error_event, at least, is preallocated and isn't able to to handle arbitrarily large file handles. Future-proof the code by complaining loudly if a handle larger than MAX_HANDLE_SZ is ever found. Link: https://lore.kernel.org/r/20211025192746.66445-26-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:51 +02:00
Gabriel Krisman Bertazi	ebd89f7d07	fanotify: Add helpers to decide whether to report FID/DFID [ Upstream commit 4bd5a5c8e6e5cd964e9738e6ef87f6c2cb453edf ] Now that there is an event that reports FID records even for a zeroed file handle, wrap the logic that deides whether to issue the records into helper functions. This shouldn't have any impact on the code, but simplifies further patches. Link: https://lore.kernel.org/r/20211025192746.66445-24-krisman@collabora.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:51 +02:00
Gabriel Krisman Bertazi	a83c3ed2e2	fanotify: Wrap object_fh inline space in a creator macro [ Upstream commit 2c5069433a3adc01ff9c5673567961bb7f138074 ] fanotify_error_event would duplicate this sequence of declarations that already exist elsewhere with a slight different size. Create a helper macro to avoid code duplication. Link: https://lore.kernel.org/r/20211025192746.66445-23-krisman@collabora.com Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:51 +02:00
Gabriel Krisman Bertazi	7aafd61567	fanotify: Support merging of error events [ Upstream commit 8a6ae64132fd27a944faed7bc38484827609eb76 ] Error events (FAN_FS_ERROR) against the same file system can be merged by simply iterating the error count. The hash is taken from the fsid, without considering the FH. This means that only the first error object is reported. Link: https://lore.kernel.org/r/20211025192746.66445-22-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:51 +02:00
Gabriel Krisman Bertazi	c44d807143	fanotify: Support enqueueing of error events [ Upstream commit 83e9acbe13dc1b767f91b5c1350f7a65689b26f6 ] Once an error event is triggered, enqueue it in the notification group, similarly to what is done for other events. FAN_FS_ERROR is not handled specially, since the memory is now handled by a preallocated mempool. For now, make the event unhashed. A future patch implements merging of this kind of event. Link: https://lore.kernel.org/r/20211025192746.66445-21-krisman@collabora.com Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:51 +02:00
Gabriel Krisman Bertazi	9258a9908b	fanotify: Pre-allocate pool of error events [ Upstream commit 734a1a5eccc5f7473002b0669f788e135f1f64aa ] Pre-allocate slots for file system errors to have greater chances of succeeding, since error events can happen in GFP_NOFS context. This patch introduces a group-wide mempool of error events, shared by all FAN_FS_ERROR marks in this group. Link: https://lore.kernel.org/r/20211025192746.66445-20-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:51 +02:00
Gabriel Krisman Bertazi	11280c7181	fanotify: Reserve UAPI bits for FAN_FS_ERROR [ Upstream commit 8d11a4f43ef4679be0908026907a7613b33d7127 ] FAN_FS_ERROR allows reporting of event type FS_ERROR to userspace, which is a mechanism to report file system wide problems via fanotify. This commit preallocate userspace visible bits to match the FS_ERROR event. Link: https://lore.kernel.org/r/20211025192746.66445-19-krisman@collabora.com Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:51 +02:00
Gabriel Krisman Bertazi	965daffd0d	fanotify: Require fid_mode for any non-fd event [ Upstream commit 4fe595cf1c80e7a5af4d00c4da29def64aff57a2 ] Like inode events, FAN_FS_ERROR will require fid mode. Therefore, convert the verification during fanotify_mark(2) to require fid for any non-fd event. This means fid_mode will not only be required for inode events, but for any event that doesn't provide a descriptor. Link: https://lore.kernel.org/r/20211025192746.66445-17-krisman@collabora.com Suggested-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:51 +02:00
Gabriel Krisman Bertazi	1dad39ebf4	fanotify: Encode empty file handle when no inode is provided [ Upstream commit 272531ac619b374ab474e989eb387162fded553f ] Instead of failing, encode an invalid file handle in fanotify_encode_fh if no inode is provided. This bogus file handle will be reported by FAN_FS_ERROR for non-inode errors. Link: https://lore.kernel.org/r/20211025192746.66445-16-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:51 +02:00
Gabriel Krisman Bertazi	5f34524dde	fanotify: Allow file handle encoding for unhashed events [ Upstream commit 74fe4734897a2da2ae2a665a5e622cd490d36eaf ] Allow passing a NULL hash to fanotify_encode_fh and avoid calculating the hash if not needed. Link: https://lore.kernel.org/r/20211025192746.66445-15-krisman@collabora.com Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:50 +02:00
Gabriel Krisman Bertazi	900fe4d704	fanotify: Support null inode event in fanotify_dfid_inode [ Upstream commit 12f47bf0f0990933d95d021d13d31bda010648fd ] FAN_FS_ERROR doesn't support DFID, but this function is still called for every event. The problem is that it is not capable of handling null inodes, which now can happen in case of superblock error events. For this case, just returning dir will be enough. Link: https://lore.kernel.org/r/20211025192746.66445-14-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:50 +02:00
Gabriel Krisman Bertazi	20bb62edd6	fsnotify: Pass group argument to free_event [ Upstream commit 330ae77d2a5b0af32c0f29e139bf28ec8591de59 ] For group-wide mempool backed events, like FS_ERROR, the free_event callback will need to reference the group's mempool to free the memory. Wire that argument into the current callers. Link: https://lore.kernel.org/r/20211025192746.66445-13-krisman@collabora.com Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:50 +02:00
Gabriel Krisman Bertazi	84f904f031	fsnotify: Protect fsnotify_handle_inode_event from no-inode events [ Upstream commit 24dca90590509a7a6cbe0650100c90c5b8a3468a ] FAN_FS_ERROR allows events without inodes - i.e. for file system-wide errors. Even though fsnotify_handle_inode_event is not currently used by fanotify, this patch protects other backends from cases where neither inode or dir are provided. Also document the constraints of the interface (inode and dir cannot be both NULL). Link: https://lore.kernel.org/r/20211025192746.66445-12-krisman@collabora.com Suggested-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:50 +02:00
Gabriel Krisman Bertazi	4b6b5edcc7	fsnotify: Retrieve super block from the data field [ Upstream commit 29335033c574a15334015d8c4e36862cff3d3384 ] Some file system events (i.e. FS_ERROR) might not be associated with an inode or directory. For these, we can retrieve the super block from the data field. But, since the super_block is available in the data field on every event type, simplify the code to always retrieve it from there, through a new helper. Link: https://lore.kernel.org/r/20211025192746.66445-11-krisman@collabora.com Suggested-by: Jan Kara <jack@suse.cz> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:50 +02:00
Gabriel Krisman Bertazi	8eb16abe89	fsnotify: Add wrapper around fsnotify_add_event [ Upstream commit 1ad03c3a326a86e259389592117252c851873395 ] fsnotify_add_event is growing in number of parameters, which in most case are just passed a NULL pointer. So, split out a new fsnotify_insert_event function to clean things up for users who don't need an insert hook. Link: https://lore.kernel.org/r/20211025192746.66445-10-krisman@collabora.com Suggested-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:50 +02:00
Gabriel Krisman Bertazi	e58d543a8c	fsnotify: Add helper to detect overflow_event [ Upstream commit 808967a0a4d2f4ce6a2005c5692fffbecaf018c1 ] Similarly to fanotify_is_perm_event and friends, provide a helper predicate to say whether a mask is of an overflow event. Link: https://lore.kernel.org/r/20211025192746.66445-9-krisman@collabora.com Suggested-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:50 +02:00
Gabriel Krisman Bertazi	3e63bff7ac	inotify: Don't force FS_IN_IGNORED [ Upstream commit e0462f91d24756916fded4313d508e0fc52f39c9 ] According to Amir: "FS_IN_IGNORED is completely internal to inotify and there is no need to set it in i_fsnotify_mask at all, so if we remove the bit from the output of inotify_arg_to_mask() no functionality will change and we will be able to overload the event bit for FS_ERROR." This is done in preparation to overload FS_ERROR with the notification mechanism in fanotify. Link: https://lore.kernel.org/r/20211025192746.66445-8-krisman@collabora.com Suggested-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:50 +02:00
Gabriel Krisman Bertazi	dbeb08836b	fanotify: Split fsid check from other fid mode checks [ Upstream commit 8299212cbdb01a5867e230e961f82e5c02a6de34 ] FAN_FS_ERROR will require fsid, but not necessarily require the filesystem to expose a file handle. Split those checks into different functions, so they can be used separately when setting up an event. While there, update a comment about tmpfs having 0 fsid, which is no longer true. Link: https://lore.kernel.org/r/20211025192746.66445-7-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:50 +02:00
Gabriel Krisman Bertazi	cce35a4878	fanotify: Fold event size calculation to its own function [ Upstream commit b9928e80dda84b349ba8de01780b9bef2fc36ffa ] Every time this function is invoked, it is immediately added to FAN_EVENT_METADATA_LEN, since there is no need to just calculate the length of info records. This minor clean up folds the rest of the calculation into the function, which now operates in terms of events, returning the size of the entire event, including metadata. Link: https://lore.kernel.org/r/20211025192746.66445-6-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:49 +02:00
Gabriel Krisman Bertazi	8c7986c2bc	fsnotify: Don't insert unmergeable events in hashtable [ Upstream commit cc53b55f697fe5aa98bdbfdfe67c6401da242155 ] Some events, like the overflow event, are not mergeable, so they are not hashed. But, when failing inside fsnotify_add_event for lack of space, fsnotify_add_event() still calls the insert hook, which adds the overflow event to the merge list. Add a check to prevent any kind of unmergeable event to be inserted in the hashtable. Fixes: 94e00d28a680 ("fsnotify: use hash table for faster events merge") Link: https://lore.kernel.org/r/20211025192746.66445-5-krisman@collabora.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-10 16:18:49 +02:00
Bart Van Assche	a71cba0778	fs/aio: Check IOCB_AIO_RW before the struct aio_kiocb conversion commit 961ebd120565cb60cebe21cb634fbc456022db4a upstream. The first kiocb_set_cancel_fn() argument may point at a struct kiocb that is not embedded inside struct aio_kiocb. With the current code, depending on the compiler, the req->ki_ctx read happens either before the IOCB_AIO_RW test or after that test. Move the req->ki_ctx read such that it is guaranteed that the IOCB_AIO_RW test happens first. Reported-by: Eric Biggers <ebiggers@kernel.org> Cc: Benjamin LaHaise <ben@communityfibre.ca> Cc: Eric Biggers <ebiggers@google.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Avi Kivity <avi@scylladb.com> Cc: Sandeep Dhavale <dhavale@google.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: stable@vger.kernel.org Fixes: b820de741ae4 ("fs/aio: Restrict kiocb_set_cancel_fn() to I/O submitted via libaio") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240304235715.3790858-1-bvanassche@acm.org Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-04-10 16:18:46 +02:00
Steven Rostedt (Google)	81b9ddd44d	NFSD: Fix nfsd_clid_class use of __string_len() macro [ Upstream commit 9388a2aa453321bcf1ad2603959debea9e6ab6d4 ] I'm working on restructuring the __string* macros so that it doesn't need to recalculate the string twice. That is, it will save it off when processing __string() and the __assign_str() will not need to do the work again as it currently does. Currently __string_len(item, src, len) doesn't actually use "src", but my changes will require src to be correct as that is where the __assign_str() will get its value from. The event class nfsd_clid_class has: __string_len(name, name, clp->cl_name.len) But the second "name" does not exist and causes my changes to fail to build. That second parameter should be: clp->cl_name.data. Link: https://lore.kernel.org/linux-trace-kernel/20240222122828.3d8d213c@gandalf.local.home Cc: Neil Brown <neilb@suse.de> Cc: Olga Kornievskaia <kolga@netapp.com> Cc: Dai Ngo <Dai.Ngo@oracle.com> Cc: Tom Talpey <tom@talpey.com> Cc: stable@vger.kernel.org Fixes: d27b74a8675ca ("NFSD: Use new __string_len C macros for nfsd_clid_class") Acked-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-10 16:18:43 +02:00
Ryusuke Konishi	ca581d237f	nilfs2: prevent kernel bug at submit_bh_wbc() [ Upstream commit 269cdf353b5bdd15f1a079671b0f889113865f20 ] Fix a bug where nilfs_get_block() returns a successful status when searching and inserting the specified block both fail inconsistently. If this inconsistent behavior is not due to a previously fixed bug, then an unexpected race is occurring, so return a temporary error -EAGAIN instead. This prevents callers such as __block_write_begin_int() from requesting a read into a buffer that is not mapped, which would cause the BUG_ON check for the BH_Mapped flag in submit_bh_wbc() to fail. Link: https://lkml.kernel.org/r/20240313105827.5296-3-konishi.ryusuke@gmail.com Fixes: 1f5abe7e7dbc ("nilfs2: replace BUG_ON and BUG calls triggerable from ioctl") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-10 16:18:43 +02:00
Ryusuke Konishi	2e2619ff5d	nilfs2: fix failure to detect DAT corruption in btree and direct mappings [ Upstream commit f2f26b4a84a0ef41791bd2d70861c8eac748f4ba ] Patch series "nilfs2: fix kernel bug at submit_bh_wbc()". This resolves a kernel BUG reported by syzbot. Since there are two flaws involved, I've made each one a separate patch. The first patch alone resolves the syzbot-reported bug, but I think both fixes should be sent to stable, so I've tagged them as such. This patch (of 2): Syzbot has reported a kernel bug in submit_bh_wbc() when writing file data to a nilfs2 file system whose metadata is corrupted. There are two flaws involved in this issue. The first flaw is that when nilfs_get_block() locates a data block using btree or direct mapping, if the disk address translation routine nilfs_dat_translate() fails with internal code -ENOENT due to DAT metadata corruption, it can be passed back to nilfs_get_block(). This causes nilfs_get_block() to misidentify an existing block as non-existent, causing both data block lookup and insertion to fail inconsistently. The second flaw is that nilfs_get_block() returns a successful status in this inconsistent state. This causes the caller __block_write_begin_int() or others to request a read even though the buffer is not mapped, resulting in a BUG_ON check for the BH_Mapped flag in submit_bh_wbc() failing. This fixes the first issue by changing the return value to code -EINVAL when a conversion using DAT fails with code -ENOENT, avoiding the conflicting condition that leads to the kernel bug described above. Here, code -EINVAL indicates that metadata corruption was detected during the block lookup, which will be properly handled as a file system error and converted to -EIO when passing through the nilfs2 bmap layer. Link: https://lkml.kernel.org/r/20240313105827.5296-1-konishi.ryusuke@gmail.com Link: https://lkml.kernel.org/r/20240313105827.5296-2-konishi.ryusuke@gmail.com Fixes: c3a7abf06ce7 ("nilfs2: support contiguous lookup of blocks") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+cfed5b56649bddf80d6e@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=cfed5b56649bddf80d6e Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-10 16:18:42 +02:00
Marios Makassikis	34cd4d5de1	ksmbd: retrieve number of blocks using vfs_getattr in set_file_allocation_info [ Upstream commit 34cd86b6632718b7df3999d96f51e63de41c5e4f ] Use vfs_getattr() to retrieve stat information, rather than make assumptions about how a filesystem fills inode structs. Cc: stable@vger.kernel.org Signed-off-by: Marios Makassikis <mmakassikis@freebox.fr> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-10 16:18:41 +02:00
Josef Bacik	80d24b308b	nfs: fix UAF in direct writes [ Upstream commit 17f46b803d4f23c66cacce81db35fef3adb8f2af ] In production we have been hitting the following warning consistently ------------[ cut here ]------------ refcount_t: underflow; use-after-free. WARNING: CPU: 17 PID: 1800359 at lib/refcount.c:28 refcount_warn_saturate+0x9c/0xe0 Workqueue: nfsiod nfs_direct_write_schedule_work [nfs] RIP: 0010:refcount_warn_saturate+0x9c/0xe0 PKRU: 55555554 Call Trace: <TASK> ? __warn+0x9f/0x130 ? refcount_warn_saturate+0x9c/0xe0 ? report_bug+0xcc/0x150 ? handle_bug+0x3d/0x70 ? exc_invalid_op+0x16/0x40 ? asm_exc_invalid_op+0x16/0x20 ? refcount_warn_saturate+0x9c/0xe0 nfs_direct_write_schedule_work+0x237/0x250 [nfs] process_one_work+0x12f/0x4a0 worker_thread+0x14e/0x3b0 ? ZSTD_getCParams_internal+0x220/0x220 kthread+0xdc/0x120 ? __btf_name_valid+0xa0/0xa0 ret_from_fork+0x1f/0x30 This is because we're completing the nfs_direct_request twice in a row. The source of this is when we have our commit requests to submit, we process them and send them off, and then in the completion path for the commit requests we have if (nfs_commit_end(cinfo.mds)) nfs_direct_write_complete(dreq); However since we're submitting asynchronous requests we sometimes have one that completes before we submit the next one, so we end up calling complete on the nfs_direct_request twice. The only other place we use nfs_generic_commit_list() is in __nfs_commit_inode, which wraps this call in a nfs_commit_begin(); nfs_commit_end(); Which is a common pattern for this style of completion handling, one that is also repeated in the direct code with get_dreq()/put_dreq() calls around where we process events as well as in the completion paths. Fix this by using the same pattern for the commit requests. Before with my 200 node rocksdb stress running this warning would pop every 10ish minutes. With my patch the stress test has been running for several hours without popping. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-10 16:18:40 +02:00
Maximilian Heyne	239c669edb	ext4: fix corruption during on-line resize [ Upstream commit a6b3bfe176e8a5b05ec4447404e412c2a3fc92cc ] We observed a corruption during on-line resize of a file system that is larger than 16 TiB with 4k block size. With having more then 2^32 blocks resize_inode is turned off by default by mke2fs. The issue can be reproduced on a smaller file system for convenience by explicitly turning off resize_inode. An on-line resize across an 8 GiB boundary (the size of a meta block group in this setup) then leads to a corruption: dev=/dev/<some_dev> # should be >= 16 GiB mkdir -p /corruption /sbin/mke2fs -t ext4 -b 4096 -O ^resize_inode $dev $((2 * 221 - 215)) mount -t ext4 $dev /corruption dd if=/dev/zero bs=4096 of=/corruption/test count=$((2221 - 42*15)) sha1sum /corruption/test # 79d2658b39dcfd77274e435b0934028adafaab11 /corruption/test /sbin/resize2fs $dev $((22*21)) # drop page cache to force reload the block from disk echo 1 > /proc/sys/vm/drop_caches sha1sum /corruption/test # 3c2abc63cbf1a94c9e6977e0fbd72cd832c4d5c3 /corruption/test 2^21 = 2^152^6 equals 8 GiB whereof 2^15 is the number of blocks per block group and 2^6 are the number of block groups that make a meta block group. The last checksum might be different depending on how the file is laid out across the physical blocks. The actual corruption occurs at physical block 63*2^15 = 2064384 which would be the location of the backup of the meta block group's block descriptor. During the on-line resize the file system will be converted to meta_bg starting at s_first_meta_bg which is 2 in the example - meaning all block groups after 16 GiB. However, in ext4_flex_group_add we might add block groups that are not part of the first meta block group yet. In the reproducer we achieved this by substracting the size of a whole block group from the point where the meta block group would start. This must be considered when updating the backup block group descriptors to follow the non-meta_bg layout. The fix is to add a test whether the group to add is already part of the meta block group or not. Fixes: 01f795f9e0d67 ("ext4: add online resizing support for meta_bg and 64-bit file systems") Cc: <stable@vger.kernel.org> Signed-off-by: Maximilian Heyne <mheyne@amazon.de> Tested-by: Srivathsa Dara <srivathsa.d.dara@oracle.com> Reviewed-by: Srivathsa Dara <srivathsa.d.dara@oracle.com> Link: https://lore.kernel.org/r/20240215155009.94493-1-mheyne@amazon.de Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-10 16:18:40 +02:00
Filipe Manana	fa3762b45e	btrfs: fix off-by-one chunk length calculation at contains_pending_extent() [ Upstream commit ae6bd7f9b46a29af52ebfac25d395757e2031d0d ] At contains_pending_extent() the value of the end offset of a chunk we found in the device's allocation state io tree is inclusive, so when we calculate the length we pass to the in_range() macro, we must sum 1 to the expression "physical_end - physical_offset". In practice the wrong calculation should be harmless as chunks sizes are never 1 byte and we should never have 1 byte ranges of unallocated space. Nevertheless fix the wrong calculation. Reported-by: Alex Lyakas <alex.lyakas@zadara.com> Link: https://lore.kernel.org/linux-btrfs/CAOcd+r30e-f4R-5x-S7sV22RJPe7+pgwherA6xqN2_qe7o4XTg@mail.gmail.com/ Fixes: 1c11b63eff2a ("btrfs: replace pending/pinned chunks lists with io tree") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-10 16:18:38 +02:00
Miklos Szeredi	6eac5030fc	fuse: don't unhash root [ Upstream commit b1fe686a765e6c0d71811d825b5a1585a202b777 ] The root inode is assumed to be always hashed. Do not unhash the root inode even if it is marked BAD. Fixes: 5d069dbe8aaf ("fuse: fix bad inode") Cc: <stable@vger.kernel.org> # v5.11 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-10 16:18:38 +02:00
Miklos Szeredi	773783681d	fuse: fix root lookup with nonzero generation [ Upstream commit 68ca1b49e430f6534d0774a94147a823e3b8b26e ] The root inode has a fixed nodeid and generation (1, 0). Prior to the commit 15db16837a35 ("fuse: fix illegal access to inode with reused nodeid") generation number on lookup was ignored. After this commit lookup with the wrong generation number resulted in the inode being unhashed. This is correct for non-root inodes, but replacing the root inode is wrong and results in weird behavior. Fix by reverting to the old behavior if ignoring the generation for the root inode, but issuing a warning in dmesg. Reported-by: Antonio SJ Musumeci <trapexit@spawn.link> Closes: https://lore.kernel.org/all/CAOQ4uxhek5ytdN8Yz2tNEOg5ea4NkBb4nk0FGPjPk_9nz-VG3g@mail.gmail.com/ Fixes: 15db16837a35 ("fuse: fix illegal access to inode with reused nodeid") Cc: <stable@vger.kernel.org> # v5.14 Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-10 16:18:38 +02:00
Matthew Wilcox (Oracle)	f19b1023a3	ubifs: Set page uptodate in the correct place [ Upstream commit 723012cab779eee8228376754e22c6594229bf8f ] Page cache reads are lockless, so setting the freshly allocated page uptodate before we've overwritten it with the data it's supposed to have in it will allow a simultaneous reader to see old data. Move the call to SetPageUptodate into ubifs_write_end(), which is after we copied the new data into the page. Fixes: 1e51764a3c2a ("UBIFS: add new flash file system") Cc: stable@vger.kernel.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-10 16:18:35 +02:00
Jan Kara	b7fb63e807	fat: fix uninitialized field in nostale filehandles [ Upstream commit fde2497d2bc3a063d8af88b258dbadc86bd7b57c ] When fat_encode_fh_nostale() encodes file handle without a parent it stores only first 10 bytes of the file handle. However the length of the file handle must be a multiple of 4 so the file handle is actually 12 bytes long and the last two bytes remain uninitialized. This is not great at we potentially leak uninitialized information with the handle to userspace. Properly initialize the full handle length. Link: https://lkml.kernel.org/r/20240205122626.13701-1-jack@suse.cz Reported-by: syzbot+3ce5dea5b1539ff36769@syzkaller.appspotmail.com Fixes: ea3983ace6b7 ("fat: restructure export_operations") Signed-off-by: Jan Kara <jack@suse.cz> Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Amir Goldstein <amir73il@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-10 16:18:35 +02:00
Baokun Li	447753bdd6	ext4: correct best extent lstart adjustment logic [ Upstream commit 4fbf8bc733d14bceb16dda46a3f5e19c6a9621c5 ] When yangerkun review commit 93cdf49f6eca ("ext4: Fix best extent lstart adjustment logic in ext4_mb_new_inode_pa()"), it was found that the best extent did not completely cover the original request after adjusting the best extent lstart in ext4_mb_new_inode_pa() as follows: original request: 2/10(8) normalized request: 0/64(64) best extent: 0/9(9) When we check if best ex can be kept at start of goal, ac_o_ex.fe_logical is 2 less than the adjusted best extent logical end 9, so we think the adjustment is done. But obviously 0/9(9) doesn't cover 2/10(8), so we should determine here if the original request logical end is less than or equal to the adjusted best extent logical end. In addition, add a comment stating when adjusted best_ex will not cover the original request, and remove the duplicate assertion because adjusting lstart makes no change to b_ex.fe_len. Link: https://lore.kernel.org/r/3630fa7f-b432-7afd-5f79-781bc3b2c5ea@huawei.com Fixes: 93cdf49f6eca ("ext4: Fix best extent lstart adjustment logic in ext4_mb_new_inode_pa()") Cc: <stable@kernel.org> Signed-off-by: yangerkun <yangerkun@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Ojaswin Mujoo <ojaswin@linux.ibm.com> Link: https://lore.kernel.org/r/20240201141845.1879253-1-libaokun1@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-10 16:18:35 +02:00
David Howells	a6ffae61ad	afs: Revert "afs: Hide silly-rename files from userspace" [ Upstream commit 0aec3847d044273733285dcff90afda89ad461d2 ] This reverts commit 57e9d49c54528c49b8bffe6d99d782ea051ea534. This undoes the hiding of .__afsXXXX silly-rename files. The problem with hiding them is that rm can't then manually delete them. This also reverts commit 5f7a07646655fb4108da527565dcdc80124b14c4 ("afs: Fix endless loop in directory parsing") as that's a bugfix for the above. Fixes: 57e9d49c5452 ("afs: Hide silly-rename files from userspace") Reported-by: Markus Suvanto <markus.suvanto@gmail.com> Link: https://lists.infradead.org/pipermail/linux-afs/2024-February/008102.html Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/3085695.1710328121@warthog.procyon.org.uk Reviewed-by: Jeffrey E Altman <jaltman@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-26 18:21:34 -04:00
Xiuhong Wang	fa3ac8b1a2	f2fs: compress: fix reserve_cblocks counting error when out of space [ Upstream commit 2f6d721e14b69d6e1251f69fa238b48e8374e25f ] When a file only needs one direct_node, performing the following operations will cause the file to be unrepairable: unisoc # ./f2fs_io compress test.apk unisoc #df -h \| grep dm-48 /dev/block/dm-48 112G 112G 1.2M 100% /data unisoc # ./f2fs_io release_cblocks test.apk 924 unisoc # df -h \| grep dm-48 /dev/block/dm-48 112G 112G 4.8M 100% /data unisoc # dd if=/dev/random of=file4 bs=1M count=3 3145728 bytes (3.0 M) copied, 0.025 s, 120 M/s unisoc # df -h \| grep dm-48 /dev/block/dm-48 112G 112G 1.8M 100% /data unisoc # ./f2fs_io reserve_cblocks test.apk F2FS_IOC_RESERVE_COMPRESS_BLOCKS failed: No space left on device adb reboot unisoc # df -h \| grep dm-48 /dev/block/dm-48 112G 112G 11M 100% /data unisoc # ./f2fs_io reserve_cblocks test.apk 0 This is because the file has only one direct_node. After returning to -ENOSPC, reserved_blocks += ret will not be executed. As a result, the reserved_blocks at this time is still 0, which is not the real number of reserved blocks. Therefore, fsck cannot be set to repair the file. After this patch, the fsck flag will be set to fix this problem. unisoc # df -h \| grep dm-48 /dev/block/dm-48 112G 112G 1.8M 100% /data unisoc # ./f2fs_io reserve_cblocks test.apk F2FS_IOC_RESERVE_COMPRESS_BLOCKS failed: No space left on device adb reboot then fsck will be executed unisoc # df -h \| grep dm-48 /dev/block/dm-48 112G 112G 11M 100% /data unisoc # ./f2fs_io reserve_cblocks test.apk 924 Fixes: c75488fb4d82 ("f2fs: introduce F2FS_IOC_RESERVE_COMPRESS_BLOCKS") Signed-off-by: Xiuhong Wang <xiuhong.wang@unisoc.com> Signed-off-by: Zhiguo Niu <zhiguo.niu@unisoc.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-26 18:21:33 -04:00
Christophe JAILLET	6ca2ea698d	NFS: Fix an off by one in root_nfs_cat() [ Upstream commit 698ad1a538da0b6bf969cfee630b4e3a026afb87 ] The intent is to check if 'dest' is truncated or not. So, >= should be used instead of >, because strlcat() returns the length of 'dest' and 'src' excluding the trailing NULL. Fixes: 56463e50d1fc ("NFS: Use super.c for NFSROOT mount option parsing") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-26 18:21:33 -04:00
Jorge Mora	446f55d0d4	NFSv4.2: fix listxattr maximum XDR buffer size [ Upstream commit bcac8bff90a6ee1629f90669cdb9d28fb86049b0 ] Switch order of operations to avoid creating a short XDR buffer: e.g., buflen = 12, old xdrlen = 12, new xdrlen = 20. Having a short XDR buffer leads to lxa_maxcount be a few bytes less than what is needed to retrieve the whole list when using a buflen as returned by a call with size = 0: buflen = listxattr(path, NULL, 0); buf = malloc(buflen); buflen = listxattr(path, buf, buflen); For a file with one attribute (name = '123456'), the first call with size = 0 will return buflen = 12 ('user.123456\x00'). The second call with size = 12, sends LISTXATTRS with lxa_maxcount = 12 + 8 (cookie) + 4 (array count) = 24. The XDR buffer needs 8 (cookie) + 4 (array count) + 4 (name count) + 6 (name len) + 2 (padding) + 4 (eof) = 28 which is 4 bytes shorter than the lxa_maxcount provided in the call. Fixes: 04a5da690e8f ("NFSv4.2: define limits and sizes for user xattr handling") Signed-off-by: Jorge Mora <mora@netapp.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-26 18:21:33 -04:00

... 6 7 8 9 10 ...

75509 Commits