Commit Graph

91674 Commits

Author SHA1 Message Date
Linus Torvalds
0eb03c7e8e tracefs/eventfs fixes and updates for v6.10:
Bug fixes:
 
 - The eventfs directories need to have unique inode numbers. Make sure that
   they do not get the default file inode number.
 
 - Update the inode uid and gid fields on remount.
   When a remount happens where a uid and/or gid is specified, all the tracefs
   files and directories should get the specified uid and/or gid. But this
   can be sporadic when some uids were assigned already. There's already
   a list of inodes that are allocated. Just update their uid and gid fields
   at the time of remount.
 
 - Update the eventfs_inodes on remount from the top level "events" descriptor.
   There was a bug where not all the eventfs files or directories where
   getting updated on remount. One fix was to clear the SAVED_UID/GID
   flags from the inode list during the iteration of the inodes during
   the remount. But because the eventfs inodes can be freed when the last
   referenced is released, not all the eventfs_inodes were being updated.
   This lead to the ownership selftest to fail if it was run a second
   time (the first time would leave eventfs_inodes with no corresponding
   tracefs_inode).
 
   Instead, for eventfs_inodes, only process the "events" eventfs_inode
   from the list iteration, as it is guaranteed to have a tracefs_inode
   (it's never freed while the "events" directory exists). As it has
   a list of its children, and the children have a list of their children,
   just iterate all the eventfs_inodes from the "events" descriptor and
   it is guaranteed to get all of them.
 
 - Clear the EVENT_INODE flag from the tracefs_drop_inode() callback.
   Currently the EVENTFS_INODE FLAG is cleared in the tracefs_d_iput()
   callback. But this is the wrong location. The iput() callback is
   called when the last reference to the dentry inode is hit. There could
   be a case where two dentry's have the same inode, and the flag will
   be cleared prematurely. The flag needs to be cleared when the last
   reference of the inode is dropped and that happens in the inode's
   drop_inode() callback handler.
 
 Clean ups:
 
 - Consolidate the creation of a tracefs_inode for an eventfs_inode
   A tracefs_inode is created for both files and directories of the
   eventfs system. It is open coded. Instead, consolidate it into a
   single eventfs_get_inode() function call.
 
 - Remove the eventfs getattr and permission callbacks.
   The permissions for the eventfs files and directories are updated
   when the inodes are created, on remount, and when the user sets
   them (via setattr). The inodes hold the current permissions so
   there is no need to have custom getattr or permissions callbacks
   as they will more likely cause them to be incorrect. The inode's
   permissions are updated when they should be updated. Remove the
   getattr and permissions inode callbacks.
 
 - Do not update eventfs_inode attributes on creation of inodes.
   The eventfs_inodes attribute field is used to store the permissions
   of the directories and files for when their corresponding inodes
   are freed and are created again. But when the creation of the inodes
   happen, the eventfs_inode attributes are recalculated. The
   recalculation should only happen when the permissions change for
   a given file or directory. Currently, the attribute changes are
   just being set to their current files so this is not a bug, but
   it's unnecessary and error prone. Stop doing that.
 
 - The events directory inode is created once when the events directory
   is created and deleted when it is deleted. It is now updated on
   remount and when the user changes the permissions. There's no need
   to use the eventfs_inode of the events directory to store the
   events directory permissions. But using it to store the default
   permissions for the files within the directory that have not been
   updated by the user can simplify the code.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZk+0ohQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qtWOAQCSdEsWYiNcFBqvKp1kSI+dH1sKfur3
 CAoe1trzDEdv/gEAsFkophR9OBzO193in4ZQYNKdEDfeaicEaDctzLxlkwY=
 =9zqq
 -----END PGP SIGNATURE-----

Merge tag 'trace-tracefs-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracefs/eventfs updates from Steven Rostedt:
 "Bug fixes:

   - The eventfs directories need to have unique inode numbers. Make
     sure that they do not get the default file inode number.

   - Update the inode uid and gid fields on remount.

     When a remount happens where a uid and/or gid is specified, all the
     tracefs files and directories should get the specified uid and/or
     gid. But this can be sporadic when some uids were assigned already.
     There's already a list of inodes that are allocated. Just update
     their uid and gid fields at the time of remount.

   - Update the eventfs_inodes on remount from the top level "events"
     descriptor.

     There was a bug where not all the eventfs files or directories
     where getting updated on remount. One fix was to clear the
     SAVED_UID/GID flags from the inode list during the iteration of the
     inodes during the remount. But because the eventfs inodes can be
     freed when the last referenced is released, not all the
     eventfs_inodes were being updated. This lead to the ownership
     selftest to fail if it was run a second time (the first time would
     leave eventfs_inodes with no corresponding tracefs_inode).

     Instead, for eventfs_inodes, only process the "events"
     eventfs_inode from the list iteration, as it is guaranteed to have
     a tracefs_inode (it's never freed while the "events" directory
     exists). As it has a list of its children, and the children have a
     list of their children, just iterate all the eventfs_inodes from
     the "events" descriptor and it is guaranteed to get all of them.

   - Clear the EVENT_INODE flag from the tracefs_drop_inode() callback.

     Currently the EVENTFS_INODE FLAG is cleared in the tracefs_d_iput()
     callback. But this is the wrong location. The iput() callback is
     called when the last reference to the dentry inode is hit. There
     could be a case where two dentry's have the same inode, and the
     flag will be cleared prematurely. The flag needs to be cleared when
     the last reference of the inode is dropped and that happens in the
     inode's drop_inode() callback handler.

  Cleanups:

   - Consolidate the creation of a tracefs_inode for an eventfs_inode

     A tracefs_inode is created for both files and directories of the
     eventfs system. It is open coded. Instead, consolidate it into a
     single eventfs_get_inode() function call.

   - Remove the eventfs getattr and permission callbacks.

     The permissions for the eventfs files and directories are updated
     when the inodes are created, on remount, and when the user sets
     them (via setattr). The inodes hold the current permissions so
     there is no need to have custom getattr or permissions callbacks as
     they will more likely cause them to be incorrect. The inode's
     permissions are updated when they should be updated. Remove the
     getattr and permissions inode callbacks.

   - Do not update eventfs_inode attributes on creation of inodes.

     The eventfs_inodes attribute field is used to store the permissions
     of the directories and files for when their corresponding inodes
     are freed and are created again. But when the creation of the
     inodes happen, the eventfs_inode attributes are recalculated. The
     recalculation should only happen when the permissions change for a
     given file or directory. Currently, the attribute changes are just
     being set to their current files so this is not a bug, but it's
     unnecessary and error prone. Stop doing that.

   - The events directory inode is created once when the events
     directory is created and deleted when it is deleted. It is now
     updated on remount and when the user changes the permissions.
     There's no need to use the eventfs_inode of the events directory to
     store the events directory permissions. But using it to store the
     default permissions for the files within the directory that have
     not been updated by the user can simplify the code"

* tag 'trace-tracefs-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  eventfs: Do not use attributes for events directory
  eventfs: Cleanup permissions in creation of inodes
  eventfs: Remove getattr and permission callbacks
  eventfs: Consolidate the eventfs_inode update in eventfs_get_inode()
  tracefs: Clear EVENT_INODE flag in tracefs_drop_inode()
  eventfs: Update all the eventfs_inodes from the events descriptor
  tracefs: Update inode permissions on remount
  eventfs: Keep the directories from having the same inode number as files
2024-05-24 08:27:34 -07:00
Linus Torvalds
6d69b6c12f NFS client updates for Linux 6.10
Highlights include:
 
 Stable fixes:
 - nfs: fix undefined behavior in nfs_block_bits()
 - NFSv4.2: Fix READ_PLUS when server doesn't support OP_READ_PLUS
 
 Bugfixes:
 - Fix mixing of the lock/nolock and local_lock mount options
 - NFSv4: Fixup smatch warning for ambiguous return
 - NFSv3: Fix remount when using the legacy binary mount api
 - SUNRPC: Fix the handling of expired RPCSEC_GSS contexts
 - SUNRPC: fix the NFSACL RPC retries when soft mounts are enabled
 - rpcrdma: fix handling for RDMA_CM_EVENT_DEVICE_REMOVAL
 
 Features and cleanups:
 - NFSv3: Use the atomic_open API to fix open(O_CREAT|O_TRUNC)
 - pNFS/filelayout: S layout segment range in LAYOUTGET
 - pNFS: rework pnfs_generic_pg_check_layout to check IO range
 - NFSv2: Turn off enabling of NFS v2 by default
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESQctxSBg8JpV8KqEZwvnipYKAPIFAmZPpYMACgkQZwvnipYK
 APITOw//acjE9YTZcST9kgkf2bfwuHFcdxvMZAr4MV0YsfqMesU2MYmaK/5YMLyo
 iNCHjLmlfE2iLAUqvFtakc1F3guACJqqFfMdnMHa1MwPznrL3yNNClGnBamovbPd
 XK2MBgpQBXb+xLxqH0A2TtOK2ofk0CFzb3x9eaziox8omBM2j3v6ZARsDHYehuhM
 Hig8IxW/kZ7kx5jxqSVktrgW3gDKqIuLssF6fJVINzh45jHC5QO98cuSwetx6Mi1
 Aw04HbOE6B66ORrzC1wyGN3PwOkTW2kgAiyB6UNNt+Hnvr0RD5TEqf3s3mzmhP9N
 7LJ3H1Okxdcpn0G/bR4LBUg26r5BWxhfPiTYG/l9vAQk65yt2LO1kFzXbECBEfaG
 ctGG7/7mMLVPs05kIFYm5S0cIYW2dYNuE20JY50LMaCIopjThdfruQj3yR4xibSt
 bHrAbG9wW4qg/cgx860t5h7nbZnD5OOYIqKOCDRNrUfP7P0mK/tD49HggLjDo47M
 vIMlYS3bTNSF7uEPTrv6bFr8XOD1I3BVXDQwGaJMZ8zyhkUIQtKO70+i4xM1E/Wl
 Jw5Z6NpM8saDD449ZqX4IRUPDAhvz4v00QqD3Tqr4MHEc5sWi898S7XcJgL3bEai
 QMJmBkAK8aDAP/suPw8VQc9wqplFNlB+QEh87p2WO+yRoEucn+A=
 =HMSC
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-6.10-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client updates from Trond Myklebust:
 "Stable fixes:
   - nfs: fix undefined behavior in nfs_block_bits()
   - NFSv4.2: Fix READ_PLUS when server doesn't support OP_READ_PLUS

  Bugfixes:
   - Fix mixing of the lock/nolock and local_lock mount options
   - NFSv4: Fixup smatch warning for ambiguous return
   - NFSv3: Fix remount when using the legacy binary mount api
   - SUNRPC: Fix the handling of expired RPCSEC_GSS contexts
   - SUNRPC: fix the NFSACL RPC retries when soft mounts are enabled
   - rpcrdma: fix handling for RDMA_CM_EVENT_DEVICE_REMOVAL

  Features and cleanups:
   - NFSv3: Use the atomic_open API to fix open(O_CREAT|O_TRUNC)
   - pNFS/filelayout: S layout segment range in LAYOUTGET
   - pNFS: rework pnfs_generic_pg_check_layout to check IO range
   - NFSv2: Turn off enabling of NFS v2 by default"

* tag 'nfs-for-6.10-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  nfs: fix undefined behavior in nfs_block_bits()
  pNFS: rework pnfs_generic_pg_check_layout to check IO range
  pNFS/filelayout: check layout segment range
  pNFS/filelayout: fixup pNfs allocation modes
  rpcrdma: fix handling for RDMA_CM_EVENT_DEVICE_REMOVAL
  NFS: Don't enable NFS v2 by default
  NFS: Fix READ_PLUS when server doesn't support OP_READ_PLUS
  sunrpc: fix NFSACL RPC retry on soft mount
  SUNRPC: fix handling expired GSS context
  nfs: keep server info for remounts
  NFSv4: Fixup smatch warning for ambiguous return
  NFS: make sure lock/nolock overriding local_lock mount option
  NFS: add atomic_open for NFSv3 to handle O_TRUNC correctly.
  pNFS/filelayout: Specify the layout segment range in LAYOUTGET
  pNFS/filelayout: Remove the whole file layout requirement
2024-05-23 13:51:09 -07:00
Linus Torvalds
d6a326d694 tracing: Remove second argument of __assign_str()
The __assign_str() macro logic of the TRACE_EVENT() macro was optimized so
 that it no longer needs the second argument. The __assign_str() is always
 matched with __string() field that takes a field name and the source for
 that field:
 
   __string(field, source)
 
 The TRACE_EVENT() macro logic will save off the source value and then use
 that value to copy into the ring buffer via the __assign_str(). Before
 commit c1fa617cae ("tracing: Rework __assign_str() and __string() to not
 duplicate getting the string"), the __assign_str() needed the second
 argument which would perform the same logic as the __string() source
 parameter did. Not only would this add overhead, but it was error prone as
 if the __assign_str() source produced something different, it may not have
 allocated enough for the string in the ring buffer (as the __string()
 source was used to determine how much to allocate)
 
 Now that the __assign_str() just uses the same string that was used in
 __string() it no longer needs the source parameter. It can now be removed.
 -----BEGIN PGP SIGNATURE-----
 
 iIkEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZk9RMBQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qur+AP9jbSYaGhzZdJ7a3HGA8M4l6JNju8nC
 GcX1JpJT4z1qvgD3RkoNvP87etDAUAqmbVhVWnUHCY/vTqr9uB/gqmG6Ag==
 =Y+6f
 -----END PGP SIGNATURE-----

Merge tag 'trace-assign-str-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing cleanup from Steven Rostedt:
 "Remove second argument of __assign_str()

  The __assign_str() macro logic of the TRACE_EVENT() macro was
  optimized so that it no longer needs the second argument. The
  __assign_str() is always matched with __string() field that takes a
  field name and the source for that field:

    __string(field, source)

  The TRACE_EVENT() macro logic will save off the source value and then
  use that value to copy into the ring buffer via the __assign_str().

  Before commit c1fa617cae ("tracing: Rework __assign_str() and
  __string() to not duplicate getting the string"), the __assign_str()
  needed the second argument which would perform the same logic as the
  __string() source parameter did. Not only would this add overhead, but
  it was error prone as if the __assign_str() source produced something
  different, it may not have allocated enough for the string in the ring
  buffer (as the __string() source was used to determine how much to
  allocate)

  Now that the __assign_str() just uses the same string that was used in
  __string() it no longer needs the source parameter. It can now be
  removed"

* tag 'trace-assign-str-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing/treewide: Remove second parameter of __assign_str()
2024-05-23 12:28:01 -07:00
Linus Torvalds
2ef32ad224 virtio: features, fixes, cleanups
Several new features here:
 
 - virtio-net is finally supported in vduse.
 
 - Virtio (balloon and mem) interaction with suspend is improved
 
 - vhost-scsi now handles signals better/faster.
 
 Fixes, cleanups all over the place.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmZN570PHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRp2JUH/1K3fZOHymop6Y5Z3USFS7YdlF+dniedY/vg
 TKyWERkXOlxq1d9DVxC0mN7tk72DweuWI0YJjLXofrEW1VuW29ecSbyFXxpeWJls
 b7ErffxDAFRas5jkMCngD8TuFnbEegU0mGP5kbiHpEndBydQ2hH99Gg0x7swW+cE
 xsvU5zonCCLwLGIP2DrVrn9qGOHtV6o8eZfVKDVXfvicn3lFBkUSxlwEYsO9RMup
 aKxV4FT2Pb1yBicwBK4TH1oeEXqEGy1YLEn+kAHRbgoC/5L0/LaiqrkzwzwwOIPj
 uPGkacf8CIbX0qZo5EzD8kvfcYL1xhU3eT9WBmpp2ZwD+4bINd4=
 =nax1
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:
 "Several new features here:

   - virtio-net is finally supported in vduse

   - virtio (balloon and mem) interaction with suspend is improved

   - vhost-scsi now handles signals better/faster

  And fixes, cleanups all over the place"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (48 commits)
  virtio-pci: Check if is_avq is NULL
  virtio: delete vq in vp_find_vqs_msix() when request_irq() fails
  MAINTAINERS: add Eugenio Pérez as reviewer
  vhost-vdpa: Remove usage of the deprecated ida_simple_xx() API
  vp_vdpa: don't allocate unused msix vectors
  sound: virtio: drop owner assignment
  fuse: virtio: drop owner assignment
  scsi: virtio: drop owner assignment
  rpmsg: virtio: drop owner assignment
  nvdimm: virtio_pmem: drop owner assignment
  wifi: mac80211_hwsim: drop owner assignment
  vsock/virtio: drop owner assignment
  net: 9p: virtio: drop owner assignment
  net: virtio: drop owner assignment
  net: caif: virtio: drop owner assignment
  misc: nsm: drop owner assignment
  iommu: virtio: drop owner assignment
  drm/virtio: drop owner assignment
  gpio: virtio: drop owner assignment
  firmware: arm_scmi: virtio: drop owner assignment
  ...
2024-05-23 12:04:36 -07:00
Steven Rostedt (Google)
2dd00ac1d3 eventfs: Do not use attributes for events directory
The top "events" directory has a static inode (it's created when it is and
removed when the directory is removed). There's no need to use the events
ei->attr to determine its permissions. But it is used for saving the
permissions of the "events" directory for when it is created, as that is
needed for the default permissions for the files and directories
underneath it.

For example:

 # cd /sys/kernel/tracing
 # mkdir instances/foo
 # chown 1001 instances/foo/events

The files under instances/foo/events should still have the same owner as
instances/foo (which the instances/foo/events ei->attr will hold), but the
events directory now has owner 1001.

Link: https://lore.kernel.org/lkml/20240522165032.104981011@goodmis.org

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-05-23 09:31:50 -04:00
Steven Rostedt (Google)
6e3d7c903c eventfs: Cleanup permissions in creation of inodes
The permissions being set during the creation of the inodes was updating
eventfs_inode attributes as well. Those attributes should only be touched
by the setattr or remount operations, not during the creation of inodes.
The eventfs_inode attributes should only be used to set the inodes and
should not be modified during the inode creation.

Simplify the code and fix the situation by:

 1) Removing the eventfs_find_events() and doing a simple lookup for
    the events descriptor in eventfs_get_inode()

 2) Remove update_events_attr() as the attributes should only be used
    to update the inode and should not be modified here.

 3) Add update_inode_attr() that uses the attributes to determine what
    the inode permissions should be.

 4) As the parent_inode of the eventfs_root_inode structure is no longer
    needed, remove it.

Now on creation, the inode gets the proper permissions without causing
side effects to the ei->attr field.

Link: https://lore.kernel.org/lkml/20240522165031.944088388@goodmis.org

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-05-23 09:31:50 -04:00
Steven Rostedt (Google)
37cd0d1266 eventfs: Remove getattr and permission callbacks
Now that inodes have their permissions updated on remount, the only other
places to update the inode permissions are when they are created and in
the setattr callback. The getattr and permission callbacks are not needed
as the inodes should already be set at their proper settings.

Remove the callbacks, as it not only simplifies the code, but also allows
more flexibility to fix the inconsistencies with various corner cases
(like changing the permission of an instance directory).

Link: https://lore.kernel.org/lkml/20240522165031.782066021@goodmis.org

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-05-23 09:27:25 -04:00
Steven Rostedt (Google)
625acf9d5e eventfs: Consolidate the eventfs_inode update in eventfs_get_inode()
To simplify the code, create a eventfs_get_inode() that is used when an
eventfs file or directory is created. Have the internal tracefs_inode
updated the appropriate flags in this function and update the inode's
mode as well.

Link: https://lore.kernel.org/lkml/20240522165031.624864160@goodmis.org

Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-05-23 09:27:25 -04:00
Steven Rostedt (Google)
0bcfd9aa4d tracefs: Clear EVENT_INODE flag in tracefs_drop_inode()
When the inode is being dropped from the dentry, the TRACEFS_EVENT_INODE
flag needs to be cleared to prevent a remount from calling
eventfs_remount() on the tracefs_inode private data. There's a race
between the inode is dropped (and the dentry freed) to where the inode is
actually freed. If a remount happens between the two, the eventfs_inode
could be accessed after it is freed (only the dentry keeps a ref count on
it).

Currently the TRACEFS_EVENT_INODE flag is cleared from the dentry iput()
function. But this is incorrect, as it is possible that the inode has
another reference to it. The flag should only be cleared when the inode is
really being dropped and has no more references. That happens in the
drop_inode callback of the inode, as that gets called when the last
reference of the inode is released.

Remove the tracefs_d_iput() function and move its logic to the more
appropriate tracefs_drop_inode() callback function.

Link: https://lore.kernel.org/linux-trace-kernel/20240523051539.908205106@goodmis.org

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Fixes: baa23a8d43 ("tracefs: Reset permissions on remount if permissions are options")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-05-23 09:26:24 -04:00
Steven Rostedt (Google)
340f0c7067 eventfs: Update all the eventfs_inodes from the events descriptor
The change to update the permissions of the eventfs_inode had the
misconception that using the tracefs_inode would find all the
eventfs_inodes that have been updated and reset them on remount.
The problem with this approach is that the eventfs_inodes are freed when
they are no longer used (basically the reason the eventfs system exists).
When they are freed, the updated eventfs_inodes are not reset on a remount
because their tracefs_inodes have been freed.

Instead, since the events directory eventfs_inode always has a
tracefs_inode pointing to it (it is not freed when finished), and the
events directory has a link to all its children, have the
eventfs_remount() function only operate on the events eventfs_inode and
have it descend into its children updating their uid and gids.

Link: https://lore.kernel.org/all/CAK7LNARXgaWw3kH9JgrnH4vK6fr8LDkNKf3wq8NhMWJrVwJyVQ@mail.gmail.com/
Link: https://lore.kernel.org/linux-trace-kernel/20240523051539.754424703@goodmis.org

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Fixes: baa23a8d43 ("tracefs: Reset permissions on remount if permissions are options")
Reported-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-05-23 09:26:23 -04:00
Steven Rostedt (Google)
27c0464843 tracefs: Update inode permissions on remount
When a remount happens, if a gid or uid is specified update the inodes to
have the same gid and uid. This will allow the simplification of the
permissions logic for the dynamically created files and directories.

Link: https://lore.kernel.org/linux-trace-kernel/20240523051539.592429986@goodmis.org

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Fixes: baa23a8d43 ("tracefs: Reset permissions on remount if permissions are options")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-05-23 09:26:23 -04:00
Steven Rostedt (Google)
8898e7f288 eventfs: Keep the directories from having the same inode number as files
The directories require unique inode numbers but all the eventfs files
have the same inode number. Prevent the directories from having the same
inode numbers as the files as that can confuse some tooling.

Link: https://lore.kernel.org/linux-trace-kernel/20240523051539.428826685@goodmis.org

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Fixes: 834bf76add ("eventfs: Save directory inodes in the eventfs_inode structure")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-05-23 09:26:23 -04:00
Linus Torvalds
c760b3725e - A series ("kbuild: enable more warnings by default") from Arnd
Bergmann which enables a number of additional build-time warnings.  We
   fixed all the fallout which we could find, there may still be a few
   stragglers.
 
 - Samuel Holland has developed the series "Unified cross-architecture
   kernel-mode FPU API".  This does a lot of consolidation of
   per-architecture kernel-mode FPU usage and enables the use of newer AMD
   GPUs on RISC-V.
 
 - Tao Su has fixed some selftests build warnings in the series
   "Selftests: Fix compilation warnings due to missing _GNU_SOURCE
   definition".
 
 - This pull also includes a nilfs2 fixup from Ryusuke Konishi.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZk6OSAAKCRDdBJ7gKXxA
 jpTGAP9hQaZ+g7CO38hKQAtEI8rwcZJtvUAP84pZEGMjYMGLxQD/S8z1o7UHx61j
 DUbnunbOkU/UcPx3Fs/gp4KcJARMEgs=
 =EPi9
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more non-mm updates from Andrew Morton:

 - A series ("kbuild: enable more warnings by default") from Arnd
   Bergmann which enables a number of additional build-time warnings. We
   fixed all the fallout which we could find, there may still be a few
   stragglers.

 - Samuel Holland has developed the series "Unified cross-architecture
   kernel-mode FPU API". This does a lot of consolidation of
   per-architecture kernel-mode FPU usage and enables the use of newer
   AMD GPUs on RISC-V.

 - Tao Su has fixed some selftests build warnings in the series
   "Selftests: Fix compilation warnings due to missing _GNU_SOURCE
   definition".

 - This pull also includes a nilfs2 fixup from Ryusuke Konishi.

* tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (23 commits)
  nilfs2: make block erasure safe in nilfs_finish_roll_forward()
  selftests/harness: use 1024 in place of LINE_MAX
  Revert "selftests/harness: remove use of LINE_MAX"
  selftests/fpu: allow building on other architectures
  selftests/fpu: move FP code to a separate translation unit
  drm/amd/display: use ARCH_HAS_KERNEL_FPU_SUPPORT
  drm/amd/display: only use hard-float, not altivec on powerpc
  riscv: add support for kernel-mode FPU
  x86: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  powerpc: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  LoongArch: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  lib/raid6: use CC_FLAGS_FPU for NEON CFLAGS
  arm64: crypto: use CC_FLAGS_FPU for NEON CFLAGS
  arm64: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  ARM: crypto: use CC_FLAGS_FPU for NEON CFLAGS
  ARM: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  arch: add ARCH_HAS_KERNEL_FPU_SUPPORT
  x86/fpu: fix asm/fpu/types.h include guard
  kbuild: enable -Wcast-function-type-strict unconditionally
  kbuild: enable -Wformat-truncation on clang
  ...
2024-05-22 18:59:29 -07:00
Steven Rostedt (Google)
2c92ca849f tracing/treewide: Remove second parameter of __assign_str()
With the rework of how the __string() handles dynamic strings where it
saves off the source string in field in the helper structure[1], the
assignment of that value to the trace event field is stored in the helper
value and does not need to be passed in again.

This means that with:

  __string(field, mystring)

Which use to be assigned with __assign_str(field, mystring), no longer
needs the second parameter and it is unused. With this, __assign_str()
will now only get a single parameter.

There's over 700 users of __assign_str() and because coccinelle does not
handle the TRACE_EVENT() macro I ended up using the following sed script:

  git grep -l __assign_str | while read a ; do
      sed -e 's/\(__assign_str([^,]*[^ ,]\) *,[^;]*/\1)/' $a > /tmp/test-file;
      mv /tmp/test-file $a;
  done

I then searched for __assign_str() that did not end with ';' as those
were multi line assignments that the sed script above would fail to catch.

Note, the same updates will need to be done for:

  __assign_str_len()
  __assign_rel_str()
  __assign_rel_str_len()

I tested this with both an allmodconfig and an allyesconfig (build only for both).

[1] https://lore.kernel.org/linux-trace-kernel/20240222211442.634192653@goodmis.org/

Link: https://lore.kernel.org/linux-trace-kernel/20240516133454.681ba6a0@rorschach.local.home

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Christian König <christian.koenig@amd.com> for the amdgpu parts.
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #for
Acked-by: Rafael J. Wysocki <rafael@kernel.org> # for thermal
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Darrick J. Wong <djwong@kernel.org>	# xfs
Tested-by: Guenter Roeck <linux@roeck-us.net>
2024-05-22 20:14:47 -04:00
Linus Torvalds
d90be6e4aa Driver core changes for 6.10-rc1
Here is the small set of driver core and kernfs changes for 6.10-rc1.
 
 Nothing major here at all, just a small set of changes for some driver
 core apis, and minor fixups.  Included in here are:
   - sysfs_bin_attr_simple_read() helper added and used
   - device_show_string() helper added and used
 All usages of these were acked by the various maintainers.  Also in here
 are:
   - kernfs minor cleanup
   - removed unused functions
   - typo fix in documentation
   - pay attention to sysfs_create_link() failures in module.c finally.
 
 All of these have been in linux-next for a very long time with no
 reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZk3+hQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylfTwCfUyHWkDZuZ7ehdtjzfmcd4EKZBK8An3AAV99G
 ox8PXMxuFTaUEdT/69FQ
 =2sEo
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the small set of driver core and kernfs changes for 6.10-rc1.

  Nothing major here at all, just a small set of changes for some driver
  core apis, and minor fixups. Included in here are:

   - sysfs_bin_attr_simple_read() helper added and used

   - device_show_string() helper added and used

  All usages of these were acked by the various maintainers. Also in
  here are:

   - kernfs minor cleanup

   - removed unused functions

   - typo fix in documentation

   - pay attention to sysfs_create_link() failures in module.c finally

  All of these have been in linux-next for a very long time with no
  reported problems"

* tag 'driver-core-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  device property: Fix a typo in the description of device_get_child_node_count()
  kernfs: mount: Remove unnecessary ‘NULL’ values from knparent
  scsi: Use device_show_string() helper for sysfs attributes
  platform/x86: Use device_show_string() helper for sysfs attributes
  perf: Use device_show_string() helper for sysfs attributes
  IB/qib: Use device_show_string() helper for sysfs attributes
  hwmon: Use device_show_string() helper for sysfs attributes
  driver core: Add device_show_string() helper for sysfs attributes
  treewide: Use sysfs_bin_attr_simple_read() helper
  sysfs: Add sysfs_bin_attr_simple_read() helper
  module: don't ignore sysfs_create_link() failures
  driver core: Remove unused platform_notify, platform_notify_remove
2024-05-22 12:13:40 -07:00
Linus Torvalds
0e22bedd75 overlayfs update for 6.10
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCZk3w7AAKCRDh3BK/laaZ
 PFlfAQD7puZ3BowTZ6cTCJ0Zg6U9wNMszlYHl3WyDnYDicaiVgD9HTqlv1pbNoFh
 e5hyXI4vgaMZkYfpET0zBhVkirSKEg4=
 =1vRI
 -----END PGP SIGNATURE-----

Merge tag 'ovl-update-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs

Pull overlayfs updates from Miklos Szeredi:

 - Add tmpfile support

 - Clean up include

* tag 'ovl-update-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
  ovl: remove duplicate included header
  ovl: remove upper umask handling from ovl_create_upper()
  ovl: implement tmpfile
2024-05-22 09:23:18 -07:00
Linus Torvalds
4f2d34b65b fuse update for 6.10
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCZk231AAKCRDh3BK/laaZ
 PKarAP9yWQ6HfqiWdMlW5gO9FLpm0Cbey6DAs1cisAdw86uLbwEAqpYKPXIcOaHX
 IIzCu5R5tbwNHUxADtzznC9r9o/ZHQY=
 =KiAB
 -----END PGP SIGNATURE-----

Merge tag 'fuse-update-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse

Pull fuse updates from Miklos Szeredi:

 - Add fs-verity support (Richard Fung)

 - Add multi-queue support to virtio-fs (Peter-Jan Gootzen)

 - Fix a bug in NOTIFY_RESEND handling (Hou Tao)

 - page -> folio cleanup (Matthew Wilcox)

* tag 'fuse-update-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  virtio-fs: add multi-queue support
  virtio-fs: limit number of request queues
  fuse: clear FR_SENT when re-adding requests into pending list
  fuse: set FR_PENDING atomically in fuse_resend()
  fuse: Add initial support for fs-verity
  fuse: Convert fuse_readpages_end() to use folio_end_read()
2024-05-22 09:18:51 -07:00
Yafang Shao
681ce86235 vfs: Delete the associated dentry when deleting a file
Our applications, built on Elasticsearch[0], frequently create and
delete files.  These applications operate within containers, some with a
memory limit exceeding 100GB.  Over prolonged periods, the accumulation
of negative dentries within these containers can amount to tens of
gigabytes.

Upon container exit, directories are deleted.  However, due to the
numerous associated dentries, this process can be time-consuming.  Our
users have expressed frustration with this prolonged exit duration,
which constitutes our first issue.

Simultaneously, other processes may attempt to access the parent
directory of the Elasticsearch directories.  Since the task responsible
for deleting the dentries holds the inode lock, processes attempting
directory lookup experience significant delays.  This issue, our second
problem, is easily demonstrated:

  - Task 1 generates negative dentries:
  $ pwd
  ~/test
  $ mkdir es && cd es/ && ./create_and_delete_files.sh

  [ After generating tens of GB dentries ]

  $ cd ~/test && rm -rf es

  [ It will take a long duration to finish ]

  - Task 2 attempts to lookup the 'test/' directory
  $ pwd
  ~/test
  $ ls

  The 'ls' command in Task 2 experiences prolonged execution as Task 1
  is deleting the dentries.

We've devised a solution to address both issues by deleting associated
dentry when removing a file.  Interestingly, we've noted that a similar
patch was proposed years ago[1], although it was rejected citing the
absence of tangible issues caused by negative dentries.  Given our
current challenges, we're resubmitting the proposal.  All relevant
stakeholders from previous discussions have been included for reference.

Some alternative solutions are also under discussion[2][3], such as
shrinking child dentries outside of the parent inode lock or even
asynchronously shrinking child dentries.  However, given the
straightforward nature of the current solution, I believe this approach
is still necessary.

[ NOTE! This is a pretty fundamental change in how we deal with
  unlinking dentries, and it doesn't change the fact that you can have
  lots of negative dentries from just doing negative lookups.

  But the kernel test robot is at least initially happy with this from a
  performance angle, so I'm applying this ASAP just to get more testing
  and as a "known fix for an issue people hit in real life".

  Put another way: we should still look at the alternatives, and this
  patch may get reverted if somebody finds a performance regression on
  some other load.       - Linus ]

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://github.com/elastic/elasticsearch [0]
Link: https://patchwork.kernel.org/project/linux-fsdevel/patch/1502099673-31620-1-git-send-email-wangkai86@huawei.com [1]
Link: https://lore.kernel.org/linux-fsdevel/20240511200240.6354-2-torvalds@linux-foundation.org/ [2]
Link: https://lore.kernel.org/linux-fsdevel/CAHk-=wjEMf8Du4UFzxuToGDnF3yLaMcrYeyNAaH1NJWa6fwcNQ@mail.gmail.com/ [3]
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Waiman Long <longman@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Wangkai <wangkai86@huawei.com>
Cc: Colin Walters <walters@verbum.org>
Tested-by: kernel test robot <oliver.sang@intel.com>
Link: https://lore.kernel.org/all/202405221518.ecea2810-oliver.sang@intel.com/
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-05-22 08:49:13 -07:00
Krzysztof Kozlowski
bc21020f61 fuse: virtio: drop owner assignment
virtio core already sets the .owner, so driver does not need to.

Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>

Message-Id: <20240331-module-owner-virtio-v2-24-98f04bfaf46a@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-05-22 08:31:18 -04:00
Mike Christie
240a1853b4 kernel: Remove signal hacks for vhost_tasks
This removes the signal/coredump hacks added for vhost_tasks in:

Commit f9010dbdce ("fork, vhost: Use CLONE_THREAD to fix freezer/ps regression")

When that patch was added vhost_tasks did not handle SIGKILL and would
try to ignore/clear the signal and continue on until the device's close
function was called. In the previous patches vhost_tasks and the vhost
drivers were converted to support SIGKILL by cleaning themselves up and
exiting. The hacks are no longer needed so this removes them.

Signed-off-by: Mike Christie <michael.christie@oracle.com>
Message-Id: <20240316004707.45557-10-michael.christie@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-05-22 08:31:15 -04:00
Linus Torvalds
b6394d6f71 Assorted commits that had missed the last merge window...
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZkzp/gAKCRBZ7Krx/gZQ
 63KFAQCsKv3XdcF+2BO+QuwPvR6eAvDxFjrFEcQFyyOXgFVLaAD/UMM0HcEFWxBb
 PCPvyKVP22wF9PbodkrKJn8DRdtRZwM=
 =jvWv
 -----END PGP SIGNATURE-----

Merge tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull misc vfs updates from Al Viro:
 "Assorted commits that had missed the last merge window..."

* tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  remove call_{read,write}_iter() functions
  do_dentry_open(): kill inode argument
  kernel_file_open(): get rid of inode argument
  get_file_rcu(): no need to check for NULL separately
  fd_is_open(): move to fs/file.c
  close_on_exec(): pass files_struct instead of fdtable
2024-05-21 13:11:44 -07:00
Linus Torvalds
38da32ee70 bd_inode series
Replacement of bdev->bd_inode with sane(r) set of primitives.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZkwjlgAKCRBZ7Krx/gZQ
 66OmAP9nhZLASn/iM2+979I6O0GW+vid+uLh48uW3d+LbsmVIgD9GYpR+cuLQ/xj
 mJESWfYKOVSpFFSrqlzKg9PQlU/GFgs=
 =6LRp
 -----END PGP SIGNATURE-----

Merge tag 'pull-bd_inode-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull bdev bd_inode updates from Al Viro:
 "Replacement of bdev->bd_inode with sane(r) set of primitives by me and
  Yu Kuai"

* tag 'pull-bd_inode-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  RIP ->bd_inode
  dasd_format(): killing the last remaining user of ->bd_inode
  nilfs_attach_log_writer(): use ->bd_mapping->host instead of ->bd_inode
  block/bdev.c: use the knowledge of inode/bdev coallocation
  gfs2: more obvious initializations of mapping->host
  fs/buffer.c: massage the remaining users of ->bd_inode to ->bd_mapping
  blk_ioctl_{discard,zeroout}(): we only want ->bd_inode->i_mapping here...
  grow_dev_folio(): we only want ->bd_inode->i_mapping there
  use ->bd_mapping instead of ->bd_inode->i_mapping
  block_device: add a pointer to struct address_space (page cache of bdev)
  missing helpers: bdev_unhash(), bdev_drop()
  block: move two helpers into bdev.c
  block2mtd: prevent direct access of bd_inode
  dm-vdo: use bdev_nr_bytes(bdev) instead of i_size_read(bdev->bd_inode)
  blkdev_write_iter(): saner way to get inode and bdev
  bcachefs: remove dead function bdev_sectors()
  ext4: remove block_device_ejected()
  erofs_buf: store address_space instead of inode
  erofs: switch erofs_bread() to passing offset instead of block number
2024-05-21 09:51:42 -07:00
Linus Torvalds
5ad8b6ad9a getting rid of bogus set_blocksize() uses, switching it
to struct file * and verifying that caller has device
 opened exclusively.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZkwkfQAKCRBZ7Krx/gZQ
 62C3AQDW5vuXNx2+KDPma5YStjFpPLC0xtSyAS5D3YANjtyRFgD/TOcCarq7rvBt
 KubxHVFsfW+eu6ASeaoMRB83w5OIzwk=
 =Liix
 -----END PGP SIGNATURE-----

Merge tag 'pull-set_blocksize' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull vfs blocksize updates from Al Viro:
 "This gets rid of bogus set_blocksize() uses, switches it over
  to be based on a 'struct file *' and verifies that the caller
  has the device opened exclusively"

* tag 'pull-set_blocksize' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  make set_blocksize() fail unless block device is opened exclusive
  set_blocksize(): switch to passing struct file *
  btrfs_get_bdev_and_sb(): call set_blocksize() only for exclusive opens
  swsusp: don't bother with setting block size
  zram: don't bother with reopening - just use O_EXCL for open
  swapon(2): open swap with O_EXCL
  swapon(2)/swapoff(2): don't bother with block size
  pktcdvd: sort set_blocksize() calls out
  bcache_register(): don't bother with set_blocksize()
2024-05-21 08:34:51 -07:00
Linus Torvalds
db3d841ac9 fs/pidfs: make 'lsof' happy with our inode changes
pidfs started using much saner inodes in commit b28ddcc32d ("pidfs:
convert to path_from_stashed() helper"), but that exposed the fact that
lsof had some knowledge of just how odd our old anon_inode usage was.

For example, legacy anon_inodes hadn't even initialized the inode type
in the inode mode, so everything had a type of zero.

So sane tools like 'stat' would report these files as "weird file", but
'lsof' instead used that (together with the name of the link in proc) to
notice that it's an anonymous inode, and used it to detect pidfd files.

Let's keep our internal new sane inode model, but mask the file type
bits at 'stat()' time in the getattr() function we already have, and by
making the dentry name match what lsof expects too.

This keeps our internal models sane, but should make user space see the
same old odd behavior.

Reported-by: Jiri Slaby <jirislaby@kernel.org>
Link: https://lore.kernel.org/all/a15b1050-4b52-4740-a122-a4d055c17f11@kernel.org/
Link: https://github.com/lsof-org/lsof/issues/317
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Seth Forshee <sforshee@kernel.org>
Cc: Tycho Andersen <tycho@tycho.pizza>
Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-05-21 08:08:00 -07:00
Sergey Shtylyov
3c0a2e0b0a nfs: fix undefined behavior in nfs_block_bits()
Shifting *signed int* typed constant 1 left by 31 bits causes undefined
behavior. Specify the correct *unsigned long* type by using 1UL instead.

Found by Linux Verification Center (linuxtesting.org) with the Svace static
analysis tool.

Cc: stable@vger.kernel.org
Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-05-21 08:34:15 -04:00
Olga Kornievskaia
a01b077a87 pNFS: rework pnfs_generic_pg_check_layout to check IO range
All callers of pnfs_generic_pg_check_layout() also want to do a call to
check that the layout's range covers the IO range. Merge the functionality
of the pnfs_generic_pg_check_range() into that of
pnfs_generic_pg_check_layout().

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-05-21 08:34:15 -04:00
Olga Kornievskaia
523412b904 pNFS/filelayout: check layout segment range
Before doing the IO, check that we have the layout covering the range of
IO.

Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-05-21 08:34:15 -04:00
Olga Kornievskaia
3ebcb24646 pNFS/filelayout: fixup pNfs allocation modes
Change left over allocation flags.

Fixes: a245832aaa ("pNFS/files: Ensure pNFS allocation modes are consistent with nfsiod")
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-05-21 08:34:15 -04:00
Linus Torvalds
72ece20127 f2fs update for 6.10-rc1
In this round, we've tried to address some performance issues on zoned storage
 such as direct IO and write_hints. In addition, we've migrated some IO paths
 using folio. Meanwhile, there are multiple bug fixes in the compression paths,
 sanity check conditions, and error handlers.
 
 Enhancement:
  - allow direct io of pinned files for zoned storage
  - assign the write hint per stream by default
  - convert read paths and test_writeback to folio
  - avoid allocating WARM_DATA segment for direct IO
 
 Bug fix:
  - fix false alarm on invalid block address
  - fix to add missing iput() in gc_data_segment()
  - fix to release node block count in error path of f2fs_new_node_page()
  - compress: don't allow unaligned truncation on released compress inode
  - compress: fix to cover {reserve,release}_compress_blocks() w/ cp_rwsem lock
  - compress: fix error path of inc_valid_block_count()
  - compress: fix to update i_compr_blocks correctly
  - fix block migration when section is not aligned to pow2
  - don't trigger OPU on pinfile for direct IO
  - fix to do sanity check on i_xattr_nid in sanity_check_inode()
  - write missing last sum blk of file pinning section
  - clear writeback when compression failed
  - fix to adjust appropirate defragment pg_end
 
 As usual, there are several minor code clean-ups, and fixes to manage missing
 corner cases in the error paths.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE00UqedjCtOrGVvQiQBSofoJIUNIFAmZLpYcACgkQQBSofoJI
 UNJQTw/+NaY7a1EgkMUpBAzxrJMKHcuBtyG42QKqgk6new0XejQGjPHojL2nPrw/
 t5G9TsbZbkHNMuhAkkTZMH+DFg92QYhByJlq79fxzya0XyGH4OaY1i4u67FLu0Qz
 PS/UKRkEI2B9lH+bGwa//XNMDSnzcao46bNi1SFbCNPGzU1cS35uOy/YgAdFlqTM
 WKJmM/AcNir4xtL30tBCVU//0OTtzT8+5YFVyPTeFR4WACsF6eTJAre9938xw1Ef
 p6ed6Wl2GYehqgFrAdAF07veZ1hVDSRAAB/1Mu1WKnNp57VBRjJW3DFDyApf+fIe
 2KJIDJd9/ece3dycuiZP/LXPV0sODqOI1/5s9RbFVq/QAhTSME5xq8hNXTejdl28
 PV6M2tKcTKMRpykppQg/K/N9PaO5Q6oFz0xlrOsrGoAhT1YnZfJi/DmzCZCCwYxW
 jyZor/r+849yDDdjhB94ZaByvj5S3OVqgsaunnbMBcGy+DDe0rUMXvRzVK4gTcCF
 lSTSp895BggWXLyPuXVNTjC4GIbzVbEDaHILPicfbqi0h5OCXG8YybKHiRs+ss6z
 ZrKJQxSVVvhjyHTVcBhb/Nc1s7Fm7DkX+KjV9GV3gwzB+AlVIgPlwyMTc2fZp3ST
 dUbmBR5+g4UUz2v4v4ZStAGy9eUFktO89u/roet8/74ppklj73E=
 =3mwj
 -----END PGP SIGNATURE-----

Merge tag 'f2fs-for-6.10.rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
 "In this round, we've tried to address some performance issues on zoned
  storage such as direct IO and write_hints. In addition, we've migrated
  some IO paths using folio. Meanwhile, there are multiple bug fixes in
  the compression paths, sanity check conditions, and error handlers.

  Enhancements:
   - allow direct io of pinned files for zoned storage
   - assign the write hint per stream by default
   - convert read paths and test_writeback to folio
   - avoid allocating WARM_DATA segment for direct IO

  Bug fixes:
   - fix false alarm on invalid block address
   - fix to add missing iput() in gc_data_segment()
   - fix to release node block count in error path of
     f2fs_new_node_page()
   - compress:
       - don't allow unaligned truncation on released compress inode
       - cover {reserve,release}_compress_blocks() w/ cp_rwsem lock
       - fix error path of inc_valid_block_count()
       - fix to update i_compr_blocks correctly
   - fix block migration when section is not aligned to pow2
   - don't trigger OPU on pinfile for direct IO
   - fix to do sanity check on i_xattr_nid in sanity_check_inode()
   - write missing last sum blk of file pinning section
   - clear writeback when compression failed
   - fix to adjust appropirate defragment pg_end

  As usual, there are several minor code clean-ups, and fixes to manage
  missing corner cases in the error paths"

* tag 'f2fs-for-6.10.rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (50 commits)
  f2fs: initialize last_block_in_bio variable
  f2fs: Add inline to f2fs_build_fault_attr() stub
  f2fs: fix some ambiguous comments
  f2fs: fix to add missing iput() in gc_data_segment()
  f2fs: allow dirty sections with zero valid block for checkpoint disabled
  f2fs: compress: don't allow unaligned truncation on released compress inode
  f2fs: fix to release node block count in error path of f2fs_new_node_page()
  f2fs: compress: fix to cover {reserve,release}_compress_blocks() w/ cp_rwsem lock
  f2fs: compress: fix error path of inc_valid_block_count()
  f2fs: compress: fix typo in f2fs_reserve_compress_blocks()
  f2fs: compress: fix to update i_compr_blocks correctly
  f2fs: check validation of fault attrs in f2fs_build_fault_attr()
  f2fs: fix to limit gc_pin_file_threshold
  f2fs: remove unused GC_FAILURE_PIN
  f2fs: use f2fs_{err,info}_ratelimited() for cleanup
  f2fs: fix block migration when section is not aligned to pow2
  f2fs: zone: fix to don't trigger OPU on pinfile for direct IO
  f2fs: fix to do sanity check on i_xattr_nid in sanity_check_inode()
  f2fs: fix to avoid allocating WARM_DATA segment for direct IO
  f2fs: remove redundant parameter in is_next_segment_free()
  ...
2024-05-20 13:23:43 -07:00
Linus Torvalds
119d1b8a5d New code for 6.10:
* Introduce Parent Pointer extended attribute for inodes.
 
   * Online Repair
     - Implement atomic file content exchanges i.e. exchange ranges of bytes
       between two files atomically.
     - Create temporary files to repair file-based metadata. This uses atomic
       file content exchange facility to swap file fork mappings between the
       temporary file and the metadata inode.
 
     - Allow callers of directory/xattr code to set an explicit owner number to
       be written into the header fields of any new blocks that are created.
       This is required to avoid walking every block of the new structure and
       modify their ownership during online repair.
     - Repair
       - Extended attributes
       - Inode unlinked state
       - Directories
       - Symbolic links
       - AGI's unlinked inode list.
       - Parent pointers.
     - Move Orphan files to lost and found directory.
     - Fixes for Inode repair functionality.
     - Introduce a new sub-AG FITRIM implementation to reduce the duration for
       which the AGF lock is held.
     - Updates for the design documentation.
     - Use Parent Pointers to assist in checking directories, parent pointers,
       extended attributes, and link counts.
 
   * Bring back delalloc support for realtime devices which have an extent size
     that is equal to filesystem's block size.
 
   * Improve performance of log incompat feature handling.
 
   * Fixes
     - Prevent userspace from reading invalid file data due to incorrect.
       updation of file size when performing a non-atomic clone operation.
     - Minor fixes to online repair.
     - Fix confusing return values from xfs_bmapi_write().
     - Fix an out of bounds access due to incorrect h_size during log recovery.
     - Defer upgrading the extent counters in xfs_reflink_end_cow_extent() until
       we know we are going to modify the extent mapping.
     - Remove racy access to if_bytes check in xfs_reflink_end_cow_extent().
     - Fix sparse warnings.
 
   * Cleanups
     - Hold inode locks on all files involved in a rename until the completion
       of the operation. This is in preparation for the parent pointers patchset
       where parent pointers are applied in a separate chained update from the
       actual directory update.
     - Compile out v4 support when disabled.
     - Cleanup xfs_extent_busy_clear().
     - Remove unused flags and fields from struct xfs_da_args.
     - Remove definitions of unused functions.
     - Improve extended attribute validation.
     - Add higher level directory operations helpers to remove duplication of
       code.
     - Cleanup quota (un)reservation interfaces.
 
 Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQjMC4mbgVeU7MxEIYH7y4RirJu9AUCZjZC0wAKCRAH7y4RirJu
 9HsCAPoCQvmPefDv56aMb5JEQNpv9dPz2Djj14hqLytQs5P/twD+LF5NhJgQNDUo
 Lwnb0tmkAhmG9Y4CCiN1FwSj1rq59gE=
 =2hXB
 -----END PGP SIGNATURE-----

Merge tag 'xfs-6.10-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs updates from Chandan Babu:
 "Online repair feature continues to be expanded. Also, we now support
  delayed allocation for realtime devices which have an extent size that
  is equal to filesystem's block size.

  New code:

   - Introduce Parent Pointer extended attribute for inodes

   - Bring back delalloc support for realtime devices which have an
     extent size that is equal to filesystem's block size

   - Improve performance of log incompat feature handling

  Online Repair:

   - Implement atomic file content exchanges i.e. exchange ranges of
     bytes between two files atomically

   - Create temporary files to repair file-based metadata. This uses
     atomic file content exchange facility to swap file fork mappings
     between the temporary file and the metadata inode

   - Allow callers of directory/xattr code to set an explicit owner
     number to be written into the header fields of any new blocks that
     are created. This is required to avoid walking every block of the
     new structure and modify their ownership during online repair

   - Repair more data structures:
       - Extended attributes
       - Inode unlinked state
       - Directories
       - Symbolic links
       - AGI's unlinked inode list
       - Parent pointers

   - Move Orphan files to lost and found directory

   - Fixes for Inode repair functionality

   - Introduce a new sub-AG FITRIM implementation to reduce the duration
     for which the AGF lock is held

   - Updates for the design documentation

   - Use Parent Pointers to assist in checking directories, parent
     pointers, extended attributes, and link counts

  Fixes:

   - Prevent userspace from reading invalid file data due to incorrect.
     updation of file size when performing a non-atomic clone operation

   - Minor fixes to online repair

   - Fix confusing return values from xfs_bmapi_write()

   - Fix an out of bounds access due to incorrect h_size during log
     recovery

   - Defer upgrading the extent counters in xfs_reflink_end_cow_extent()
     until we know we are going to modify the extent mapping

   - Remove racy access to if_bytes check in
     xfs_reflink_end_cow_extent()

   - Fix sparse warnings

  Cleanups:

   - Hold inode locks on all files involved in a rename until the
     completion of the operation. This is in preparation for the parent
     pointers patchset where parent pointers are applied in a separate
     chained update from the actual directory update

   - Compile out v4 support when disabled

   - Cleanup xfs_extent_busy_clear()

   - Remove unused flags and fields from struct xfs_da_args

   - Remove definitions of unused functions

   - Improve extended attribute validation

   - Add higher level directory operations helpers to remove duplication
     of code

   - Cleanup quota (un)reservation interfaces"

* tag 'xfs-6.10-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (221 commits)
  xfs: simplify iext overflow checking and upgrade
  xfs: remove a racy if_bytes check in xfs_reflink_end_cow_extent
  xfs: upgrade the extent counters in xfs_reflink_end_cow_extent later
  xfs: xfs_quota_unreserve_blkres can't fail
  xfs: consolidate the xfs_quota_reserve_blkres definitions
  xfs: clean up buffer allocation in xlog_do_recovery_pass
  xfs: fix log recovery buffer allocation for the legacy h_size fixup
  xfs: widen flags argument to the xfs_iflags_* helpers
  xfs: minor cleanups of xfs_attr3_rmt_blocks
  xfs: create a helper to compute the blockcount of a max sized remote value
  xfs: turn XFS_ATTR3_RMT_BUF_SPACE into a function
  xfs: use unsigned ints for non-negative quantities in xfs_attr_remote.c
  xfs: do not allocate the entire delalloc extent in xfs_bmapi_write
  xfs: fix xfs_bmap_add_extent_delay_real for partial conversions
  xfs: remove the xfs_iext_peek_prev_extent call in xfs_bmapi_allocate
  xfs: pass the actual offset and len to allocate to xfs_bmapi_allocate
  xfs: don't open code XFS_FILBLKS_MIN in xfs_bmapi_write
  xfs: lift a xfs_valid_startblock into xfs_bmapi_allocate
  xfs: remove the unusued tmp_logflags variable in xfs_bmapi_allocate
  xfs: fix error returns from xfs_bmapi_write
  ...
2024-05-20 12:55:12 -07:00
Linus Torvalds
bb6b206216 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmZLMxUACgkQnJ2qBz9k
 QNnnXAgA1MfUPm6b4AE3y3EEWUSTQd2THQSGUg/ZLeJW3zE5nNaW74BPYKTSTreY
 Bmx0QVrgzJX9EJe2UmFsnPiPEzznn1RIDdPwCQEZpjEt+/iX3/+5+z+aDK57mWpJ
 5Lxzv/Ji1iNlYzdDti8MIc9edj923A7JpQQQ7Hz6ldmuc5EHXLrcTTzytPIDU5cp
 6RKY7W0sntslTjKvVkZaEqugPtQpe064Sq7a0jtjLz2r+tyxDXakWNGrdQIYDTLR
 UGwsFFZL5JbnR12oCMKJ3Vh3YjA5gmyxlbHCghDtGkXrSS1mseumJYsRFazE7y+8
 Fp1fbObLe1z22N742r1aNE8vtXLGUw==
 =nUBh
 -----END PGP SIGNATURE-----

Merge tag 'fs_for_v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull isofs, udf, quota, ext2, and reiserfs updates from Jan Kara:

 - convert isofs to the new mount API

 - cleanup isofs Makefile

 - udf conversion to folios

 - some other small udf cleanups and fixes

 - ext2 cleanups

 - removal of reiserfs .writepage method

 - update reiserfs README file

* tag 'fs_for_v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  isofs: Use *-y instead of *-objs in Makefile
  ext2: Remove LEGACY_DIRECT_IO dependency
  isofs: Remove calls to set/clear the error flag
  ext2: Remove call to folio_set_error()
  udf: Use a folio in udf_write_end()
  udf: Convert udf_page_mkwrite() to use a folio
  udf: Convert udf_symlink_getattr() to use a folio
  udf: Convert udf_adinicb_readpage() to udf_adinicb_read_folio()
  udf: Convert udf_expand_file_adinicb() to use a folio
  udf: Convert udf_write_begin() to use a folio
  udf: Convert udf_symlink_filler() to use a folio
  reiserfs: Trim some README bits
  quota: fix to propagate error of mark_dquot_dirty() to caller
  reiserfs: Convert to writepages
  udf: udftime: prevent overflow in udf_disk_stamp_to_time()
  ext2: set FMODE_CAN_ODIRECT instead of a dummy direct_IO method
  udf: replace deprecated strncpy/strcpy with strscpy
  udf: Remove second semicolon
  isofs: convert isofs to use the new mount API
  fs: quota: use group allocation of per-cpu counters API
2024-05-20 12:49:25 -07:00
Linus Torvalds
d0e71e23ec Revert "fanotify: remove unneeded sub-zero check for unsigned value"
This reverts commit e659522446.

These kinds of patches are only making the code worse.

Compilers don't care about the unnecessary check, but removing it makes
the code less obvious to a human.  The declaration of 'len' is more than
80 lines earlier, so a human won't easily see that 'len' is of an
unsigned type, so to a human the range check that checks against zero is
much more explicit and obvious.

Any tool that complains about a range check like this just because the
variable is unsigned is actively detrimental, and should be ignored.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-05-20 12:43:58 -07:00
Linus Torvalds
5af9d1cf39 \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmZLJS0ACgkQnJ2qBz9k
 QNmFlggAlIg5oDZfOhJur6h3Icldrl2DsnKer0CAP7TFK+GfkFTEb25paoydBEu4
 Y0VzZ3n3EqhmsJ8P515k1UPPPXlqqZwSRWGAek0FDhQCXhqEYxiWwf9U343hJNBS
 rya4Rnwc1pxqmJU2hrY5R5kEbugUFAIL+qNXzhhLpWonYiy/ya7P5n/qz5F5HJH2
 FufRRaPHcHFfk1u0+PvFrk019AS9C6Y3bkcUGtbpdwmFsuN3D4HKuLEkr1+C9Apb
 NmkoAwCiSobQhAxGDr6Szqu6r1VCuM+n/O9fqLknnL9u0jm95AmGdIMOdQ/ofx6d
 xn3mfRp8gUbPD8PubHhQsMjCmSjGwg==
 =kwWW
 -----END PGP SIGNATURE-----

Merge tag 'fsnotify_for_v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull fsnotify updates from Jan Kara:

 - reduce overhead of fsnotify infrastructure when no permission events
   are in use

 - a few small cleanups

* tag 'fsnotify_for_v6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fsnotify: fix UAF from FS_ERROR event on a shutting down filesystem
  fsnotify: optimize the case of no permission event watchers
  fsnotify: use an enum for group priority constants
  fsnotify: move s_fsnotify_connectors into fsnotify_sb_info
  fsnotify: lazy attach fsnotify_sb_info state to sb
  fsnotify: create helper fsnotify_update_sb_watchers()
  fsnotify: pass object pointer and type to fsnotify mark helpers
  fanotify: merge two checks regarding add of ignore mark
  fsnotify: create a wrapper fsnotify_find_inode_mark()
  fsnotify: create helpers to get sb and connp from object
  fsnotify: rename fsnotify_{get,put}_sb_connectors()
  fsnotify: Avoid -Wflex-array-member-not-at-end warning
  fanotify: remove unneeded sub-zero check for unsigned value
2024-05-20 12:31:43 -07:00
Anna Schumaker
d1404e46ae NFS: Don't enable NFS v2 by default
This came up during one of the Bake-a-thon discussions. NFS v2 support
was dropped from nfs-utils/mount.nfs in December 2021. Let's turn it
off by default in the kernel too, since this means there isn't a way
to mount and test it.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Reviewed-by: Jeffrey Layton <jlayton@kernel.org>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-05-20 11:37:15 -04:00
Anna Schumaker
f06d1b10cb NFS: Fix READ_PLUS when server doesn't support OP_READ_PLUS
Olga showed me a case where the client was sending multiple READ_PLUS
calls to the server in parallel, and the server replied
NFS4ERR_OPNOTSUPP to each. The client would fall back to READ for the
first reply, but fail to retry the other calls.

I fix this by removing the test for NFS_CAP_READ_PLUS in
nfs4_read_plus_not_supported(). This allows us to reschedule any
READ_PLUS call that has a NFS4ERR_OPNOTSUPP return value, even after the
capability has been cleared.

Reported-by: Olga Kornievskaia <kolga@netapp.com>
Fixes: c567552612 ("NFS: Add READ_PLUS data segment support")
Cc: stable@vger.kernel.org # v5.10+
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Reviewed-by: Benjamin Coddington <bcodding@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-05-20 11:37:15 -04:00
Martin Kaiser
b322bf9e98 nfs: keep server info for remounts
With newer kernels that use fs_context for nfs mounts, remounts fail with
-EINVAL.

$ mount -t nfs -o nolock 10.0.0.1:/tmp/test /mnt/test/
$ mount -t nfs -o remount /mnt/test/
mount: mounting 10.0.0.1:/tmp/test on /mnt/test failed: Invalid argument

For remounts, the nfs server address and port are populated by
nfs_init_fs_context and later overwritten with 0x00 bytes by
nfs23_parse_monolithic. The remount then fails as the server address is
invalid.

Fix this by not overwriting nfs server info in nfs23_parse_monolithic if
we're doing a remount.

Fixes: f2aedb713c ("NFS: Add fs_context support.")
Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-05-20 11:37:15 -04:00
Benjamin Coddington
37ffe06537 NFSv4: Fixup smatch warning for ambiguous return
Dan Carpenter reports smatch warning for nfs4_try_migration() when a memory
allocation failure results in a zero return value.  In this case, a
transient allocation failure error will likely be retried the next time the
server responds with NFS4ERR_MOVED.

We can fixup the smatch warning with a small refactor: attempt all three
allocations before testing and returning on a failure.

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Fixes: c3ed222745 ("NFSv4: Fix free of uninitialized nfs4_label on referral lookup.")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-05-20 11:12:59 -04:00
Chen Hanxiao
bf95f82e6a NFS: make sure lock/nolock overriding local_lock mount option
Currently, mount option lock/nolock and local_lock option
may override NFS_MOUNT_LOCAL_FLOCK NFS_MOUNT_LOCAL_FCNTL flags
when passing in different order:

mount -o vers=3,local_lock=all,lock:
	local_lock=none

mount -o vers=3,lock,local_lock=all:
	local_lock=all

This patch will let lock/nolock override local_lock option
as nfs(5) suggested.

Signed-off-by: Chen Hanxiao <chenhx.fnst@fujitsu.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-05-20 11:10:27 -04:00
NeilBrown
7c6c5249f0 NFS: add atomic_open for NFSv3 to handle O_TRUNC correctly.
With two clients, each with NFSv3 mounts of the same directory, the sequence:

   client1            client2
  ls -l afile
                      echo hello there > afile
  echo HELLO > afile
  cat afile

will show
   HELLO
   there

because the O_TRUNC requested in the final 'echo' doesn't take effect.
This is because the "Negative dentry, just create a file" section in
lookup_open() assumes that the file *does* get created since the dentry
was negative, so it sets FMODE_CREATED, and this causes do_open() to
clear O_TRUNC and so the file doesn't get truncated.

Even mounting with -o lookupcache=none does not help as
nfs_neg_need_reval() always returns false if LOOKUP_CREATE is set.

This patch fixes the problem by providing an atomic_open inode operation
for NFSv3 (and v2).  The code is largely the code from the branch in
lookup_open() when atomic_open is not provided.  The significant change
is that the O_TRUNC flag is passed a new nfs_do_create() which add
'trunc' handling to nfs_create().

With this change we also optimise away an unnecessary LOOKUP before the
file is created.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-05-20 11:09:20 -04:00
Anna Schumaker
464b424fb0 pNFS/filelayout: Specify the layout segment range in LAYOUTGET
Move from only requesting full file layout segments to requesting layout
segments that match our I/O size. This means the server is still free to
return a full file layout if it wants, but partial layouts will no
longer cause an error.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-05-20 11:06:18 -04:00
Anna Schumaker
9c75576e3b pNFS/filelayout: Remove the whole file layout requirement
Layout segments have been supported in pNFS for years, so remove the
requirement that the server always sends whole file layouts.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
2024-05-20 11:06:04 -04:00
Ryusuke Konishi
db3e24a02e nilfs2: make block erasure safe in nilfs_finish_roll_forward()
The implementation of writing a zero-fill block in
nilfs_finish_roll_forward() is not safe.  The buffer is being cleared
without acquiring a lock or setting the uptodate flag, so theoretically,
between the time the buffer's data is cleared and the time it is written
back to the block device using sync_dirty_buffer(), that zero data can be
undone by concurrent block device reads.

Since this buffer points to a location that has been read from disk once,
the uptodate flag will most likely remain, but since it was obtained with
__getblk(), that is not guaranteed.  In other words, this is exceptional,
and this function itself is not normally called (only once when mounting
after a specific pattern of unclean shutdown), so it is highly unlikely
that this will actually cause a problem.

Anyway, eliminate this potential race issue by protecting the clearing of
buffer data with a buffer lock and setting the buffer's uptodate flag
within the protected section.

Link: https://lkml.kernel.org/r/20240511002942.9608-1-konishi.ryusuke@gmail.com
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-19 14:36:21 -07:00
Linus Torvalds
eb6a9339ef Mainly singleton patches, documented in their respective changelogs.
Notable series include:
 
 - Some maintenance and performance work for ocfs2 in Heming Zhao's
   series "improve write IO performance when fragmentation is high".
 
 - Some ocfs2 bugfixes from Su Yue in the series "ocfs2 bugs fixes
   exposed by fstests".
 
 - kfifo header rework from Andy Shevchenko in the series "kfifo: Clean
   up kfifo.h".
 
 - GDB script fixes from Florian Rommel in the series "scripts/gdb: Fixes
   for $lx_current and $lx_per_cpu".
 
 - After much discussion, a coding-style update from Barry Song
   explaining one reason why inline functions are preferred over macros.
   The series is "codingstyle: avoid unused parameters for a function-like
   macro".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZkpLYQAKCRDdBJ7gKXxA
 jo9NAQDctSD3TMXqxqCHLaEpCaYTYzi6TGAVHjgkqGzOt7tYjAD/ZIzgcmRwthjP
 R7SSiSgZ7UnP9JRn16DQILmFeaoG1gs=
 =lYhr
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-05-19-11-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-mm updates from Andrew Morton:
 "Mainly singleton patches, documented in their respective changelogs.
  Notable series include:

   - Some maintenance and performance work for ocfs2 in Heming Zhao's
     series "improve write IO performance when fragmentation is high".

   - Some ocfs2 bugfixes from Su Yue in the series "ocfs2 bugs fixes
     exposed by fstests".

   - kfifo header rework from Andy Shevchenko in the series "kfifo:
     Clean up kfifo.h".

   - GDB script fixes from Florian Rommel in the series "scripts/gdb:
     Fixes for $lx_current and $lx_per_cpu".

   - After much discussion, a coding-style update from Barry Song
     explaining one reason why inline functions are preferred over
     macros. The series is "codingstyle: avoid unused parameters for a
     function-like macro""

* tag 'mm-nonmm-stable-2024-05-19-11-56' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (62 commits)
  fs/proc: fix softlockup in __read_vmcore
  nilfs2: convert BUG_ON() in nilfs_finish_roll_forward() to WARN_ON()
  scripts: checkpatch: check unused parameters for function-like macro
  Documentation: coding-style: ask function-like macros to evaluate parameters
  nilfs2: use __field_struct() for a bitwise field
  selftests/kcmp: remove unused open mode
  nilfs2: remove calls to folio_set_error() and folio_clear_error()
  kernel/watchdog_perf.c: tidy up kerneldoc
  watchdog: allow nmi watchdog to use raw perf event
  watchdog: handle comma separated nmi_watchdog command line
  nilfs2: make superblock data array index computation sparse friendly
  squashfs: remove calls to set the folio error flag
  squashfs: convert squashfs_symlink_read_folio to use folio APIs
  scripts/gdb: fix detection of current CPU in KGDB
  scripts/gdb: make get_thread_info accept pointers
  scripts/gdb: fix parameter handling in $lx_per_cpu
  scripts/gdb: fix failing KGDB detection during probe
  kfifo: don't use "proxy" headers
  media: stih-cec: add missing io.h
  media: rc: add missing io.h
  ...
2024-05-19 14:02:03 -07:00
Linus Torvalds
16dbfae867 bcachefs changes for 6.10-rc1
- More safety fixes, primarily found by syzbot
 
 - Run the upgrade/downgrade paths in nochnages mode. Nochanges mode is
   primarily for testing fsck/recovery in dry run mode, so it shouldn't
   change anything besides disabling writes and holding dirty metadata in
   memory.
 
   The idea here was to reduce the amount of activity if we can't write
   anything out, so that bringing up a filesystem in "super ro" mode
   would be more lilkely to work for data recovery - but norecovery is
   the correct option for this.
 
 - btree_trans->locked; we now track whether a btree_trans has any btree
   nodes locked, and this is used for improved assertions related to
   trans_unlock() and trans_relock(). We'll also be using it for
   improving how we work with lockdep in the future: we don't want
   lockdep to be tracking individual btree node locks because we take too
   many for lockdep to track, and it's not necessary since we have a
   cycle detector.
 
 - Trigger improvements that are prep work for online fsck
 
 - BTREE_TRIGGER_check_repair; this regularizes how we do some repair
   work for extents that goes with running triggers in fsck, and fixes
   some subtle issues with transaction restarts there.
 
 - bch2_snapshot_equiv() has now been ripped out of fsck.c; snapshot
   equivalence classes are for when snapshot deletion leaves behind
   redundant snapshot nodes, but snapshot deletion now cleans this up
   right away, so the abstraction doesn't need to leak.
 
 - Improvements to how we resume writing to the journal in recovery. The
   code for picking the new place to write when reading the journal is
   greatly simplified and we also store the position in the superblock
   for when we don't read the journal; this means that we preserve more
   of the journal for list_journal debugging.
 
 - Improvements to sysfs btree_cache and btree_node_cache, for debugging
   memory reclaim.
 
 - We now detect when we've blocked for 10 seconds on the allocator in
   the write path and dump some useful info.
 
 - Safety fixes for devices references: this is a big series that changes
   almost all device lookups to properly check if the device exists and
   take a reference to it.
 
   Previously we assumed that if a bkey exists that references a device
   then the device must exist, and this was enforced in .invalid methods,
   but this was incorrect because it meant device removal relied on
   accounting being correct to not leave keys pointing to invalid
   devices, and that's not something we can assume.
 
   Getting the "pointer to invalid device" checks out of our .invalid()
   methods fixes some long standing device removal bugs; the only
   outstanding bug with device removal now is a race between the discard
   path and deleting alloc info, which should be easily fixed.
 
 - The allocator now prefers not to expand the new
   member_info.btree_allocated bitmap, meaning if repair ever requires
   scanning for btree nodes (because of a corrupt interior nodes) we
   won't have to scan the whole device(s).
 
 - New coding style document, which among other things talks about the
   correct usage of assertions
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmZKJQgACgkQE6szbY3K
 bnZETg//SU9H0OHnBSMB/cteF6PKo9QR+dhT+3n+gWTxl0o/egbGTqwbzVqGtd2f
 J6II1BsDk8VoTOb/gFfLRShlmJfnj2jpRThU265faR/7LQYeSaqndDPkjOpTayAD
 Nj/DJyiSUTL753rZh3yUhOpOIHf7iapH6wuaZCPfhdfk+yvZNW8iz07JHjHLKRp8
 I2cFH0r6kN916NdRkt9oDCz68WouT8eWTqwcKra04XsLEZjNJHxLpKMq4M8UdPc7
 YynJPVt+aP8+VduGIq6pV8Co3afCP2oUywo11JpRmvLsw4tex/59wxOYtpMfgn6k
 4H+9WqiBwkbmnLDrfFHWRameS6F/7+GRAOVuz9nkmfk61UPU15gLjSRffqZ6u2YC
 7vbrXgebId/sZXtBpQd83RMMX52BnEJah0upNJ54IsSqfDYkU9lwl6CEyYpcX1hf
 YNBGBTbspZztc3AB13b3ow421FMhaySUg0FDmntMR9O8Z6/BXk7Ykc7b8DPEfrFs
 W6JY7q+ARBxr+EgFcV74fvMCf7NJTAhyv80AKryo7NFU2JZOyyaTxcTGSnolX4Mi
 lyHiOgicmOX+vy3vbC1dZoDcmIDJ4Uc0vixYcpKiZqxlR8XJ+wpevC50TEhxrcW+
 ZO4SloQvgyjI34xu/gZgjRYb3BhXK3x+ougVFpRG8V8zQ/+ccWg=
 =MKrF
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-05-19' of https://evilpiepirate.org/git/bcachefs

Pull bcachefs updates from Kent Overstreet:

 - More safety fixes, primarily found by syzbot

 - Run the upgrade/downgrade paths in nochnages mode. Nochanges mode is
   primarily for testing fsck/recovery in dry run mode, so it shouldn't
   change anything besides disabling writes and holding dirty metadata
   in memory.

   The idea here was to reduce the amount of activity if we can't write
   anything out, so that bringing up a filesystem in "super ro" mode
   would be more lilkely to work for data recovery - but norecovery is
   the correct option for this.

 - btree_trans->locked; we now track whether a btree_trans has any btree
   nodes locked, and this is used for improved assertions related to
   trans_unlock() and trans_relock(). We'll also be using it for
   improving how we work with lockdep in the future: we don't want
   lockdep to be tracking individual btree node locks because we take
   too many for lockdep to track, and it's not necessary since we have a
   cycle detector.

 - Trigger improvements that are prep work for online fsck

 - BTREE_TRIGGER_check_repair; this regularizes how we do some repair
   work for extents that goes with running triggers in fsck, and fixes
   some subtle issues with transaction restarts there.

 - bch2_snapshot_equiv() has now been ripped out of fsck.c; snapshot
   equivalence classes are for when snapshot deletion leaves behind
   redundant snapshot nodes, but snapshot deletion now cleans this up
   right away, so the abstraction doesn't need to leak.

 - Improvements to how we resume writing to the journal in recovery. The
   code for picking the new place to write when reading the journal is
   greatly simplified and we also store the position in the superblock
   for when we don't read the journal; this means that we preserve more
   of the journal for list_journal debugging.

 - Improvements to sysfs btree_cache and btree_node_cache, for debugging
   memory reclaim.

 - We now detect when we've blocked for 10 seconds on the allocator in
   the write path and dump some useful info.

 - Safety fixes for devices references: this is a big series that
   changes almost all device lookups to properly check if the device
   exists and take a reference to it.

   Previously we assumed that if a bkey exists that references a device
   then the device must exist, and this was enforced in .invalid
   methods, but this was incorrect because it meant device removal
   relied on accounting being correct to not leave keys pointing to
   invalid devices, and that's not something we can assume.

   Getting the "pointer to invalid device" checks out of our .invalid()
   methods fixes some long standing device removal bugs; the only
   outstanding bug with device removal now is a race between the discard
   path and deleting alloc info, which should be easily fixed.

 - The allocator now prefers not to expand the new
   member_info.btree_allocated bitmap, meaning if repair ever requires
   scanning for btree nodes (because of a corrupt interior nodes) we
   won't have to scan the whole device(s).

 - New coding style document, which among other things talks about the
   correct usage of assertions

* tag 'bcachefs-2024-05-19' of https://evilpiepirate.org/git/bcachefs: (155 commits)
  bcachefs: add no_invalid_checks flag
  bcachefs: add counters for failed shrinker reclaim
  bcachefs: Fix sb_field_downgrade validation
  bcachefs: Plumb bch_validate_flags to sb_field_ops.validate()
  bcachefs: s/bkey_invalid_flags/bch_validate_flags
  bcachefs: fsync() should not return -EROFS
  bcachefs: Invalid devices are now checked for by fsck, not .invalid methods
  bcachefs: kill bch2_dev_bkey_exists() in bch2_check_fix_ptrs()
  bcachefs: kill bch2_dev_bkey_exists() in bch2_read_endio()
  bcachefs: bch2_dev_get_ioref() checks for device not present
  bcachefs: bch2_dev_get_ioref2(); io_read.c
  bcachefs: bch2_dev_get_ioref2(); debug.c
  bcachefs: bch2_dev_get_ioref2(); journal_io.c
  bcachefs: bch2_dev_get_ioref2(); io_write.c
  bcachefs: bch2_dev_get_ioref2(); btree_io.c
  bcachefs: bch2_dev_get_ioref2(); backpointers.c
  bcachefs: bch2_dev_get_ioref2(); alloc_background.c
  bcachefs: for_each_bset() declares loop iter
  bcachefs: Move BCACHEFS_STATFS_MAGIC value to UAPI magic.h
  bcachefs: Improve sysfs internal/btree_cache
  ...
2024-05-19 13:45:48 -07:00
Linus Torvalds
61307b7be4 The usual shower of singleton fixes and minor series all over MM,
documented (hopefully adequately) in the respective changelogs.  Notable
 series include:
 
 - Lucas Stach has provided some page-mapping
   cleanup/consolidation/maintainability work in the series "mm/treewide:
   Remove pXd_huge() API".
 
 - In the series "Allow migrate on protnone reference with
   MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
   MPOL_PREFERRED_MANY mode, yielding almost doubled performance in one
   test.
 
 - In their series "Memory allocation profiling" Kent Overstreet and
   Suren Baghdasaryan have contributed a means of determining (via
   /proc/allocinfo) whereabouts in the kernel memory is being allocated:
   number of calls and amount of memory.
 
 - Matthew Wilcox has provided the series "Various significant MM
   patches" which does a number of rather unrelated things, but in largely
   similar code sites.
 
 - In his series "mm: page_alloc: freelist migratetype hygiene" Johannes
   Weiner has fixed the page allocator's handling of migratetype requests,
   with resulting improvements in compaction efficiency.
 
 - In the series "make the hugetlb migration strategy consistent" Baolin
   Wang has fixed a hugetlb migration issue, which should improve hugetlb
   allocation reliability.
 
 - Liu Shixin has hit an I/O meltdown caused by readahead in a
   memory-tight memcg.  Addressed in the series "Fix I/O high when memory
   almost met memcg limit".
 
 - In the series "mm/filemap: optimize folio adding and splitting" Kairui
   Song has optimized pagecache insertion, yielding ~10% performance
   improvement in one test.
 
 - Baoquan He has cleaned up and consolidated the early zone
   initialization code in the series "mm/mm_init.c: refactor
   free_area_init_core()".
 
 - Baoquan has also redone some MM initializatio code in the series
   "mm/init: minor clean up and improvement".
 
 - MM helper cleanups from Christoph Hellwig in his series "remove
   follow_pfn".
 
 - More cleanups from Matthew Wilcox in the series "Various page->flags
   cleanups".
 
 - Vlastimil Babka has contributed maintainability improvements in the
   series "memcg_kmem hooks refactoring".
 
 - More folio conversions and cleanups in Matthew Wilcox's series
 
 	"Convert huge_zero_page to huge_zero_folio"
 	"khugepaged folio conversions"
 	"Remove page_idle and page_young wrappers"
 	"Use folio APIs in procfs"
 	"Clean up __folio_put()"
 	"Some cleanups for memory-failure"
 	"Remove page_mapping()"
 	"More folio compat code removal"
 
 - David Hildenbrand chipped in with "fs/proc/task_mmu: convert hugetlb
   functions to work on folis".
 
 - Code consolidation and cleanup work related to GUP's handling of
   hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".
 
 - Rick Edgecombe has developed some fixes to stack guard gaps in the
   series "Cover a guard gap corner case".
 
 - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the series
   "mm/ksm: fix ksm exec support for prctl".
 
 - Baolin Wang has implemented NUMA balancing for multi-size THPs.  This
   is a simple first-cut implementation for now.  The series is "support
   multi-size THP numa balancing".
 
 - Cleanups to vma handling helper functions from Matthew Wilcox in the
   series "Unify vma_address and vma_pgoff_address".
 
 - Some selftests maintenance work from Dev Jain in the series
   "selftests/mm: mremap_test: Optimizations and style fixes".
 
 - Improvements to the swapping of multi-size THPs from Ryan Roberts in
   the series "Swap-out mTHP without splitting".
 
 - Kefeng Wang has significantly optimized the handling of arm64's
   permission page faults in the series
 
 	"arch/mm/fault: accelerate pagefault when badaccess"
 	"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"
 
 - GUP cleanups from David Hildenbrand in "mm/gup: consistently call it
   GUP-fast".
 
 - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault path to
   use struct vm_fault".
 
 - selftests build fixes from John Hubbard in the series "Fix
   selftests/mm build without requiring "make headers"".
 
 - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
   series "Improved Memory Tier Creation for CPUless NUMA Nodes".  Fixes
   the initialization code so that migration between different memory types
   works as intended.
 
 - David Hildenbrand has improved follow_pte() and fixed an errant driver
   in the series "mm: follow_pte() improvements and acrn follow_pte()
   fixes".
 
 - David also did some cleanup work on large folio mapcounts in his
   series "mm: mapcount for large folios + page_mapcount() cleanups".
 
 - Folio conversions in KSM in Alex Shi's series "transfer page to folio
   in KSM".
 
 - Barry Song has added some sysfs stats for monitoring multi-size THP's
   in the series "mm: add per-order mTHP alloc and swpout counters".
 
 - Some zswap cleanups from Yosry Ahmed in the series "zswap same-filled
   and limit checking cleanups".
 
 - Matthew Wilcox has been looking at buffer_head code and found the
   documentation to be lacking.  The series is "Improve buffer head
   documentation".
 
 - Multi-size THPs get more work, this time from Lance Yang.  His series
   "mm/madvise: enhance lazyfreeing with mTHP in madvise_free" optimizes
   the freeing of these things.
 
 - Kemeng Shi has added more userspace-visible writeback instrumentation
   in the series "Improve visibility of writeback".
 
 - Kemeng Shi then sent some maintenance work on top in the series "Fix
   and cleanups to page-writeback".
 
 - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in the
   series "Improve anon_vma scalability for anon VMAs".  Intel's test bot
   reported an improbable 3x improvement in one test.
 
 - SeongJae Park adds some DAMON feature work in the series
 
 	"mm/damon: add a DAMOS filter type for page granularity access recheck"
 	"selftests/damon: add DAMOS quota goal test"
 
 - Also some maintenance work in the series
 
 	"mm/damon/paddr: simplify page level access re-check for pageout"
 	"mm/damon: misc fixes and improvements"
 
 - David Hildenbrand has disabled some known-to-fail selftests ni the
   series "selftests: mm: cow: flag vmsplice() hugetlb tests as XFAIL".
 
 - memcg metadata storage optimizations from Shakeel Butt in "memcg:
   reduce memory consumption by memcg stats".
 
 - DAX fixes and maintenance work from Vishal Verma in the series
   "dax/bus.c: Fixups for dax-bus locking".
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZkgQYwAKCRDdBJ7gKXxA
 jrdKAP9WVJdpEcXxpoub/vVE0UWGtffr8foifi9bCwrQrGh5mgEAx7Yf0+d/oBZB
 nvA4E0DcPrUAFy144FNM0NTCb7u9vAw=
 =V3R/
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull mm updates from Andrew Morton:
 "The usual shower of singleton fixes and minor series all over MM,
  documented (hopefully adequately) in the respective changelogs.
  Notable series include:

   - Lucas Stach has provided some page-mapping cleanup/consolidation/
     maintainability work in the series "mm/treewide: Remove pXd_huge()
     API".

   - In the series "Allow migrate on protnone reference with
     MPOL_PREFERRED_MANY policy", Donet Tom has optimized mempolicy's
     MPOL_PREFERRED_MANY mode, yielding almost doubled performance in
     one test.

   - In their series "Memory allocation profiling" Kent Overstreet and
     Suren Baghdasaryan have contributed a means of determining (via
     /proc/allocinfo) whereabouts in the kernel memory is being
     allocated: number of calls and amount of memory.

   - Matthew Wilcox has provided the series "Various significant MM
     patches" which does a number of rather unrelated things, but in
     largely similar code sites.

   - In his series "mm: page_alloc: freelist migratetype hygiene"
     Johannes Weiner has fixed the page allocator's handling of
     migratetype requests, with resulting improvements in compaction
     efficiency.

   - In the series "make the hugetlb migration strategy consistent"
     Baolin Wang has fixed a hugetlb migration issue, which should
     improve hugetlb allocation reliability.

   - Liu Shixin has hit an I/O meltdown caused by readahead in a
     memory-tight memcg. Addressed in the series "Fix I/O high when
     memory almost met memcg limit".

   - In the series "mm/filemap: optimize folio adding and splitting"
     Kairui Song has optimized pagecache insertion, yielding ~10%
     performance improvement in one test.

   - Baoquan He has cleaned up and consolidated the early zone
     initialization code in the series "mm/mm_init.c: refactor
     free_area_init_core()".

   - Baoquan has also redone some MM initializatio code in the series
     "mm/init: minor clean up and improvement".

   - MM helper cleanups from Christoph Hellwig in his series "remove
     follow_pfn".

   - More cleanups from Matthew Wilcox in the series "Various
     page->flags cleanups".

   - Vlastimil Babka has contributed maintainability improvements in the
     series "memcg_kmem hooks refactoring".

   - More folio conversions and cleanups in Matthew Wilcox's series:
	"Convert huge_zero_page to huge_zero_folio"
	"khugepaged folio conversions"
	"Remove page_idle and page_young wrappers"
	"Use folio APIs in procfs"
	"Clean up __folio_put()"
	"Some cleanups for memory-failure"
	"Remove page_mapping()"
	"More folio compat code removal"

   - David Hildenbrand chipped in with "fs/proc/task_mmu: convert
     hugetlb functions to work on folis".

   - Code consolidation and cleanup work related to GUP's handling of
     hugetlbs in Peter Xu's series "mm/gup: Unify hugetlb, part 2".

   - Rick Edgecombe has developed some fixes to stack guard gaps in the
     series "Cover a guard gap corner case".

   - Jinjiang Tu has fixed KSM's behaviour after a fork+exec in the
     series "mm/ksm: fix ksm exec support for prctl".

   - Baolin Wang has implemented NUMA balancing for multi-size THPs.
     This is a simple first-cut implementation for now. The series is
     "support multi-size THP numa balancing".

   - Cleanups to vma handling helper functions from Matthew Wilcox in
     the series "Unify vma_address and vma_pgoff_address".

   - Some selftests maintenance work from Dev Jain in the series
     "selftests/mm: mremap_test: Optimizations and style fixes".

   - Improvements to the swapping of multi-size THPs from Ryan Roberts
     in the series "Swap-out mTHP without splitting".

   - Kefeng Wang has significantly optimized the handling of arm64's
     permission page faults in the series
	"arch/mm/fault: accelerate pagefault when badaccess"
	"mm: remove arch's private VM_FAULT_BADMAP/BADACCESS"

   - GUP cleanups from David Hildenbrand in "mm/gup: consistently call
     it GUP-fast".

   - hugetlb fault code cleanups from Vishal Moola in "Hugetlb fault
     path to use struct vm_fault".

   - selftests build fixes from John Hubbard in the series "Fix
     selftests/mm build without requiring "make headers"".

   - Memory tiering fixes/improvements from Ho-Ren (Jack) Chuang in the
     series "Improved Memory Tier Creation for CPUless NUMA Nodes".
     Fixes the initialization code so that migration between different
     memory types works as intended.

   - David Hildenbrand has improved follow_pte() and fixed an errant
     driver in the series "mm: follow_pte() improvements and acrn
     follow_pte() fixes".

   - David also did some cleanup work on large folio mapcounts in his
     series "mm: mapcount for large folios + page_mapcount() cleanups".

   - Folio conversions in KSM in Alex Shi's series "transfer page to
     folio in KSM".

   - Barry Song has added some sysfs stats for monitoring multi-size
     THP's in the series "mm: add per-order mTHP alloc and swpout
     counters".

   - Some zswap cleanups from Yosry Ahmed in the series "zswap
     same-filled and limit checking cleanups".

   - Matthew Wilcox has been looking at buffer_head code and found the
     documentation to be lacking. The series is "Improve buffer head
     documentation".

   - Multi-size THPs get more work, this time from Lance Yang. His
     series "mm/madvise: enhance lazyfreeing with mTHP in madvise_free"
     optimizes the freeing of these things.

   - Kemeng Shi has added more userspace-visible writeback
     instrumentation in the series "Improve visibility of writeback".

   - Kemeng Shi then sent some maintenance work on top in the series
     "Fix and cleanups to page-writeback".

   - Matthew Wilcox reduces mmap_lock traffic in the anon vma code in
     the series "Improve anon_vma scalability for anon VMAs". Intel's
     test bot reported an improbable 3x improvement in one test.

   - SeongJae Park adds some DAMON feature work in the series
	"mm/damon: add a DAMOS filter type for page granularity access recheck"
	"selftests/damon: add DAMOS quota goal test"

   - Also some maintenance work in the series
	"mm/damon/paddr: simplify page level access re-check for pageout"
	"mm/damon: misc fixes and improvements"

   - David Hildenbrand has disabled some known-to-fail selftests ni the
     series "selftests: mm: cow: flag vmsplice() hugetlb tests as
     XFAIL".

   - memcg metadata storage optimizations from Shakeel Butt in "memcg:
     reduce memory consumption by memcg stats".

   - DAX fixes and maintenance work from Vishal Verma in the series
     "dax/bus.c: Fixups for dax-bus locking""

* tag 'mm-stable-2024-05-17-19-19' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (426 commits)
  memcg, oom: cleanup unused memcg_oom_gfp_mask and memcg_oom_order
  selftests/mm: hugetlb_madv_vs_map: avoid test skipping by querying hugepage size at runtime
  mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_wp
  mm/hugetlb: add missing VM_FAULT_SET_HINDEX in hugetlb_fault
  selftests: cgroup: add tests to verify the zswap writeback path
  mm: memcg: make alloc_mem_cgroup_per_node_info() return bool
  mm/damon/core: fix return value from damos_wmark_metric_value
  mm: do not update memcg stats for NR_{FILE/SHMEM}_PMDMAPPED
  selftests: cgroup: remove redundant enabling of memory controller
  Docs/mm/damon/maintainer-profile: allow posting patches based on damon/next tree
  Docs/mm/damon/maintainer-profile: change the maintainer's timezone from PST to PT
  Docs/mm/damon/design: use a list for supported filters
  Docs/admin-guide/mm/damon/usage: fix wrong schemes effective quota update command
  Docs/admin-guide/mm/damon/usage: fix wrong example of DAMOS filter matching sysfs file
  selftests/damon: classify tests for functionalities and regressions
  selftests/damon/_damon_sysfs: use 'is' instead of '==' for 'None'
  selftests/damon/_damon_sysfs: find sysfs mount point from /proc/mounts
  selftests/damon/_damon_sysfs: check errors from nr_schemes file reads
  mm/damon/core: initialize ->esz_bp from damos_quota_init_priv()
  selftests/damon: add a test for DAMOS quota goal
  ...
2024-05-19 09:21:03 -07:00
Linus Torvalds
0450d2083b an important fix to address recent netfs regression (data corruption)
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmZIewwACgkQiiy9cAdy
 T1H5VwwAokoVoYvrlPQO4CNwMlXmzlTLYVi74qLubbRcKmltMzzHRHRH7JZtu9ao
 K1GRvVMMUIJoTG6tGJg9B3x0hmosyZ4gzVywc87bXyM7vpOCsvOQqA0xmWxN/Hxx
 AuiPi4Ttjy6TiWxVWDb77BOnGGGnqVgmjGS2LS/29RetOz2NLYCuDooo6S2qyVWn
 d8URgHzJbAcuBQVbhkPXge32I2ojHSpPktEcqi6MK2Em9HItMVNv3HK1NIYYo5kf
 T2D8G62xWFjQHgHGgEh9CNQrWocg0CLux/oPA/zVs1p2XRvJv41EO/x09cieFBPU
 BQgbORLo4e45YyN5OOClqCVwAU3wrWU4ADs1Z8SXid8Jc4rqgjlUYt47T4hYKVVW
 p6Airoz2SEz5d+/THQ0aIlYBBGFj2Qd22ufJdUi9NbnH/gGchuJzYAi7XJc6i0tn
 6N9bK8Tul4p353B+0j+zajrnH2sGWTAUv6Ir+cqRhP2/ma8ZUDofcaid5OPf/js1
 JxTnRB6z
 =UJ5I
 -----END PGP SIGNATURE-----

Merge tag '6.10-rc-smb-fix' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client fix from Steve French:
 "An important fix to address recent netfs regression (data corruption)"

* tag '6.10-rc-smb-fix' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: fix data corruption in read after invalidate
2024-05-18 14:19:47 -07:00
Linus Torvalds
7991c92f4c Ext4 patches for the 6.10-rc1 merge window:
- more folio conversion patches
  - add support for FS_IOC_GETFSSYSFSPATH
  - mballoc cleaups and add more kunit tests
  - sysfs cleanups and bug fixes
  - miscellaneous bug fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAmZIMBAACgkQ8vlZVpUN
 gaNvhQf9GAdBxpCLc3fc9mW8oP+okAqQ2etpz7Up5PRjX62P8o89QOXBUHSAZxat
 qOpKu5NaUBdz5mfdg/ptbCRdbsLxQTY670nSYhmseOCHZR/cw4jX1f+FUMj0VoFm
 tu/TR285W6A+i7zb1xOgyUsqN8jbQdm4ASmhVjV67oTLs+A6I8loL0wotlAl+K0U
 g8twZbnNfUaB0jrNyhEzr59bTFUgFMjt8Jv9aH3Oi4rjXGzmS5/xqPCK5Lhl+nCW
 gxIfRphwKlw9+c9osLYRrtRFrexFsQMCGmz2z9F4m7SplHI3A/SVaKSHaFeW/jQ0
 gXP/S91zale6tSeu14gZLY2JqwvI0g==
 =XA7v
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:

 - more folio conversion patches

 - add support for FS_IOC_GETFSSYSFSPATH

 - mballoc cleaups and add more kunit tests

 - sysfs cleanups and bug fixes

 - miscellaneous bug fixes and cleanups

* tag 'ext4_for_linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (40 commits)
  ext4: fix error pointer dereference in ext4_mb_load_buddy_gfp()
  jbd2: add prefix 'jbd2' for 'shrink_type'
  jbd2: use shrink_type type instead of bool type for __jbd2_journal_clean_checkpoint_list()
  ext4: fix uninitialized ratelimit_state->lock access in __ext4_fill_super()
  ext4: remove calls to to set/clear the folio error flag
  ext4: propagate errors from ext4_sb_bread() in ext4_xattr_block_cache_find()
  ext4: fix mb_cache_entry's e_refcnt leak in ext4_xattr_block_cache_find()
  jbd2: remove redundant assignement to variable err
  ext4: remove the redundant folio_wait_stable()
  ext4: fix potential unnitialized variable
  ext4: convert ac_buddy_page to ac_buddy_folio
  ext4: convert ac_bitmap_page to ac_bitmap_folio
  ext4: convert ext4_mb_init_cache() to take a folio
  ext4: convert bd_buddy_page to bd_buddy_folio
  ext4: convert bd_bitmap_page to bd_bitmap_folio
  ext4: open coding repeated check in next_linear_group
  ext4: use correct criteria name instead stale integer number in comment
  ext4: call ext4_mb_mark_free_simple to free continuous bits in found chunk
  ext4: add test_mb_mark_used_cost to estimate cost of mb_mark_used
  ext4: keep "prefetch_grp" and "nr" consistent
  ...
2024-05-18 14:11:54 -07:00
Linus Torvalds
61ea647ed1 NFSD 6.10 Release Notes
This is a light release containing mostly optimizations, code clean-
 ups, and minor bug fixes. This development cycle has focused on non-
 upstream kernel work:
 
 1. Continuing to build upstream CI for NFSD, based on kdevops
 2. Backporting NFSD filecache-related fixes to selected LTS kernels
 
 One notable new feature in v6.10 NFSD is the addition of a new
 netlink protocol dedicated to configuring NFSD. A new user space
 tool, nfsdctl, is to be added to nfs-utils. Lots more to come here.
 
 As always I am very grateful to NFSD contributors, reviewers,
 testers, and bug reporters who participated during this cycle.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmZHdB8ACgkQM2qzM29m
 f5cSMhAApukZZQCSR9lcppVrv48vsTKHFup4qFG5upOtdHR8yuSI4HOfSb5F9gsO
 fHJABFFtvlsTWVFotwUY7ljjtin00PK0bfn9kZnekvcZ1A5Yoly8SJmxK+jjnmAw
 jaXT7XzGYWShYiRkIXb3NYE9uiC1VZOYURYTVbMwklg3jbsyp2M7ylnRKIqUO2Qt
 bin6tdqnDx2H4Hou9k4csMX4sZJlXQZjQxzxhWuL1XrjEMlXREnklfppLzIlnJJt
 eHFxTRhwPdcJ9CbGVsae7GNQeGUdgq7P/AIFuHWIruvxaknY7ZOp2Z/xnxaifeU+
 O2Psh/9G7zmqFkeH01QwItita8rUdBwgTv0r7QPw8/lCd0xMieqFynNGtTGwWv0Q
 1DC8RssM3axeHHfpTgXtkqfwFvKIyE6xKrvTCBZ8Pd8hsrWzbYI4d/oTe8rwXLZ6
 sMD5wgsfagl6fd6G+4/9adFniOgpUi2xHmqJ5yyALyzUDeHiiqsOmxM2Rb0FN5YR
 ixlNj7s9lmYbbMwQshNRhV/fOPQRvKvicHAyKO7Yko/seDf8NxwQfPX6M2j2esUG
 Ld8lW1hGpBDWpF1YnA6AsC+Jr12+A4c2Lg95155R9Svumk6Fv/4MIftiWpO8qf/g
 d66Q35eGr3BSSypP9KFEa7aegZdcJAlUpLhsd0Wj2rbei7gh0kU=
 =tpVD
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd updates from Chuck Lever:
 "This is a light release containing mostly optimizations, code clean-
  ups, and minor bug fixes. This development cycle has focused on non-
  upstream kernel work:

   1. Continuing to build upstream CI for NFSD, based on kdevops

   2. Backporting NFSD filecache-related fixes to selected LTS kernels

  One notable new feature in v6.10 NFSD is the addition of a new netlink
  protocol dedicated to configuring NFSD. A new user space tool,
  nfsdctl, is to be added to nfs-utils. Lots more to come here.

  As always I am very grateful to NFSD contributors, reviewers, testers,
  and bug reporters who participated during this cycle"

* tag 'nfsd-6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (29 commits)
  NFSD: Force all NFSv4.2 COPY requests to be synchronous
  SUNRPC: Fix gss_free_in_token_pages()
  NFS/knfsd: Remove the invalid NFS error 'NFSERR_OPNOTSUPP'
  knfsd: LOOKUP can return an illegal error value
  nfsd: set security label during create operations
  NFSD: Add COPY status code to OFFLOAD_STATUS response
  NFSD: Record status of async copy operation in struct nfsd4_copy
  SUNRPC: Remove comment for sp_lock
  NFSD: add listener-{set,get} netlink command
  SUNRPC: add a new svc_find_listener helper
  SUNRPC: introduce svc_xprt_create_from_sa utility routine
  NFSD: add write_version to netlink command
  NFSD: convert write_threads to netlink command
  NFSD: allow callers to pass in scope string to nfsd_svc
  NFSD: move nfsd_mutex handling into nfsd_svc callers
  lockd: host: Remove unnecessary statements'host = NULL;'
  nfsd: don't create nfsv4recoverydir in nfsdfs when not used.
  nfsd: optimise recalculate_deny_mode() for a common case
  nfsd: add tracepoint in mark_client_expired_locked
  nfsd: new tracepoint for check_slot_seqid
  ...
2024-05-18 14:04:20 -07:00
Linus Torvalds
ff9a79307f Kbuild updates for v6.10
- Avoid 'constexpr', which is a keyword in C23
 
  - Allow 'dtbs_check' and 'dt_compatible_check' run independently of
    'dt_binding_check'
 
  - Fix weak references to avoid GOT entries in position-independent
    code generation
 
  - Convert the last use of 'optional' property in arch/sh/Kconfig
 
  - Remove support for the 'optional' property in Kconfig
 
  - Remove support for Clang's ThinLTO caching, which does not work with
    the .incbin directive
 
  - Change the semantics of $(src) so it always points to the source
    directory, which fixes Makefile inconsistencies between upstream and
    downstream
 
  - Fix 'make tar-pkg' for RISC-V to produce a consistent package
 
  - Provide reasonable default coverage for objtool, sanitizers, and
    profilers
 
  - Remove redundant OBJECT_FILES_NON_STANDARD, KASAN_SANITIZE, etc.
 
  - Remove the last use of tristate choice in drivers/rapidio/Kconfig
 
  - Various cleanups and fixes in Kconfig
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmZFlGcVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsG8voQALC8NtFpduWVfLRj2Qg6Ll/xf1vX
 2igcTJEOFHkeqXLGoT8dTDKLEipUBUvKyguPq66CGwVTe2g6zy/nUSXeVtFrUsIa
 msLTi8FqhqUo5lodNvGMRf8qqmuqcvnXoiQwIocF92jtsFy14bhiFY+n4HfcFNjj
 GOKwqBZYQUwY/VVb090efc7RfS9c7uwABJSBelSoxg3AGZriwjGy7Pw5aSKGgVYi
 inqL1eR6qwPP6z7CgQWM99soP+zwybFZmnQrsD9SniRBI4rtAat8Ih5jQFaSUFUQ
 lk2w0NQBRFN88/uR2IJ2GWuIlQ74WeJ+QnCqVuQ59tV5zw90wqSmLzngfPD057Dv
 JjNuhk0UyXVtpIg3lRtd4810ppNSTe33b9OM4O2H846W/crju5oDRNDHcflUXcwm
 Rmn5ho1rb5QVzDVejJbgwidnUInSgJ9PZcvXQ/RJVZPhpgsBzAY9pQexG1G3hviw
 y9UDrt6KP6bF9tHjmolmtdIes9Pj0c4dN6/Rdj4HS4hIQ/GDar0tnwvOvtfUctNL
 orJlBsA6GeMmDVXKkR0ytOCWRYqWWbyt8g70RVKQJfuHX7/hGyAQPaQ2/u4mQhC2
 aevYfbNJMj0VDfGz81HDBKFtkc5n+Ite8l157dHEl2LEabkOkRdNVcn7SNbOvZmd
 ZCSnZ31h7woGfNho
 =D5B/
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Avoid 'constexpr', which is a keyword in C23

 - Allow 'dtbs_check' and 'dt_compatible_check' run independently of
   'dt_binding_check'

 - Fix weak references to avoid GOT entries in position-independent code
   generation

 - Convert the last use of 'optional' property in arch/sh/Kconfig

 - Remove support for the 'optional' property in Kconfig

 - Remove support for Clang's ThinLTO caching, which does not work with
   the .incbin directive

 - Change the semantics of $(src) so it always points to the source
   directory, which fixes Makefile inconsistencies between upstream and
   downstream

 - Fix 'make tar-pkg' for RISC-V to produce a consistent package

 - Provide reasonable default coverage for objtool, sanitizers, and
   profilers

 - Remove redundant OBJECT_FILES_NON_STANDARD, KASAN_SANITIZE, etc.

 - Remove the last use of tristate choice in drivers/rapidio/Kconfig

 - Various cleanups and fixes in Kconfig

* tag 'kbuild-v6.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (46 commits)
  kconfig: use sym_get_choice_menu() in sym_check_prop()
  rapidio: remove choice for enumeration
  kconfig: lxdialog: remove initialization with A_NORMAL
  kconfig: m/nconf: merge two item_add_str() calls
  kconfig: m/nconf: remove dead code to display value of bool choice
  kconfig: m/nconf: remove dead code to display children of choice members
  kconfig: gconf: show checkbox for choice correctly
  kbuild: use GCOV_PROFILE and KCSAN_SANITIZE in scripts/Makefile.modfinal
  Makefile: remove redundant tool coverage variables
  kbuild: provide reasonable defaults for tool coverage
  modules: Drop the .export_symbol section from the final modules
  kconfig: use menu_list_for_each_sym() in sym_check_choice_deps()
  kconfig: use sym_get_choice_menu() in conf_write_defconfig()
  kconfig: add sym_get_choice_menu() helper
  kconfig: turn defaults and additional prompt for choice members into error
  kconfig: turn missing prompt for choice members into error
  kconfig: turn conf_choice() into void function
  kconfig: use linked list in sym_set_changed()
  kconfig: gconf: use MENU_CHANGED instead of SYMBOL_CHANGED
  kconfig: gconf: remove debug code
  ...
2024-05-18 12:39:20 -07:00
Linus Torvalds
2fc0e7892c Landlock updates for v6.10-rc1
-----BEGIN PGP SIGNATURE-----
 
 iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCZkYGahAcbWljQGRpZ2lr
 b2QubmV0AAoJEOXj0OiMgvbSaxUBAL1USJI2DUo+hJTDHqXdVZGEE6VfYnmEhIur
 NZfsti4UAQCabAej7+NHzh1jAnnKMJFbcyl4LRy99anJGlEWq9ExDA==
 =fViz
 -----END PGP SIGNATURE-----

Merge tag 'landlock-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux

Pull landlock updates from Mickaël Salaün:
 "This brings ioctl control to Landlock, contributed by Günther Noack.
  This also adds him as a Landlock reviewer, and fixes an issue in the
  sample"

* tag 'landlock-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  MAINTAINERS: Add Günther Noack as Landlock reviewer
  fs/ioctl: Add a comment to keep the logic in sync with LSM policies
  MAINTAINERS: Notify Landlock maintainers about changes to fs/ioctl.c
  landlock: Document IOCTL support
  samples/landlock: Add support for LANDLOCK_ACCESS_FS_IOCTL_DEV
  selftests/landlock: Exhaustive test for the IOCTL allow-list
  selftests/landlock: Check IOCTL restrictions for named UNIX domain sockets
  selftests/landlock: Test IOCTLs on named pipes
  selftests/landlock: Test ioctl(2) and ftruncate(2) with open(O_PATH)
  selftests/landlock: Test IOCTL with memfds
  selftests/landlock: Test IOCTL support
  landlock: Add IOCTL access right for character and block devices
  samples/landlock: Fix incorrect free in populate_ruleset_net
2024-05-18 10:48:07 -07:00