79896 Commits

Author SHA1 Message Date
Linus Torvalds
149c51f876 for-6.2-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmOSLtIACgkQxWXV+ddt
 WDvpQA//dQ3Wosz5puFNiZvoSUn/BnYJueZHjwF0bWY8OYINkF1PvDenu/WotyFz
 Ozf4Yl4Afxncz+FjDnOtlpr6KsSU5NqdGM3NrY0eNsxd2t1KrTsN0LgkA4m24p8b
 YsYp7pygbMm7c+h0X4uFpebY4lABkEPCBXnI//ktsls0xG5sOvGfZA3rdUP0bou2
 JTn6hk+s0cLTNoTiOCGNHRJbeTzHLR0viZj/E4LCJfCeJvAmOLZamUjqe9sBNYAg
 YtsrZTpUIL3JgmRi5B6jG4fHSXOnE14mKmRIR3xPME6J6eoYyNOeuSh1oNmJEuoE
 B7nD5We+x5+isjXNw/V5CQrs7FF09UbdpbNb9NF5CYQWv40OCeefuai1opGtBUxX
 dvbfmf1blYpWW/wfFOKQwMOsl8kZIZYx68FW2OBUNglB6yRpX/3QgFSGb8kPCr83
 DW2ttqwkpSNPMKk92I/owIc4BRvZ+LMR/PimEHB/Sa2apZA2/L+7RGwoaaei1QNX
 1tJxHWeJFLDZ+YRxjO1eKqhWdGQPn1kkq8LoXLi3tGaNF4kYQfhWOSM3WRowvx1q
 f99XRgA8JQnqZS83zqRIspWlpFK0CFdvzG1Zlqx+eoxERfeaMNA2fHxv1YCyFV4+
 TiXgsnCo+PIBwlvL/HjUWZgYE9+AD+NN5vyoE2UDYff4AgBFTE8=
 =Nqg9
 -----END PGP SIGNATURE-----

Merge tag 'for-6.2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs updates from David Sterba:
 "This round there are a lot of cleanups and moved code so the diffstat
  looks huge, otherwise there are some nice performance improvements and
  an update to raid56 reliability.

  User visible features:

   - raid56 reliability vs performance trade off:
      - fix destructive RMW for raid5 data (raid6 still needs work): do
        full checksum verification for all data during RMW cycle, this
        should prevent rewriting potentially corrupted data without
        notice
      - stripes are cached in memory which should reduce the performance
        impact but still can hurt some workloads
      - checksums are verified after repair again
      - this is the last option without introducing additional features
        (write intent bitmap, journal, another tree), the extra checksum
        read/verification was supposed to be avoided by the original
        implementation exactly for performance reasons but that caused
        all the reliability problems

   - discard=async by default for devices that support it

   - implement emergency flush reserve to avoid almost all unnecessary
     transaction aborts due to ENOSPC in cases where there are too many
     delayed refs or delayed allocation

   - skip block group synchronization if there's no change in used
     bytes, can reduce transaction commit count for some workloads

  Performance improvements:

   - fiemap and lseek:
      - overall speedup due to skipping unnecessary or duplicate
        searches (-40% run time)
      - cache some data structures and sharedness of extents (-30% run
        time)

   - send:
      - faster backref resolution when finding clones
      - cached leaf to root mapping for faster backref walking
      - improved clone/sharing detection
      - overall run time improvements (-70%)

  Core:

   - module initialization converted to a table of function pointers run
     in a sequence

   - preparation for fscrypt, extend passing file names across calls,
     dir item can store encryption status

   - raid56 updates:
      - more accurate error tracking of sectors within stripe
      - simplify recovery path and remove dedicated endio worker kthread
      - simplify scrub call paths
      - refactoring to support the extra data checksum verification
        during RMW cycle

   - tree block parentness checks consolidated and done at metadata read
     time

   - improved error handling

   - cleanups:
      - move a lot of code for better synchronization between kernel and
        user space sources, split big files
      - enum cleanups
      - GFP flag cleanups
      - header file cleanups, prototypes, dependencies
      - redundant parameter cleanups
      - inline extent handling simplifications
      - inode parameter conversion
      - data structure cleanups, reductions, renames, merges"

* tag 'for-6.2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (249 commits)
  btrfs: print transaction aborted messages with an error level
  btrfs: sync some cleanups from progs into uapi/btrfs.h
  btrfs: do not BUG_ON() on ENOMEM when dropping extent items for a range
  btrfs: fix extent map use-after-free when handling missing device in read_one_chunk
  btrfs: remove outdated logic from overwrite_item() and add assertion
  btrfs: unify overwrite_item() and do_overwrite_item()
  btrfs: replace strncpy() with strscpy()
  btrfs: fix uninitialized variable in find_first_clear_extent_bit
  btrfs: fix uninitialized parent in insert_state
  btrfs: add might_sleep() annotations
  btrfs: add stack helpers for a few btrfs items
  btrfs: add nr_global_roots to the super block definition
  btrfs: remove BTRFS_LEAF_DATA_OFFSET
  btrfs: add helpers for manipulating leaf items and data
  btrfs: add eb to btrfs_node_key_ptr_offset
  btrfs: pass the extent buffer for the btrfs_item_nr helpers
  btrfs: move the csum helpers into ctree.h
  btrfs: move eb offset helpers into extent_io.h
  btrfs: move file_extent_item helpers into file-item.h
  btrfs: move leaf_data_end into ctree.c
  ...
2022-12-12 20:47:51 -08:00
Linus Torvalds
97971df811 dlm for 6.2
These patches include the usual cleanups and minor fixes, the removal of
 code that is no longer needed due to recent improvements, and
 improvements to processing large volumes of messages during heavy
 locking activity.
 
 - Misc code cleanup.
 
 - Fix a couple socket handling bugs: a double release on an error path
   and a data-ready race in an accept loop.
 
 - Remove code for resending dir-remove messages. This code is no longer
   needed since the midcomms layer now ensures the messages are resent if
   needed.
 
 - Add tracepoints for dlm messages.
 
 - Improve callback queueing by replacing the fixed array with a list.
 
 - Simplify the handling of a remove message followed by a lookup
   message by sending both without releasing a spinlock in between.
 
 - Improve the concurrency of sending and receiving messages by holding
   locks for a shorter time, and changing how workqueues are used.
 
 - Remove old code for shutting down sockets, which is no longer needed
   with the reliable connection handling that was recently added.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJjl2TfAAoJEDgbc8f8gGmqvj8P/ibVRzW00DeKJJU8Gz5qlcQo
 YziuUxR2xYffDwif/VB2FFW1dE9lY9bTTYHz6ctiIKip51XsVTVj9oaXEebSt5Xu
 /jEMJXUnMCUPcXz6T5FoaXAt+oMKqo7QhitCD4kqdM8AUZQLXKu/YM/FC1EyIMgl
 zGmHch99ojGPQmmPrFdJq8JQSF75+r28RfH9qmy7dzctX0Iok12J8Gsgd1J9Hwti
 PITmNBs0YauT7AdYwR3UhuGC9g73RISJ4QTjL+wpj7+P2F6pNYb8XXUXsEZ2wZYH
 sgo325HbIXZ8AbxIw0Tbx2NNmIQaTELZRCOSxgEjcN7AMhn1xkXYPE3mfMCU1bDZ
 TtfN86HvWYFRgNotGpDUrNnd9XJoyeHtCWBDikEepYRWEiyYkzipj5mFGRzINlvL
 Orou1tKBchvtDgQzGg6XwNkxnxR6l/xfhg1AYizYXCgOCxfEzJgIqCUttzMCPkec
 HXi/y17IDIvNvIhtmZ0EYh/6W40UYlnncREszerg5IR3ssuDmF5aA6MHdiSt0dn/
 PkjZ5DRbZaEbNl37mjShJHxztP1QKW85c3isgEBe4jvncu8VY3WDCbQRw5gjZvC4
 LnQF4hcGLm0y51fU5DXv+CBlQGRWaqYan8bWaIdVoA2UgDbM0x8eWC8IUUYBO5ak
 rCBOpWJudzZFH6ynyfwD
 =TJvP
 -----END PGP SIGNATURE-----

Merge tag 'dlm-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm

Pull dlm updates from David Teigland:
 "These patches include the usual cleanups and minor fixes, the removal
  of code that is no longer needed due to recent improvements, and
  improvements to processing large volumes of messages during heavy
  locking activity.

  Summary:

   - Misc code cleanup

   - Fix a couple of socket handling bugs: a double release on an error
     path and a data-ready race in an accept loop

   - Remove code for resending dir-remove messages. This code is no
     longer needed since the midcomms layer now ensures the messages are
     resent if needed

   - Add tracepoints for dlm messages

   - Improve callback queueing by replacing the fixed array with a list

   - Simplify the handling of a remove message followed by a lookup
     message by sending both without releasing a spinlock in between

   - Improve the concurrency of sending and receiving messages by
     holding locks for a shorter time, and changing how workqueues are
     used

   - Remove old code for shutting down sockets, which is no longer
     needed with the reliable connection handling that was recently
     added"

* tag 'dlm-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm: (37 commits)
  fs: dlm: fix building without lockdep
  fs: dlm: parallelize lowcomms socket handling
  fs: dlm: don't init error value
  fs: dlm: use saved sk_error_report()
  fs: dlm: use sock2con without checking null
  fs: dlm: remove dlm_node_addrs lookup list
  fs: dlm: don't put dlm_local_addrs on heap
  fs: dlm: cleanup listen sock handling
  fs: dlm: remove socket shutdown handling
  fs: dlm: use listen sock as dlm running indicator
  fs: dlm: use list_first_entry_or_null
  fs: dlm: remove twice INIT_WORK
  fs: dlm: add midcomms init/start functions
  fs: dlm: add dst nodeid for msg tracing
  fs: dlm: rename seq to h_seq for msg tracing
  fs: dlm: rename DLM_IFL_NEED_SCHED to DLM_IFL_CB_PENDING
  fs: dlm: ast do WARN_ON_ONCE() on hotpath
  fs: dlm: drop lkb ref in bug case
  fs: dlm: avoid false-positive checker warning
  fs: dlm: use WARN_ON_ONCE() instead of WARN_ON()
  ...
2022-12-12 20:41:50 -08:00
Linus Torvalds
56c003e4db Assorted JFS fixes for 6.2
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEIodevzQLVs53l6BhNqiEXrVAjGQFAmOXkwgACgkQNqiEXrVA
 jGSo/hAAiw68BXbUkHqCLwaobNU8cqEB+NjnXII1VacwQa+tZBh9rB0+VtChxx4j
 AcHc2Uys3+W7JDCP/eCMACyUwD5xFzSknqlRa0FEHHVnAQybrtfO+dlsBmEVlZdX
 yFPwRGND9rOuGcNOWpPkiNp82FxN+JNQVKY1+ve5XoUFTBG9jf5pieQLLYmKGHHt
 vlpCIx3F70/88MTz+BaniarMviPjLRlzgWs2PtqFy0lOAkwYbTqBIqCY5tQaP1iy
 8uJCufHu87nS2WhxwaV/FDhFsvRbiuIze0ES8GCE6jipDKl8f+fDmlOOJtgoUnCN
 WSo36OUacBnbmr3OJ+sT3j/k0/KzgSx+m43qWos/ZZypnNMIZC5a2u6Df33/rg11
 sG3sr7HiP60EstfY1Cmf8xwmy6scfGEHq0aAD/1VoEQp/pxRiOPft9RSQeRtuz26
 nlRzFaZ4QGzQfMWos7jEmCNcVB2hkIWj5lZC+FiedHDXtISHdz7+u065P+u5Z9Zg
 mb5yB0XqMwMai7I8IwzASuH8//cdlofOdEjiLGQ7eWaRevrw2lmUlp3hkS+RGcE1
 G/sTYvTqxMOyb77TXLcmfbGTgvArfimkhR0OLweQmfTqmI52mevPtVBkzTP6ZB4n
 kOpuCD+LuWumbn8rFJ/6sjjR4FmSYM/XqnkEIKIwKXevPtR/8G8=
 =fd5k
 -----END PGP SIGNATURE-----

Merge tag 'jfs-6.2' of https://github.com/kleikamp/linux-shaggy

Pull jfs updates from David Kleikamp:
 "Assorted JFS fixes for 6.2"

* tag 'jfs-6.2' of https://github.com/kleikamp/linux-shaggy:
  jfs: makes diUnmount/diMount in jfs_mount_rw atomic
  jfs: Fix a typo in function jfs_umount
  fs: jfs: fix shift-out-of-bounds in dbDiscardAG
  jfs: Fix fortify moan in symlink
  jfs: remove redundant assignments to ipaimap and ipaimap2
  jfs: remove unused declarations for jfs
  fs/jfs/jfs_xattr.h: Fix spelling typo in comment
  MAINTAINERS: git://github -> https://github.com for kleikamp
  fs/jfs: replace ternary operator with min_t()
  fs: jfs: fix shift-out-of-bounds in dbAllocAG
2022-12-12 20:38:28 -08:00
Linus Torvalds
cda6a60acc \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmOXWlMACgkQnJ2qBz9k
 QNltfwf9GIEZASzX/v2LXYInbgvJnuRxeNoNFDZjatz+Nk17fJ9P0Fzcu7OztMTk
 +4HX3pxXYn/eFTyxVf5c3C2gU4KDV2InrYk+IyA7unZ92ROO5uaxrrknSPouYXoO
 fd0zwlMQ8bxk7wgjSnG+0Q38dbWr9XgYRqcURjXvRG9e68o49SXTc333lXc+l25X
 WphjDK6d1gXWiHKdYVYiROF7HjAjaeRk8clXtFhHmyGvhi+wvfP6mqOhzMCRuR7U
 M1dYR/B2+AJieOmVK1gqsLFc2f/TN3AEYMsRi256vYEuQhY7WRkxQw6afmYsLc8J
 sjj4mR15SwZewLtIlNbX3phvi1OBWA==
 =Q4qw
 -----END PGP SIGNATURE-----

Merge tag 'fixes_for_v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull udf and ext2 fixes from Jan Kara:

 - a couple of smaller cleanups and fixes for ext2

 - fixes of a data corruption issues in udf when handling holes and
   preallocation extents

 - fixes and cleanups of several smaller issues in udf

 - add maintainer entry for isofs

* tag 'fixes_for_v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  udf: Fix extending file within last block
  udf: Discard preallocation before extending file with a hole
  udf: Do not bother looking for prealloc extents if i_lenExtents matches i_size
  udf: Fix preallocation discarding at indirect extent boundary
  udf: Increase UDF_MAX_READ_VERSION to 0x0260
  fs/ext2: Fix code indentation
  ext2: unbugger ext2_empty_dir()
  udf: remove ->writepage
  ext2: remove ->writepage
  ext2: Don't flush page immediately for DIRSYNC directories
  ext2: Fix some kernel-doc warnings
  maintainers: Add ISOFS entry
  udf: Avoid double brelse() in udf_rename()
  fs: udf: Optimize udf_free_in_core_inode and udf_find_fileset function
2022-12-12 20:32:50 -08:00
Linus Torvalds
07d7a4d696 fs.xattr.simple.noaudit.v6.2
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEkJ7fqygL5xE8uB3PUwOZruX0MckFAmOTX7AACgkQUwOZruX0
 MclZNgf/XAoQbNRkB0H8oi3CS7RWl0iBlmPLKtNo39NuNgW6RCDY4F9rYt6F+dfq
 P4EuYayZwYw9g/JotEVXtg6Sp7FINKHkMMvlvwa+VsN1fPVM1AbD4mos5imDsj0f
 tkDVNXHsEOIc2tq0Ov9KUARCN59WpjV5dS0YQmCEsABIJQ1yu9FbCfm6WX3tH4Wd
 l+z6SSU38ZjQw4fSsc4rEqxf3imynCz9fVJYdfUKsg5WH7TswBQbYEAExMPhd6df
 qioM1t9u/nBFhOjw1VVgElx7PUijANczyiatU00jNRbMjCLlZcs5yjvv5XTgfRIe
 u2UJZJB0+6eTqAjA8neC4Z6eKnezwQ==
 =Cizh
 -----END PGP SIGNATURE-----

Merge tag 'fs.xattr.simple.noaudit.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping

Pull xattr audit fix from Seth Forshee:
 "This is a single patch to remove auditing of the capability check in
  simple_xattr_list().

  This check is done to check whether trusted xattrs should be included
  by listxattr(2). SELinux will normally log a denial when capable() is
  called and the task's SELinux context doesn't have the corresponding
  capability permission allowed, which can end up spamming the log.

  Since a failed check here cannot be used to infer malicious intent,
  auditing is of no real value, and it makes sense to stop auditing the
  capability check"

* tag 'fs.xattr.simple.noaudit.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
  fs: don't audit the capability check in simple_xattr_list()
2022-12-12 20:29:45 -08:00
Linus Torvalds
6e8948a063 fs.idmapped.squashfs.v6.2
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEkJ7fqygL5xE8uB3PUwOZruX0MckFAmOTX4IACgkQUwOZruX0
 McmDJgf8DlnoL7R88j0OZCLEpuxCOw+j1ZmQhUYz7XmDYPi8zDQCZtcx7scqQIlR
 oYb6fxP3NOLdsPzt/K2QNuzM8Yv2kRGskZPzojcBzkMwGBlFgyjLjGCOG93hmTcD
 +nXtod7QzgUvpv7w9bnLXZMw8WvU78UUsUzwhN5jgSmnpmhozZKV8xfIzJYlRUJ+
 dNYRZ9O374W+NZOX97nki7oR19cdL5uzpU53ZXMm9DMlLVgcB7yqqS37ABmqS37K
 I9wJsfcPOsd60f5FUWIm0uxGvAC+p5Z+rG4MJytbDFBfvLlhnJOHRwE+sn5b53wI
 LuAr9AHF1AHuwuNswatjumJdW9qr4Q==
 =/IDh
 -----END PGP SIGNATURE-----

Merge tag 'fs.idmapped.squashfs.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping

Pull squashfs update from Seth Forshee:
 "This is a simple patch to enable idmapped mounts for squashfs.

  All functionality squashfs needs to support idmapped mounts is already
  implemented in generic VFS code, so all that is needed is to set
  FS_ALLOW_IDMAP in fs_flags"

* tag 'fs.idmapped.squashfs.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
  squashfs: enable idmapped mounts
2022-12-12 20:24:51 -08:00
Linus Torvalds
043930b1c8 fuse update for 6.2
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCY5cH0wAKCRDh3BK/laaZ
 PAIJAQC658ZaXRgWBZ/XmCqbb+c8g/InrccE+PXhtVGYTiWTiwD/ZEy7r/X/uJaO
 gb5anxJT5jVJ9Qfk1VbyZqdwqUqydAo=
 =If6t
 -----END PGP SIGNATURE-----

Merge tag 'fuse-update-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse

Pull fuse update from Miklos Szeredi:

 - Allow some write requests to proceed in parallel

 - Fix a performance problem with allow_sys_admin_access

 - Add a special kind of invalidation that doesn't immediately purge
   submounts

 - On revalidation treat the target of rename(RENAME_NOREPLACE) the same
   as open(O_EXCL)

 - Use type safe helpers for some mnt_userns transformations

* tag 'fuse-update-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: Rearrange fuse_allow_current_process checks
  fuse: allow non-extending parallel direct writes on the same file
  fuse: remove the unneeded result variable
  fuse: port to vfs{g,u}id_t and associated helpers
  fuse: Remove user_ns check for FUSE_DEV_IOC_CLONE
  fuse: always revalidate rename target dentry
  fuse: add "expire only" mode to FUSE_NOTIFY_INVAL_ENTRY
  fs/fuse: Replace kmap() with kmap_local_page()
2022-12-12 20:22:09 -08:00
Linus Torvalds
6df7cc2268 overlayfs update for 6.2
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSQHSd0lITzzeNWNm3h3BK/laaZPAUCY5b3+gAKCRDh3BK/laaZ
 PIPxAQCPgyV/X/yJFd3wVgKa3/JxcHl5qdPbwHXFuYiJCBd69QEA9LYQEeEoTLCY
 veGiQPkl6Sp8ZqmTbDBxqw5OaBTSMwM=
 =7TiE
 -----END PGP SIGNATURE-----

Merge tag 'ovl-update-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs

Pull overlayfs update from Miklos Szeredi:

 - Fix a couple of bugs found by syzbot

 - Don't ingore some open flags set by fcntl(F_SETFL)

 - Fix failure to create a hard link in certain cases

 - Use type safe helpers for some mnt_userns transformations

 - Improve performance of mount

 - Misc cleanups

* tag 'ovl-update-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
  ovl: Kconfig: Fix spelling mistake "undelying" -> "underlying"
  ovl: use inode instead of dentry where possible
  ovl: Add comment on upperredirect reassignment
  ovl: use plain list filler in indexdir and workdir cleanup
  ovl: do not reconnect upper index records in ovl_indexdir_cleanup()
  ovl: fix comment typos
  ovl: port to vfs{g,u}id_t and associated helpers
  ovl: Use ovl mounter's fsuid and fsgid in ovl_link()
  ovl: Use "buf" flexible array for memcpy() destination
  ovl: update ->f_iocb_flags when ovl_change_flags() modifies ->f_flags
  ovl: fix use inode directly in rcu-walk mode
2022-12-12 20:18:26 -08:00
Linus Torvalds
4a6bff1187 Changes since the last update:
- Enable large folios for iomap/fscache mode;
 
  - Avoid sysfs warning due to mounting twice with the same fsid and
    domain_id in fscache mode;
 
  - Refine fscache interface among erofs, fscache, and cachefiles;
 
  - Use kmap_local_page() only for metabuf;
 
  - Fixes around crafted images found by syzbot;
 
  - Minor cleanups and documentation updates.
 -----BEGIN PGP SIGNATURE-----
 
 iIcEABYIAC8WIQThPAmQN9sSA0DVxtI5NzHcH7XmBAUCY5S3khEceGlhbmdAa2Vy
 bmVsLm9yZwAKCRA5NzHcH7XmBLr3AQDA5xpztSsxfe0Gp+bwf12ySuntimJxXmAj
 83EHCfSC+AEAu4fcWkIF38MBBVJvFVjFaXCZKmFossbI5Rp8TuqPpgk=
 =HDsJ
 -----END PGP SIGNATURE-----

Merge tag 'erofs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs updates from Gao Xiang:
 "In this cycle, large folios are now enabled in the iomap/fscache mode
  for uncompressed files first. In order to do that, we've also cleaned
  up better interfaces between erofs and fscache, which are acked by
  fscache/netfs folks and included in this pull request.

  Other than that, there are random fixes around erofs over fscache and
  crafted images by syzbot, minor cleanups and documentation updates.

  Summary:

   - Enable large folios for iomap/fscache mode

   - Avoid sysfs warning due to mounting twice with the same fsid and
     domain_id in fscache mode

   - Refine fscache interface among erofs, fscache, and cachefiles

   - Use kmap_local_page() only for metabuf

   - Fixes around crafted images found by syzbot

   - Minor cleanups and documentation updates"

* tag 'erofs-for-6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: validate the extent length for uncompressed pclusters
  erofs: fix missing unmap if z_erofs_get_extent_compressedlen() fails
  erofs: Fix pcluster memleak when its block address is zero
  erofs: use kmap_local_page() only for erofs_bread()
  erofs: enable large folios for fscache mode
  erofs: support large folios for fscache mode
  erofs: switch to prepare_ondemand_read() in fscache mode
  fscache,cachefiles: add prepare_ondemand_read() callback
  erofs: clean up cached I/O strategies
  erofs: update documentation
  erofs: check the uniqueness of fsid in shared domain in advance
  erofs: enable large folios for iomap mode
2022-12-12 20:14:04 -08:00
Linus Torvalds
ad0d9da164 fsverity updates for 6.2
The main change this cycle is to stop using the PG_error flag to track
 verity failures, and instead just track failures at the bio level.  This
 follows a similar fscrypt change that went into 6.1, and it is a step
 towards freeing up PG_error for other uses.
 
 There's also one other small cleanup.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCY5anyRQcZWJpZ2dlcnNA
 Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK1IPAP0SMSKJRgehpXHKp5QZxHSpAjkFlcGa
 2y8Lc+DlHOrfLQEAmpGAxewowkMzpYVXmlAVVHRgUPWLjoMQQELEUQ8mWgU=
 =M+pB
 -----END PGP SIGNATURE-----

Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt

Pull fsverity updates from Eric Biggers:
 "The main change this cycle is to stop using the PG_error flag to track
  verity failures, and instead just track failures at the bio level.
  This follows a similar fscrypt change that went into 6.1, and it is a
  step towards freeing up PG_error for other uses.

  There's also one other small cleanup"

* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
  fsverity: simplify fsverity_get_digest()
  fsverity: stop using PG_error to track error status
2022-12-12 20:06:35 -08:00
Linus Torvalds
8129bac60f fscrypt updates for 6.2
This release adds SM4 encryption support, contributed by Tianjia Zhang.
 SM4 is a Chinese block cipher that is an alternative to AES.
 
 I recommend against using SM4, but (according to Tianjia) some people
 are being required to use it.  Since SM4 has been turning up in many
 other places (crypto API, wireless, TLS, OpenSSL, ARMv8 CPUs, etc.), it
 hasn't been very controversial, and some people have to use it, I don't
 think it would be fair for me to reject this optional feature.
 
 Besides the above, there are a couple cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCY5auyBQcZWJpZ2dlcnNA
 Z29vZ2xlLmNvbQAKCRDzXCl4vpKOK1u4AP4lhLxaEJ9upkHZrPAvEdF7QjLhO/ju
 h1LrvWHcEbvr6AEA/8ptc5RA1BAoSTDcqIWxIAWRztvptP4gUETb1b9C/ws=
 =An5w
 -----END PGP SIGNATURE-----

Merge tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt

Pull fscrypt updates from Eric Biggers:
 "This release adds SM4 encryption support, contributed by Tianjia
  Zhang. SM4 is a Chinese block cipher that is an alternative to AES.

  I recommend against using SM4, but (according to Tianjia) some people
  are being required to use it. Since SM4 has been turning up in many
  other places (crypto API, wireless, TLS, OpenSSL, ARMv8 CPUs, etc.),
  it hasn't been very controversial, and some people have to use it, I
  don't think it would be fair for me to reject this optional feature.

  Besides the above, there are a couple cleanups"

* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/fscrypt:
  fscrypt: add additional documentation for SM4 support
  fscrypt: remove unused Speck definitions
  fscrypt: Add SM4 XTS/CTS symmetric algorithm support
  blk-crypto: Add support for SM4-XTS blk crypto mode
  fscrypt: add comment for fscrypt_valid_enc_modes_v1()
  fscrypt: pass super_block to fscrypt_put_master_key_activeref()
2022-12-12 20:03:50 -08:00
Linus Torvalds
deb9acc122 A large number of cleanups and bug fixes, with many of the bug fixes
found by Syzbot and fuzzing.  (Many of the bug fixes involve less-used
 ext4 features such as fast_commit, inline_data and bigalloc.)
 
 In addition, remove the writepage function for ext4, since the
 medium-term plan is to remove ->writepage() entirely.  (The VM doesn't
 need or want writepage() for writeback, since it is fine with
 ->writepages() so long as ->migrate_folio() is implemented.)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAmOWqrMACgkQ8vlZVpUN
 gaMvmgf+P2C6vzjn13ZdF+GwFTi4fx4TJ5BZT78LQqvTZqhkfk4k1q2SFfHI7nXT
 ZWdu1KUQ0SYLo64oaSU9W+2B2pmGi/KgUlrwNhy8DFeGStogPuDVfmGWB63p1UQL
 ld42mE9q7bjY6nCZSKYXPp2jfSwsHuliHBJ4UfzVNAIwjiUEJ7pGeIrMFdLAEkVm
 TVNzvlUZaHUnVxhpsP6hs+5WNhHQ2IhWz4rwX01ussNgHTijYac4iaL05wpTvF5e
 6NtvfmpOEMAbYrmIkJX4RVss4JNsHNOC0E8fjEHlgXJxBiAI6w8GxTxrS52Y4ELH
 nHXl/pc0L+I8+yh9B9+s0LBaSuPuTg==
 =lezv
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:
 "A large number of cleanups and bug fixes, with many of the bug fixes
  found by Syzbot and fuzzing. (Many of the bug fixes involve less-used
  ext4 features such as fast_commit, inline_data and bigalloc)

  In addition, remove the writepage function for ext4, since the
  medium-term plan is to remove ->writepage() entirely. (The VM doesn't
  need or want writepage() for writeback, since it is fine with
  ->writepages() so long as ->migrate_folio() is implemented)"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (58 commits)
  ext4: fix reserved cluster accounting in __es_remove_extent()
  ext4: fix inode leak in ext4_xattr_inode_create() on an error path
  ext4: allocate extended attribute value in vmalloc area
  ext4: avoid unaccounted block allocation when expanding inode
  ext4: initialize quota before expanding inode in setproject ioctl
  ext4: stop providing .writepage hook
  mm: export buffer_migrate_folio_norefs()
  ext4: switch to using write_cache_pages() for data=journal writeout
  jbd2: switch jbd2_submit_inode_data() to use fs-provided hook for data writeout
  ext4: switch to using ext4_do_writepages() for ordered data writeout
  ext4: move percpu_rwsem protection into ext4_writepages()
  ext4: provide ext4_do_writepages()
  ext4: add support for writepages calls that cannot map blocks
  ext4: drop pointless IO submission from ext4_bio_write_page()
  ext4: remove nr_submitted from ext4_bio_write_page()
  ext4: move keep_towrite handling to ext4_bio_write_page()
  ext4: handle redirtying in ext4_bio_write_page()
  ext4: fix kernel BUG in 'ext4_write_inline_data_end()'
  ext4: make ext4_mb_initialize_context return void
  ext4: fix deadlock due to mbcache entry corruption
  ...
2022-12-12 19:56:37 -08:00
Linus Torvalds
9b93f5069f fs.idmapped.mnt_idmap.v6.2
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCY5by6AAKCRCRxhvAZXjc
 onblAPsFzodV8/9UoCIkKxwn0aiclbiAITTWI9ZLulmKhm0I6wD/RUOLKjt12uZJ
 m81UTfkWHopWKtQ+X3saZEcyYTNLugE=
 =AtGb
 -----END PGP SIGNATURE-----

Merge tag 'fs.idmapped.mnt_idmap.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping

Pull idmapping updates from Christian Brauner:
 "Last cycle we've already made the interaction with idmapped mounts
  more robust and type safe by introducing the vfs{g,u}id_t type. This
  cycle we concluded the conversion and removed the legacy helpers.

  Currently we still pass around the plain namespace that was attached
  to a mount. This is in general pretty convenient but it makes it easy
  to conflate namespaces that are relevant on the filesystem - with
  namespaces that are relevent on the mount level. Especially for
  filesystem developers without detailed knowledge in this area this can
  be a potential source for bugs.

  Instead of passing the plain namespace we introduce a dedicated type
  struct mnt_idmap and replace the pointer with a pointer to a struct
  mnt_idmap. There are no semantic or size changes for the mount struct
  caused by this.

  We then start converting all places aware of idmapped mounts to rely
  on struct mnt_idmap. Once the conversion is done all helpers down to
  the really low-level make_vfs{g,u}id() and from_vfs{g,u}id() will take
  a struct mnt_idmap argument instead of two namespace arguments. This
  way it becomes impossible to conflate the two removing and thus
  eliminating the possibility of any bugs. Fwiw, I fixed some issues in
  that area a while ago in ntfs3 and ksmbd in the past. Afterwards only
  low-level code can ultimately use the associated namespace for any
  permission checks. Even most of the vfs can be completely obivious
  about this ultimately and filesystems will never interact with it in
  any form in the future.

  A struct mnt_idmap currently encompasses a simple refcount and pointer
  to the relevant namespace the mount is idmapped to. If a mount isn't
  idmapped then it will point to a static nop_mnt_idmap and if it
  doesn't that it is idmapped. As usual there are no allocations or
  anything happening for non-idmapped mounts. Everthing is carefully
  written to be a nop for non-idmapped mounts as has always been the
  case.

  If an idmapped mount is created a struct mnt_idmap is allocated and a
  reference taken on the relevant namespace. Each mount that gets
  idmapped or inherits the idmap simply bumps the reference count on
  struct mnt_idmap. Just a reminder that we only allow a mount to change
  it's idmapping a single time and only if it hasn't already been
  attached to the filesystems and has no active writers.

  The actual changes are fairly straightforward but this will have huge
  benefits for maintenance and security in the long run even if it
  causes some churn.

  Note that this also makes it possible to extend struct mount_idmap in
  the future. For example, it would be possible to place the namespace
  pointer in an anonymous union together with an idmapping struct. This
  would allow us to expose an api to userspace that would let it specify
  idmappings directly instead of having to go through the detour of
  setting up namespaces at all"

* tag 'fs.idmapped.mnt_idmap.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
  acl: conver higher-level helpers to rely on mnt_idmap
  fs: introduce dedicated idmap type for mounts
2022-12-12 19:30:18 -08:00
Linus Torvalds
e1212e9b6f fs.vfsuid.conversion.v6.2
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCY5bspgAKCRCRxhvAZXjc
 opEWAQDpF5rnZn1vv4/uOTij9ztcA4yLxu/Q19CdqBaoHlWZ9AD/d3eecee3bh5h
 iPHtlUK5/VspfD9LPpdc5ZbPCdZ2pA4=
 =t6NN
 -----END PGP SIGNATURE-----

Merge tag 'fs.vfsuid.conversion.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping

Pull vfsuid updates from Christian Brauner:
 "Last cycle we introduced the vfs{g,u}id_t types and associated helpers
  to gain type safety when dealing with idmapped mounts. That initial
  work already converted a lot of places over but there were still some
  left,

  This converts all remaining places that still make use of non-type
  safe idmapping helpers to rely on the new type safe vfs{g,u}id based
  helpers.

  Afterwards it removes all the old non-type safe helpers"

* tag 'fs.vfsuid.conversion.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
  fs: remove unused idmapping helpers
  ovl: port to vfs{g,u}id_t and associated helpers
  fuse: port to vfs{g,u}id_t and associated helpers
  ima: use type safe idmapping helpers
  apparmor: use type safe idmapping helpers
  caps: use type safe idmapping helpers
  fs: use type safe idmapping helpers
  mnt_idmapping: add missing helpers
2022-12-12 19:20:05 -08:00
Linus Torvalds
cf619f8919 fs.ovl.setgid.v6.2
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCY5bt7AAKCRCRxhvAZXjc
 ovAOAP9qcrUqs2MoyBDe6qUXThYY9w2rgX/ZI4ZZmbtsXEDGtQEA/LPddq8lD8o9
 m17zpvMGbXXRwz4/zVGuyWsHgg0HsQ0=
 =ioRq
 -----END PGP SIGNATURE-----

Merge tag 'fs.ovl.setgid.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping

Pull setgid inheritance updates from Christian Brauner:
 "This contains the work to make setgid inheritance consistent between
  modifying a file and when changing ownership or mode as this has been
  a repeated source of very subtle bugs. The gist is that we perform the
  same permission checks in the write path as we do in the ownership and
  mode changing paths after this series where we're currently doing
  different things.

  We've already made setgid inheritance a lot more consistent and
  reliable in the last releases by moving setgid stripping from the
  individual filesystems up into the vfs. This aims to make the logic
  even more consistent and easier to understand and also to fix
  long-standing overlayfs setgid inheritance bugs. Miklos was nice
  enough to just let me carry the trivial overlayfs patches from Amir
  too.

  Below is a more detailed explanation how the current difference in
  setgid handling lead to very subtle bugs exemplified via overlayfs
  which is a victim of the current rules. I hope this explains why I
  think taking the regression risk here is worth it.

  A long while ago I found a few setgid inheritance bugs in overlayfs in
  the write path in certain conditions. Amir recently picked this back
  up in [1] and I jumped on board to fix this more generally.

  On the surface all that overlayfs would need to fix setgid inheritance
  would be to call file_remove_privs() or file_modified() but actually
  that isn't enough because the setgid inheritance api is wildly
  inconsistent in that area.

  Before this pr setgid stripping in file_remove_privs()'s old
  should_remove_suid() helper was inconsistent with other parts of the
  vfs. Specifically, it only raises ATTR_KILL_SGID if the inode is
  S_ISGID and S_IXGRP but not if the inode isn't in the caller's groups
  and the caller isn't privileged over the inode although we require
  this already in setattr_prepare() and setattr_copy() and so all
  filesystem implement this requirement implicitly because they have to
  use setattr_{prepare,copy}() anyway.

  But the inconsistency shows up in setgid stripping bugs for overlayfs
  in xfstests (e.g., generic/673, generic/683, generic/685, generic/686,
  generic/687). For example, we test whether suid and setgid stripping
  works correctly when performing various write-like operations as an
  unprivileged user (fallocate, reflink, write, etc.):

      echo "Test 1 - qa_user, non-exec file $verb"
      setup_testfile
      chmod a+rws $junk_file
      commit_and_check "$qa_user" "$verb" 64k 64k

  The test basically creates a file with 6666 permissions. While the
  file has the S_ISUID and S_ISGID bits set it does not have the S_IXGRP
  set.

  On a regular filesystem like xfs what will happen is:

      sys_fallocate()
      -> vfs_fallocate()
         -> xfs_file_fallocate()
            -> file_modified()
               -> __file_remove_privs()
                  -> dentry_needs_remove_privs()
                     -> should_remove_suid()
                  -> __remove_privs()
                     newattrs.ia_valid = ATTR_FORCE | kill;
                     -> notify_change()
                        -> setattr_copy()

  In should_remove_suid() we can see that ATTR_KILL_SUID is raised
  unconditionally because the file in the test has S_ISUID set.

  But we also see that ATTR_KILL_SGID won't be set because while the
  file is S_ISGID it is not S_IXGRP (see above) which is a condition for
  ATTR_KILL_SGID being raised.

  So by the time we call notify_change() we have attr->ia_valid set to
  ATTR_KILL_SUID | ATTR_FORCE.

  Now notify_change() sees that ATTR_KILL_SUID is set and does:

      ia_valid      = attr->ia_valid |= ATTR_MODE
      attr->ia_mode = (inode->i_mode & ~S_ISUID);

  which means that when we call setattr_copy() later we will definitely
  update inode->i_mode. Note that attr->ia_mode still contains S_ISGID.

  Now we call into the filesystem's ->setattr() inode operation which
  will end up calling setattr_copy(). Since ATTR_MODE is set we will
  hit:

      if (ia_valid & ATTR_MODE) {
              umode_t mode = attr->ia_mode;
              vfsgid_t vfsgid = i_gid_into_vfsgid(mnt_userns, inode);
              if (!vfsgid_in_group_p(vfsgid) &&
                  !capable_wrt_inode_uidgid(mnt_userns, inode, CAP_FSETID))
                      mode &= ~S_ISGID;
              inode->i_mode = mode;
      }

  and since the caller in the test is neither capable nor in the group
  of the inode the S_ISGID bit is stripped.

  But assume the file isn't suid then ATTR_KILL_SUID won't be raised
  which has the consequence that neither the setgid nor the suid bits
  are stripped even though it should be stripped because the inode isn't
  in the caller's groups and the caller isn't privileged over the inode.

  If overlayfs is in the mix things become a bit more complicated and
  the bug shows up more clearly.

  When e.g., ovl_setattr() is hit from ovl_fallocate()'s call to
  file_remove_privs() then ATTR_KILL_SUID and ATTR_KILL_SGID might be
  raised but because the check in notify_change() is questioning the
  ATTR_KILL_SGID flag again by requiring S_IXGRP for it to be stripped
  the S_ISGID bit isn't removed even though it should be stripped:

      sys_fallocate()
      -> vfs_fallocate()
         -> ovl_fallocate()
            -> file_remove_privs()
               -> dentry_needs_remove_privs()
                  -> should_remove_suid()
               -> __remove_privs()
                  newattrs.ia_valid = ATTR_FORCE | kill;
                  -> notify_change()
                     -> ovl_setattr()
                        /* TAKE ON MOUNTER'S CREDS */
                        -> ovl_do_notify_change()
                           -> notify_change()
                        /* GIVE UP MOUNTER'S CREDS */
           /* TAKE ON MOUNTER'S CREDS */
           -> vfs_fallocate()
              -> xfs_file_fallocate()
                 -> file_modified()
                    -> __file_remove_privs()
                       -> dentry_needs_remove_privs()
                          -> should_remove_suid()
                       -> __remove_privs()
                          newattrs.ia_valid = attr_force | kill;
                          -> notify_change()

  The fix for all of this is to make file_remove_privs()'s
  should_remove_suid() helper perform the same checks as we already
  require in setattr_prepare() and setattr_copy() and have
  notify_change() not pointlessly requiring S_IXGRP again. It doesn't
  make any sense in the first place because the caller must calculate
  the flags via should_remove_suid() anyway which would raise
  ATTR_KILL_SGID

  Note that some xfstests will now fail as these patches will cause the
  setgid bit to be lost in certain conditions for unprivileged users
  modifying a setgid file when they would've been kept otherwise. I
  think this risk is worth taking and I explained and mentioned this
  multiple times on the list [2].

  Enforcing the rules consistently across write operations and
  chmod/chown will lead to losing the setgid bit in cases were it
  might've been retained before.

  While I've mentioned this a few times but it's worth repeating just to
  make sure that this is understood. For the sake of maintainability,
  consistency, and security this is a risk worth taking.

  If we really see regressions for workloads the fix is to have special
  setgid handling in the write path again with different semantics from
  chmod/chown and possibly additional duct tape for overlayfs. I'll
  update the relevant xfstests with if you should decide to merge this
  second setgid cleanup.

  Before that people should be aware that there might be failures for
  fstests where unprivileged users modify a setgid file"

Link: https://lore.kernel.org/linux-fsdevel/20221003123040.900827-1-amir73il@gmail.com [1]
Link: https://lore.kernel.org/linux-fsdevel/20221122142010.zchf2jz2oymx55qi@wittgenstein [2]

* tag 'fs.ovl.setgid.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping:
  fs: use consistent setgid checks in is_sxid()
  ovl: remove privs in ovl_fallocate()
  ovl: remove privs in ovl_copyfile()
  attr: use consistent sgid stripping checks
  attr: add setattr_should_drop_sgid()
  fs: move should_remove_suid()
  attr: add in_group_or_capable()
2022-12-12 19:03:10 -08:00
Linus Torvalds
6a518afcc2 fs.acl.rework.v6.2
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCY5bwTgAKCRCRxhvAZXjc
 ovd2AQCK00NAtGjQCjQPQGyTa4GAPqvWgq1ef0lnhv+TL5US5gD9FncQ8UofeMXt
 pBfjtAD6ettTPCTxUQfnTwWEU4rc7Qg=
 =27Wm
 -----END PGP SIGNATURE-----

Merge tag 'fs.acl.rework.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping

Pull VFS acl updates from Christian Brauner:
 "This contains the work that builds a dedicated vfs posix acl api.

  The origins of this work trace back to v5.19 but it took quite a while
  to understand the various filesystem specific implementations in
  sufficient detail and also come up with an acceptable solution.

  As we discussed and seen multiple times the current state of how posix
  acls are handled isn't nice and comes with a lot of problems: The
  current way of handling posix acls via the generic xattr api is error
  prone, hard to maintain, and type unsafe for the vfs until we call
  into the filesystem's dedicated get and set inode operations.

  It is already the case that posix acls are special-cased to death all
  the way through the vfs. There are an uncounted number of hacks that
  operate on the uapi posix acl struct instead of the dedicated vfs
  struct posix_acl. And the vfs must be involved in order to interpret
  and fixup posix acls before storing them to the backing store, caching
  them, reporting them to userspace, or for permission checking.

  Currently a range of hacks and duct tape exist to make this work. As
  with most things this is really no ones fault it's just something that
  happened over time. But the code is hard to understand and difficult
  to maintain and one is constantly at risk of introducing bugs and
  regressions when having to touch it.

  Instead of continuing to hack posix acls through the xattr handlers
  this series builds a dedicated posix acl api solely around the get and
  set inode operations.

  Going forward, the vfs_get_acl(), vfs_remove_acl(), and vfs_set_acl()
  helpers must be used in order to interact with posix acls. They
  operate directly on the vfs internal struct posix_acl instead of
  abusing the uapi posix acl struct as we currently do. In the end this
  removes all of the hackiness, makes the codepaths easier to maintain,
  and gets us type safety.

  This series passes the LTP and xfstests suites without any
  regressions. For xfstests the following combinations were tested:
   - xfs
   - ext4
   - btrfs
   - overlayfs
   - overlayfs on top of idmapped mounts
   - orangefs
   - (limited) cifs

  There's more simplifications for posix acls that we can make in the
  future if the basic api has made it.

  A few implementation details:

   - The series makes sure to retain exactly the same security and
     integrity module permission checks. Especially for the integrity
     modules this api is a win because right now they convert the uapi
     posix acl struct passed to them via a void pointer into the vfs
     struct posix_acl format to perform permission checking on the mode.

     There's a new dedicated security hook for setting posix acls which
     passes the vfs struct posix_acl not a void pointer. Basing checking
     on the posix acl stored in the uapi format is really unreliable.
     The vfs currently hacks around directly in the uapi struct storing
     values that frankly the security and integrity modules can't
     correctly interpret as evidenced by bugs we reported and fixed in
     this area. It's not necessarily even their fault it's just that the
     format we provide to them is sub optimal.

   - Some filesystems like 9p and cifs need access to the dentry in
     order to get and set posix acls which is why they either only
     partially or not even at all implement get and set inode
     operations. For example, cifs allows setxattr() and getxattr()
     operations but doesn't allow permission checking based on posix
     acls because it can't implement a get acl inode operation.

     Thus, this patch series updates the set acl inode operation to take
     a dentry instead of an inode argument. However, for the get acl
     inode operation we can't do this as the old get acl method is
     called in e.g., generic_permission() and inode_permission(). These
     helpers in turn are called in various filesystem's permission inode
     operation. So passing a dentry argument to the old get acl inode
     operation would amount to passing a dentry to the permission inode
     operation which we shouldn't and probably can't do.

     So instead of extending the existing inode operation Christoph
     suggested to add a new one. He also requested to ensure that the
     get and set acl inode operation taking a dentry are consistently
     named. So for this version the old get acl operation is renamed to
     ->get_inode_acl() and a new ->get_acl() inode operation taking a
     dentry is added. With this we can give both 9p and cifs get and set
     acl inode operations and in turn remove their complex custom posix
     xattr handlers.

     In the future I hope to get rid of the inode method duplication but
     it isn't like we have never had this situation. Readdir is just one
     example. And frankly, the overall gain in type safety and the more
     pleasant api wise are simply too big of a benefit to not accept
     this duplication for a while.

   - We've done a full audit of every codepaths using variant of the
     current generic xattr api to get and set posix acls and
     surprisingly it isn't that many places. There's of course always a
     chance that we might have missed some and if so I'm sure we'll find
     them soon enough.

     The crucial codepaths to be converted are obviously stacking
     filesystems such as ecryptfs and overlayfs.

     For a list of all callers currently using generic xattr api helpers
     see [2] including comments whether they support posix acls or not.

   - The old vfs generic posix acl infrastructure doesn't obey the
     create and replace semantics promised on the setxattr(2) manpage.
     This patch series doesn't address this. It really is something we
     should revisit later though.

  The patches are roughly organized as follows:

   (1) Change existing set acl inode operation to take a dentry
       argument (Intended to be a non-functional change)

   (2) Rename existing get acl method (Intended to be a non-functional
       change)

   (3) Implement get and set acl inode operations for filesystems that
       couldn't implement one before because of the missing dentry.
       That's mostly 9p and cifs (Intended to be a non-functional
       change)

   (4) Build posix acl api, i.e., add vfs_get_acl(), vfs_remove_acl(),
       and vfs_set_acl() including security and integrity hooks
       (Intended to be a non-functional change)

   (5) Implement get and set acl inode operations for stacking
       filesystems (Intended to be a non-functional change)

   (6) Switch posix acl handling in stacking filesystems to new posix
       acl api now that all filesystems it can stack upon support it.

   (7) Switch vfs to new posix acl api (semantical change)

   (8) Remove all now unused helpers

   (9) Additional regression fixes reported after we merged this into
       linux-next

  Thanks to Seth for a lot of good discussion around this and
  encouragement and input from Christoph"

* tag 'fs.acl.rework.v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/idmapping: (36 commits)
  posix_acl: Fix the type of sentinel in get_acl
  orangefs: fix mode handling
  ovl: call posix_acl_release() after error checking
  evm: remove dead code in evm_inode_set_acl()
  cifs: check whether acl is valid early
  acl: make vfs_posix_acl_to_xattr() static
  acl: remove a slew of now unused helpers
  9p: use stub posix acl handlers
  cifs: use stub posix acl handlers
  ovl: use stub posix acl handlers
  ecryptfs: use stub posix acl handlers
  evm: remove evm_xattr_acl_change()
  xattr: use posix acl api
  ovl: use posix acl api
  ovl: implement set acl method
  ovl: implement get acl method
  ecryptfs: implement set acl method
  ecryptfs: implement get acl method
  ksmbd: use vfs_remove_acl()
  acl: add vfs_remove_acl()
  ...
2022-12-12 18:46:39 -08:00
Linus Torvalds
bd90741318 misc pile
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCY5ZzrwAKCRBZ7Krx/gZQ
 6+WrAP9QltAQopxexxpRxTdA3yq7Fy9ZakkS7b1udhRHgRA8GgEA7ZcrqX8IsyDW
 hLW4cQPVUkJD7MCR8P7lw5sLaararAg=
 =TchO
 -----END PGP SIGNATURE-----

Merge tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull misc vfs updates from Al Viro:
 "misc pile"

* tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: sysv: Fix sysv_nblocks() returns wrong value
  get rid of INT_LIMIT, use type_max() instead
  btrfs: replace INT_LIMIT(loff_t) with OFFSET_MAX
  fs: simplify vfs_get_super
  fs: drop useless condition from inode_needs_update_time
2022-12-12 18:38:47 -08:00
Linus Torvalds
13c574fec8 fix of weird corner case in copy_mnt_ns()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCY5ZzcgAKCRBZ7Krx/gZQ
 67oZAQCJ3ucfmif/P+GPhgNqUV0sb/zL036mAvBw9Cz3q36JcgD9E/NuS0DYWS6+
 fOsNMFFDbXUPAz7Ny3BFV8W3wFrClw4=
 =t2Lt
 -----END PGP SIGNATURE-----

Merge tag 'pull-namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull namespace fix from Al Viro:
 "Fix weird corner case in copy_mnt_ns()"

* tag 'pull-namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  copy_mnt_ns(): handle a corner case (overmounted mntns bindings) saner
2022-12-12 18:36:12 -08:00
Linus Torvalds
75f4d9af8b iov_iter work; most of that is about getting rid of
direction misannotations and (hopefully) preventing
 more of the same for the future.
 
 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHQEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCY5ZzQAAKCRBZ7Krx/gZQ
 65RZAP4nTkvOn0NZLVFkuGOx8pgJelXAvrteyAuecVL8V6CR4AD40qCVY51PJp8N
 MzwiRTeqnGDxTTF7mgd//IB6hoatAA==
 =bcvF
 -----END PGP SIGNATURE-----

Merge tag 'pull-iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull iov_iter updates from Al Viro:
 "iov_iter work; most of that is about getting rid of direction
  misannotations and (hopefully) preventing more of the same for the
  future"

* tag 'pull-iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  use less confusing names for iov_iter direction initializers
  iov_iter: saner checks for attempt to copy to/from iterator
  [xen] fix "direction" argument of iov_iter_kvec()
  [vhost] fix 'direction' argument of iov_iter_{init,bvec}()
  [target] fix iov_iter_bvec() "direction" argument
  [s390] memcpy_real(): WRITE is "data source", not destination...
  [s390] zcore: WRITE is "data source", not destination...
  [infiniband] READ is "data destination", not source...
  [fsi] WRITE is "data source", not destination...
  [s390] copy_oldmem_kernel() - WRITE is "data source", not destination
  csum_and_copy_to_iter(): handle ITER_DISCARD
  get rid of unlikely() on page_copy_sane() calls
2022-12-12 18:29:54 -08:00
Linus Torvalds
405b2fc663 Unification of regset and non-regset sides of ELF coredump
handling.  Collecting per-thread register values is the
 only thing that needs to be ifdefed there...
 
 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCY5ZyNgAKCRBZ7Krx/gZQ
 63MZAQDZDE9Pk9EQ/3qOPNb2cuz8KSB3THUyotvustUUGPTUVAD/Ut1xD03jpWCY
 oQ6tM8dNyh3+Vsx6/XKNd1+pj6IgNQE=
 =XYkZ
 -----END PGP SIGNATURE-----

Merge tag 'pull-elfcore' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull elf coredumping updates from Al Viro:
 "Unification of regset and non-regset sides of ELF coredump handling.

  Collecting per-thread register values is the only thing that needs to
  be ifdefed there..."

* tag 'pull-elfcore' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  [elf] get rid of get_note_info_size()
  [elf] unify regset and non-regset cases
  [elf][non-regset] use elf_core_copy_task_regs() for dumper as well
  [elf][non-regset] uninline elf_core_copy_task_fpregs() (and lose pt_regs argument)
  elf_core_copy_task_regs(): task_pt_regs is defined everywhere
  [elf][regset] simplify thread list handling in fill_note_info()
  [elf][regset] clean fill_note_info() a bit
  kill extern of vsyscall32_sysctl
  kill coredump_params->regs
  kill signal_pt_regs()
2022-12-12 18:18:34 -08:00
Linus Torvalds
8702f2c611 Non-MM patches for 6.2-rc1.
- A ptrace API cleanup series from Sergey Shtylyov
 
 - Fixes and cleanups for kexec from ye xingchen
 
 - nilfs2 updates from Ryusuke Konishi
 
 - squashfs feature work from Xiaoming Ni: permit configuration of the
   filesystem's compression concurrency from the mount command line.
 
 - A series from Akinobu Mita which addresses bound checking errors when
   writing to debugfs files.
 
 - A series from Yang Yingliang to address rapido memory leaks
 
 - A series from Zheng Yejian to address possible overflow errors in
   encode_comp_t().
 
 - And a whole shower of singleton patches all over the place.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY5efRgAKCRDdBJ7gKXxA
 jgvdAP0al6oFDtaSsshIdNhrzcMwfjt6PfVxxHdLmNhF1hX2dwD/SVluS1bPSP7y
 0sZp7Ustu3YTb8aFkMl96Y9m9mY1Nwg=
 =ga5B
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull non-MM updates from Andrew Morton:

 - A ptrace API cleanup series from Sergey Shtylyov

 - Fixes and cleanups for kexec from ye xingchen

 - nilfs2 updates from Ryusuke Konishi

 - squashfs feature work from Xiaoming Ni: permit configuration of the
   filesystem's compression concurrency from the mount command line

 - A series from Akinobu Mita which addresses bound checking errors when
   writing to debugfs files

 - A series from Yang Yingliang to address rapidio memory leaks

 - A series from Zheng Yejian to address possible overflow errors in
   encode_comp_t()

 - And a whole shower of singleton patches all over the place

* tag 'mm-nonmm-stable-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (79 commits)
  ipc: fix memory leak in init_mqueue_fs()
  hfsplus: fix bug causing custom uid and gid being unable to be assigned with mount
  rapidio: devices: fix missing put_device in mport_cdev_open
  kcov: fix spelling typos in comments
  hfs: Fix OOB Write in hfs_asc2mac
  hfs: fix OOB Read in __hfs_brec_find
  relay: fix type mismatch when allocating memory in relay_create_buf()
  ocfs2: always read both high and low parts of dinode link count
  io-mapping: move some code within the include guarded section
  kernel: kcsan: kcsan_test: build without structleak plugin
  mailmap: update email for Iskren Chernev
  eventfd: change int to __u64 in eventfd_signal() ifndef CONFIG_EVENTFD
  rapidio: fix possible UAF when kfifo_alloc() fails
  relay: use strscpy() is more robust and safer
  cpumask: limit visibility of FORCE_NR_CPUS
  acct: fix potential integer overflow in encode_comp_t()
  acct: fix accuracy loss for input value of encode_comp_t()
  linux/init.h: include <linux/build_bug.h> and <linux/stringify.h>
  rapidio: rio: fix possible name leak in rio_register_mport()
  rapidio: fix possible name leaks when rio_add_device() fails
  ...
2022-12-12 17:28:58 -08:00
Linus Torvalds
268325bda5 Random number generator updates for Linux 6.2-rc1.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmOU+U8ACgkQSfxwEqXe
 A67NnQ//Y5DltmvibyPd7r1TFT2gUYv+Rx3sUV9ZE1NYptd/SWhhcL8c5FZ70Fuw
 bSKCa1uiWjOxosjXT1kGrWq3de7q7oUpAPSOGxgxzoaNURIt58N/ajItCX/4Au8I
 RlGAScHy5e5t41/26a498kB6qJ441fBEqCYKQpPLINMBAhe8TQ+NVp0rlpUwNHFX
 WrUGg4oKWxdBIW3HkDirQjJWDkkAiklRTifQh/Al4b6QDbOnRUGGCeckNOhixsvS
 waHWTld+Td8jRrA4b82tUb2uVZ2/b8dEvj/A8CuTv4yC0lywoyMgBWmJAGOC+UmT
 ZVNdGW02Jc2T+Iap8ZdsEmeLHNqbli4+IcbY5xNlov+tHJ2oz41H9TZoYKbudlr6
 /ReAUPSn7i50PhbQlEruj3eg+M2gjOeh8OF8UKwwRK8PghvyWQ1ScW0l3kUhPIhI
 PdIG6j4+D2mJc1FIj2rTVB+Bg933x6S+qx4zDxGlNp62AARUFYf6EgyD6aXFQVuX
 RxcKb6cjRuFkzFiKc8zkqg5edZH+IJcPNuIBmABqTGBOxbZWURXzIQvK/iULqZa4
 CdGAFIs6FuOh8pFHLI3R4YoHBopbHup/xKDEeAO9KZGyeVIuOSERDxxo5f/ITzcq
 APvT77DFOEuyvanr8RMqqh0yUjzcddXqw9+ieufsAyDwjD9DTuE=
 =QRhK
 -----END PGP SIGNATURE-----

Merge tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull random number generator updates from Jason Donenfeld:

 - Replace prandom_u32_max() and various open-coded variants of it,
   there is now a new family of functions that uses fast rejection
   sampling to choose properly uniformly random numbers within an
   interval:

       get_random_u32_below(ceil) - [0, ceil)
       get_random_u32_above(floor) - (floor, U32_MAX]
       get_random_u32_inclusive(floor, ceil) - [floor, ceil]

   Coccinelle was used to convert all current users of
   prandom_u32_max(), as well as many open-coded patterns, resulting in
   improvements throughout the tree.

   I'll have a "late" 6.1-rc1 pull for you that removes the now unused
   prandom_u32_max() function, just in case any other trees add a new
   use case of it that needs to converted. According to linux-next,
   there may be two trivial cases of prandom_u32_max() reintroductions
   that are fixable with a 's/.../.../'. So I'll have for you a final
   conversion patch doing that alongside the removal patch during the
   second week.

   This is a treewide change that touches many files throughout.

 - More consistent use of get_random_canary().

 - Updates to comments, documentation, tests, headers, and
   simplification in configuration.

 - The arch_get_random*_early() abstraction was only used by arm64 and
   wasn't entirely useful, so this has been replaced by code that works
   in all relevant contexts.

 - The kernel will use and manage random seeds in non-volatile EFI
   variables, refreshing a variable with a fresh seed when the RNG is
   initialized. The RNG GUID namespace is then hidden from efivarfs to
   prevent accidental leakage.

   These changes are split into random.c infrastructure code used in the
   EFI subsystem, in this pull request, and related support inside of
   EFISTUB, in Ard's EFI tree. These are co-dependent for full
   functionality, but the order of merging doesn't matter.

 - Part of the infrastructure added for the EFI support is also used for
   an improvement to the way vsprintf initializes its siphash key,
   replacing an sleep loop wart.

 - The hardware RNG framework now always calls its correct random.c
   input function, add_hwgenerator_randomness(), rather than sometimes
   going through helpers better suited for other cases.

 - The add_latent_entropy() function has long been called from the fork
   handler, but is a no-op when the latent entropy gcc plugin isn't
   used, which is fine for the purposes of latent entropy.

   But it was missing out on the cycle counter that was also being mixed
   in beside the latent entropy variable. So now, if the latent entropy
   gcc plugin isn't enabled, add_latent_entropy() will expand to a call
   to add_device_randomness(NULL, 0), which adds a cycle counter,
   without the absent latent entropy variable.

 - The RNG is now reseeded from a delayed worker, rather than on demand
   when used. Always running from a worker allows it to make use of the
   CPU RNG on platforms like S390x, whose instructions are too slow to
   do so from interrupts. It also has the effect of adding in new inputs
   more frequently with more regularity, amounting to a long term
   transcript of random values. Plus, it helps a bit with the upcoming
   vDSO implementation (which isn't yet ready for 6.2).

 - The jitter entropy algorithm now tries to execute on many different
   CPUs, round-robining, in hopes of hitting even more memory latencies
   and other unpredictable effects. It also will mix in a cycle counter
   when the entropy timer fires, in addition to being mixed in from the
   main loop, to account more explicitly for fluctuations in that timer
   firing. And the state it touches is now kept within the same cache
   line, so that it's assured that the different execution contexts will
   cause latencies.

* tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (23 commits)
  random: include <linux/once.h> in the right header
  random: align entropy_timer_state to cache line
  random: mix in cycle counter when jitter timer fires
  random: spread out jitter callback to different CPUs
  random: remove extraneous period and add a missing one in comments
  efi: random: refresh non-volatile random seed when RNG is initialized
  vsprintf: initialize siphash key using notifier
  random: add back async readiness notifier
  random: reseed in delayed work rather than on-demand
  random: always mix cycle counter in add_latent_entropy()
  hw_random: use add_hwgenerator_randomness() for early entropy
  random: modernize documentation comment on get_random_bytes()
  random: adjust comment to account for removed function
  random: remove early archrandom abstraction
  random: use random.trust_{bootloader,cpu} command line option only
  stackprotector: actually use get_random_canary()
  stackprotector: move get_random_canary() into stackprotector.h
  treewide: use get_random_u32_inclusive() when possible
  treewide: use get_random_u32_{above,below}() instead of manual loop
  treewide: use get_random_u32_below() instead of deprecated function
  ...
2022-12-12 16:22:22 -08:00
Yuwei Guan
26a8057a1a f2fs: reset wait_ms to default if any of the victims have been selected
In non-foreground gc mode, if no victim is selected, the gc process
will wait for no_gc_sleep_time before waking up again. In this
subsequent time, even though a victim will be selected, the gc process
still waits for no_gc_sleep_time before waking up. The configuration
of wait_ms is not reasonable.

After any of the victims have been selected, we need to reset wait_ms to
default sleep time from no_gc_sleep_time.

Signed-off-by: Yuwei Guan <Yuwei.Guan@zeekrlife.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12 15:18:36 -08:00
Yangtao Li
7411143f20 f2fs: fix some format WARNING in debug.c and sysfs.c
To fix:

WARNING: function definition argument 'struct f2fs_attr *' should also have an identifier name
+       ssize_t (*show)(struct f2fs_attr *, struct f2fs_sb_info *, char *);

WARNING: return sysfs_emit(...) formats should include a terminating newline
+       return sysfs_emit(buf, "(none)");

WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
+               unsigned npages = NODE_MAPPING(sbi)->nrpages;

WARNING: Missing a blank line after declarations
+               unsigned npages = COMPRESS_MAPPING(sbi)->nrpages;
+               si->page_mem += (unsigned long long)npages << PAGE_SHIFT;

WARNING: quoted string split across lines
+               seq_printf(s, "CP merge (Queued: %4d, Issued: %4d, Total: %4d, "
+                               "Cur time: %4d(ms), Peak time: %4d(ms))\n",

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12 14:59:39 -08:00
Yangtao Li
25547439f1 f2fs: don't call f2fs_issue_discard_timeout() when discard_cmd_cnt is 0 in f2fs_put_super()
No need to call f2fs_issue_discard_timeout() in f2fs_put_super,
when no discard command requires issue. Since the caller of
f2fs_issue_discard_timeout() usually judges the number of discard
commands before using it. Let's move this logic to
f2fs_issue_discard_timeout().

By the way, use f2fs_realtime_discard_enable to simplify the code.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12 14:59:38 -08:00
Yangtao Li
15e38ee44d f2fs: fix iostat parameter for discard
Just like other data we count uses the number of bytes as the basic unit,
but discard uses the number of cmds as the statistical unit. In fact the
discard command contains the number of blocks, so let's change to the
number of bytes as the base unit.

Fixes: b0af6d491a6b ("f2fs: add app/fs io stat")
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12 14:59:38 -08:00
Colin Ian King
db8dcd25ec f2fs: Fix spelling mistake in label: free_bio_enrty_cache -> free_bio_entry_cache
There is a spelling mistake in a label name. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12 14:59:32 -08:00
Jaegeuk Kim
71644dff48 f2fs: add block_age-based extent cache
This patch introduces a runtime hot/cold data separation method
for f2fs, in order to improve the accuracy for data temperature
classification, reduce the garbage collection overhead after
long-term data updates.

Enhanced hot/cold data separation can record data block update
frequency as "age" of the extent per inode, and take use of the age
info to indicate better temperature type for data block allocation:
 - It records total data blocks allocated since mount;
 - When file extent has been updated, it calculate the count of data
blocks allocated since last update as the age of the extent;
 - Before the data block allocated, it searches for the age info and
chooses the suitable segment for allocation.

Test and result:
 - Prepare: create about 30000 files
  * 3% for cold files (with cold file extension like .apk, from 3M to 10M)
  * 50% for warm files (with random file extension like .FcDxq, from 1K
to 4M)
  * 47% for hot files (with hot file extension like .db, from 1K to 256K)
 - create(5%)/random update(90%)/delete(5%) the files
  * total write amount is about 70G
  * fsync will be called for .db files, and buffered write will be used
for other files

The storage of test device is large enough(128G) so that it will not
switch to SSR mode during the test.

Benefit: dirty segment count increment reduce about 14%
 - before: Dirty +21110
 - after:  Dirty +18286

Signed-off-by: qixiaoyu1 <qixiaoyu1@xiaomi.com>
Signed-off-by: xiongping1 <xiongping1@xiaomi.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12 14:53:56 -08:00
Jaegeuk Kim
72840cccc0 f2fs: allocate the extent_cache by default
Let's allocate it to remove the runtime complexity.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12 14:53:56 -08:00
Jaegeuk Kim
e7547daccd f2fs: refactor extent_cache to support for read and more
This patch prepares extent_cache to be ready for addition.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12 14:53:56 -08:00
Jaegeuk Kim
749d543c0d f2fs: remove unnecessary __init_extent_tree
Added into the caller.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12 14:53:56 -08:00
Jaegeuk Kim
3bac20a8f0 f2fs: move internal functions into extent_cache.c
No functional change.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12 14:53:55 -08:00
Jaegeuk Kim
12607c1ba7 f2fs: specify extent cache for read explicitly
Let's descrbie it's read extent cache.

Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12 14:53:55 -08:00
Yangtao Li
ed8ac22b6b f2fs: introduce f2fs_is_readonly() for readability
Introduce f2fs_is_readonly() and use it to simplify code.

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12 14:53:48 -08:00
Yangtao Li
e480751970 f2fs: remove F2FS_SET_FEATURE() and F2FS_CLEAR_FEATURE() macro
F2FS_SET_FEATURE() and F2FS_CLEAR_FEATURE() have never
been used since they were introduced by this commit
76f105a2dbcd("f2fs: add feature facility in superblock").

So let's remove them. BTW, convert f2fs_sb_has_##name to return bool.

Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2022-12-12 14:53:20 -08:00
Miaohe Lin
23e188a164 writeback: remove obsolete macro EXPIRE_DIRTY_ATIME
EXPIRE_DIRTY_ATIME is not used anymore. Remove it.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20221210101042.2012931-1-linmiaohe@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-12 13:08:42 -07:00
Jan Kara
a9438b44bc writeback: Add asserts for adding freed inode to lists
In the past we had several use-after-free issues with inodes getting
added to writeback lists after evict() removed them. These are painful
to debug so add some asserts to catch the problem earlier. The only
non-obvious change in the commit is that we need to tweak
redirty_tail_locked() to avoid triggering assertion in
inode_io_list_move_locked().

Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20221212113633.29181-1-jack@suse.cz
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-12-12 13:08:34 -07:00
Paulo Alcantara
f7f291e14d cifs: fix oops during encryption
When running xfstests against Azure the following oops occurred on an
arm64 system

  Unable to handle kernel write to read-only memory at virtual address
  ffff0001221cf000
  Mem abort info:
    ESR = 0x9600004f
    EC = 0x25: DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
    FSC = 0x0f: level 3 permission fault
  Data abort info:
    ISV = 0, ISS = 0x0000004f
    CM = 0, WnR = 1
  swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000294f3000
  [ffff0001221cf000] pgd=18000001ffff8003, p4d=18000001ffff8003,
  pud=18000001ff82e003, pmd=18000001ff71d003, pte=00600001221cf787
  Internal error: Oops: 9600004f [#1] PREEMPT SMP
  ...
  pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
  pc : __memcpy+0x40/0x230
  lr : scatterwalk_copychunks+0xe0/0x200
  sp : ffff800014e92de0
  x29: ffff800014e92de0 x28: ffff000114f9de80 x27: 0000000000000008
  x26: 0000000000000008 x25: ffff800014e92e78 x24: 0000000000000008
  x23: 0000000000000001 x22: 0000040000000000 x21: ffff000000000000
  x20: 0000000000000001 x19: ffff0001037c4488 x18: 0000000000000014
  x17: 235e1c0d6efa9661 x16: a435f9576b6edd6c x15: 0000000000000058
  x14: 0000000000000001 x13: 0000000000000008 x12: ffff000114f2e590
  x11: ffffffffffffffff x10: 0000040000000000 x9 : ffff8000105c3580
  x8 : 2e9413b10000001a x7 : 534b4410fb86b005 x6 : 534b4410fb86b005
  x5 : ffff0001221cf008 x4 : ffff0001037c4490 x3 : 0000000000000001
  x2 : 0000000000000008 x1 : ffff0001037c4488 x0 : ffff0001221cf000
  Call trace:
   __memcpy+0x40/0x230
   scatterwalk_map_and_copy+0x98/0x100
   crypto_ccm_encrypt+0x150/0x180
   crypto_aead_encrypt+0x2c/0x40
   crypt_message+0x750/0x880
   smb3_init_transform_rq+0x298/0x340
   smb_send_rqst.part.11+0xd8/0x180
   smb_send_rqst+0x3c/0x100
   compound_send_recv+0x534/0xbc0
   smb2_query_info_compound+0x32c/0x440
   smb2_set_ea+0x438/0x4c0
   cifs_xattr_set+0x5d4/0x7c0

This is because in scatterwalk_copychunks(), we attempted to write to
a buffer (@sign) that was allocated in the stack (vmalloc area) by
crypt_message() and thus accessing its remaining 8 (x2) bytes ended up
crossing a page boundary.

To simply fix it, we could just pass @sign kmalloc'd from
crypt_message() and then we're done.  Luckily, we don't seem to pass
any other vmalloc'd buffers in smb_rqst::rq_iov...

Instead, let's map the correct pages and offsets from vmalloc buffers
as well in cifs_sg_set_buf() and then avoiding such oopses.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-12-12 13:08:22 -06:00
Steve French
9d91f8108e cifs: print warning when conflicting soft vs. hard mount options specified
If the user specifies conflicting hard vs. soft mount options
(or nosoft vs. nohard) print a warning to dmesg

We were missing a warning when a user e.g. mounted with both
"hard,soft" mount options.

Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-12-12 13:08:22 -06:00
Steve French
2bfd81043e cifs: fix missing display of three mount options
Three mount options: "tcpnodelay" and "noautotune" and "noblocksend"
were not displayed when passed in on cifs/smb3 mounts (e.g. displayed
in /proc/mounts e.g.).  No change to defaults so these are not
displayed if not specified on mount.

Cc: stable@vger.kernel.org
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-12-12 13:08:22 -06:00
Steve French
9544597b5b cifs: fix various whitespace errors in headers
Fix some extra spaces and a few comments that were unnecessarily split over
two lines. These were some trivial issues pointed out by checkpatch)

Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-12-12 13:08:22 -06:00
Steve French
c19204cbd6 cifs: minor cleanup of some headers
checkpatch showed formatting problems with extra spaces,
and extra semicolon and some missing blank lines in some
cifs headers.

Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Germano Percossi <germano.percossi@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-12-12 13:08:06 -06:00
Xiubo Li
68c62bee9d ceph: try to check caps immediately after async creating finishes
We should call the check_caps() again immediately after the async
creating finishes in case the MDS is waiting for caps revocation
to finish.

Link: https://tracker.ceph.com/issues/46904
Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-12-12 19:15:39 +01:00
Xiubo Li
e4b731ccb0 ceph: remove useless session parameter for check_caps()
The session parameter makes no sense any more.

Signed-off-by: Xiubo Li <xiubli@redhat.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2022-12-12 19:15:39 +01:00
Linus Torvalds
98d0052d0d printk changes for 6.2
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmORzikACgkQUqAMR0iA
 lPKF/g/7Bmcao3rJkZjEagsYY+s7rGhaFaSbML8FDdyE3UzeXLJOnNxBLrD0JIe9
 XFW7+DMqr2uRxsab5C7APy0mrIWp/zCGyJ8CmBILnrPDNcAQ27OhFzxv6WlMUmEc
 xEjGHrk5dFV96s63gyHGLkKGOZMd/cfcpy/QDOyg0vfF8EZCiPywWMbQQ2Ij8E50
 N6UL70ExkoLjT9tzb8NXQiaDqHxqNRvd15aIomDjRrce7eeaL4TaZIT7fKnEcULz
 0Lmdo8RUknonCI7Y00RWdVXMqqPD2JsKz3+fh0vBnXEN+aItwyxis/YajtN+m6l7
 jhPGt7hNhCKG17auK0/6XVJ3717QwjI3+xLXCvayA8jyewMK14PgzX70hCws0eXM
 +5M+IeXI4ze5qsq+ln9Dt8zfC+5HGmwXODUtaYTBWhB4nVWdL/CZ+nTv349zt+Uc
 VIi/QcPQ4vq6EfsxUZR2r6Y12+sSH40iLIROUfqSchtujbLo7qxSNF5x7x9+rtff
 nWuXo5OsjGE7TZDwn3kr0zSuJ+w/pkWMYQ7jch+A2WqUMYyGC86sL3At7ocL+Esq
 34uvzwEgWnNySV8cLiMh34kBmgBwhAP34RhV0RS9iCv8kev2DV7pLQTs9V3QAjw9
 EZnFDHATUdikgugaFKCeDV86R3wFgnRWWOdlRrRi6aAzFDqNcYk=
 =1PTZ
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk updates from Petr Mladek:

 - Add NMI-safe SRCU reader API. It uses atomic_inc() instead of
   this_cpu_inc() on strong load-store architectures.

 - Introduce new console_list_lock to synchronize a manipulation of the
   list of registered consoles and their flags.

   This is a first step in removing the big-kernel-lock-like behavior of
   console_lock(). This semaphore still serializes console->write()
   calbacks against:

      - each other. It primary prevents potential races between early
        and proper console drivers using the same device.

      - suspend()/resume() callbacks and init() operations in some
        drivers.

      - various other operations in the tty/vt and framebufer
        susbsystems. It is likely that console_lock() serializes even
        operations that are not directly conflicting with the
        console->write() callbacks here. This is the most complicated
        big-kernel-lock aspect of the console_lock() that will be hard
        to untangle.

 - Introduce new console_srcu lock that is used to safely iterate and
   access the registered console drivers under SRCU read lock.

   This is a prerequisite for introducing atomic console drivers and
   console kthreads. It will reduce the complexity of serialization
   against normal consoles and console_lock(). Also it should remove the
   risk of deadlock during critical situations, like Oops or panic, when
   only atomic consoles are registered.

 - Check whether the console is registered instead of enabled on many
   locations. It was a historical leftover.

 - Cleanly force a preferred console in xenfb code instead of a dirty
   hack.

 - A lot of code and comment clean ups and improvements.

* tag 'printk-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: (47 commits)
  printk: htmldocs: add missing description
  tty: serial: sh-sci: use setup() callback for early console
  printk: relieve console_lock of list synchronization duties
  tty: serial: kgdboc: use console_list_lock to trap exit
  tty: serial: kgdboc: synchronize tty_find_polling_driver() and register_console()
  tty: serial: kgdboc: use console_list_lock for list traversal
  tty: serial: kgdboc: use srcu console list iterator
  proc: consoles: use console_list_lock for list iteration
  tty: tty_io: use console_list_lock for list synchronization
  printk, xen: fbfront: create/use safe function for forcing preferred
  netconsole: avoid CON_ENABLED misuse to track registration
  usb: early: xhci-dbc: use console_is_registered()
  tty: serial: xilinx_uartps: use console_is_registered()
  tty: serial: samsung_tty: use console_is_registered()
  tty: serial: pic32_uart: use console_is_registered()
  tty: serial: earlycon: use console_is_registered()
  tty: hvc: use console_is_registered()
  efi: earlycon: use console_is_registered()
  tty: nfcon: use console_is_registered()
  serial_core: replace uart_console_enabled() with uart_console_registered()
  ...
2022-12-12 09:01:36 -08:00
Linus Torvalds
73fa58dca8 File locking changes for v6.2.
-----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEES8DXskRxsqGE6vXTAA5oQRlWghUFAmOPOjwTHGpsYXl0b25A
 a2VybmVsLm9yZwAKCRAADmhBGVaCFZ/jEADDZ1RlXCwuozDzAXFzzsR+kmKJJfXG
 ff3ejXHhyJdYH8kh1IldTCR4RGblTH7dM/gO/ApJlSLbEglQm9AIjZ2lpVstqtzQ
 lnZir+bA6uzOyYMRVXJ+0oDZuv3Gca3W8IhFHCqD7K9oQQbn+c/ZmEWrvNJJXN1j
 Ogi1SXHUNfrFgSbgBKjc2VqewuiTc2I8tZAQyezYoGXKn6LtAgMJhQIS4eWjqjju
 38aageni9doKPnAmMOq+vBcw2bWV5mYijz/pObfsaDlAgFdr9rKjNP5+F4fBply1
 SDW2T1ge8jWYegq39EcDKxd/raSOET/p9vQu6rHniXKfvMQ6Ywbr7qji1a7yTZ+i
 MkuOToNZy/+TTEvFQm48Fa25tcKjjl/uuk5Ugojf/hSWOsNkW1Cy4S33eUzDZiSO
 wox5EFVhFpf8Q8L3dUQY0sZazCyoEftw+bq2cKGHJYfUhBD7u6yLG7EKqYiqpepX
 SSPxuh3GC65xl33hYJL2V+5cgXAV23kSGCdNqDUvYZgJfjhDjQnyoSTcuBjh67kv
 chmSoeUaIkS4yFqsH9kRINMSef2M5LXYbfxTnftokX0cvV6RqQndZl43X5LEBgQL
 GRIxyxPkkKaqFjkqyFzBD0dkVGyjyUmkioy/1xON3pLWz3Sk77U38pEQ7NeUl2Lc
 bK5uysBuvDnCpg==
 =XMv7
 -----END PGP SIGNATURE-----

Merge tag 'locks-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux

Pull file locking updates from Jeff Layton:
 "The main change here is to add the new locks_inode_context helper, and
  convert all of the places that dereference inode->i_flctx directly to
  use that instead.

  There is a new helper to indicate whether any locks are held on an
  inode. This is mostly for Ceph but may be usable elsewhere too.

  Andi Kleen requested that we print the PID when the LOCK_MAND warning
  fires, to help track down applications trying to use it.

  Finally, we added some new warnings to some of the file locking
  functions that fire when the ->fl_file and filp arguments differ. This
  helped us find some long-standing bugs in lockd. Patches for those are
  in Chuck Lever's tree and should be in his v6.2 PR. After that patch,
  people using NFSv2/v3 locking may see some warnings fire until those
  go in.

  Happy Holidays!"

* tag 'locks-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
  Add process name and pid to locks warning
  nfsd: use locks_inode_context helper
  nfs: use locks_inode_context helper
  lockd: use locks_inode_context helper
  ksmbd: use locks_inode_context helper
  cifs: use locks_inode_context helper
  ceph: use locks_inode_context helper
  filelock: add a new locks_inode_context accessor function
  filelock: new helper: vfs_inode_has_locks
  filelock: WARN_ON_ONCE when ->fl_file and filp don't match
2022-12-12 08:52:53 -08:00
Linus Torvalds
7fc035058e execve updates for v6.2-rc1
- Add timens support (when switching mm). This version has survived
   in -next for the entire cycle (Andrei Vagin).
 
 - Various small bug fixes, refactoring, and readability improvements
   (Bernd Edlinger, Rolf Eike Beer, Bo Liu, Li Zetao Liu Shixin).
 
 - Remove FOLL_FORCE for stack setup (Kees Cook).
 
 - Whilespace cleanups (Rolf Eike Beer, Kees Cook).
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmOOjsgWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJqwED/9mjtKL2GwHOYKsfhtc0m4HVGBw
 gxTEKuyo5mRwaRLg2bfuWe1OQfeGWQd9+IZ83Kr2ijzm4R16Gslv9i69Iwdf2tce
 iFf2R+iR7On+zNokHxaNflRH9fMsZLobVFqzLvB73BUF82ybJlTR3WMnQhS6HZQB
 Gse8jRfueOnVgKldRLlgdxIucPVsXYSoBS4B0nvIUuQn3aNzDNuuctMe/5NFK0ud
 +TWMXtKzS3B9pcLTXy3e0bPk/Ptio18CBUEI+iLMAHswtNCoxx1ZCcuvnEcrd5Qr
 h2WGaRvYJ7oSUXeEsqPKuDdhqEJQH2AQoX8FzvD+hyIutQJCJzVYlHvwGCqn/Km6
 0Dalng9Pjb6z2LEie/N42LDXEQmLZO2WtJ4otpORJlsJ7ZkrLjB4u+hDU1JA/Q14
 YPWvth3fMA5vAFKvGCtpEc7YdHmghmXCW+YGXOBm625fPYnwFSXOarHfow1RKNE5
 MOM4l60WwzLIHgmr8AFUaLf8TbutXN+BKvbMRh2ToWzDYXEoywxAedHDyo4LVwEy
 mZEca/3izT1ynBcyZg1t8shf4htgLjcPHqM0B+Hq0iNMIrwtecqAcYL/Oj6XssPx
 OuQYv341KF9fV/hMy84GM2HMr0ygUmrP7b9x+PEvCwzWf/2Glaw6Z4rtCdYC+TjW
 8ZWqPqEY+LRsZsL18Q==
 =ZDYk
 -----END PGP SIGNATURE-----

Merge tag 'execve-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull execve updates from Kees Cook:
 "Most are small refactorings and bug fixes, but three things stand out:
  switching timens (which got reverted before) looks solid now,
  FOLL_FORCE has been removed (no failures seen yet across several weeks
  in -next), and some whitespace cleanups (which are long overdue).

   - Add timens support (when switching mm). This version has survived
     in -next for the entire cycle (Andrei Vagin)

   - Various small bug fixes, refactoring, and readability improvements
     (Bernd Edlinger, Rolf Eike Beer, Bo Liu, Li Zetao Liu Shixin)

   - Remove FOLL_FORCE for stack setup (Kees Cook)

   - Whitespace cleanups (Rolf Eike Beer, Kees Cook)"

* tag 'execve-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  binfmt_misc: fix shift-out-of-bounds in check_special_flags
  binfmt: Fix error return code in load_elf_fdpic_binary()
  exec: Remove FOLL_FORCE for stack setup
  binfmt_elf: replace IS_ERR() with IS_ERR_VALUE()
  binfmt_elf: simplify error handling in load_elf_phdrs()
  binfmt_elf: fix documented return value for load_elf_phdrs()
  exec: simplify initial stack size expansion
  binfmt: Fix whitespace issues
  exec: Add comments on check_unsafe_exec() fs counting
  ELF uapi: add spaces before '{'
  selftests/timens: add a test for vfork+exit
  fs/exec: switch timens when a task gets a new mm
2022-12-12 08:42:29 -08:00
Linus Torvalds
059c4a341d pstore updates for v6.2-rc1
- Reporting improvements and return path fixes (Guilherme G. Piccoli,
   Wang Yufen, Kees Cook).
 
 - Clean up kmsg_bytes module parameter usage (Guilherme G. Piccoli).
 
 - Add Guilherme to pstore MAINTAINERS entry.
 
 - Choose friendlier allocation flags (Qiujun Huang, Stephen Boyd).
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmOOi3cWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJm8QD/901WcETCGFZlkWKsXLym8123rr
 Y87WifzKuI3cTf1oYTtG7zrYBTWMaFYEiPZBltcy0nEbLlUs0YtYukNlkykEt9S4
 CWmyxV7DDFn2sZ/HluPhKvsIZlzcHtW1o5dzxoJadRMN06pjnAFZOHkktpuVniVN
 0IXDOOTTEEBxh11BjbD7UrilnYR6BA9kXGKcZTd6Oo/GmO8EkpzXGnVxLRr6U1/i
 qwxhOZGgVzhFuCogQvOo1VQ0DcJ8l5u3h1UIS3b9vQD/oZlpe4brVGCoD5CGugwQ
 1IpqqiBsLrsXIBtqbtg02MMgSy1bELgyLgb5jHRClfuuEiwcxw1GvAy6JzS78Uye
 5g3eiKh3oVkF9/TojSVMAzD3ObAukH4hBo4y98Jy+X2PYvSzUn/WpW0itnxFIaou
 MqZZeYn2Xz7AMXQ5N3WF3fJLjscKoCT2D0WyyiNOqoWAaYSHeZcILXUGltT+Zjtz
 vyvEhLlzQ+avh6Tx0NOKrnIA91nemuW0TYjtGlKx4X8uBvEmt+cFaKd0oZ2M8grB
 l+B2iRxVMlIrMk63mzy+qISVzLN73XCdmhcpPw60Gqin7TyIOGJ6JvZ3viq9Col7
 os5ii4MZyoerDM0bsdmPQlUq8bn0DMDUV+4kGAiZwczPkB1oigxn37ksDHMNbwRu
 jrFtb+v5Vazmb5Lafg==
 =EsLr
 -----END PGP SIGNATURE-----

Merge tag 'pstore-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull pstore updates from Kees Cook:
 "A small collection of bug fixes, refactorings, and general
  improvements:

   - Reporting improvements and return path fixes (Guilherme G. Piccoli,
     Wang Yufen, Kees Cook)

   - Clean up kmsg_bytes module parameter usage (Guilherme G. Piccoli)

   - Add Guilherme to pstore MAINTAINERS entry

   - Choose friendlier allocation flags (Qiujun Huang, Stephen Boyd)"

* tag 'pstore-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  pstore: Avoid kcore oops by vmap()ing with VM_IOREMAP
  pstore/ram: Fix error return code in ramoops_probe()
  pstore: Alert on backend write error
  MAINTAINERS: Update pstore maintainers
  pstore/ram: Set freed addresses to NULL
  pstore/ram: Move internal definitions out of kernel-wide include
  pstore/ram: Move pmsg init earlier
  pstore/ram: Consolidate kfree() paths
  efi: pstore: Follow convention for the efi-pstore backend name
  pstore: Inform unregistered backend names as well
  pstore: Expose kmsg_bytes as a module parameter
  pstore: Improve error reporting in case of backend overlap
  pstore/zone: Use GFP_ATOMIC to allocate zone buffer
2022-12-12 08:31:13 -08:00
Dan Aloni
3bc8edc98b nfsd: under NFSv4.1, fix double svc_xprt_put on rpc_create failure
On error situation `clp->cl_cb_conn.cb_xprt` should not be given
a reference to the xprt otherwise both client cleanup and the
error handling path of the caller call to put it. Better to
delay handing over the reference to a later branch.

[   72.530665] refcount_t: underflow; use-after-free.
[   72.531933] WARNING: CPU: 0 PID: 173 at lib/refcount.c:28 refcount_warn_saturate+0xcf/0x120
[   72.533075] Modules linked in: nfsd(OE) nfsv4(OE) nfsv3(OE) nfs(OE) lockd(OE) compat_nfs_ssc(OE) nfs_acl(OE) rpcsec_gss_krb5(OE) auth_rpcgss(OE) rpcrdma(OE) dns_resolver fscache netfs grace rdma_cm iw_cm ib_cm sunrpc(OE) mlx5_ib mlx5_core mlxfw pci_hyperv_intf ib_uverbs ib_core xt_MASQUERADE nf_conntrack_netlink nft_counter xt_addrtype nft_compat br_netfilter bridge stp llc nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set overlay nf_tables nfnetlink crct10dif_pclmul crc32_pclmul ghash_clmulni_intel xfs serio_raw virtio_net virtio_blk net_failover failover fuse [last unloaded: sunrpc]
[   72.540389] CPU: 0 PID: 173 Comm: kworker/u16:5 Tainted: G           OE     5.15.82-dan #1
[   72.541511] Hardware name: Red Hat KVM/RHEL-AV, BIOS 1.16.0-3.module+el8.7.0+1084+97b81f61 04/01/2014
[   72.542717] Workqueue: nfsd4_callbacks nfsd4_run_cb_work [nfsd]
[   72.543575] RIP: 0010:refcount_warn_saturate+0xcf/0x120
[   72.544299] Code: 55 00 0f 0b 5d e9 01 50 98 00 80 3d 75 9e 39 08 00 0f 85 74 ff ff ff 48 c7 c7 e8 d1 60 8e c6 05 61 9e 39 08 01 e8 f6 51 55 00 <0f> 0b 5d e9 d9 4f 98 00 80 3d 4b 9e 39 08 00 0f 85 4c ff ff ff 48
[   72.546666] RSP: 0018:ffffb3f841157cf0 EFLAGS: 00010286
[   72.547393] RAX: 0000000000000026 RBX: ffff89ac6231d478 RCX: 0000000000000000
[   72.548324] RDX: ffff89adb7c2c2c0 RSI: ffff89adb7c205c0 RDI: ffff89adb7c205c0
[   72.549271] RBP: ffffb3f841157cf0 R08: 0000000000000000 R09: c0000000ffefffff
[   72.550209] R10: 0000000000000001 R11: ffffb3f841157ad0 R12: ffff89ac6231d180
[   72.551142] R13: ffff89ac6231d478 R14: ffff89ac40c06180 R15: ffff89ac6231d4b0
[   72.552089] FS:  0000000000000000(0000) GS:ffff89adb7c00000(0000) knlGS:0000000000000000
[   72.553175] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   72.553934] CR2: 0000563a310506a8 CR3: 0000000109a66000 CR4: 0000000000350ef0
[   72.554874] Call Trace:
[   72.555278]  <TASK>
[   72.555614]  svc_xprt_put+0xaf/0xe0 [sunrpc]
[   72.556276]  nfsd4_process_cb_update.isra.11+0xb7/0x410 [nfsd]
[   72.557087]  ? update_load_avg+0x82/0x610
[   72.557652]  ? cpuacct_charge+0x60/0x70
[   72.558212]  ? dequeue_entity+0xdb/0x3e0
[   72.558765]  ? queued_spin_unlock+0x9/0x20
[   72.559358]  nfsd4_run_cb_work+0xfc/0x270 [nfsd]
[   72.560031]  process_one_work+0x1df/0x390
[   72.560600]  worker_thread+0x37/0x3b0
[   72.561644]  ? process_one_work+0x390/0x390
[   72.562247]  kthread+0x12f/0x150
[   72.562710]  ? set_kthread_struct+0x50/0x50
[   72.563309]  ret_from_fork+0x22/0x30
[   72.563818]  </TASK>
[   72.564189] ---[ end trace 031117b1c72ec616 ]---
[   72.566019] list_add corruption. next->prev should be prev (ffff89ac4977e538), but was ffff89ac4763e018. (next=ffff89ac4763e018).
[   72.567647] ------------[ cut here ]------------

Fixes: a4abc6b12eb1 ("nfsd: Fix svc_xprt refcnt leak when setup callback client failed")
Cc: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Cc: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Dan Aloni <dan.aloni@vastdata.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-12-12 09:18:44 -05:00
Aditya Garg
9f2b5debc0 hfsplus: fix bug causing custom uid and gid being unable to be assigned with mount
Despite specifying UID and GID in mount command, the specified UID and GID
were not being assigned. This patch fixes this issue.

Link: https://lkml.kernel.org/r/C0264BF5-059C-45CF-B8DA-3A3BD2C803A2@live.com
Signed-off-by: Aditya Garg <gargaditya08@live.com>
Reviewed-by: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-12-11 19:30:20 -08:00