Commit Graph

1120687 Commits

Author SHA1 Message Date
c235698355 Merge tag 'cxl-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull cxl updates from Dan Williams:
 "Compute Express Link (CXL) updates for 6.0:

   - Introduce a 'struct cxl_region' object with support for
     provisioning and assembling persistent memory regions.

   - Introduce alloc_free_mem_region() to accompany the existing
     request_free_mem_region() as a method to allocate physical memory
     capacity out of an existing resource.

   - Export insert_resource_expand_to_fit() for the CXL subsystem to
     late-publish CXL platform windows in iomem_resource.

   - Add a polled mode PCI DOE (Data Object Exchange) driver service and
     use it in cxl_pci to retrieve the CDAT (Coherent Device Attribute
     Table)"

* tag 'cxl-for-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (74 commits)
  cxl/hdm: Fix skip allocations vs multiple pmem allocations
  cxl/region: Disallow region granularity != window granularity
  cxl/region: Fix x1 interleave to greater than x1 interleave routing
  cxl/region: Move HPA setup to cxl_region_attach()
  cxl/region: Fix decoder interleave programming
  Documentation: cxl: remove dangling kernel-doc reference
  cxl/region: describe targets and nr_targets members of cxl_region_params
  cxl/regions: add padding for cxl_rr_ep_add nested lists
  cxl/region: Fix IS_ERR() vs NULL check
  cxl/region: Fix region reference target accounting
  cxl/region: Fix region commit uninitialized variable warning
  cxl/region: Fix port setup uninitialized variable warnings
  cxl/region: Stop initializing interleave granularity
  cxl/hdm: Fix DPA reservation vs cxl_endpoint_decoder lifetime
  cxl/acpi: Minimize granularity for x1 interleaves
  cxl/region: Delete 'region' attribute from root decoders
  cxl/acpi: Autoload driver for 'cxl_acpi' test devices
  cxl/region: decrement ->nr_targets on error in cxl_region_attach()
  cxl/region: prevent underflow in ways_to_cxl()
  cxl/region: uninitialized variable in alloc_hpa()
  ...
2022-08-10 11:07:26 -07:00
5e2e7383b5 Merge tag 'pinctrl-v6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control updates from Linus Walleij:
 "Outside the pinctrl driver and DT bindings we hit some Arm DT files,
  patched by the maintainers.

  Other than that it is business as usual.

  Core changes:

   - Add PINCTRL_PINGROUP() helper macro (and use it in the AMD driver).

  New drivers:

   - Intel Meteor Lake support.

   - Reneasas RZ/V2M and r8a779g0 (R-Car V4H).

   - AXP209 variants AXP221, AXP223 and AXP809.

   - Qualcomm MSM8909, PM8226, PMP8074 and SM6375.

   - Allwinner D1.

  Improvements:

   - Proper pin multiplexing in the AMD driver.

   - Mediatek MT8192 can use generic drive strength and pin bias, then
     fixes on top plus some I2C pin group fixes.

   - Have the Allwinner Sunplus SP7021 use the generic DT schema and
     make interrupts optional.

   - Handle Qualcomm SC7280 ADSP.

   - Handle Qualcomm MSM8916 CAMSS GP clock muxing.

   - High impedance bias on ZynqMP.

   - Serialize StarFive access to MMIO.

   - Immutable gpiochip for BCM2835, Ingenic, Qualcomm SPMI GPIO"

* tag 'pinctrl-v6.0-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (117 commits)
  dt-bindings: pinctrl: qcom,pmic-gpio: add PM8226 constraints
  pinctrl: qcom: Make PINCTRL_SM8450 depend on PINCTRL_MSM
  pinctrl: qcom: sm8250: Fix PDC map
  pinctrl: amd: Fix an unused variable
  dt-bindings: pinctrl: mt8186: Add and use drive-strength-microamp
  dt-bindings: pinctrl: mt8186: Add gpio-line-names property
  ARM: dts: imxrt1170-pinfunc: Add pinctrl binding header
  pinctrl: amd: Use unicode for debugfs output
  pinctrl: amd: Fix newline declaration in debugfs output
  pinctrl: at91: Fix typo 'the the' in comment
  dt-bindings: pinctrl: st,stm32: Correct 'resets' property name
  pinctrl: mvebu: Missing a blank line after declarations.
  pinctrl: qcom: Add SM6375 TLMM driver
  dt-bindings: pinctrl: Add DT schema for SM6375 TLMM
  dt-bindings: pinctrl: mt8195: Use drive-strength-microamp in examples
  Revert "pinctrl: qcom: spmi-gpio: make the irqchip immutable"
  pinctrl: imx93: Add MODULE_DEVICE_TABLE()
  pinctrl: sunxi: Add driver for Allwinner D1
  pinctrl: sunxi: Make some layout parameters dynamic
  pinctrl: sunxi: Refactor register/offset calculation
  ...
2022-08-10 11:01:44 -07:00
00aa9d0bbf Merge tag 'apparmor-pr-2022-08-08' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor
Pull AppArmor updates from John Johansen:
 "This is mostly cleanups and bug fixes with the one bigger change being
  Mathew Wilcox's patch to use XArrays instead of the IDR from the
  thread around the locking weirdness.

  Features:
   - Convert secid mapping to XArrays instead of IDR
   - Add a kernel label to use on kernel objects
   - Extend policydb permission set by making use of the xbits
   - Make export of raw binary profile to userspace optional
   - Enable tuning of policy paranoid load for embedded systems
   - Don't create raw_sha1 symlink if sha1 hashing is disabled
   - Allow labels to carry debug flags

  Cleanups:
   - Update MAINTAINERS file
   - Use struct_size() helper in kmalloc()
   - Move ptrace mediation to more logical task.{h,c}
   - Resolve uninitialized symbol warnings
   - Remove redundant ret variable
   - Mark alloc_unconfined() as static
   - Update help description of policy hash for introspection
   - Remove some casts which are no-longer required

  Bug Fixes:
   - Fix aa_label_asxprint return check
   - Fix reference count leak in aa_pivotroot()
   - Fix memleak in aa_simple_write_to_buffer()
   - Fix kernel doc comments
   - Fix absroot causing audited secids to begin with =
   - Fix quiet_denied for file rules
   - Fix failed mount permission check error message
   - Disable showing the mode as part of a secid to secctx
   - Fix setting unconfined mode on a loaded profile
   - Fix overlapping attachment computation
   - Fix undefined reference to `zlib_deflate_workspacesize'"

* tag 'apparmor-pr-2022-08-08' of git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor: (34 commits)
  apparmor: Update MAINTAINERS file with new email address
  apparmor: correct config reference to intended one
  apparmor: move ptrace mediation to more logical task.{h,c}
  apparmor: extend policydb permission set by making use of the xbits
  apparmor: allow label to carry debug flags
  apparmor: fix overlapping attachment computation
  apparmor: fix setting unconfined mode on a loaded profile
  apparmor: Fix some kernel-doc comments
  apparmor: Mark alloc_unconfined() as static
  apparmor: disable showing the mode as part of a secid to secctx
  apparmor: Convert secid mapping to XArrays instead of IDR
  apparmor: add a kernel label to use on kernel objects
  apparmor: test: Remove some casts which are no-longer required
  apparmor: Fix memleak in aa_simple_write_to_buffer()
  apparmor: fix reference count leak in aa_pivotroot()
  apparmor: Fix some kernel-doc comments
  apparmor: Fix undefined reference to `zlib_deflate_workspacesize'
  apparmor: fix aa_label_asxprint return check
  apparmor: Fix some kernel-doc comments
  apparmor: Fix some kernel-doc comments
  ...
2022-08-10 10:53:22 -07:00
0af5cb349a Merge tag 'kbuild-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:

 - Remove the support for -O3 (CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE_O3)

 - Fix error of rpm-pkg cross-builds

 - Support riscv for checkstack tool

 - Re-enable -Wformwat warnings for Clang

 - Clean up modpost, Makefiles, and misc scripts

* tag 'kbuild-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (30 commits)
  modpost: remove .symbol_white_list field entirely
  modpost: remove unneeded .symbol_white_list initializers
  modpost: add PATTERNS() helper macro
  modpost: shorten warning messages in report_sec_mismatch()
  Revert "Kbuild, lto, workaround: Don't warn for initcall_reference in modpost"
  modpost: use more reliable way to get fromsec in section_rel(a)()
  modpost: add array range check to sec_name()
  modpost: refactor get_secindex()
  kbuild: set EXIT trap before creating temporary directory
  modpost: remove unused Elf_Sword macro
  Makefile.extrawarn: re-enable -Wformat for clang
  kbuild: add dtbs_prepare target
  kconfig: Qt5: tell the user which packages are required
  modpost: use sym_get_data() to get module device_table data
  modpost: drop executable ELF support
  checkstack: add riscv support for scripts/checkstack.pl
  kconfig: shorten the temporary directory name for cc-option
  scripts: headers_install.sh: Update config leak ignore entries
  kbuild: error out if $(INSTALL_MOD_PATH) contains % or :
  kbuild: error out if $(KBUILD_EXTMOD) contains % or :
  ...
2022-08-10 10:40:41 -07:00
d4252071b9 add barriers to buffer_uptodate and set_buffer_uptodate
Let's have a look at this piece of code in __bread_slow:

	get_bh(bh);
	bh->b_end_io = end_buffer_read_sync;
	submit_bh(REQ_OP_READ, 0, bh);
	wait_on_buffer(bh);
	if (buffer_uptodate(bh))
		return bh;

Neither wait_on_buffer nor buffer_uptodate contain any memory barrier.
Consequently, if someone calls sb_bread and then reads the buffer data,
the read of buffer data may be executed before wait_on_buffer(bh) on
architectures with weak memory ordering and it may return invalid data.

Fix this bug by adding a memory barrier to set_buffer_uptodate and an
acquire barrier to buffer_uptodate (in a similar way as
folio_test_uptodate and folio_mark_uptodate).

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-08-09 15:03:02 -07:00
e394ff83bb Merge tag 'nfsd-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd updates from Chuck Lever:
 "Work on 'courteous server', which was introduced in 5.19, continues
  apace. This release introduces a more flexible limit on the number of
  NFSv4 clients that NFSD allows, now that NFSv4 clients can remain in
  courtesy state long after the lease expiration timeout. The client
  limit is adjusted based on the physical memory size of the server.

  The NFSD filecache is a cache of files held open by NFSv4 clients or
  recently touched by NFSv2 or NFSv3 clients. This cache had some
  significant scalability constraints that have been relieved in this
  release. Thanks to all who contributed to this work.

  A data corruption bug found during the most recent NFS bake-a-thon
  that involves NFSv3 and NFSv4 clients writing the same file has been
  addressed in this release.

  This release includes several improvements in CPU scalability for
  NFSv4 operations. In addition, Neil Brown provided patches that
  simplify locking during file lookup, creation, rename, and removal
  that enables subsequent work on making these operations more scalable.
  We expect to see that work materialize in the next release.

  There are also numerous single-patch fixes, clean-ups, and the usual
  improvements in observability"

* tag 'nfsd-6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (78 commits)
  lockd: detect and reject lock arguments that overflow
  NFSD: discard fh_locked flag and fh_lock/fh_unlock
  NFSD: use (un)lock_inode instead of fh_(un)lock for file operations
  NFSD: use explicit lock/unlock for directory ops
  NFSD: reduce locking in nfsd_lookup()
  NFSD: only call fh_unlock() once in nfsd_link()
  NFSD: always drop directory lock in nfsd_unlink()
  NFSD: change nfsd_create()/nfsd_symlink() to unlock directory before returning.
  NFSD: add posix ACLs to struct nfsd_attrs
  NFSD: add security label to struct nfsd_attrs
  NFSD: set attributes when creating symlinks
  NFSD: introduce struct nfsd_attrs
  NFSD: verify the opened dentry after setting a delegation
  NFSD: drop fh argument from alloc_init_deleg
  NFSD: Move copy offload callback arguments into a separate structure
  NFSD: Add nfsd4_send_cb_offload()
  NFSD: Remove kmalloc from nfsd4_do_async_copy()
  NFSD: Refactor nfsd4_do_copy()
  NFSD: Refactor nfsd4_cleanup_inter_ssc() (2/2)
  NFSD: Refactor nfsd4_cleanup_inter_ssc() (1/2)
  ...
2022-08-09 14:56:49 -07:00
15205c2829 Merge tag 'fscache-fixes-20220809' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull fscache updates from David Howells:

 - Fix a cookie access ref leak if a cookie is invalidated a second time
   before the first invalidation is actually processed.

 - Add a tracepoint to log cookie lookup failure

* tag 'fscache-fixes-20220809' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  fscache: add tracepoint when failing cookie
  fscache: don't leak cookie access refs if invalidation is in progress or failed
2022-08-09 10:11:56 -07:00
4b22e20741 Merge tag 'afs-fixes-20220802' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
Pull AFS fixes from David Howells:
 "Fix AFS refcount handling.

  The first patch converts afs to use refcount_t for its refcounts and
  the second patch fixes afs_put_call() and afs_put_server() to save the
  values they're going to log in the tracepoint before decrementing the
  refcount"

* tag 'afs-fixes-20220802' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Fix access after dec in put functions
  afs: Use refcount_t rather than atomic_t
2022-08-09 10:08:08 -07:00
426b4ca2d6 Merge tag 'fs.setgid.v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull setgid updates from Christian Brauner:
 "This contains the work to move setgid stripping out of individual
  filesystems and into the VFS itself.

  Creating files that have both the S_IXGRP and S_ISGID bit raised in
  directories that themselves have the S_ISGID bit set requires
  additional privileges to avoid security issues.

  When a filesystem creates a new inode it needs to take care that the
  caller is either in the group of the newly created inode or they have
  CAP_FSETID in their current user namespace and are privileged over the
  parent directory of the new inode. If any of these two conditions is
  true then the S_ISGID bit can be raised for an S_IXGRP file and if not
  it needs to be stripped.

  However, there are several key issues with the current implementation:

   - S_ISGID stripping logic is entangled with umask stripping.

     For example, if the umask removes the S_IXGRP bit from the file
     about to be created then the S_ISGID bit will be kept.

     The inode_init_owner() helper is responsible for S_ISGID stripping
     and is called before posix_acl_create(). So we can end up with two
     different orderings:

     1. FS without POSIX ACL support

        First strip umask then strip S_ISGID in inode_init_owner().

        In other words, if a filesystem doesn't support or enable POSIX
        ACLs then umask stripping is done directly in the vfs before
        calling into the filesystem:

     2. FS with POSIX ACL support

        First strip S_ISGID in inode_init_owner() then strip umask in
        posix_acl_create().

        In other words, if the filesystem does support POSIX ACLs then
        unmask stripping may be done in the filesystem itself when
        calling posix_acl_create().

     Note that technically filesystems are free to impose their own
     ordering between posix_acl_create() and inode_init_owner() meaning
     that there's additional ordering issues that influence S_ISGID
     inheritance.

     (Note that the commit message of commit 1639a49ccd ("fs: move
     S_ISGID stripping into the vfs_*() helpers") gets the ordering
     between inode_init_owner() and posix_acl_create() the wrong way
     around. I realized this too late.)

   - Filesystems that don't rely on inode_init_owner() don't get S_ISGID
     stripping logic.

     While that may be intentional (e.g. network filesystems might just
     defer setgid stripping to a server) it is often just a security
     issue.

     Note that mandating the use of inode_init_owner() was proposed as
     an alternative solution but that wouldn't fix the ordering issues
     and there are examples such as afs where the use of
     inode_init_owner() isn't possible.

     In any case, we should also try the cleaner and generalized
     solution first before resorting to this approach.

   - We still have S_ISGID inheritance bugs years after the initial
     round of S_ISGID inheritance fixes:

       e014f37db1 ("xfs: use setattr_copy to set vfs inode attributes")
       01ea173e10 ("xfs: fix up non-directory creation in SGID directories")
       fd84bfdddd ("ceph: fix up non-directory creation in SGID directories")

  All of this led us to conclude that the current state is too messy.
  While we won't be able to make it completely clean as
  posix_acl_create() is still a filesystem specific call we can improve
  the S_SIGD stripping situation quite a bit by hoisting it out of
  inode_init_owner() and into the respective vfs creation operations.

  The obvious advantage is that we don't need to rely on individual
  filesystems getting S_ISGID stripping right and instead can
  standardize the ordering between S_ISGID and umask stripping directly
  in the VFS.

  A few short implementation notes:

   - The stripping logic needs to happen in vfs_*() helpers for the sake
     of stacking filesystems such as overlayfs that rely on these
     helpers taking care of S_ISGID stripping.

   - Security hooks have never seen the mode as it is ultimately seen by
     the filesystem because of the ordering issue we mentioned. Nothing
     is changed for them. We simply continue to strip the umask before
     passing the mode down to the security hooks.

   - The following filesystems use inode_init_owner() and thus relied on
     S_ISGID stripping: spufs, 9p, bfs, btrfs, ext2, ext4, f2fs,
     hfsplus, hugetlbfs, jfs, minix, nilfs2, ntfs3, ocfs2, omfs,
     overlayfs, ramfs, reiserfs, sysv, ubifs, udf, ufs, xfs, zonefs,
     bpf, tmpfs.

     We've audited all callchains as best as we could. More details can
     be found in the commit message to 1639a49ccd ("fs: move S_ISGID
     stripping into the vfs_*() helpers")"

* tag 'fs.setgid.v6.0' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  ceph: rely on vfs for setgid stripping
  fs: move S_ISGID stripping into the vfs_*() helpers
  fs: Add missing umask strip in vfs_tmpfile
  fs: add mode_strip_sgid() helper
2022-08-09 09:52:28 -07:00
b8dcef877a Merge tag 'memblock-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock
Pull memblock updates from Mike Rapoport:

 - An optimization in memblock_add_range() to reduce array traversals

 - Improvements to the memblock test suite

* tag 'memblock-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock:
  memblock test: Modify the obsolete description in README
  memblock tests: fix compilation errors
  memblock tests: change build options to run-time options
  memblock tests: remove completed TODO items
  memblock tests: set memblock_debug to enable memblock_dbg() messages
  memblock tests: add verbose output to memblock tests
  memblock tests: Makefile: add arguments to control verbosity
  memblock: avoid some repeat when add new range
2022-08-09 09:48:30 -07:00
15886321a4 Merge tag 'm68knommu-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
Pull m68knommu fixes from Greg Ungerer:

 - spelling in comment

 - compilation when flexcan driver enabled

 - sparse warning

* tag 'm68knommu-for-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
  m68k: Fix syntax errors in comments
  m68k: coldfire: make symbol m523x_clk_lookup static
  m68k: coldfire/device.c: protect FLEXCAN blocks
2022-08-09 09:39:25 -07:00
5318b987fe Merge tag 'x86_bugs_pbrsb' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 eIBRS fixes from Borislav Petkov:
 "More from the CPU vulnerability nightmares front:

  Intel eIBRS machines do not sufficiently mitigate against RET
  mispredictions when doing a VM Exit therefore an additional RSB,
  one-entry stuffing is needed"

* tag 'x86_bugs_pbrsb' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/speculation: Add LFENCE to RSB fill sequence
  x86/speculation: Add RSB VM Exit protections
2022-08-09 09:29:07 -07:00
1a1e3aca9d fscache: add tracepoint when failing cookie
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
2022-08-09 14:13:59 +01:00
fb24771faf fscache: don't leak cookie access refs if invalidation is in progress or failed
It's possible for a request to invalidate a fscache_cookie will come in
while we're already processing an invalidation. If that happens we
currently take an extra access reference that will leak. Only call
__fscache_begin_cookie_access if the FSCACHE_COOKIE_DO_INVALIDATE bit
was previously clear.

Also, ensure that we attempt to clear the bit when the cookie is
"FAILED" and put the reference to avoid an access leak.

Fixes: 85e4ea1049 ("fscache: Fix invalidation/lookup race")
Suggested-by: David Howells <dhowells@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
2022-08-09 14:13:55 +01:00
eb555cb5b7 Merge tag '5.20-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd
Pull ksmbd updates from Steve French:

 - fixes for memory access bugs (out of bounds access, oops, leak)

 - multichannel fixes

 - session disconnect performance improvement, and session register
   improvement

 - cleanup

* tag '5.20-rc-ksmbd-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: fix heap-based overflow in set_ntacl_dacl()
  ksmbd: prevent out of bound read for SMB2_TREE_CONNNECT
  ksmbd: prevent out of bound read for SMB2_WRITE
  ksmbd: fix use-after-free bug in smb2_tree_disconect
  ksmbd: fix memory leak in smb2_handle_negotiate
  ksmbd: fix racy issue while destroying session on multichannel
  ksmbd: use wait_event instead of schedule_timeout()
  ksmbd: fix kernel oops from idr_remove()
  ksmbd: add channel rwlock
  ksmbd: replace sessions list in connection with xarray
  MAINTAINERS: ksmbd: add entry for documentation
  ksmbd: remove unused ksmbd_share_configs_cleanup function
2022-08-08 20:15:13 -07:00
f30adc0d33 Merge tag 'pull-work.iov_iter-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull more iov_iter updates from Al Viro:

 - more new_sync_{read,write}() speedups - ITER_UBUF introduction

 - ITER_PIPE cleanups

 - unification of iov_iter_get_pages/iov_iter_get_pages_alloc and
   switching them to advancing semantics

 - making ITER_PIPE take high-order pages without splitting them

 - handling copy_page_from_iter() for high-order pages properly

* tag 'pull-work.iov_iter-rebased' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (32 commits)
  fix copy_page_from_iter() for compound destinations
  hugetlbfs: copy_page_to_iter() can deal with compound pages
  copy_page_to_iter(): don't split high-order page in case of ITER_PIPE
  expand those iov_iter_advance()...
  pipe_get_pages(): switch to append_pipe()
  get rid of non-advancing variants
  ceph: switch the last caller of iov_iter_get_pages_alloc()
  9p: convert to advancing variant of iov_iter_get_pages_alloc()
  af_alg_make_sg(): switch to advancing variant of iov_iter_get_pages()
  iter_to_pipe(): switch to advancing variant of iov_iter_get_pages()
  block: convert to advancing variants of iov_iter_get_pages{,_alloc}()
  iov_iter: advancing variants of iov_iter_get_pages{,_alloc}()
  iov_iter: saner helper for page array allocation
  fold __pipe_get_pages() into pipe_get_pages()
  ITER_XARRAY: don't open-code DIV_ROUND_UP()
  unify the rest of iov_iter_get_pages()/iov_iter_get_pages_alloc() guts
  unify xarray_get_pages() and xarray_get_pages_alloc()
  unify pipe_get_pages() and pipe_get_pages_alloc()
  iov_iter_get_pages(): sanity-check arguments
  iov_iter_get_pages_alloc(): lift freeing pages array on failure exits into wrapper
  ...
2022-08-08 20:04:35 -07:00
c03f05f183 fix copy_page_from_iter() for compound destinations
had been broken for ITER_BVEC et.al. since ever (OK, v3.17 when
ITER_BVEC had first appeared)...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:26 -04:00
c7d57ab163 hugetlbfs: copy_page_to_iter() can deal with compound pages
... since April 2021

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:26 -04:00
f0f6b614f8 copy_page_to_iter(): don't split high-order page in case of ITER_PIPE
... just shove it into one pipe_buffer.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:25 -04:00
310d9d5a50 expand those iov_iter_advance()...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:25 -04:00
746de1f86f pipe_get_pages(): switch to append_pipe()
now that we are advancing the iterator, there's no need to
treat the first page separately - just call append_pipe()
in a loop.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:25 -04:00
eba2d3d798 get rid of non-advancing variants
mechanical change; will be further massaged in subsequent commits

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:24 -04:00
b53589927d ceph: switch the last caller of iov_iter_get_pages_alloc()
here nothing even looks at the iov_iter after the call, so we couldn't
care less whether it advances or not.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:24 -04:00
7f02464739 9p: convert to advancing variant of iov_iter_get_pages_alloc()
that one is somewhat clumsier than usual and needs serious testing.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:23 -04:00
dc5801f60b af_alg_make_sg(): switch to advancing variant of iov_iter_get_pages()
... and adjust the callers

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:23 -04:00
7d690c157c iter_to_pipe(): switch to advancing variant of iov_iter_get_pages()
... and untangle the cleanup on failure to add into pipe.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:23 -04:00
480cb846c2 block: convert to advancing variants of iov_iter_get_pages{,_alloc}()
... doing revert if we end up not using some pages

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:22 -04:00
1ef255e257 iov_iter: advancing variants of iov_iter_get_pages{,_alloc}()
Most of the users immediately follow successful iov_iter_get_pages()
with advancing by the amount it had returned.

Provide inline wrappers doing that, convert trivial open-coded
uses of those.

BTW, iov_iter_get_pages() never returns more than it had been asked
to; such checks in cifs ought to be removed someday...

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:22 -04:00
3cf42da327 iov_iter: saner helper for page array allocation
All call sites of get_pages_array() are essenitally identical now.
Replace with common helper...

Returns number of slots available in resulting array or 0 on OOM;
it's up to the caller to make sure it doesn't ask to zero-entry
array (i.e. neither maxpages nor size are allowed to be zero).

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:22 -04:00
8520008417 fold __pipe_get_pages() into pipe_get_pages()
... and don't mangle maxsize there - turn the loop into counting
one instead.  Easier to see that we won't run out of array that
way.  Note that special treatment of the partial buffer in that
thing is an artifact of the non-advancing semantics of
iov_iter_get_pages() - if not for that, it would be append_pipe(),
same as the body of the loop that follows it.  IOW, once we make
iov_iter_get_pages() advancing, the whole thing will turn into
	calculate how many pages do we want
	allocate an array (if needed)
	call append_pipe() that many times.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:21 -04:00
0aa4fc32f5 ITER_XARRAY: don't open-code DIV_ROUND_UP()
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:21 -04:00
451c0ba947 unify the rest of iov_iter_get_pages()/iov_iter_get_pages_alloc() guts
same as for pipes and xarrays; after that iov_iter_get_pages() becomes
a wrapper for __iov_iter_get_pages_alloc().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:21 -04:00
68fe506f37 unify xarray_get_pages() and xarray_get_pages_alloc()
same as for pipes

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:20 -04:00
acbdeb8320 unify pipe_get_pages() and pipe_get_pages_alloc()
The differences between those two are
* pipe_get_pages() gets a non-NULL struct page ** value pointing to
preallocated array + array size.
* pipe_get_pages_alloc() gets an address of struct page ** variable that
contains NULL, allocates the array and (on success) stores its address in
that variable.

	Not hard to combine - always pass struct page ***, have
the previous pipe_get_pages_alloc() caller pass ~0U as cap for
array size.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:20 -04:00
c81ce28df5 iov_iter_get_pages(): sanity-check arguments
zero maxpages is bogus, but best treated as "just return 0";
NULL pages, OTOH, should be treated as a hard bug.

get rid of now completely useless checks in xarray_get_pages{,_alloc}().

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:20 -04:00
91329559eb iov_iter_get_pages_alloc(): lift freeing pages array on failure exits into wrapper
Incidentally, ITER_XARRAY did *not* free the sucker in case when
iter_xarray_populate_pages() returned 0...

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:19 -04:00
12d426ab64 ITER_PIPE: fold data_start() and pipe_space_for_user() together
All their callers are next to each other; all of them
want the total amount of pages and, possibly, the
offset in the partial final buffer.

Combine into a new helper (pipe_npages()), fix the
bogosity in pipe_space_for_user(), while we are at it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:19 -04:00
10f525a8cd ITER_PIPE: cache the type of last buffer
We often need to find whether the last buffer is anon or not, and
currently it's rather clumsy:
	check if ->iov_offset is non-zero (i.e. that pipe is not empty)
	if so, get the corresponding pipe_buffer and check its ->ops
	if it's &default_pipe_buf_ops, we have an anon buffer.

Let's replace the use of ->iov_offset (which is nowhere near similar to
its role for other flavours) with signed field (->last_offset), with
the following rules:
	empty, no buffers occupied:		0
	anon, with bytes up to N-1 filled:	N
	zero-copy, with bytes up to N-1 filled:	-N

That way abs(i->last_offset) is equal to what used to be in i->iov_offset
and empty vs. anon vs. zero-copy can be distinguished by the sign of
i->last_offset.

	Checks for "should we extend the last buffer or should we start
a new one?" become easier to follow that way.

	Note that most of the operations can only be done in a sane
state - i.e. when the pipe has nothing past the current position of
iterator.  About the only thing that could be done outside of that
state is iov_iter_advance(), which transitions to the sane state by
truncating the pipe.  There are only two cases where we leave the
sane state:
	1) iov_iter_get_pages()/iov_iter_get_pages_alloc().  Will be
dealt with later, when we make get_pages advancing - the callers are
actually happier that way.
	2) iov_iter copied, then something is put into the copy.  Since
they share the underlying pipe, the original gets behind.  When we
decide that we are done with the copy (original is not usable until then)
we advance the original.  direct_io used to be done that way; nowadays
it operates on the original and we do iov_iter_revert() to discard
the excessive data.  At the moment there's nothing in the kernel that
could do that to ITER_PIPE iterators, so this reason for insane state
is theoretical right now.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:18 -04:00
92acdc4f37 ITER_PIPE: clean iov_iter_revert()
Fold pipe_truncate() into it, clean up.  We can release buffers
in the same loop where we walk backwards to the iterator beginning
looking for the place where the new position will be.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:18 -04:00
2c855de933 ITER_PIPE: clean pipe_advance() up
instead of setting ->iov_offset for new position and calling
pipe_truncate() to adjust ->len of the last buffer and discard
everything after it, adjust ->len at the same time we set ->iov_offset
and use pipe_discard_from() to deal with buffers past that.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:18 -04:00
ca59196754 ITER_PIPE: lose iter_head argument of __pipe_get_pages()
it's only used to get to the partial buffer we can add to,
and that's always the last one, i.e. pipe->head - 1.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:17 -04:00
e3b42964f8 ITER_PIPE: fold push_pipe() into __pipe_get_pages()
Expand the only remaining call of push_pipe() (in
__pipe_get_pages()), combine it with the page-collecting loop there.

Note that the only reason it's not a loop doing append_pipe() is
that append_pipe() is advancing, while iov_iter_get_pages() is not.
As soon as it switches to saner semantics, this thing will switch
to using append_pipe().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:17 -04:00
8fad7767ed ITER_PIPE: allocate buffers as we go in copy-to-pipe primitives
New helper: append_pipe().  Extends the last buffer if possible,
allocates a new one otherwise.  Returns page and offset in it
on success, NULL on failure.  iov_iter is advanced past the
data we've got.

Use that instead of push_pipe() in copy-to-pipe primitives;
they get simpler that way.  Handling of short copy (in "mc" one)
is done simply by iov_iter_revert() - iov_iter is in consistent
state after that one, so we can use that.

[Fix for braino caught by Liu Xinpeng <liuxp11@chinatelecom.cn> folded in]
[another braino fix, this time in copy_pipe_to_iter() and pipe_zero();
caught by testcase from Hugh Dickins]

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:17 -04:00
47b7fcae41 ITER_PIPE: helpers for adding pipe buffers
There are only two kinds of pipe_buffer in the area used by ITER_PIPE.

1) anonymous - copy_to_iter() et.al. end up creating those and copying
data there.  They have zero ->offset, and their ->ops points to
default_pipe_page_ops.

2) zero-copy ones - those come from copy_page_to_iter(), and page
comes from caller.  ->offset is also caller-supplied - it might be
non-zero.  ->ops points to page_cache_pipe_buf_ops.

Move creation and insertion of those into helpers - push_anon(pipe, size)
and push_page(pipe, page, offset, size) resp., separating them from
the "could we avoid creating a new buffer by merging with the current
head?" logics.

Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:16 -04:00
2dcedb2a54 ITER_PIPE: helper for getting pipe buffer by index
pipe_buffer instances of a pipe are organized as a ring buffer,
with power-of-2 size.  Indices are kept *not* reduced modulo ring
size, so the buffer refered to by index N is
	pipe->bufs[N & (pipe->ring_size - 1)].

Ring size can change over the lifetime of a pipe, but not while
the pipe is locked.  So for any iov_iter primitives it's a constant.
Original conversion of pipes to this layout went overboard trying
to microoptimize that - calculating pipe->ring_size - 1, storing
it in a local variable and using through the function.  In some
cases it might be warranted, but most of the times it only
obfuscates what's going on in there.

Introduce a helper (pipe_buf(pipe, N)) that would encapsulate
that and use it in the obvious cases.  More will follow...

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:16 -04:00
0d96493413 splice: stop abusing iov_iter_advance() to flush a pipe
Use pipe_discard_from() explicitly in generic_file_read_iter(); don't bother
with rather non-obvious use of iov_iter_advance() in there.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:16 -04:00
3e20a751af switch new_sync_{read,write}() to ITER_UBUF
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:15 -04:00
fcb14cb1bd new iov_iter flavour - ITER_UBUF
Equivalent of single-segment iovec.  Initialized by iov_iter_ubuf(),
checked for by iter_is_ubuf(), otherwise behaves like ITER_IOVEC
ones.

We are going to expose the things like ->write_iter() et.al. to those
in subsequent commits.

New predicate (user_backed_iter()) that is true for ITER_IOVEC and
ITER_UBUF; places like direct-IO handling should use that for
checking that pages we modify after getting them from iov_iter_get_pages()
would need to be dirtied.

DO NOT assume that replacing iter_is_iovec() with user_backed_iter()
will solve all problems - there's code that uses iter_is_iovec() to
decide how to poke around in iov_iter guts and for that the predicate
replacement obviously won't suffice.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-08-08 22:37:15 -04:00
5d5d353bed Merge tag 'rproc-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
Pull remoteproc updates from Bjorn Andersson:
 "This introduces support for the remoteproc on Mediatek MT8188, and
  enables caches for MT8186 SCP. It adds support for PRU cores found on
  the TI K3 AM62x SoCs.

  It moves the recovery work after a firmware crash to an unbound
  workqueue, to allow recovery to happen in parallel.

  A new DMA API is introduced to release dma_mem for a device.

  It adds support a panic handler for the Qualcomm modem remoteproc,
  with the goal of having caches flushed in memory dumps for post-mortem
  debugging and it introduces a mechanism to wait for the modem firmware
  on SM8450 to decrypt part of its memory for post-mortem debugging.

  Qualcomm sysmon is restricted to only inform remote processors about
  peers that are actually running, to avoid a race where Linux tries to
  notify a recovering remote processor about its peers new state. A
  mechanism for waiting for the sysmon connection to be established is
  also introduced, to avoid out-of-sync updates for rapidly restarting
  remote processors.

  A number of Devicetree binding cleanups and conversions to YAML are
  introduced, to facilitate Devicetree validation. Lastly it introduces
  a number of smaller fixes and cleanups in the core and a few different
  drivers"

* tag 'rproc-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux: (42 commits)
  remoteproc: qcom_q6v5_pas: Do not fail if regulators are not found
  drivers/remoteproc: fix repeated words in comments
  remoteproc: Directly use ida_alloc()/free()
  remoteproc: Use unbounded workqueue for recovery work
  remoteproc: using pm_runtime_resume_and_get instead of pm_runtime_get_sync
  remoteproc: qcom_q6v5_pas: Deal silently with optional px and cx regulators
  remoteproc: sysmon: Send sysmon state only for running rprocs
  remoteproc: sysmon: Wait for SSCTL service to come up
  remoteproc: qcom: q6v5: Set q6 state to offline on receiving wdog irq
  remoteproc: qcom: pas: Check if coredump is enabled
  remoteproc: qcom: pas: Mark devices as wakeup capable
  remoteproc: qcom: pas: Mark va as io memory
  remoteproc: qcom: pas: Add decrypt shutdown support for modem
  remoteproc: qcom: q6v5-mss: add powerdomains to MSM8996 config
  remoteproc: qcom_q6v5: Introduce panic handler for MSS
  remoteproc: qcom_q6v5_mss: Update MBA log info
  remoteproc: qcom: correct kerneldoc
  remoteproc: qcom_q6v5_mss: map/unmap metadata region before/after use
  remoteproc: qcom: using pm_runtime_resume_and_get to simplify the code
  remoteproc: mediatek: Support MT8188 SCP
  ...
2022-08-08 15:16:29 -07:00
c72687614b Merge tag 'rpmsg-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux
Pull rpmsg updates from Bjorn Andersson:
 "This contains fixes and cleanups in the rpmsg core, Qualcomm SMD and
  GLINK drivers, a circular lock dependency in the Mediatek driver and
  a possible race condition in the rpmsg_char driver is resolved"

* tag 'rpmsg-v5.20' of git://git.kernel.org/pub/scm/linux/kernel/git/remoteproc/linux:
  rpmsg: convert sysfs snprintf to sysfs_emit
  rpmsg: qcom_smd: Fix refcount leak in qcom_smd_parse_edge
  rpmsg: qcom: correct kerneldoc
  rpmsg: qcom: glink: remove unused name
  rpmsg: qcom: glink: replace strncpy() with strscpy_pad()
  rpmsg: Strcpy is not safe, use strscpy_pad() instead
  rpmsg: Fix possible refcount leak in rpmsg_register_device_override()
  rpmsg: Fix parameter naming for announce_create/destroy ops
  rpmsg: mtk_rpmsg: Fix circular locking dependency
  rpmsg: char: Add mutex protection for rpmsg_eptdev_open()
2022-08-08 15:14:43 -07:00