Commit Graph

1138430 Commits

Author SHA1 Message Date
eeff73f7c1 NFSD: Clean up nfs4_preprocess_stateid_op() call sites
Remove the lame-duck dprintk()s around nfs4_preprocess_stateid_op()
call sites.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
2022-11-28 12:54:46 -05:00
b3276c1f5b NFSD: Flesh out a documenting comment for filecache.c
Record what we've learned recently about the NFSD filecache in a
documenting comment so our future selves don't forget what all this
is for.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-11-28 12:54:45 -05:00
4d1ea84557 NFSD: Add an NFSD_FILE_GC flag to enable nfsd_file garbage collection
NFSv4 operations manage the lifetime of nfsd_file items they use by
means of NFSv4 OPEN and CLOSE. Hence there's no need for them to be
garbage collected.

Introduce a mechanism to enable garbage collection for nfsd_file
items used only by NFSv2/3 callers.

Note that the change in nfsd_file_put() ensures that both CLOSE and
DELEGRETURN will actually close out and free an nfsd_file on last
reference of a non-garbage-collected file.

Link: https://bugzilla.linux-nfs.org/show_bug.cgi?id=394
Suggested-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
2022-11-28 12:54:45 -05:00
dcf3f80965 NFSD: Revert "NFSD: NFSv4 CLOSE should release an nfsd_file immediately"
This reverts commit 5e138c4a75.

That commit attempted to make files available to other users as soon
as all NFSv4 clients were done with them, rather than waiting until
the filecache LRU had garbage collected them.

It gets the reference counting wrong, for one thing.

But it also misses that DELEGRETURN should release a file in the
same fashion. In fact, any nfsd_file_put() on an file held open
by an NFSv4 client needs potentially to release the file
immediately...

Clear the way for implementing that idea.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
2022-11-28 12:54:45 -05:00
c252849082 NFSD: Pass the target nfsd_file to nfsd_commit()
In a moment I'm going to introduce separate nfsd_file types, one of
which is garbage-collected; the other, not. The garbage-collected
variety is to be used by NFSv2 and v3, and the non-garbage-collected
variety is to be used by NFSv4.

nfsd_commit() is invoked by both NFSv3 and NFSv4 consumers. We want
nfsd_commit() to find and use the correct variety of cached
nfsd_file object for the NFS version that is in use.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: NeilBrown <neilb@suse.de>
2022-11-28 12:54:45 -05:00
e0aa651068 nfsd: don't call nfsd_file_put from client states seqfile display
We had a report of this:

    BUG: sleeping function called from invalid context at fs/nfsd/filecache.c:440

...with a stack trace showing nfsd_file_put being called from
nfs4_show_open. This code has always tried to call fput while holding a
spinlock, but we recently changed this to use the filecache, and that
started triggering the might_sleep() in nfsd_file_put.

states_start takes and holds the cl_lock while iterating over the
client's states, and we can't sleep with that held.

Have the various nfs4_show_* functions instead hold the fi_lock instead
of taking a nfsd_file reference.

Fixes: 78599c42ae ("nfsd4: add file to display list of client's opens")
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2138357
Reported-by: Zhi Li <yieli@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-11-28 12:54:45 -05:00
427505ffea exportfs: use pr_debug for unreachable debug statements
expfs.c has a bunch of dprintk statements which are unusable due to:
 #define dprintk(fmt, args...) do{}while(0)
Use pr_debug so that they can be enabled dynamically.
Also make some minor changes to the debug statements to fix some
incorrect types, and remove __func__ which can be handled by dynamic
debug separately.

Signed-off-by: David Disseldorp <ddiss@suse.de>
Reviewed-by: NeilBrown <neilb@suse.de>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-11-28 12:54:45 -05:00
2f3a4b2ac2 nfsd: allow disabling NFSv2 at compile time
rpc.nfsd stopped supporting NFSv2 a year ago. Take the next logical
step toward deprecating it and allow NFSv2 support to be compiled out.

Add a new CONFIG_NFSD_V2 option that can be turned off and rework the
CONFIG_NFSD_V?_ACL option dependencies. Add a description that
discourages enabling it.

Also, change the description of CONFIG_NFSD to state that the always-on
version is now 3 instead of 2.

Finally, add an #ifdef around "case 2:" in __write_versions. When NFSv2
is disabled at compile time, this should make the kernel ignore attempts
to disable it at runtime, but still error out when trying to enable it.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-11-28 12:54:45 -05:00
cb12fae1c3 nfsd: move nfserrno() to vfs.c
nfserrno() is common to all nfs versions, but nfsproc.c is specifically
for NFSv2. Move it to vfs.c, and the prototype to vfs.h.

While we're in here, remove the #ifdef EDQUOT check in this function.
It's apparently a holdover from the initial merge of the nfsd code in
1997. No other place in the kernel checks that that symbol is defined
before using it, so I think we can dispense with it here.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-11-28 12:54:44 -05:00
8e823bafff nfsd: ignore requests to disable unsupported versions
The kernel currently errors out if you attempt to enable or disable a
version that it doesn't recognize. Change it to ignore attempts to
disable an unrecognized version. If we don't support it, then there is
no harm in doing so.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-11-28 12:54:44 -05:00
841fd0a3cb NFSD: Finish converting the NFSv3 GETACL result encoder
For some reason, the NFSv2 GETACL result encoder was fully converted
to use the new nfs_stream_encode_acl(), but the NFSv3 equivalent was
not similarly converted.

Fixes: 20798dfe24 ("NFSD: Update the NFSv3 GETACL result encoder to use struct xdr_stream")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-11-28 12:54:44 -05:00
ea5021e911 NFSD: Finish converting the NFSv2 GETACL result encoder
The xdr_stream conversion inadvertently left some code that set the
page_len of the send buffer. The XDR stream encoders should handle
this automatically now.

This oversight adds garbage past the end of the Reply message.
Clients typically ignore the garbage, but NFSD does not need to send
it, as it leaks stale memory contents onto the wire.

Fixes: f8cba47344 ("NFSD: Update the NFSv2 GETACL result encoder to use struct xdr_stream")
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-11-28 12:54:44 -05:00
5a717a6e01 SUNRPC: Remove unused svc_rqst::rq_lock field
Clean up after commit 22700f3c6d ("SUNRPC: Improve ordering of
transport processing").

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-11-28 12:54:44 -05:00
69eed23baf NFSD: Remove redundant assignment to variable host_err
Variable host_err is assigned a value that is never read, it is being
re-assigned a value in every different execution path in the following
switch statement. The assignment is redundant and can be removed.

Cleans up clang-scan warning:
warning: Value stored to 'host_err' is never read [deadcode.DeadStores]

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-11-28 12:54:44 -05:00
eeadcb7579 NFSD: Simplify READ_PLUS
Chuck had suggested reverting READ_PLUS so it returns a single DATA
segment covering the requested read range. This prepares the server for
a future "sparse read" function so support can easily be added without
needing to rip out the old READ_PLUS code at the same time.

Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2022-11-28 12:54:43 -05:00
b7b275e60b Linux 6.1-rc7 v6.1-rc7 2022-11-27 13:31:48 -08:00
cf562a45a0 Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fix from Al Viro:
 "Amir's copy_file_range() fix"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  vfs: fix copy_file_range() averts filesystem freeze protection
2022-11-27 12:40:06 -08:00
9066e15186 Merge tag 'usb-6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
 "Here are some small USB fixes for 6.1-rc7 that resolve some reported
  problems:

   - cdnsp driver fixes for reported problems

   - dwc3 fixes for some small reported problems

   - uvc gadget driver fix for reported regression

  All of these have been in linux-next with no reported problems"

* tag 'usb-6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: cdnsp: fix issue with ZLP - added TD_SIZE = 1
  usb: dwc3: gadget: Clear ep descriptor last
  usb: dwc3: exynos: Fix remove() function
  usb: cdnsp: Fix issue with Clear Feature Halt Endpoint
  usb: dwc3: gadget: Disable GUSB2PHYCFG.SUSPHY for End Transfer
  usb: gadget: uvc: also use try_format in set_format
2022-11-27 12:30:57 -08:00
db3182484f Merge tag 'char-misc-6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
 "Here are some small driver fixes for 6.1-rc7, they include:

   - build warning fix for the vdso when using new versions of grep

   - iio driver fixes for reported issues

   - small nvmem driver fixes

   - fpga Kconfig fix

   - interconnect dt binding fix

  All of these have been in linux-next with no reported issues"

* tag 'char-misc-6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  lib/vdso: use "grep -E" instead of "egrep"
  nvmem: lan9662-otp: Change return type of lan9662_otp_wait_flag_clear()
  nvmem: rmem: Fix return value check in rmem_read()
  fpga: m10bmc-sec: Fix kconfig dependencies
  dt-bindings: iio: adc: Remove the property "aspeed,trim-data-valid"
  iio: adc: aspeed: Remove the trim valid dts property.
  iio: core: Fix entry not deleted when iio_register_sw_trigger_type() fails
  iio: accel: bma400: Fix memory leak in bma400_get_steps_reg()
  iio: light: rpr0521: add missing Kconfig dependencies
  iio: health: afe4404: Fix oob read in afe4404_[read|write]_raw
  iio: health: afe4403: Fix oob read in afe4403_read_raw
  iio: light: apds9960: fix wrong register for gesture gain
  dt-bindings: interconnect: qcom,msm8998-bwmon: Correct SC7280 CPU compatible
2022-11-27 12:17:10 -08:00
715d2d9608 Merge tag 'timers_urgent_for_v6.1_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Borislav Petkov:

 - Return the proper timer register width (31 bits) for a 32-bit signed
   register in order to avoid a timer interrupt storm on ARM XGene-1
   hardware running in NO_HZ mode

* tag 'timers_urgent_for_v6.1_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/drivers/arm_arch_timer: Fix XGene-1 TVAL register math error
2022-11-27 12:11:00 -08:00
b465cf1773 Merge tag 'objtool_urgent_for_v6.1_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fix from Borislav Petkov:

 - Handle different output of readelf on different distros running
   ppc64le which confuses faddr2line's function offsets conversion

* tag 'objtool_urgent_for_v6.1_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  scripts/faddr2line: Fix regression in name resolution on ppc64le
2022-11-27 12:08:17 -08:00
08b0644126 Merge tag 'x86_urgent_for_v6.1_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:

 - ioremap: mask out the bits which are not part of the physical address
   *after* the size computation is done to prevent any hypothetical
   ioremap failures

 - Change the MSR save/restore functionality during suspend to rely on
   flags denoting that the related MSRs are actually supported vs
   reading them and assuming they are (an Atom one allows reading but
   not writing, thus breaking this scheme at resume time)

 - prevent IV reuse in the AES-GCM communication scheme between SNP
   guests and the AMD secure processor

* tag 'x86_urgent_for_v6.1_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ioremap: Fix page aligned size calculation in __ioremap_caller()
  x86/pm: Add enumeration check before spec MSRs save/restore setup
  x86/tsx: Add a feature bit for TSX control MSR support
  virt/sev-guest: Prevent IV reuse in the SNP guest driver
2022-11-27 11:59:14 -08:00
5afcab2217 Merge tag 'perf_urgent_for_v6.1_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Borislav Petkov:
 "Two more fixes to the perf sigtrap handling:

   - output the address in the sample only when it has been requested

   - handle the case where user-only events can hit in kernel and thus
     upset the sigtrap sanity checking"

* tag 'perf_urgent_for_v6.1_rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Consider OS filter fail
  perf: Fixup SIGTRAP and sample_flags interaction
2022-11-27 11:53:41 -08:00
bf82d38c91 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "x86:

   - Fixes for Xen emulation. While nobody should be enabling it in the
     kernel (the only public users of the feature are the selftests),
     the bug effectively allows userspace to read arbitrary memory.

   - Correctness fixes for nested hypervisors that do not intercept INIT
     or SHUTDOWN on AMD; the subsequent CPU reset can cause a
     use-after-free when it disables virtualization extensions. While
     downgrading the panic to a WARN is quite easy, the full fix is a
     bit more laborious; there are also tests. This is the bulk of the
     pull request.

   - Fix race condition due to incorrect mmu_lock use around
     make_mmu_pages_available().

  Generic:

   - Obey changes to the kvm.halt_poll_ns module parameter in VMs not
     using KVM_CAP_HALT_POLL, restoring behavior from before the
     introduction of the capability"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: Update gfn_to_pfn_cache khva when it moves within the same page
  KVM: x86/xen: Only do in-kernel acceleration of hypercalls for guest CPL0
  KVM: x86/xen: Validate port number in SCHEDOP_poll
  KVM: x86/mmu: Fix race condition in direct_page_fault
  KVM: x86: remove exit_int_info warning in svm_handle_exit
  KVM: selftests: add svm part to triple_fault_test
  KVM: x86: allow L1 to not intercept triple fault
  kvm: selftests: add svm nested shutdown test
  KVM: selftests: move idt_entry to header
  KVM: x86: forcibly leave nested mode on vCPU reset
  KVM: x86: add kvm_leave_nested
  KVM: x86: nSVM: harden svm_free_nested against freeing vmcb02 while still in use
  KVM: x86: nSVM: leave nested mode on vCPU free
  KVM: Obey kvm.halt_poll_ns in VMs not using KVM_CAP_HALT_POLL
  KVM: Avoid re-reading kvm->max_halt_poll_ns during halt-polling
  KVM: Cap vcpu->halt_poll_ns before halting rather than after
2022-11-27 09:08:40 -08:00
30a853c1bd Merge tag '6.1-rc6-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
 "Two small cifs/smb3 client fixes:

   - an unlock missing in an error path in copychunk_range found by
     xfstest 476

   - a fix for a use after free in a debug code path"

* tag '6.1-rc6-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: fix missing unlock in cifs_file_copychunk_range()
  cifs: Use after free in debug code
2022-11-27 08:48:26 -08:00
faf68e3523 Merge tag 'kbuild-fixes-v6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:

 - Fix CC_HAS_ASM_GOTO_TIED_OUTPUT test in Kconfig

 - Fix noisy "No such file or directory" message when
   KBUILD_BUILD_VERSION is passed

 - Include rust/ in source tarballs

 - Fix missing FORCE for ARCH=nios2 builds

* tag 'kbuild-fixes-v6.1-4' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  nios2: add FORCE for vmlinuz.gz
  scripts: add rust in scripts/Makefile.package
  kbuild: fix "cat: .version: No such file or directory"
  init/Kconfig: fix CC_HAS_ASM_GOTO_TIED_OUTPUT test with dash
2022-11-26 16:38:56 -08:00
869e4ae4cd nios2: add FORCE for vmlinuz.gz
Add FORCE to placate a warning from make:

arch/nios2/boot/Makefile:24: FORCE prerequisite is missing

Fixes: 2fc8483fdc ("nios2: Build infrastructure")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Masahiro Yamada <masahiroy@kernel.org>
2022-11-27 08:28:41 +09:00
e5f3ec38c8 Merge tag 'nfsd-6.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux
Pull nfsd fix from Chuck Lever:

 - Fix rare data corruption on READ operations

* tag 'nfsd-6.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  NFSD: Fix reads with a non-zero offset that don't end on a page boundary
2022-11-26 12:25:49 -08:00
644e952438 Merge tag 'for-v6.1-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply
Pull power supply fixes from Sebastian Reichel:

 - rk817: Two error handling fixes

 - ip5xxx: fix inter overflow in current calculation

 - ab8500: fix thermal zone probing

* tag 'for-v6.1-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply:
  power: supply: ab8500: Defer thermal zone probe
  power: supply: ip5xxx: Fix integer overflow in current_now calculation
  power: supply: rk817: Change rk817_chg_cur_to_reg to int
  power: supply: rk817: check correct variable
2022-11-25 18:02:49 -08:00
990f320031 Merge tag 'block-6.1-2022-11-25' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:

 - A few fixes for s390 sads (Stefan, Colin)

 - Ensure that ublk doesn't reorder requests, as that can be problematic
   on devices that need specific ordering (Ming)

 - Fix a queue reference leak in disk allocation handling (Christoph)

* tag 'block-6.1-2022-11-25' of git://git.kernel.dk/linux:
  ublk_drv: don't forward io commands in reserve order
  s390/dasd: fix possible buffer overflow in copy_pair_show
  s390/dasd: fix no record found for raw_track_access
  s390/dasd: increase printing of debug data payload
  s390/dasd: Fix spelling mistake "Ivalid" -> "Invalid"
  blk-mq: fix queue reference leak on blk_mq_alloc_disk_for_queue failure
2022-11-25 17:50:57 -08:00
364eb61834 Merge tag 'io_uring-6.1-2022-11-25' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:

 - A few poll related fixes. One fixing a race condition between poll
   cancelation and trigger, and one making the overflow handling a bit
   more robust (Lin, Pavel)

 - Fix an fput() for error handling in the direct file table (Lin)

 - Fix for a regression introduced in this cycle, where we don't always
   get TIF_NOTIFY_SIGNAL cleared appropriately (me)

* tag 'io_uring-6.1-2022-11-25' of git://git.kernel.dk/linux:
  io_uring: clear TIF_NOTIFY_SIGNAL if set and task_work not available
  io_uring/poll: fix poll_refs race with cancelation
  io_uring/filetable: fix file reference underflow
  io_uring: make poll refs more robust
  io_uring: cmpxchg for poll arm refs release
2022-11-25 17:46:04 -08:00
3e0d88f911 Merge tag 'zonefs-6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs
Pull zonefs fixes from Damien Le Moal:

 - Fix a race between zonefs module initialization of sysfs attribute
   directory and mounting a drive (from Xiaoxu).

 - Fix active zone accounting in the rare case of an IO error due to a
   zone transition to offline or read-only state (from me).

* tag 'zonefs-6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs:
  zonefs: Fix active zone accounting
  zonefs: Fix race between modprobe and mount
2022-11-25 16:34:39 -08:00
f10b439638 Merge tag 'regulator-fix-v6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Pull regulator fixes from Mark Brown:
 "This is more changes than I'd like this late although the diffstat is
  still fairly small, I kept on holding off as new fixes came in to give
  things time to soak in -next but should probably have tagged and sent
  an additional pull request earlier.

  There's some relatively large fixes to the twl6030 driver to fix
  issues with the TWL6032 variant which resulted from some work on the
  core TWL6030 driver, a couple of fixes for error handling paths
  (mostly in the core), and a nice stability fix for the sgl51000 driver
  that's been pulled out of a BSP"

* tag 'regulator-fix-v6.1-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: twl6030: fix get status of twl6032 regulators
  regulator: twl6030: re-add TWL6032_SUBCLASS
  regulator: slg51000: Wait after asserting CS pin
  regulator: core: fix UAF in destroy_regulator()
  regulator: rt5759: fix OOB in validate_desc()
  regulator: core: fix kobject release warning and memory leak in regulator_register()
2022-11-25 13:54:48 -08:00
3eaea0db25 Merge tag 'for-6.1-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:

 - fix a regression in nowait + buffered write

 - in zoned mode fix endianness when comparing super block generation

 - locking and lockdep fixes:
     - fix potential sleeping under spinlock when setting qgroup limit
     - lockdep warning fixes when btrfs_path is freed after copy_to_user
     - do not modify log tree while holding a leaf from fs tree locked

 - fix freeing of sysfs files of static features on error

 - use kv.alloc for zone map allocation as a fallback to avoid warnings
   due to high order allocation

 - send, avoid unaligned encoded writes when attempting to clone range

* tag 'for-6.1-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: sysfs: normalize the error handling branch in btrfs_init_sysfs()
  btrfs: do not modify log tree while holding a leaf from fs tree locked
  btrfs: use kvcalloc in btrfs_get_dev_zone_info
  btrfs: qgroup: fix sleep from invalid context bug in btrfs_qgroup_inherit()
  btrfs: send: avoid unaligned encoded writes when attempting to clone range
  btrfs: zoned: fix missing endianness conversion in sb_write_pointer
  btrfs: free btrfs_path before copying subvol info to userspace
  btrfs: free btrfs_path before copying fspath to userspace
  btrfs: free btrfs_path before copying inodes to userspace
  btrfs: free btrfs_path before copying root refs to userspace
  btrfs: fix assertion failure and blocking during nowait buffered write
2022-11-25 13:24:05 -08:00
88817acb8b Merge tag 'pm-6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
 "These revert a recent change in the schedutil cpufreq governor that
  had not been expected to make any functional difference, but turned
  out to introduce a performance regression, fix an initialization issue
  in the amd-pstate driver and make it actually replace the venerable
  ACPI cpufreq driver on the supported systems by default.

  Specifics:

   - Revert a recent schedutil cpufreq governor change that introduced a
     performace regression on Pixel 6 (Sam Wu)

   - Fix amd-pstate driver initialization after running the kernel via
     kexec (Wyes Karny)

   - Turn amd-pstate into a built-in driver which allows it to take
     precedence over acpi-cpufreq by default on supported systems and
     amend it with a mechanism to disable this behavior (Perry Yuan)

   - Update amd-pstate documentation in accordance with the other
     changes made to it (Perry Yuan)"

* tag 'pm-6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Documentation: add amd-pstate kernel command line options
  Documentation: amd-pstate: add driver working mode introduction
  cpufreq: amd-pstate: add amd-pstate driver parameter for mode selection
  cpufreq: amd-pstate: change amd-pstate driver to be built-in type
  cpufreq: amd-pstate: cpufreq: amd-pstate: reset MSR_AMD_PERF_CTL register at init
  Revert "cpufreq: schedutil: Move max CPU capacity to sugov_policy"
2022-11-25 12:43:33 -08:00
e3ebac80b6 Merge tag 's390-6.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Alexander Gordeev:

 - Fix size of incorrectly increased from four to eight bytes TOD field
   of crash dump save area. As result in case of kdump NT_S390_TODPREG
   ELF notes section contains correct value and "detected read beyond
   size of field" compiler warning goes away.

 - Fix memory leak in cryptographic Adjunct Processors (AP) module on
   initialization failure path.

 - Add Gerald Schaefer <gerald.schaefer@linux.ibm.com> and Alexander
   Gordeev <agordeev@linux.ibm.com> as S390 memory management
   maintainers. Also rename the S390 section to S390 ARCHITECTURE to be
   a bit more precise.

* tag 's390-6.1-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  MAINTAINERS: add S390 MM section
  s390/crashdump: fix TOD programmable field size
  s390/ap: fix memory leak in ap_init_qci_info()
2022-11-25 12:37:24 -08:00
081f359ef5 Merge tag 'hyperv-fixes-signed-20221125' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:

 - Fix IRTE allocation in Hyper-V PCI controller (Dexuan Cui)

 - Fix handling of SCSI srb_status and capacity change events (Michael
   Kelley)

 - Restore VP assist page after CPU offlining and onlining (Vitaly
   Kuznetsov)

 - Fix some memory leak issues in VMBus (Yang Yingliang)

* tag 'hyperv-fixes-signed-20221125' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  Drivers: hv: vmbus: fix possible memory leak in vmbus_device_register()
  Drivers: hv: vmbus: fix double free in the error path of vmbus_add_channel_work()
  PCI: hv: Only reuse existing IRTE allocation for Multi-MSI
  scsi: storvsc: Fix handling of srb_status and capacity change events
  x86/hyperv: Restore VP assist page after cpu offlining/onlining
2022-11-25 12:32:42 -08:00
0b1dcc2cf5 Merge tag 'mm-hotfixes-stable-2022-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton:
 "24 MM and non-MM hotfixes. 8 marked cc:stable and 16 for post-6.0
  issues.

  There have been a lot of hotfixes this cycle, and this is quite a
  large batch given how far we are into the -rc cycle. Presumably a
  reflection of the unusually large amount of MM material which went
  into 6.1-rc1"

* tag 'mm-hotfixes-stable-2022-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (24 commits)
  test_kprobes: fix implicit declaration error of test_kprobes
  nilfs2: fix nilfs_sufile_mark_dirty() not set segment usage as dirty
  mm/cgroup/reclaim: fix dirty pages throttling on cgroup v1
  mm: fix unexpected changes to {failslab|fail_page_alloc}.attr
  swapfile: fix soft lockup in scan_swap_map_slots
  hugetlb: fix __prep_compound_gigantic_page page flag setting
  kfence: fix stack trace pruning
  proc/meminfo: fix spacing in SecPageTables
  mm: multi-gen LRU: retry folios written back while isolated
  mailmap: update email address for Satya Priya
  mm/migrate_device: return number of migrating pages in args->cpages
  kbuild: fix -Wimplicit-function-declaration in license_is_gpl_compatible
  MAINTAINERS: update Alex Hung's email address
  mailmap: update Alex Hung's email address
  mm: mmap: fix documentation for vma_mas_szero
  mm/damon/sysfs-schemes: skip stats update if the scheme directory is removed
  mm/memory: return vm_fault_t result from migrate_to_ram() callback
  mm: correctly charge compressed memory to its memcg
  ipc/shm: call underlying open/close vm_ops
  gcov: clang: fix the buffer overflow issue
  ...
2022-11-25 10:18:25 -08:00
b308570957 Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs fixes from Al Viro:
 "A couple of fixes, one of them for this cycle regression..."

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  vfs: vfs_tmpfile: ensure O_EXCL flag is enforced
  fs: use acquire ordering in __fget_light()
2022-11-25 10:12:43 -08:00
7cfe7a0948 io_uring: clear TIF_NOTIFY_SIGNAL if set and task_work not available
With how task_work is added and signaled, we can have TIF_NOTIFY_SIGNAL
set and no task_work pending as it got run in a previous loop. Treat
TIF_NOTIFY_SIGNAL like get_signal(), always clear it if set regardless
of whether or not task_work is pending to run.

Cc: stable@vger.kernel.org
Fixes: 46a525e199 ("io_uring: don't gate task_work run on TIF_NOTIFY_SIGNAL")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-25 10:55:08 -07:00
ca66e58001 Merge tag 'sound-6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "A few more last-minute fixes for 6.1 that have been gathered in the
  last week; nothing looks too worrisome, mostly device-specific small
  fixes, including the ABI fix for ASoC SOF"

* tag 'sound-6.1-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ASoC: soc-pcm: Add NULL check in BE reparenting
  ALSA: seq: Fix function prototype mismatch in snd_seq_expand_var_event
  ASoC: SOF: dai: move AMD_HS to end of list to restore backwards-compatibility
  ASoC: max98373: Add checks for devm_kcalloc
  ASoC: rt711-sdca: fix the latency time of clock stop prepare state machine transitions
  ASoC: soc-pcm: Don't zero TDM masks in __soc_pcm_open()
  ASoC: sgtl5000: Reset the CHIP_CLK_CTRL reg on remove
  ASoC: hdac_hda: fix hda pcm buffer overflow issue
  ASoC: stm32: i2s: remove irqf_oneshot flag
  ASoC: wm8962: Wait for updated value of WM8962_CLOCKING1 register
2022-11-25 09:26:18 -08:00
6fe0e074e7 Merge tag 'drm-fixes-2022-11-25' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "Weekly fixes, amdgpu has not quite settled down.

  Most of the changes are small, and the non-amdgpu ones are all fine.
  There are a bunch of DP MST DSC fixes that fix some issues introduced
  in a previous larger MST rework.

  The biggest one is mainly propagating some error values properly
  instead of bool returns, and I think it just looks large but doesn't
  really change anything too much, except propagating errors that are
  required to avoid deadlocks. I've gone over it and a few others and
  they've had some decent testing over the last few weeks.

  Summary:

  amdgpu:
   - amdgpu gang submit fix
   - DCN 3.1.4 fixes
   - DP MST DSC deadlock fixes
   - HMM userptr fixes
   - Fix Aldebaran CU occupancy reporting
   - GFX11 fixes
   - PSP suspend/resume fix
   - DCE12 KASAN fix
   - DCN 3.2.x fixes
   - Rotated cursor fix
   - SMU 13.x fix
   - DELL platform suspend/resume fixes
   - VCN4 SR-IOV fix
   - Display regression fix for polled connectors

  i915:
   - Fix GVT KVM reference count handling
   - Never purge busy TTM objects
   - Fix warn in intel_display_power_*_domain() functions

  dma-buf:
   - Use dma_fence_unwrap_for_each when importing sync files
   - Fix race in dma_heap_add()

  fbcon:
   - Fix use of uninitialized memory in logo"

* tag 'drm-fixes-2022-11-25' of git://anongit.freedesktop.org/drm/drm: (30 commits)
  drm/amdgpu/vcn: re-use original vcn0 doorbell value
  drm/amdgpu: Partially revert "drm/amdgpu: update drm_display_info correctly when the edid is read"
  drm/amd/display: No display after resume from WB/CB
  drm/amdgpu: fix use-after-free during gpu recovery
  drm/amd/pm: update driver if header for smu_13_0_7
  drm/amd/display: Fix rotated cursor offset calculation
  drm/amd/display: Use new num clk levels struct for max mclk index
  drm/amd/display: Avoid setting pixel rate divider to N/A
  drm/amd/display: Use viewport height for subvp mall allocation size
  drm/amd/display: Update soc bounding box for dcn32/dcn321
  drm/amd/dc/dce120: Fix audio register mapping, stop triggering KASAN
  drm/amdgpu/psp: don't free PSP buffers on suspend
  fbcon: Use kzalloc() in fbcon_prepare_logo()
  dma-buf: fix racing conflict of dma_heap_add()
  drm/amd/amdgpu: reserve vm invalidation engine for firmware
  drm/amdgpu: Enable Aldebaran devices to report CU Occupancy
  drm/amdgpu: fix userptr HMM range handling v2
  drm/amdgpu: always register an MMU notifier for userptr
  drm/amdgpu/dm/mst: Fix uninitialized var in pre_compute_mst_dsc_configs_for_state()
  drm/amdgpu/dm/dp_mst: Don't grab mst_mgr->lock when computing DSC state
  ...
2022-11-25 09:20:42 -08:00
12ad3d2d6c io_uring/poll: fix poll_refs race with cancelation
There is an interesting race condition of poll_refs which could result
in a NULL pointer dereference. The crash trace is like:

KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
CPU: 0 PID: 30781 Comm: syz-executor.2 Not tainted 6.0.0-g493ffd6605b2 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.13.0-1ubuntu1.1 04/01/2014
RIP: 0010:io_poll_remove_entry io_uring/poll.c:154 [inline]
RIP: 0010:io_poll_remove_entries+0x171/0x5b4 io_uring/poll.c:190
Code: ...
RSP: 0018:ffff88810dfefba0 EFLAGS: 00010202
RAX: 0000000000000001 RBX: 0000000000000000 RCX: 0000000000040000
RDX: ffffc900030c4000 RSI: 000000000003ffff RDI: 0000000000040000
RBP: 0000000000000008 R08: ffffffff9764d3dd R09: fffffbfff3836781
R10: fffffbfff3836781 R11: 0000000000000000 R12: 1ffff11003422d60
R13: ffff88801a116b04 R14: ffff88801a116ac0 R15: dffffc0000000000
FS:  00007f9c07497700(0000) GS:ffff88811a600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffb5c00ea98 CR3: 0000000105680005 CR4: 0000000000770ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
 <TASK>
 io_apoll_task_func+0x3f/0xa0 io_uring/poll.c:299
 handle_tw_list io_uring/io_uring.c:1037 [inline]
 tctx_task_work+0x37e/0x4f0 io_uring/io_uring.c:1090
 task_work_run+0x13a/0x1b0 kernel/task_work.c:177
 get_signal+0x2402/0x25a0 kernel/signal.c:2635
 arch_do_signal_or_restart+0x3b/0x660 arch/x86/kernel/signal.c:869
 exit_to_user_mode_loop kernel/entry/common.c:166 [inline]
 exit_to_user_mode_prepare+0xc2/0x160 kernel/entry/common.c:201
 __syscall_exit_to_user_mode_work kernel/entry/common.c:283 [inline]
 syscall_exit_to_user_mode+0x58/0x160 kernel/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

The root cause for this is a tiny overlooking in
io_poll_check_events() when cocurrently run with poll cancel routine
io_poll_cancel_req().

The interleaving to trigger use-after-free:

CPU0                                       |  CPU1
                                           |
io_apoll_task_func()                       |  io_poll_cancel_req()
 io_poll_check_events()                    |
  // do while first loop                   |
  v = atomic_read(...)                     |
  // v = poll_refs = 1                     |
  ...                                      |  io_poll_mark_cancelled()
                                           |   atomic_or()
                                           |   // poll_refs =
IO_POLL_CANCEL_FLAG | 1
                                           |
  atomic_sub_return(...)                   |
  // poll_refs = IO_POLL_CANCEL_FLAG       |
  // loop continue                         |
                                           |
                                           |  io_poll_execute()
                                           |   io_poll_get_ownership()
                                           |   // poll_refs =
IO_POLL_CANCEL_FLAG | 1
                                           |   // gets the ownership
  v = atomic_read(...)                     |
  // poll_refs not change                  |
                                           |
  if (v & IO_POLL_CANCEL_FLAG)             |
   return -ECANCELED;                      |
  // io_poll_check_events return           |
  // will go into                          |
  // io_req_complete_failed() free req     |
                                           |
                                           |  io_apoll_task_func()
                                           |  // also go into
io_req_complete_failed()

And the interleaving to trigger the kernel WARNING:

CPU0                                       |  CPU1
                                           |
io_apoll_task_func()                       |  io_poll_cancel_req()
 io_poll_check_events()                    |
  // do while first loop                   |
  v = atomic_read(...)                     |
  // v = poll_refs = 1                     |
  ...                                      |  io_poll_mark_cancelled()
                                           |   atomic_or()
                                           |   // poll_refs =
IO_POLL_CANCEL_FLAG | 1
                                           |
  atomic_sub_return(...)                   |
  // poll_refs = IO_POLL_CANCEL_FLAG       |
  // loop continue                         |
                                           |
  v = atomic_read(...)                     |
  // v = IO_POLL_CANCEL_FLAG               |
                                           |  io_poll_execute()
                                           |   io_poll_get_ownership()
                                           |   // poll_refs =
IO_POLL_CANCEL_FLAG | 1
                                           |   // gets the ownership
                                           |
  WARN_ON_ONCE(!(v & IO_POLL_REF_MASK)))   |
  // v & IO_POLL_REF_MASK = 0 WARN         |
                                           |
                                           |  io_apoll_task_func()
                                           |  // also go into
io_req_complete_failed()

By looking up the source code and communicating with Pavel, the
implementation of this atomic poll refs should continue the loop of
io_poll_check_events() just to avoid somewhere else to grab the
ownership. Therefore, this patch simply adds another AND operation to
make sure the loop will stop if it finds the poll_refs is exactly equal
to IO_POLL_CANCEL_FLAG. Since io_poll_cancel_req() grabs ownership and
will finally make its way to io_req_complete_failed(), the req will
be reclaimed as expected.

Fixes: aa43477b04 ("io_uring: poll rework")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
[axboe: tweak description and code style]
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-25 07:17:33 -07:00
9d94c04c0d io_uring/filetable: fix file reference underflow
There is an interesting reference bug when -ENOMEM occurs in calling of
io_install_fixed_file(). KASan report like below:

[   14.057131] ==================================================================
[   14.059161] BUG: KASAN: use-after-free in unix_get_socket+0x10/0x90
[   14.060975] Read of size 8 at addr ffff88800b09cf20 by task kworker/u8:2/45
[   14.062684]
[   14.062768] CPU: 2 PID: 45 Comm: kworker/u8:2 Not tainted 6.1.0-rc4 #1
[   14.063099] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[   14.063666] Workqueue: events_unbound io_ring_exit_work
[   14.063936] Call Trace:
[   14.064065]  <TASK>
[   14.064175]  dump_stack_lvl+0x34/0x48
[   14.064360]  print_report+0x172/0x475
[   14.064547]  ? _raw_spin_lock_irq+0x83/0xe0
[   14.064758]  ? __virt_addr_valid+0xef/0x170
[   14.064975]  ? unix_get_socket+0x10/0x90
[   14.065167]  kasan_report+0xad/0x130
[   14.065353]  ? unix_get_socket+0x10/0x90
[   14.065553]  unix_get_socket+0x10/0x90
[   14.065744]  __io_sqe_files_unregister+0x87/0x1e0
[   14.065989]  ? io_rsrc_refs_drop+0x1c/0xd0
[   14.066199]  io_ring_exit_work+0x388/0x6a5
[   14.066410]  ? io_uring_try_cancel_requests+0x5bf/0x5bf
[   14.066674]  ? try_to_wake_up+0xdb/0x910
[   14.066873]  ? virt_to_head_page+0xbe/0xbe
[   14.067080]  ? __schedule+0x574/0xd20
[   14.067273]  ? read_word_at_a_time+0xe/0x20
[   14.067492]  ? strscpy+0xb5/0x190
[   14.067665]  process_one_work+0x423/0x710
[   14.067879]  worker_thread+0x2a2/0x6f0
[   14.068073]  ? process_one_work+0x710/0x710
[   14.068284]  kthread+0x163/0x1a0
[   14.068454]  ? kthread_complete_and_exit+0x20/0x20
[   14.068697]  ret_from_fork+0x22/0x30
[   14.068886]  </TASK>
[   14.069000]
[   14.069088] Allocated by task 289:
[   14.069269]  kasan_save_stack+0x1e/0x40
[   14.069463]  kasan_set_track+0x21/0x30
[   14.069652]  __kasan_slab_alloc+0x58/0x70
[   14.069899]  kmem_cache_alloc+0xc5/0x200
[   14.070100]  __alloc_file+0x20/0x160
[   14.070283]  alloc_empty_file+0x3b/0xc0
[   14.070479]  path_openat+0xc3/0x1770
[   14.070689]  do_filp_open+0x150/0x270
[   14.070888]  do_sys_openat2+0x113/0x270
[   14.071081]  __x64_sys_openat+0xc8/0x140
[   14.071283]  do_syscall_64+0x3b/0x90
[   14.071466]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   14.071791]
[   14.071874] Freed by task 0:
[   14.072027]  kasan_save_stack+0x1e/0x40
[   14.072224]  kasan_set_track+0x21/0x30
[   14.072415]  kasan_save_free_info+0x2a/0x50
[   14.072627]  __kasan_slab_free+0x106/0x190
[   14.072858]  kmem_cache_free+0x98/0x340
[   14.073075]  rcu_core+0x427/0xe50
[   14.073249]  __do_softirq+0x110/0x3cd
[   14.073440]
[   14.073523] Last potentially related work creation:
[   14.073801]  kasan_save_stack+0x1e/0x40
[   14.074017]  __kasan_record_aux_stack+0x97/0xb0
[   14.074264]  call_rcu+0x41/0x550
[   14.074436]  task_work_run+0xf4/0x170
[   14.074619]  exit_to_user_mode_prepare+0x113/0x120
[   14.074858]  syscall_exit_to_user_mode+0x1d/0x40
[   14.075092]  do_syscall_64+0x48/0x90
[   14.075272]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   14.075529]
[   14.075612] Second to last potentially related work creation:
[   14.075900]  kasan_save_stack+0x1e/0x40
[   14.076098]  __kasan_record_aux_stack+0x97/0xb0
[   14.076325]  task_work_add+0x72/0x1b0
[   14.076512]  fput+0x65/0xc0
[   14.076657]  filp_close+0x8e/0xa0
[   14.076825]  __x64_sys_close+0x15/0x50
[   14.077019]  do_syscall_64+0x3b/0x90
[   14.077199]  entry_SYSCALL_64_after_hwframe+0x63/0xcd
[   14.077448]
[   14.077530] The buggy address belongs to the object at ffff88800b09cf00
[   14.077530]  which belongs to the cache filp of size 232
[   14.078105] The buggy address is located 32 bytes inside of
[   14.078105]  232-byte region [ffff88800b09cf00, ffff88800b09cfe8)
[   14.078685]
[   14.078771] The buggy address belongs to the physical page:
[   14.079046] page:000000001bd520e7 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88800b09de00 pfn:0xb09c
[   14.079575] head:000000001bd520e7 order:1 compound_mapcount:0 compound_pincount:0
[   14.079946] flags: 0x100000000010200(slab|head|node=0|zone=1)
[   14.080244] raw: 0100000000010200 0000000000000000 dead000000000001 ffff88800493cc80
[   14.080629] raw: ffff88800b09de00 0000000080190018 00000001ffffffff 0000000000000000
[   14.081016] page dumped because: kasan: bad access detected
[   14.081293]
[   14.081376] Memory state around the buggy address:
[   14.081618]  ffff88800b09ce00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   14.081974]  ffff88800b09ce80: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc
[   14.082336] >ffff88800b09cf00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   14.082690]                                ^
[   14.082909]  ffff88800b09cf80: fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc
[   14.083266]  ffff88800b09d000: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
[   14.083622] ==================================================================

The actual tracing of this bug is shown below:

commit 8c71fe7502 ("io_uring: ensure fput() called correspondingly
when direct install fails") adds an additional fput() in
io_fixed_fd_install() when io_file_bitmap_get() returns error values. In
that case, the routine will never make it to io_install_fixed_file() due
to an early return.

static int io_fixed_fd_install(...)
{
  if (alloc_slot) {
    ...
    ret = io_file_bitmap_get(ctx);
    if (unlikely(ret < 0)) {
      io_ring_submit_unlock(ctx, issue_flags);
      fput(file);
      return ret;
    }
    ...
  }
  ...
  ret = io_install_fixed_file(req, file, issue_flags, file_slot);
  ...
}

In the above scenario, the reference is okay as io_fixed_fd_install()
ensures the fput() is called when something bad happens, either via
bitmap or via inner io_install_fixed_file().

However, the commit 61c1b44a21 ("io_uring: fix deadlock on iowq file
slot alloc") breaks the balance because it places fput() into the common
path for both io_file_bitmap_get() and io_install_fixed_file(). Since
io_install_fixed_file() handles the fput() itself, the reference
underflow come across then.

There are some extra commits make the current code into
io_fixed_fd_install() -> __io_fixed_fd_install() ->
io_install_fixed_file()

However, the fact that there is an extra fput() is called if
io_install_fixed_file() calls fput(). Traversing through the code, I
find that the existing two callers to __io_fixed_fd_install():
io_fixed_fd_install() and io_msg_send_fd() have fput() when handling
error return, this patch simply removes the fput() in
io_install_fixed_file() to fix the bug.

Fixes: 61c1b44a21 ("io_uring: fix deadlock on iowq file slot alloc")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Link: https://lore.kernel.org/r/be4ba4b.5d44.184a0a406a4.Coremail.linma@zju.edu.cn
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-25 06:54:46 -07:00
a26a35e901 io_uring: make poll refs more robust
poll_refs carry two functions, the first is ownership over the request.
The second is notifying the io_poll_check_events() that there was an
event but wake up couldn't grab the ownership, so io_poll_check_events()
should retry.

We want to make poll_refs more robust against overflows. Instead of
always incrementing it, which covers two purposes with one atomic, check
if poll_refs is elevated enough and if so set a retry flag without
attempts to grab ownership. The gap between the bias check and following
atomics may seem racy, but we don't need it to be strict. Moreover there
might only be maximum 4 parallel updates: by the first and the second
poll entries, __io_arm_poll_handler() and cancellation. From those four,
only poll wake ups may be executed multiple times, but they're protected
by a spin.

Cc: stable@vger.kernel.org
Reported-by: Lin Ma <linma@zju.edu.cn>
Fixes: aa43477b04 ("io_uring: poll rework")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/c762bc31f8683b3270f3587691348a7119ef9c9d.1668963050.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-25 06:54:46 -07:00
2f3893437a io_uring: cmpxchg for poll arm refs release
Replace atomically substracting the ownership reference at the end of
arming a poll with a cmpxchg. We try to release ownership by setting 0
assuming that poll_refs didn't change while we were arming. If it did
change, we keep the ownership and use it to queue a tw, which is fully
capable to process all events and (even tolerates spurious wake ups).

It's a bit more elegant as we reduce races b/w setting the cancellation
flag and getting refs with this release, and with that we don't have to
worry about any kinds of underflows. It's not the fastest path for
polling. The performance difference b/w cmpxchg and atomic dec is
usually negligible and it's not the fastest path.

Cc: stable@vger.kernel.org
Fixes: aa43477b04 ("io_uring: poll rework")
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/0c95251624397ea6def568ff040cad2d7926fd51.1668963050.git.asml.silence@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-25 06:54:16 -07:00
db58653ce0 zonefs: Fix active zone accounting
If a file zone transitions to the offline or readonly state from an
active state, we must clear the zone active flag and decrement the
active seq file counter. Do so in zonefs_account_active() using the new
zonefs inode flags ZONEFS_ZONE_OFFLINE and ZONEFS_ZONE_READONLY. These
flags are set if necessary in zonefs_check_zone_condition() based on the
result of report zones operation after an IO error.

Fixes: 87c9ce3ffe ("zonefs: Add active seq file accounting")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
2022-11-25 17:01:22 +09:00
10bc8e4af6 vfs: fix copy_file_range() averts filesystem freeze protection
Commit 868f9f2f8e ("vfs: fix copy_file_range() regression in cross-fs
copies") removed fallback to generic_copy_file_range() for cross-fs
cases inside vfs_copy_file_range().

To preserve behavior of nfsd and ksmbd server-side-copy, the fallback to
generic_copy_file_range() was added in nfsd and ksmbd code, but that
call is missing sb_start_write(), fsnotify hooks and more.

Ideally, nfsd and ksmbd would pass a flag to vfs_copy_file_range() that
will take care of the fallback, but that code would be subtle and we got
vfs_copy_file_range() logic wrong too many times already.

Instead, add a flag to explicitly request vfs_copy_file_range() to
perform only generic_copy_file_range() and let nfsd and ksmbd use this
flag only in the fallback path.

This choise keeps the logic changes to minimum in the non-nfsd/ksmbd code
paths to reduce the risk of further regressions.

Fixes: 868f9f2f8e ("vfs: fix copy_file_range() regression in cross-fs copies")
Tested-by: Namjae Jeon <linkinjeon@kernel.org>
Tested-by: Luis Henriques <lhenriques@suse.de>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-11-25 00:52:28 -05:00
e57702069b Merge tag 'amd-drm-fixes-6.1-2022-11-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.1-2022-11-23:

amdgpu:
- DCN 3.1.4 fixes
- DP MST DSC deadlock fixes
- HMM userptr fixes
- Fix Aldebaran CU occupancy reporting
- GFX11 fixes
- PSP suspend/resume fix
- DCE12 KASAN fix
- DCN 3.2.x fixes
- Rotated cursor fix
- SMU 13.x fix
- DELL platform suspend/resume fixes
- VCN4 SR-IOV fix
- Display regression fix for polled connectors

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221123143453.8977-1-alexander.deucher@amd.com
2022-11-25 10:55:23 +10:00
b65a648865 Merge tag 'drm-intel-fixes-2022-11-24' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Fix GVT KVM reference count handling (Sean Christopherson)
- Never purge busy TTM objects (Matthew Auld)
- Fix warn in intel_display_power_*_domain() functions (Imre Deak)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Y38u44hb1LZfZC+M@tursulin-desk
2022-11-25 09:42:02 +10:00