69713 Commits

Author SHA1 Message Date
Alessio Balsini
f8425c9396 fuse: 32-bit user space ioctl compat for fuse device
With a 64-bit kernel build the FUSE device cannot handle ioctl requests
coming from 32-bit user space.  This is due to the ioctl command
translation that generates different command identifiers that thus cannot
be used for direct comparisons without proper manipulation.

Explicitly extract type and number from the ioctl command to enable 32-bit
user space compatibility on 64-bit kernel builds.

Signed-off-by: Alessio Balsini <balsini@android.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2021-03-16 15:20:16 +01:00
Qu Wenruo
60484cd9d5 btrfs: subpage: make readahead work properly
In readahead infrastructure, we are using a lot of hard coded PAGE_SHIFT
while we're not doing anything specific to PAGE_SIZE.

One of the most affected part is the radix tree operation of
btrfs_fs_info::reada_tree.

If using PAGE_SHIFT, subpage metadata readahead is broken and does no
help reading metadata ahead.

Fix the problem by using btrfs_fs_info::sectorsize_bits so that
readahead could work for subpage.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-03-16 11:06:21 +01:00
Qu Wenruo
d9bb77d51e btrfs: subpage: fix wild pointer access during metadata read failure
[BUG]
When running fstests for btrfs subpage read-write test, it has a very
high chance to crash at generic/475 with the following stack:

 BTRFS warning (device dm-8): direct IO failed ino 510 rw 1,34817 sector 0xcdf0 len 94208 err no 10
 Unable to handle kernel paging request at virtual address ffff80001157e7c0
 CPU: 2 PID: 687125 Comm: kworker/u12:4 Tainted: G        WC        5.12.0-rc2-custom+ #5
 Hardware name: Khadas VIM3 (DT)
 Workqueue: btrfs-endio-meta btrfs_work_helper [btrfs]
 pc : queued_spin_lock_slowpath+0x1a0/0x390
 lr : do_raw_spin_lock+0xc4/0x11c
 Call trace:
  queued_spin_lock_slowpath+0x1a0/0x390
  _raw_spin_lock+0x68/0x84
  btree_readahead_hook+0x38/0xc0 [btrfs]
  end_bio_extent_readpage+0x504/0x5f4 [btrfs]
  bio_endio+0x170/0x1a4
  end_workqueue_fn+0x3c/0x60 [btrfs]
  btrfs_work_helper+0x1b0/0x1b4 [btrfs]
  process_one_work+0x22c/0x430
  worker_thread+0x70/0x3a0
  kthread+0x13c/0x140
  ret_from_fork+0x10/0x30
 Code: 910020e0 8b0200c2 f861d884 aa0203e1 (f8246827)

[CAUSE]
In end_bio_extent_readpage(), if we hit an error during read, we will
handle the error differently for data and metadata.
For data we queue a repair, while for metadata, we record the error and
let the caller choose what to do.

But the code is still using page->private to grab extent buffer, which
no longer points to extent buffer for subpage metadata pages.

Thus this wild pointer access leads to above crash.

[FIX]
Introduce a helper, find_extent_buffer_readpage(), to grab extent
buffer.

The difference against find_extent_buffer_nospinlock() is:

- Also handles regular sectorsize == PAGE_SIZE case
- No extent buffer refs increase/decrease
  As extent buffer under IO must have non-zero refs, so this is safe

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-03-16 11:06:18 +01:00
Damien Le Moal
ebfd68cd0c zonefs: Fix O_APPEND async write handling
zonefs updates the size of a sequential zone file inode only on
completion of direct writes. When executing asynchronous append writes
(with a file open with O_APPEND or using RWF_APPEND), the use of the
current inode size in generic_write_checks() to set an iocb offset thus
leads to unaligned write if an application issues an append write
operation with another write already being executed.

Fix this problem by introducing zonefs_write_checks() as a modified
version of generic_write_checks() using the file inode wp_offset for an
append write iocb offset. Also introduce zonefs_write_check_limits() to
replace generic_write_check_limits() call. This zonefs special helper
makes sure that the maximum file limit used is the maximum size of the
file being accessed.

Since zonefs_write_checks() already truncates the iov_iter, the calls
to iov_iter_truncate() in zonefs_file_dio_write() and
zonefs_file_buffered_write() are removed.

Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system")
Cc: <stable@vger.kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
2021-03-16 17:46:55 +09:00
Damien Le Moal
1601ea068b zonefs: prevent use of seq files as swap file
The sequential write constraint of sequential zone file prevent their
use as swap files. Only allow conventional zone files to be used as swap
files.

Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system")
Cc: <stable@vger.kernel.org>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
2021-03-16 17:38:35 +09:00
Linus Torvalds
1a4431a5db AFS fixes
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEqG5UsNXhtOCrfGQP+7dXa6fLC2sFAmBPlQAACgkQ+7dXa6fL
 C2sfvg/+OQ4aI7aqh8HahAjxLYaRShaKHQAxnjiyL8o7BELYr5PngH7cLyBw+mt8
 WsymoT1dSSl+n8X/bsYRwP+FsL/jsF6PgjerVkSs9Z131lwKqaAaAE6dx1pGL1fj
 7I7uqogytaEUJPe1fD4TDk3i1oozm3MuYfXUu1Gi8sm1mnsfUaWndROYSuqwAs1O
 TatZa0vCdAsBhC3zgE/K1hA7/tZMXLsh+6FwUqkztOcKg7P44yAfuxMZZ5SNs2FG
 fE5N8YQRpSZwg/RxCsZnH3Zi0jIonVr1OKG56LXTKzW96eEZd4TP2YqZGD/LhQOU
 lPkp1LlqhH9meelRP9x7iaDXeIGRTPjlI/Te+xnOIyCQV9gMHgCwOrODaLf5/QJa
 JQpL8nsxCAQb3yAIvmojjld4D2j9e/adickbwZZHdBV8EkXMEeL1d+5eEEDcgOmN
 RQmubY/79KDXHcgmUBcoNu+8Q/MYcboc+ZBwtgZ4B116KpjvGo7s/scUHsSpwgG9
 kR55gx73xoYAzmViRB7uLPUQXkQYHkDZgV42f1Sx1qXucTFMmppYMPfkTuIQcxkb
 zKexFcIq7WZ6OuBHZdAlqW82wpA7/QWscy4y9jMukiX9L0MHliW5v1BH1rHOxe6W
 agFp/O+u920ebymq9NnJvzD02Wyu1nPFslXpQ+vZJ3cEig8pmEg=
 =0TI9
 -----END PGP SIGNATURE-----

Merge tag 'afs-fixes-20210315' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs

Pull AFS fixes from David Howells:

 - Fix an oops in AFS that can be triggered by accessing one of the
   afs.yfs.* xattrs against an OpenAFS server - for instance by commands
   like "cp -a"[1], "rsync -X" or getfattr[2]. These try and copy all of
   the xattrs.

   cp and rsync should pay attention to the list in /etc/xattr.conf, but
   cp doesn't on Ubuntu and rsync doesn't seem to on Ubuntu or Fedora.
   xattr.conf has been modified upstream[3], and a new version has just
   been cut that includes it. I've logged a bug against rsync for the
   problem there[4].

 - Stop listing "afs.*" xattrs[5][6][7], but particularly ACL ones[8] so
   that they don't confuse cp and rsync.

   This removes them from the list returned by listxattr(), but they're
   still available to get/set.

Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003498.html [1]
Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003501.html [2]
Link: https://git.savannah.nongnu.org/cgit/attr.git/commit/?id=74da517cc655a82ded715dea7245ce88ebc91b98 [3]
Link: https://github.com/WayneD/rsync/issues/163 [4]
Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003516.html [5]
Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003524.html [6]
Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003565.html # v1
Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003568.html [7]
Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003570.html [8]
Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003571.html # v2

* tag 'afs-fixes-20210315' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
  afs: Stop listxattr() from listing "afs.*" attributes
  afs: Fix accessing YFS xattrs on a non-YFS server
2021-03-15 16:36:40 -07:00
David Howells
a7889c6320 afs: Stop listxattr() from listing "afs.*" attributes
afs_listxattr() lists all the available special afs xattrs (i.e. those in
the "afs.*" space), no matter what type of server we're dealing with.  But
OpenAFS servers, for example, cannot deal with some of the extra-capable
attributes that AuriStor (YFS) servers provide.  Unfortunately, the
presence of the afs.yfs.* attributes causes errors[1] for anything that
tries to read them if the server is of the wrong type.

Fix the problem by removing afs_listxattr() so that none of the special
xattrs are listed (AFS doesn't support xattrs).  It does mean, however,
that getfattr won't list them, though they can still be accessed with
getxattr() and setxattr().

This can be tested with something like:

	getfattr -d -m ".*" /afs/example.com/path/to/file

With this change, none of the afs.* attributes should be visible.

Changes:
ver #2:
 - Hide all of the afs.* xattrs, not just the ACL ones.

Fixes: ae46578b963f ("afs: Get YFS ACLs and information through xattrs")
Reported-by: Gaja Sophie Peters <gaja.peters@math.uni-hamburg.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Gaja Sophie Peters <gaja.peters@math.uni-hamburg.de>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003502.html [1]
Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003567.html # v1
Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003573.html # v2
2021-03-15 17:09:54 +00:00
David Howells
64fcbb6158 afs: Fix accessing YFS xattrs on a non-YFS server
If someone attempts to access YFS-related xattrs (e.g. afs.yfs.acl) on a
file on a non-YFS AFS server (such as OpenAFS), then the kernel will jump
to a NULL function pointer because the afs_fetch_acl_operation descriptor
doesn't point to a function for issuing an operation on a non-YFS
server[1].

Fix this by making afs_wait_for_operation() check that the issue_afs_rpc
method is set before jumping to it and setting -ENOTSUPP if not.  This fix
also covers other potential operations that also only exist on YFS servers.

afs_xattr_get/set_yfs() then need to translate -ENOTSUPP to -ENODATA as the
former error is internal to the kernel.

The bug shows up as an oops like the following:

	BUG: kernel NULL pointer dereference, address: 0000000000000000
	[...]
	Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
	[...]
	Call Trace:
	 afs_wait_for_operation+0x83/0x1b0 [kafs]
	 afs_xattr_get_yfs+0xe6/0x270 [kafs]
	 __vfs_getxattr+0x59/0x80
	 vfs_getxattr+0x11c/0x140
	 getxattr+0x181/0x250
	 ? __check_object_size+0x13f/0x150
	 ? __fput+0x16d/0x250
	 __x64_sys_fgetxattr+0x64/0xb0
	 do_syscall_64+0x49/0xc0
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9
	RIP: 0033:0x7fb120a9defe

This was triggered with "cp -a" which attempts to copy xattrs, including
afs ones, but is easier to reproduce with getfattr, e.g.:

	getfattr -d -m ".*" /afs/openafs.org/

Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept")
Reported-by: Gaja Sophie Peters <gaja.peters@math.uni-hamburg.de>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Gaja Sophie Peters <gaja.peters@math.uni-hamburg.de>
Reviewed-by: Marc Dionne <marc.dionne@auristor.com>
Reviewed-by: Jeffrey Altman <jaltman@auristor.com>
cc: linux-afs@lists.infradead.org
Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003498.html [1]
Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003566.html # v1
Link: http://lists.infradead.org/pipermail/linux-afs/2021-March/003572.html # v2
2021-03-15 17:01:18 +00:00
Filipe Manana
e3d3b41576 btrfs: zoned: fix linked list corruption after log root tree allocation failure
When using a zoned filesystem, while syncing the log, if we fail to
allocate the root node for the log root tree, we are not removing the
log context we allocated on stack from the list of log contexts of the
log root tree. This means after the return from btrfs_sync_log() we get
a corrupted linked list.

Fix this by allocating the node before adding our stack allocated context
to the list of log contexts of the log root tree.

Fixes: 3ddebf27fcd3a9 ("btrfs: zoned: reorder log node allocation on zoned filesystem")
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-03-15 16:57:19 +01:00
Qu Wenruo
a3ee79bd8f btrfs: fix qgroup data rsv leak caused by falloc failure
[BUG]
When running fsstress with only falloc workload, and a very low qgroup
limit set, we can get qgroup data rsv leak at unmount time.

 BTRFS warning (device dm-0): qgroup 0/5 has unreleased space, type 0 rsv 20480
 BTRFS error (device dm-0): qgroup reserved space leaked

The minimal reproducer looks like:

  #!/bin/bash
  dev=/dev/test/test
  mnt="/mnt/btrfs"
  fsstress=~/xfstests-dev/ltp/fsstress
  runtime=8

  workload()
  {
          umount $dev &> /dev/null
          umount $mnt &> /dev/null
          mkfs.btrfs -f $dev > /dev/null
          mount $dev $mnt

          btrfs quota en $mnt
          btrfs quota rescan -w $mnt
          btrfs qgroup limit 16m 0/5 $mnt

          $fsstress -w -z -f creat=10 -f fallocate=10 -p 2 -n 100 \
  		-d $mnt -v > /tmp/fsstress

          umount $mnt
          if dmesg | grep leak ; then
		echo "!!! FAILED !!!"
  		exit 1
          fi
  }

  for (( i=0; i < $runtime; i++)); do
          echo "=== $i/$runtime==="
          workload
  done

Normally it would fail before round 4.

[CAUSE]
In function insert_prealloc_file_extent(), we first call
btrfs_qgroup_release_data() to know how many bytes are reserved for
qgroup data rsv.

Then use that @qgroup_released number to continue our work.

But after we call btrfs_qgroup_release_data(), we should either queue
@qgroup_released to delayed ref or free them manually in error path.

Unfortunately, we lack the error handling to free the released bytes,
leaking qgroup data rsv.

All the error handling function outside won't help at all, as we have
released the range, meaning in inode io tree, the EXTENT_QGROUP_RESERVED
bit is already cleared, thus all btrfs_qgroup_free_data() call won't
free any data rsv.

[FIX]
Add free_qgroup tag to manually free the released qgroup data rsv.

Reported-by: Nikolay Borisov <nborisov@suse.com>
Reported-by: David Sterba <dsterba@suse.cz>
Fixes: 9729f10a608f ("btrfs: inode: move qgroup reserved space release to the callers of insert_reserved_file_extent()")
CC: stable@vger.kernel.org # 5.10+
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-03-15 16:57:15 +01:00
Qu Wenruo
fbf48bb0b1 btrfs: track qgroup released data in own variable in insert_prealloc_file_extent
There is a piece of weird code in insert_prealloc_file_extent(), which
looks like:

	ret = btrfs_qgroup_release_data(inode, file_offset, len);
	if (ret < 0)
		return ERR_PTR(ret);
	if (trans) {
		ret = insert_reserved_file_extent(trans, inode,
						  file_offset, &stack_fi,
						  true, ret);
	...
	}
	extent_info.is_new_extent = true;
	extent_info.qgroup_reserved = ret;
	...

Note how the variable @ret is abused here, and if anyone is adding code
just after btrfs_qgroup_release_data() call, it's super easy to
overwrite the @ret and cause tons of qgroup related bugs.

Fix such abuse by introducing new variable @qgroup_released, so that we
won't reuse the existing variable @ret.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-03-15 16:57:12 +01:00
Qu Wenruo
d2dcc8ed8e btrfs: fix wrong offset to zero out range beyond i_size
[BUG]
The test generic/091 fails , with the following output:

  fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -W
  mapped writes DISABLED
  Seed set to 1
  main: filesystem does not support fallocate mode FALLOC_FL_COLLAPSE_RANGE, disabling!
  main: filesystem does not support fallocate mode FALLOC_FL_INSERT_RANGE, disabling!
  skipping zero size read
  truncating to largest ever: 0xe400
  copying to largest ever: 0x1f400
  cloning to largest ever: 0x70000
  cloning to largest ever: 0x77000
  fallocating to largest ever: 0x7a120
  Mapped Read: non-zero data past EOF (0x3a7ff) page offset 0x800 is 0xf2e1 <<<
  ...

[CAUSE]
In commit c28ea613fafa ("btrfs: subpage: fix the false data csum mismatch error")
end_bio_extent_readpage() changes to only zero the range inside the bvec
for incoming subpage support.

But that commit is using incorrect offset to calculate the start.

For subpage, we can have a case that the whole bvec is beyond isize,
thus we need to calculate the correct offset.

But the offending commit is using @end (bvec end), other than @start
(bvec start) to calculate the start offset.

This means, we only zero the last byte of the bvec, not from the isize.
This stupid bug makes the range beyond isize is not properly zeroed, and
failed above test.

[FIX]
Use correct @start to calculate the range start.

Reported-by: kernel test robot <oliver.sang@intel.com>
Fixes: c28ea613fafa ("btrfs: subpage: fix the false data csum mismatch error")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2021-03-15 16:56:54 +01:00
Christoph Hellwig
8723d5ba8b xfs: also reject BULKSTAT_SINGLE in a mount user namespace
BULKSTAT_SINGLE exposed the ondisk uids/gids just like bulkstat, and can
be called on any inode, including ones not visible in the current mount.

Fixes: f736d93d76d3 ("xfs: support idmapped mounts")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2021-03-15 08:50:41 -07:00
Darrick J. Wong
d336f7ebc6 xfs: force log and push AIL to clear pinned inodes when aborting mount
If we allocate quota inodes in the process of mounting a filesystem but
then decide to abort the mount, it's possible that the quota inodes are
sitting around pinned by the log.  Now that inode reclaim relies on the
AIL to flush inodes, we have to force the log and push the AIL in
between releasing the quota inodes and kicking off reclaim to tear down
all the incore inodes.  Do this by extracting the bits we need from the
unmount path and reusing them.  As an added bonus, failed writes during
a failed mount will not retry forever now.

This was originally found during a fuzz test of metadata directories
(xfs/1546), but the actual symptom was that reclaim hung up on the quota
inodes.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2021-03-15 08:50:40 -07:00
Pavel Begunkov
b7f5a0bfe2 io_uring: fix sqpoll cancellation via task_work
Running sqpoll cancellations via task_work_run() is a bad idea because
it depends on other task works to be run, but those may be locked in
currently running task_work_run() because of how it's (splicing the list
in batches).

Enqueue and run them through a separate callback head, namely
struct io_sq_data::park_task_work. As a nice bonus we now precisely
control where it's run, that's much safer than guessing where it can
happen as it was before.

Reported-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-15 09:32:40 -06:00
Pavel Begunkov
9b46571142 io_uring: add generic callback_head helpers
We already have helpers to run/add callback_head but taking ctx and
working with ctx->exit_task_work. Extract generic versions of them
implemented in terms of struct callback_head, it will be used later.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-15 09:32:40 -06:00
Pavel Begunkov
9e138a4834 io_uring: fix concurrent parking
If io_sq_thread_park() of one task got rescheduled right after
set_bit(), before it gets back to mutex_lock() there can happen
park()/unpark() by another task with SQPOLL locking again and
continuing running never seeing that first set_bit(SHOULD_PARK),
so won't even try to put the mutex down for parking.

It will get parked eventually when SQPOLL drops the lock for reschedule,
but may be problematic and will get in the way of further fixes.

Account number of tasks waiting for parking with a new atomic variable
park_pending and adjust SHOULD_PARK accordingly. It doesn't entirely
replaces SHOULD_PARK bit with this atomic var because it's convenient
to have it as a bit in the state and will help to do optimisations
later.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-15 09:32:40 -06:00
Pavel Begunkov
f6d54255f4 io_uring: halt SQO submission on ctx exit
io_sq_thread_finish() is called in io_ring_ctx_free(), so SQPOLL task is
potentially running submitting new requests. It's not a disaster because
of using a "try" variant of percpu_ref_get, but is far from nice.

Remove ctx from the sqd ctx list earlier, before cancellation loop, so
SQPOLL can't find it and so won't submit new requests.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-15 09:32:40 -06:00
Pavel Begunkov
09a6f4efaa io_uring: replace sqd rw_semaphore with mutex
The only user of read-locking of sqd->rw_lock is sq_thread itself, which
is by definition alone, so we don't really need rw_semaphore, but mutex
will do. Replace it with a mutex, and kill read-to-write upgrading and
extra task_work handling in io_sq_thread().

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-15 09:32:40 -06:00
Pavel Begunkov
180f829fe4 io_uring: fix complete_post use ctx after free
If io_req_complete_post() put not a final ref, we can't rely on the
request's ctx ref, and so ctx may potentially be freed while
complete_post() is in io_cqring_ev_posted()/etc.

In that case get an additional ctx reference, and put it in the end, so
protecting following io_cqring_ev_posted(). And also prolong ctx
lifetime until spin_unlock happens, as we do with mutexes, so added
percpu_ref_get() doesn't race with ctx free.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-15 09:32:24 -06:00
Pavel Begunkov
efe814a471 io_uring: fix ->flags races by linked timeouts
It's racy to modify req->flags from a not owning context, e.g. linked
timeout calling req_set_fail_links() for the master request might race
with that request setting/clearing flags while being executed
concurrently. Just remove req_set_fail_links(prev) from
io_link_timeout_fn(), io_async_find_and_cancel() and functions down the
line take care of setting the fail bit.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-15 09:31:19 -06:00
Andrew Price
62dd0f98a0 gfs2: Flag a withdraw if init_threads() fails
Interrupting mount with ^C quickly enough can cause the kthread_run()
calls in gfs2's init_threads() to fail and the error path leads to a
deadlock on the s_umount rwsem. The abridged chain of events is:

  [mount path]
  get_tree_bdev()
    sget_fc()
      alloc_super()
        down_write_nested(&s->s_umount, SINGLE_DEPTH_NESTING); [acquired]
    gfs2_fill_super()
      gfs2_make_fs_rw()
        init_threads()
          kthread_run()
            ( Interrupted )
      [Error path]
      gfs2_gl_hash_clear()
        flush_workqueue(glock_workqueue)
          wait_for_completion()

  [workqueue context]
  glock_work_func()
    run_queue()
      do_xmote()
        freeze_go_sync()
          freeze_super()
            down_write(&sb->s_umount) [deadlock]

In freeze_go_sync() there is a gfs2_withdrawn() check that we can use to
make sure freeze_super() is not called in the error path, so add a
gfs2_withdraw_delayed() call when init_threads() fails.

Ref: https://bugzilla.kernel.org/show_bug.cgi?id=212231

Reported-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: Andrew Price <anprice@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-03-15 15:32:42 +01:00
Vincent Whitchurch
05946d4b7a cifs: Fix preauth hash corruption
smb311_update_preauth_hash() uses the shash in server->secmech without
appropriate locking, and this can lead to sessions corrupting each
other's preauth hashes.

The following script can easily trigger the problem:

	#!/bin/sh -e

	NMOUNTS=10
	for i in $(seq $NMOUNTS);
		mkdir -p /tmp/mnt$i
		umount /tmp/mnt$i 2>/dev/null || :
	done
	while :; do
		for i in $(seq $NMOUNTS); do
			mount -t cifs //192.168.0.1/test /tmp/mnt$i -o ... &
		done
		wait
		for i in $(seq $NMOUNTS); do
			umount /tmp/mnt$i
		done
	done

Usually within seconds this leads to one or more of the mounts failing
with the following errors, and a "Bad SMB2 signature for message" is
seen in the server logs:

 CIFS: VFS: \\192.168.0.1 failed to connect to IPC (rc=-13)
 CIFS: VFS: cifs_mount failed w/return code = -13

Fix it by holding the server mutex just like in the other places where
the shashes are used.

Fixes: 8bd68c6e47abff34e4 ("CIFS: implement v3.11 preauth integrity")
Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
CC: <stable@vger.kernel.org>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-03-14 18:14:32 -05:00
Shyam Prasad N
5171317dfd cifs: update new ACE pointer after populate_new_aces.
After the fix for retaining externally set ACEs with cifsacl and
modefromsid,idsfromsid, there was an issue in populating the
inherited ACEs after setting the ACEs introduced by these two modes.
Fixed this by updating the ACE pointer again after the call to
populate_new_aces.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Rohith Surabattula <rohiths@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-03-14 18:14:32 -05:00
Linus Torvalds
50eb842fe5 Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "28 patches.

  Subsystems affected by this series: mm (memblock, pagealloc, hugetlb,
  highmem, kfence, oom-kill, madvise, kasan, userfaultfd, memcg, and
  zram), core-kernel, kconfig, fork, binfmt, MAINTAINERS, kbuild, and
  ia64"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (28 commits)
  zram: fix broken page writeback
  zram: fix return value on writeback_store
  mm/memcg: set memcg when splitting page
  mm/memcg: rename mem_cgroup_split_huge_fixup to split_page_memcg and add nr_pages argument
  ia64: fix ptrace(PTRACE_SYSCALL_INFO_EXIT) sign
  ia64: fix ia64_syscall_get_set_arguments() for break-based syscalls
  mm/userfaultfd: fix memory corruption due to writeprotect
  kasan: fix KASAN_STACK dependency for HW_TAGS
  kasan, mm: fix crash with HW_TAGS and DEBUG_PAGEALLOC
  mm/madvise: replace ptrace attach requirement for process_madvise
  include/linux/sched/mm.h: use rcu_dereference in in_vfork()
  kfence: fix reports if constant function prefixes exist
  kfence, slab: fix cache_alloc_debugcheck_after() for bulk allocations
  kfence: fix printk format for ptrdiff_t
  linux/compiler-clang.h: define HAVE_BUILTIN_BSWAP*
  MAINTAINERS: exclude uapi directories in API/ABI section
  binfmt_misc: fix possible deadlock in bm_register_write
  mm/highmem.c: fix zero_user_segments() with start > end
  hugetlb: do early cow when page pinned on src mm
  mm: use is_cow_mapping() across tree where proper
  ...
2021-03-14 12:23:34 -07:00
Jens Axboe
9e15c3a0ce io_uring: convert io_buffer_idr to XArray
Like we did for the personality idr, convert the IO buffer idr to use
XArray. This avoids a use-after-free on removal of entries, since idr
doesn't like doing so from inside an iterator, and it nicely reduces
the amount of code we need to support this feature.

Fixes: 5a2e745d4d43 ("io_uring: buffer registration infrastructure")
Cc: stable@vger.kernel.org
Cc: Matthew Wilcox <willy@infradead.org>
Cc: yangerkun <yangerkun@huawei.com>
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-14 09:56:14 -06:00
Linus Torvalds
420623430a Change since last update:
Fix an urgent regression introduced by commit baa2c7c97153 ("block:
 set .bi_max_vecs as actual allocated vector number"), which could
 cause unexpected hung since linux 5.12-rc1.
 
 Resolve it by avoiding using bio->bi_max_vecs completely.
 -----BEGIN PGP SIGNATURE-----
 
 iIsEABYIADMWIQThPAmQN9sSA0DVxtI5NzHcH7XmBAUCYEpAyBUcaHNpYW5na2Fv
 QHJlZGhhdC5jb20ACgkQOTcx3B+15gS55wD9GnsRm3ABN7AUKEX1lcGBt67dTEfv
 587cRSwJWHHbAl8A/0yLTt1CsnPXXxBchSGkIZ3MmQ/q2OVJ5o4rt9FRjMEC
 =opvX
 -----END PGP SIGNATURE-----

Merge tag 'erofs-for-5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs

Pull erofs fix from Gao Xiang:
 "Fix an urgent regression introduced by commit baa2c7c97153 ("block:
  set .bi_max_vecs as actual allocated vector number"), which could
  cause unexpected hung since linux 5.12-rc1.

  Resolve it by avoiding using bio->bi_max_vecs completely"

* tag 'erofs-for-5.12-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
  erofs: fix bio->bi_max_vecs behavior change
2021-03-13 12:26:22 -08:00
Lior Ribak
e7850f4d84 binfmt_misc: fix possible deadlock in bm_register_write
There is a deadlock in bm_register_write:

First, in the begining of the function, a lock is taken on the binfmt_misc
root inode with inode_lock(d_inode(root)).

Then, if the user used the MISC_FMT_OPEN_FILE flag, the function will call
open_exec on the user-provided interpreter.

open_exec will call a path lookup, and if the path lookup process includes
the root of binfmt_misc, it will try to take a shared lock on its inode
again, but it is already locked, and the code will get stuck in a deadlock

To reproduce the bug:
$ echo ":iiiii:E::ii::/proc/sys/fs/binfmt_misc/bla:F" > /proc/sys/fs/binfmt_misc/register

backtrace of where the lock occurs (#5):
0  schedule () at ./arch/x86/include/asm/current.h:15
1  0xffffffff81b51237 in rwsem_down_read_slowpath (sem=0xffff888003b202e0, count=<optimized out>, state=state@entry=2) at kernel/locking/rwsem.c:992
2  0xffffffff81b5150a in __down_read_common (state=2, sem=<optimized out>) at kernel/locking/rwsem.c:1213
3  __down_read (sem=<optimized out>) at kernel/locking/rwsem.c:1222
4  down_read (sem=<optimized out>) at kernel/locking/rwsem.c:1355
5  0xffffffff811ee22a in inode_lock_shared (inode=<optimized out>) at ./include/linux/fs.h:783
6  open_last_lookups (op=0xffffc9000022fe34, file=0xffff888004098600, nd=0xffffc9000022fd10) at fs/namei.c:3177
7  path_openat (nd=nd@entry=0xffffc9000022fd10, op=op@entry=0xffffc9000022fe34, flags=flags@entry=65) at fs/namei.c:3366
8  0xffffffff811efe1c in do_filp_open (dfd=<optimized out>, pathname=pathname@entry=0xffff8880031b9000, op=op@entry=0xffffc9000022fe34) at fs/namei.c:3396
9  0xffffffff811e493f in do_open_execat (fd=fd@entry=-100, name=name@entry=0xffff8880031b9000, flags=<optimized out>, flags@entry=0) at fs/exec.c:913
10 0xffffffff811e4a92 in open_exec (name=<optimized out>) at fs/exec.c:948
11 0xffffffff8124aa84 in bm_register_write (file=<optimized out>, buffer=<optimized out>, count=19, ppos=<optimized out>) at fs/binfmt_misc.c:682
12 0xffffffff811decd2 in vfs_write (file=file@entry=0xffff888004098500, buf=buf@entry=0xa758d0 ":iiiii:E::ii::i:CF
", count=count@entry=19, pos=pos@entry=0xffffc9000022ff10) at fs/read_write.c:603
13 0xffffffff811defda in ksys_write (fd=<optimized out>, buf=0xa758d0 ":iiiii:E::ii::i:CF
", count=19) at fs/read_write.c:658
14 0xffffffff81b49813 in do_syscall_64 (nr=<optimized out>, regs=0xffffc9000022ff58) at arch/x86/entry/common.c:46
15 0xffffffff81c0007c in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:120

To solve the issue, the open_exec call is moved to before the write
lock is taken by bm_register_write

Link: https://lkml.kernel.org/r/20210228224414.95962-1-liorribak@gmail.com
Fixes: 948b701a607f1 ("binfmt_misc: add persistent opened binary handler for containers")
Signed-off-by: Lior Ribak <liorribak@gmail.com>
Acked-by: Helge Deller <deller@gmx.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-13 11:27:30 -08:00
Peter Xu
ca6eb14d64 mm: use is_cow_mapping() across tree where proper
After is_cow_mapping() is exported in mm.h, replace some manual checks
elsewhere throughout the tree but start to use the new helper.

Link: https://lkml.kernel.org/r/20210217233547.93892-5-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@ziepe.ca>
Cc: VMware Graphics <linux-graphics-maintainer@vmware.com>
Cc: Roland Scheidegger <sroland@vmware.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Gal Pressman <galpress@amazon.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Kirill Shutemov <kirill@shutemov.name>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Wei Zhang <wzam@amazon.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-03-13 11:27:30 -08:00
Jens Axboe
16efa4fce3 io_uring: allow IO worker threads to be frozen
With the freezer using the proper signaling to notify us of when it's
time to freeze a thread, we can re-enable normal freezer usage for the
IO threads. Ensure that SQPOLL, io-wq, and the io-wq manager call
try_to_freeze() appropriately, and remove the default setting of
PF_NOFREEZE from create_io_thread().

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-12 20:26:13 -07:00
Linus Torvalds
f296bfd5cd NFS Client Bugfixes for Linux 5.10-rc3
- Other fixes:
   - Fix PNFS_FLEXFILE_LAYOUT kconfig so it is possible to build into the kernel
   - Correct size calculationn for create reply length
   - Set memalloc_nofs_save() for sync tasks to prevent deadlocks
   - Don't revalidate directory permissions on lookup failure
   - Don't clear inode cache when lookup fails
   - Change functions to use nfs_set_cache_invalid() for proper delegation handling
   - Fix return value of _nfs4_get_security_label()
   - Return an error when attempting to remove system.nfs4_acl
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAmBLzNcACgkQ18tUv7Cl
 QOv8cBAAy7vYroCt0CbJpEWplMnIJ1VtbZ/J3Y6qm3pT+ZuS5fzi2XJs2VjA4h4b
 5W9TVmPEH0nYo8yueRa8J/mpAypjQhhvxQZkUEbCWhSsVdSSw5nyTlfTcAamxoSI
 alpEciUQUBjCTZGpyhHxR+TqfY2BKpSvwZtRtrOCqC2iTlfpsDaEpYg01obUvBk/
 BXANoV/vM5KL13WeHdrVT9A5SjQgTDpFlbeIZUxl3hgoDZkHnK7FHiIsClMu1/MA
 R9MDZLymamP4OcHjzT/5zrzgdnroJFoE75Shcd9jWZwONbsi/83JMkiYPHmkkGsu
 UGsdXO2ovEhbH5lq0t+6oNdDZKRhKHGp40RAZkzg+ohpnsM8KLL/UjCfBlznSbRL
 qSByl62/FkQsAB9V91q/Uk2Nvj3mTWkJWxx62X/Q0MP9YYwU7dNULpYAN84HKaUs
 Nw+wSI28V27LpGvLUg4z7AAUtQsFqmWMOuQuGuK5IgOw+r3B9LgeLQNt4LzY7VmH
 ck8KR9n+E6U1+ZeopRNeyepftfn297ZMCd8gRv03yWzuLvUwcsHyrfIlP7zaLOt2
 LYlyVDgqfTFdWzqvBkXigTO4uIfPW3dkGKh3OF4R0vQVfE0Wo9v53Oi6s5Hz3Szz
 g74cNLWXW9HXrk78ViPm6XP0fWRfQGW9YEXlImlR8Y+403QDLn8=
 =UxFm
 -----END PGP SIGNATURE-----

Merge tag 'nfs-for-5.12-2' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client bugfixes from Anna Schumaker:
 "These are mostly fixes for issues discovered at the recent NFS
  bakeathon:

   - Fix PNFS_FLEXFILE_LAYOUT kconfig so it is possible to build
     into the kernel

   - Correct size calculationn for create reply length

   - Set memalloc_nofs_save() for sync tasks to prevent deadlocks

   - Don't revalidate directory permissions on lookup failure

   - Don't clear inode cache when lookup fails

   - Change functions to use nfs_set_cache_invalid() for proper
     delegation handling

   - Fix return value of _nfs4_get_security_label()

   - Return an error when attempting to remove system.nfs4_acl"

* tag 'nfs-for-5.12-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  nfs: we don't support removing system.nfs4_acl
  NFSv4.2: fix return value of _nfs4_get_security_label()
  NFS: Fix open coded versions of nfs_set_cache_invalid() in NFSv4
  NFS: Fix open coded versions of nfs_set_cache_invalid()
  NFS: Clean up function nfs_mark_dir_for_revalidate()
  NFS: Don't gratuitously clear the inode cache when lookup failed
  NFS: Don't revalidate the directory permissions on a lookup failure
  SUNRPC: Set memalloc_nofs_save() for sync tasks
  NFS: Correct size calculation for create reply length
  nfs: fix PNFS_FLEXFILE_LAYOUT Kconfig default
2021-03-12 14:19:35 -08:00
Linus Torvalds
ce307084c9 block-5.12-2021-03-12-v2
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmBLzKsQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpi0ID/9djN1db0OrAjQgWdOQsKwzcPG4fmVRHJAu
 Zi8SPRj0ByonWGaPWjiSi297/j00dfYFFIXaB1Pfo4j0wX0IK8bJINl0G8SN6Dag
 WYBBrT/5rCQgD8fjQ1XhuzuqLwxwcZfYXAnCAlqABG18nPk532D4dX2CMEasl8F7
 XWTTj5PqHDN4bCcriH1GEA5S+2nmoz5YXjNZEDcY3/pQMdyb8Jo9mRfZubkrnRxK
 c9fz2LjUz0IRaSb+9PILY5qDLOSIh+vHOIk/3BKW9DoqU/S3kTTr4twqnOclfVPH
 VgJM9b+sHveVCztCJ9bnNGkW7HWjUQa8gb/B40NBxKEhw7w/HCjykhhxd+QTUQTM
 GJVMRGYWhzuUEuU1M1hArPua0GLmPKSvC0CRgbKRmgPNjshTquZPJnBBFwv2wZKQ
 GkrwktdK9ihE1ya4gu20MupST3PIpT3jtc6NAizr6DCy0wJ0Z1X5KYnFdbtS79No
 I9qPC8lu3AcZq6NXdBfTO9ngIdiUwi9AfSYj7koS/4dmnVccVJmaj0/NNmVp2Ro3
 HtaObanBnTi9v8YHl8WgX6lq5RjuQ204fXmd0No4mHFvgxsl7YaX+JBts7S3A2Nf
 PoQLqmulcLmzT3EVuEg279aXw2rbnyWHARbF/5/tIr4JcugtLJhwFnBA5YgFreq9
 lSbqgoKSHw==
 =qHyO
 -----END PGP SIGNATURE-----

Merge tag 'block-5.12-2021-03-12-v2' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Mostly just random fixes all over the map.

  The only odd-one-out change is finally getting the rename of
  BIO_MAX_PAGES to BIO_MAX_VECS done. This should've been done with the
  multipage bvec change, but it's been left.

  Do it now to avoid hassles around changes piling up for the next merge
  window.

  Summary:

   - NVMe pull request:
      - one more quirk (Dmitry Monakhov)
      - fix max_zone_append_sectors initialization (Chaitanya Kulkarni)
      - nvme-fc reset/create race fix (James Smart)
      - fix status code on aborts/resets (Hannes Reinecke)
      - fix the CSS check for ZNS namespaces (Chaitanya Kulkarni)
      - fix a use after free in a debug printk in nvme-rdma (Lv Yunlong)

   - Follow-up NVMe error fix for NULL 'id' (Christoph)

   - Fixup for the bd_size_lock being IRQ safe, now that the offending
     driver has been dropped (Damien).

   - rsxx probe failure error return (Jia-Ju)

   - umem probe failure error return (Wei)

   - s390/dasd unbind fixes (Stefan)

   - blk-cgroup stats summing fix (Xunlei)

   - zone reset handling fix (Damien)

   - Rename BIO_MAX_PAGES to BIO_MAX_VECS (Christoph)

   - Suppress uevent trigger for hidden devices (Daniel)

   - Fix handling of discard on busy device (Jan)

   - Fix stale cache issue with zone reset (Shin'ichiro)"

* tag 'block-5.12-2021-03-12-v2' of git://git.kernel.dk/linux-block:
  nvme: fix the nsid value to print in nvme_validate_or_alloc_ns
  block: Discard page cache of zone reset target range
  block: Suppress uevent for hidden device when removed
  block: rename BIO_MAX_PAGES to BIO_MAX_VECS
  nvme-pci: add the DISABLE_WRITE_ZEROES quirk for a Samsung PM1725a
  nvme-rdma: Fix a use after free in nvmet_rdma_write_data_done
  nvme-core: check ctrl css before setting up zns
  nvme-fc: fix racing controller reset and create association
  nvme-fc: return NVME_SC_HOST_ABORTED_CMD when a command has been aborted
  nvme-fc: set NVME_REQ_CANCELLED in nvme_fc_terminate_exchange()
  nvme: add NVME_REQ_CANCELLED flag in nvme_cancel_request()
  nvme: simplify error logic in nvme_validate_ns()
  nvme: set max_zone_append_sectors nvme_revalidate_zones
  block: rsxx: fix error return code of rsxx_pci_probe()
  block: Fix REQ_OP_ZONE_RESET_ALL handling
  umem: fix error return code in mm_pci_probe()
  blk-cgroup: Fix the recursive blkg rwstat
  s390/dasd: fix hanging IO request during DASD driver unbind
  s390/dasd: fix hanging DASD driver unbind
  block: Try to handle busy underlying device on discard
2021-03-12 13:25:49 -08:00
Linus Torvalds
9278be92f2 io_uring-5.12-2021-03-12
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmBLtdcQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpqK9D/9sE6QDAmLCvW4+wsFawf+Md9tCE3F15quC
 Tptsa6IoR2UB01d06uavLJ5sGo0LeVQQP8+Nygz0TM7jSV39Odmr8geP8wyqSQwP
 ZHLasrnz3LGINFOmxwMz/xQbrYUXEhRah+nx9Me0ROWmtQ46MRBZlpjsxffKccC9
 SdkS6R8chfc/6HT6oQXMRRDtB4U4SjDdeX6VFIW5E2Z62h0xjhZrmY42fPmChjXR
 mmAa2medSmajlwKrmp/+6sCfu2vVRR7bZ5FbS/SoQyo3ZvMabXI3lWicSgtu1wAK
 iK9NFJEuJ34Fj4RxTSwQrj0eRX5BqZpWHUJ/1ecxc4tDRtaIXZuzPtblYrZ5fwYe
 5pBzXXNpVwhat1AvGp9BFH/4P3kxJDszUAuL7zRut6nHu8xFGDGbNJHezCtws/uZ
 i+90Qt5sfoYyXgMDAZuXS7AkJXKbdnajpwjXmZheL3MEj2EsVylcTVaW0MBdVjx1
 y0eAtOGUVj2rNOSthDT0ZlKql7PY9N3dhkRxJIzRlIIfBfg73UWkis7zOlFE8CCz
 y0rtsu+v/u22mU17v6gdVnTls/vbfiGSg4SutEK2Rv/Qqbjr+po+RXK14BJKBJR9
 JknAkQlBjagZmLZKlzRfCDqa62aFYwxC/eOeLGxSpInj0ncgKmWNpnFjXSyRBdPq
 stOCQF5aHQ==
 =40h0
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-5.12-2021-03-12' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:
 "Not quite as small this week as I had hoped, but at least this should
  be the end of it. All the little known issues have been ironed out -
  most of it little stuff, but cancelations being the bigger part. Only
  minor tweaks and/or regular fixes expected beyond this point.

   - Fix the creds tracking for async (io-wq and SQPOLL)

   - Various SQPOLL fixes related to parking, sharing, forking, IOPOLL,
     completions, and life times. Much simpler now.

   - Make IO threads unfreezable by default, on account of a bug report
     that had them spinning on resume. Honestly not quite sure why
     thawing leaves us with a perpetual signal pending (causing the
     spin), but for now make them unfreezable like there were in 5.11
     and prior.

   - Move personality_idr to xarray, solving a use-after-free related to
     removing an entry from the iterator callback. Buffer idr needs the
     same treatment.

   - Re-org around and task vs context tracking, enabling the fixing of
     cancelations, and then cancelation fixes on top.

   - Various little bits of cleanups and hardening, and removal of now
     dead parts"

* tag 'io_uring-5.12-2021-03-12' of git://git.kernel.dk/linux-block: (34 commits)
  io_uring: fix OP_ASYNC_CANCEL across tasks
  io_uring: cancel sqpoll via task_work
  io_uring: prevent racy sqd->thread checks
  io_uring: remove useless ->startup completion
  io_uring: cancel deferred requests in try_cancel
  io_uring: perform IOPOLL reaping if canceler is thread itself
  io_uring: force creation of separate context for ATTACH_WQ and non-threads
  io_uring: remove indirect ctx into sqo injection
  io_uring: fix invalid ctx->sq_thread_idle
  kernel: make IO threads unfreezable by default
  io_uring: always wait for sqd exited when stopping SQPOLL thread
  io_uring: remove unneeded variable 'ret'
  io_uring: move all io_kiocb init early in io_init_req()
  io-wq: fix ref leak for req in case of exit cancelations
  io_uring: fix complete_post races for linked req
  io_uring: add io_disarm_next() helper
  io_uring: fix io_sq_offload_create error handling
  io-wq: remove unused 'user' member of io_wq
  io_uring: Convert personality_idr to XArray
  io_uring: clean R_DISABLED startup mess
  ...
2021-03-12 13:13:57 -08:00
Linus Torvalds
8d9d53de51 configfs fix for 5.12
- fix a use-after-free in __configfs_open_file
    (Daiyue Zhang)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmBLs9ELHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYPlehAAxSw2zkzYxRvHQ+zhVRKu9x762/SOt+/3P+OuRoiN
 eGZnJi+ofLsoosj7F2Few0zKMWxMQivINpzsM8dMysvmULgbQ/xAlxPgvuk4B2I2
 l4mleFoah2BA/tnzdj9kdqv/WhzKVVhEmBgzpyTDGdRBduYAWmAhkrhXq+qC9ztY
 ArtY4Rvh61I5q12aIF8tFFbEJZqCKgZZ0W3bKO39upJF+kwoztbUUMpPF+YH4FW+
 7JitlaWFBAE89Vcf7BNqMNVk3DtLPy47WI+FaP1zmpy43XRNq2m6/FeLJhn2/S3X
 n31x6IAa4DiEJEn743czhCdAAltcMXxqVrVKF5tGfGh6mr8b96UwxjEN7U+Z4sgX
 gV8rQuHLdPc3dlTQjTuSvAAfrl3J8UpLVWLbva0vWaiUBu2/WVUfO6wJJ+ODUDaq
 woXrPtTqK8xQK2MhpOhPAvBHFsCSKqS7CvXcjOzTLJNUInN85WSVYA297r6IWr+G
 kRJpj6k8dC9e3/LbNEmrBeToKc7tPzYYcx2hlhfkaQRZUdddjuL98po7cJGeu0CR
 S10Zsry/8Lnhe2zIm7u4Hw4gEx73b+uCgZHK17OIRLHzUPmTmXaNh9eWN/jwp1/N
 wNhiPN2OJw9IRVkmTK8TFOeugEJo2VhUo51e36m5sskhowwEYDxtoZ8D7JbpKYS+
 sU8=
 =QljJ
 -----END PGP SIGNATURE-----

Merge tag 'configfs-for-5.12' of git://git.infradead.org/users/hch/configfs

Pull configfs fix from Christoph Hellwig:

 - fix a use-after-free in __configfs_open_file (Daiyue Zhang)

* tag 'configfs-for-5.12' of git://git.infradead.org/users/hch/configfs:
  configfs: fix a use-after-free in __configfs_open_file
2021-03-12 11:48:14 -08:00
Linus Torvalds
b77b5fdd05 Various gfs2 fixes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmBLgwEUHGFncnVlbmJh
 QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTpzCxAAhp5mbg+/oQ6c4wULu/q0nm2gaPdN
 Bq8DnkOKLAs3Ncd7Ft3nrGkQZQzwCvu3LVxu4zU6hdylCtZnBsYRWI8nOCn4eQKd
 Le2qZcb00awxd/XqzNmtNZwDAfyCLXaPhZJ1mFUL+IWLm/eCW9/Vi0W6YGb4Egs9
 nKCVmBdnvJSeqSSM5RJ2C6bLSwrWLe98n5r5O2uNeBtmvy2fX6A/dbM+3K03YJYJ
 JAwn1awcnSRyOD+UKSYV1mBz6mHaEKGaGmI3TKhpFGEeyOLWi8EASt2O1NDRkllC
 z9UN6H9V70Fuci8pEkP3ju0T4jbVDMv6PfX17Ah7YfHChgH70Rx64NVyCaftNMyu
 zHxHgn4PmSBgF3J5MxMO7kQUjL8OipbvPEMOTwFT4iBC10O2X7/w+hCPI+coEIB8
 w9KsZPl/5ESWdkrlxzQM3fgFUPosp5z0c3rj0gXR6aWbyumSBNWytJogp72LNfX5
 W+w1OH8nmsSJjlzbYrZjcgBsf9RCPBgyWcePL/7t+kKgjG6LlAumFh1cK+seJXBb
 tYp1WFRP2bztXz57rMD5glOc9mysbgUWwgKbvUgj9PPWyT1S/7f4EZhN/GfZWg/h
 fx+dYtlWgLWQkQwEJ1aE8Hqc+hjYxIqnAHo2h725jKpfOIvWPxBZEvztX2SL3wXO
 DtKPBhmyFtPkbR0=
 =6g8z
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-v5.12-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 fixes from Andreas Gruenbacher:
 "Various gfs2 fixes"

* tag 'gfs2-v5.12-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: bypass log flush if the journal is not live
  gfs2: bypass signal_our_withdraw if no journal
  gfs2: fix use-after-free in trans_drain
  gfs2: make function gfs2_make_fs_ro() to void type
2021-03-12 11:46:09 -08:00
Pavel Begunkov
58f9937383 io_uring: fix OP_ASYNC_CANCEL across tasks
IORING_OP_ASYNC_CANCEL tries io-wq cancellation only for current task.
If it fails go over tctx_list and try it out for every single tctx.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-12 09:42:56 -07:00
Pavel Begunkov
521d6a737a io_uring: cancel sqpoll via task_work
1) The first problem is io_uring_cancel_sqpoll() ->
io_uring_cancel_task_requests() basically doing park(); park(); and so
hanging.

2) Another one is more subtle, when the master task is doing cancellations,
but SQPOLL task submits in-between the end of the cancellation but
before finish() requests taking a ref to the ctx, and so eternally
locking it up.

3) Yet another is a dying SQPOLL task doing io_uring_cancel_sqpoll() and
same io_uring_cancel_sqpoll() from the owner task, they race for
tctx->wait events. And there probably more of them.

Instead do SQPOLL cancellations from within SQPOLL task context via
task_work, see io_sqpoll_cancel_sync(). With that we don't need temporal
park()/unpark() during cancellation, which is ugly, subtle and anyway
doesn't allow to do io_run_task_work() properly.

io_uring_cancel_sqpoll() is called only from SQPOLL task context and
under sqd locking, so all parking is removed from there. And so,
io_sq_thread_[un]park() and io_sq_thread_stop() are not used now by
SQPOLL task, and that spare us from some headache.

Also remove ctx->sqd_list early to avoid 2). And kill tctx->sqpoll,
which is not used anymore.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-12 09:42:55 -07:00
Pavel Begunkov
26984fbf3a io_uring: prevent racy sqd->thread checks
SQPOLL thread to which we're trying to attach may be going away, it's
not nice but a more serious problem is if io_sq_offload_create() sees
sqd->thread==NULL, and tries to init it with a new thread. There are
tons of ways it can be exploited or fail.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-12 09:42:53 -07:00
Bob Peterson
0efc4976e3 gfs2: bypass log flush if the journal is not live
Patch fe3e397668775 ("gfs2: Rework the log space allocation logic")
changed gfs2_log_flush to reserve a set of journal blocks in case no
transaction is active.  However, gfs2_log_flush also gets called in
cases where we don't have an active journal, for example, for spectator
mounts.  In that case, trying to reserve blocks would sleep forever, but
we want gfs2_log_flush to be a no-op instead.

Fixes: fe3e397668775 ("gfs2: Rework the log space allocation logic")
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-03-12 15:52:48 +01:00
Pavel Begunkov
0df8ea602b io_uring: remove useless ->startup completion
We always do complete(&sqd->startup) almost right after sqd->thread
creation, either in the success path or in io_sq_thread_finish(). It's
specifically created not started for us to be able to set some stuff
like sqd->thread and io_uring_alloc_task_context() before following
right after wake_up_new_task().

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-12 07:23:01 -07:00
Pavel Begunkov
e1915f76a8 io_uring: cancel deferred requests in try_cancel
As io_uring_cancel_files() and others let SQO to run between
io_uring_try_cancel_requests(), SQO may generate new deferred requests,
so it's safer to try to cancel them in it.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-12 07:23:00 -07:00
Bob Peterson
d5bf630f35 gfs2: bypass signal_our_withdraw if no journal
Before this patch, function signal_our_withdraw referenced the journal
inode immediately. But corrupt file systems may have some invalid
journals, in which case our attempt to read it in will withdraw and the
resulting signal_our_withdraw would dereference the NULL value.

This patch adds a check to signal_our_withdraw so that if the journal
has not yet been initialized, it simply returns and does the old-style
withdraw.

Thanks, Andy Price, for his analysis.

Reported-by: syzbot+50a8a9cf8127f2c6f5df@syzkaller.appspotmail.com
Fixes: 601ef0d52e96 ("gfs2: Force withdraw to replay journals and wait for it to finish")
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2021-03-12 14:55:23 +01:00
J. Bruce Fields
4f8be1f53b nfs: we don't support removing system.nfs4_acl
The NFSv4 protocol doesn't have any notion of reomoving an attribute, so
removexattr(path,"system.nfs4_acl") doesn't make sense.

There's no documented return value.  Arguably it could be EOPNOTSUPP but
I'm a little worried an application might take that to mean that we
don't support ACLs or xattrs.  How about EINVAL?

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2021-03-11 13:17:42 -05:00
Jens Axboe
d052d1d685 io_uring: perform IOPOLL reaping if canceler is thread itself
We bypass IOPOLL completion polling (and reaping) for the SQPOLL thread,
but if it's the thread itself invoking cancelations, then we still need
to perform it or no one will.

Fixes: 9936c7c2bc76 ("io_uring: deduplicate core cancellations sequence")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-11 10:49:20 -07:00
Jens Axboe
5c2469e0a2 io_uring: force creation of separate context for ATTACH_WQ and non-threads
Earlier kernels had SQPOLL threads that could share across anything, as
we grabbed the context we needed on a per-ring basis. This is no longer
the case, so only allow attaching directly if we're in the same thread
group. That is the common use case. For non-group tasks, just setup a
new context and thread as we would've done if sharing wasn't set. This
isn't 100% ideal in terms of CPU utilization for the forked and share
case, but hopefully that isn't much of a concern. If it is, there are
plans in motion for how to improve that. Most importantly, we want to
avoid app side regressions where sharing worked before and now doesn't.
With this patch, functionality is equivalent to previous kernels that
supported IORING_SETUP_ATTACH_WQ with SQPOLL.

Reported-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-11 10:17:56 -07:00
Olga Kornievskaia
b4250dd868 NFSD: fix error handling in NFSv4.0 callbacks
When the server tries to do a callback and a client fails it due to
authentication problems, we need the server to set callback down
flag in RENEW so that client can recover.

Suggested-by: Bruce Fields <bfields@redhat.com>
Signed-off-by: Olga Kornievskaia <kolga@netapp.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Benjamin Coddington <bcodding@redhat.com>
Link: https://lore.kernel.org/linux-nfs/FB84E90A-1A03-48B3-8BF7-D9D10AC2C9FE@oracle.com/T/#t
2021-03-11 10:58:49 -05:00
Eric Biggers
f053cf7aa6 ext4: fix error handling in ext4_end_enable_verity()
ext4 didn't properly clean up if verity failed to be enabled on a file:

- It left verity metadata (pages past EOF) in the page cache, which
  would be exposed to userspace if the file was later extended.

- It didn't truncate the verity metadata at all (either from cache or
  from disk) if an error occurred while setting the verity bit.

Fix these bugs by adding a call to truncate_inode_pages() and ensuring
that we truncate the verity metadata (both from cache and from disk) in
all error paths.  Also rework the code to cleanly separate the success
path from the error paths, which makes it much easier to understand.

Reported-by: Yunlei He <heyunlei@hihonor.com>
Fixes: c93d8f885809 ("ext4: add basic fs-verity support")
Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20210302200420.137977-2-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2021-03-11 10:38:50 -05:00
Christoph Hellwig
a8affc03a9 block: rename BIO_MAX_PAGES to BIO_MAX_VECS
Ever since the addition of multipage bio_vecs BIO_MAX_PAGES has been
horribly confusingly misnamed.  Rename it to BIO_MAX_VECS to stop
confusing users of the bio API.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20210311110137.1132391-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-11 07:47:48 -07:00
Daiyue Zhang
14fbbc8297 configfs: fix a use-after-free in __configfs_open_file
Commit b0841eefd969 ("configfs: provide exclusion between IO and removals")
uses ->frag_dead to mark the fragment state, thus no bothering with extra
refcount on config_item when opening a file. The configfs_get_config_item
was removed in __configfs_open_file, but not with config_item_put. So the
refcount on config_item will lost its balance, causing use-after-free
issues in some occasions like this:

Test:
1. Mount configfs on /config with read-only items:
drwxrwx--- 289 root   root            0 2021-04-01 11:55 /config
drwxr-xr-x   2 root   root            0 2021-04-01 11:54 /config/a
--w--w--w-   1 root   root         4096 2021-04-01 11:53 /config/a/1.txt
......

2. Then run:
for file in /config
do
echo $file
grep -R 'key' $file
done

3. __configfs_open_file will be called in parallel, the first one
got called will do:
if (file->f_mode & FMODE_READ) {
	if (!(inode->i_mode & S_IRUGO))
		goto out_put_module;
			config_item_put(buffer->item);
				kref_put()
					package_details_release()
						kfree()

the other one will run into use-after-free issues like this:
BUG: KASAN: use-after-free in __configfs_open_file+0x1bc/0x3b0
Read of size 8 at addr fffffff155f02480 by task grep/13096
CPU: 0 PID: 13096 Comm: grep VIP: 00 Tainted: G        W       4.14.116-kasan #1
TGID: 13096 Comm: grep
Call trace:
dump_stack+0x118/0x160
kasan_report+0x22c/0x294
__asan_load8+0x80/0x88
__configfs_open_file+0x1bc/0x3b0
configfs_open_file+0x28/0x34
do_dentry_open+0x2cc/0x5c0
vfs_open+0x80/0xe0
path_openat+0xd8c/0x2988
do_filp_open+0x1c4/0x2fc
do_sys_open+0x23c/0x404
SyS_openat+0x38/0x48

Allocated by task 2138:
kasan_kmalloc+0xe0/0x1ac
kmem_cache_alloc_trace+0x334/0x394
packages_make_item+0x4c/0x180
configfs_mkdir+0x358/0x740
vfs_mkdir2+0x1bc/0x2e8
SyS_mkdirat+0x154/0x23c
el0_svc_naked+0x34/0x38

Freed by task 13096:
kasan_slab_free+0xb8/0x194
kfree+0x13c/0x910
package_details_release+0x524/0x56c
kref_put+0xc4/0x104
config_item_put+0x24/0x34
__configfs_open_file+0x35c/0x3b0
configfs_open_file+0x28/0x34
do_dentry_open+0x2cc/0x5c0
vfs_open+0x80/0xe0
path_openat+0xd8c/0x2988
do_filp_open+0x1c4/0x2fc
do_sys_open+0x23c/0x404
SyS_openat+0x38/0x48
el0_svc_naked+0x34/0x38

To fix this issue, remove the config_item_put in
__configfs_open_file to balance the refcount of config_item.

Fixes: b0841eefd969 ("configfs: provide exclusion between IO and removals")
Signed-off-by: Daiyue Zhang <zhangdaiyue1@huawei.com>
Signed-off-by: Yi Chen <chenyi77@huawei.com>
Signed-off-by: Ge Qiu <qiuge@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2021-03-11 12:13:48 +01:00
Linus Torvalds
a74e6a014c s390 updates for 5.12-rc3
- fix various user space visible copy_to_user() instances which return the
   number of bytes left to copy instead of -EFAULT
 
 - make TMPFS_INODE64 available again for s390 and alpha, now that both
   architectures have been switched to 64-bit ino_t
   see commit 96c0a6a72d18 ("s390,alpha: switch to 64-bit ino_t")
 
 - make sure to release a shared hypervisor resource within the zcore device
   driver also on restart and power down; also remove unneeded surrounding
   debugfs_create return value checks
 
 - for the new hardware counter set device driver rename the uapi header file to
   be a bit more generic; also remove 60 second read limit which is not really
   necessary and without the limit the interface can be easier tested
 
 - some small cleanups, the largest being to convert all long long in our time
   and idle code to longs
 
 - update defconfigs
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAmBJH2QACgkQIg7DeRsp
 bsJtag//VEnPk6YUWcxNPTvyqbmZx+T/7HTR2qfiep3yjnTkyJ3w06BrrH3SNPKF
 8k1pya0tgcXXliOy+pMbl2NbpAP6Kx+GUk3OsV2TXJj8VM6wB9g3dogtpEMwxLvu
 1W5ZLOO9C+t6BGXkPh9gXdrzZAY2AYGZLlCgUocG9UU2AyDyTEPpjod1RJbvccof
 UQ02N6ClOTWYaGG2lW9aBjEr6vJcbTrPVP9OAw2uWcC4uOxr/vcM+KjAZbsrLSma
 cdqNsfWtGnHjI6ktfMCXpwSTwCKYBBiMgPxpa7YJwabTnZxjYXYiUVN+DSvByrF3
 muTnAsEnQYmA0jAcUGe1G9I2+wHOJrXtNq5cvfEpQIIerIlEjdEn5m1w3njccJdy
 9oPlE2apC0ItJBKTgPe2Zn1yU0WstmEZ58+QB5VpIw77U+FwujM/0HMVXF1XWGFb
 vk/ByX6IzkvSVPOT+ywyj81NQXqYqnLzANeMJXFH2ygT16Tr1fJVU4bOX6jXR9t5
 ezj051ZzNx4p2a3NmSpS1MJSz0Ko5coDoFmeACAm20RWRas0JbV4Z50SL/rUILCC
 UxElj4F41OhLYCUAo9eGSVD0Tb2xiOl9k+Wpl5Zn5c9DLJ/kxaBLohT8aWKdumA2
 x8aNjFoCFNLt9Mh2yCY6qv/Bd0477A3SODjnmXA7u+X1JiusJ8Y=
 =PsfM
 -----END PGP SIGNATURE-----

Merge tag 's390-5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 fixes from Heiko Carstens:

 - fix various user space visible copy_to_user() instances which return
   the number of bytes left to copy instead of -EFAULT

 - make TMPFS_INODE64 available again for s390 and alpha, now that both
   architectures have been switched to 64-bit ino_t (see commit
   96c0a6a72d18: "s390,alpha: switch to 64-bit ino_t")

 - make sure to release a shared hypervisor resource within the zcore
   device driver also on restart and power down; also remove unneeded
   surrounding debugfs_create return value checks

 - for the new hardware counter set device driver rename the uapi header
   file to be a bit more generic; also remove 60 second read limit which
   is not really necessary and without the limit the interface can be
   easier tested

 - some small cleanups, the largest being to convert all long long in
   our time and idle code to longs

 - update defconfigs

* tag 's390-5.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: remove IBM_PARTITION and CONFIGFS_FS from zfcpdump defconfig
  s390: update defconfigs
  s390,alpha: make TMPFS_INODE64 available again
  s390/cio: return -EFAULT if copy_to_user() fails
  s390/tty3270: avoid comma separated statements
  s390/cpumf: remove unneeded semicolon
  s390/crypto: return -EFAULT if copy_to_user() fails
  s390/cio: return -EFAULT if copy_to_user() fails
  s390/cpumf: rename header file to hwctrset.h
  s390/zcore: release dump save area on restart or power down
  s390/zcore: no need to check return value of debugfs_create functions
  s390/cpumf: remove 60 seconds read limit
  s390/topology: remove always false if check
  s390/time,idle: get rid of unsigned long long
2021-03-10 13:15:16 -08:00