7702 Commits

Author SHA1 Message Date
Al Viro
7392ed1734 iov_iter_get_pages{,_alloc}(): cap the maxsize with MAX_RW_COUNT
All callers can and should handle iov_iter_get_pages() returning
fewer pages than requested.  All in-kernel ones do.  And it makes
the arithmetical overflow analysis much simpler...

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-07-06 16:27:17 -04:00
Al Viro
18fa9af726 iov_iter_bvec_advance(): don't bother with bvec_iter
do what we do for iovec/kvec; that ends up generating better code,
AFAICS.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-07-06 16:23:32 -04:00
Uladzislau Rezki (Sony)
5e21f2d577 lib/test_vmalloc: switch to prandom_u32()
A get_random_bytes() function can cause a high contention if it is called
across CPUs simultaneously.  Because it shares one lock per all CPUs:

<snip>
   class name     con-bounces  contentions   waittime-min   waittime-max waittime-total   waittime-avg    acq-bounces   acquisitions   holdtime-min   holdtime-max holdtime-total   holdtime-avg
   &crng->lock:   663145       665886        0.05           8.85         261966.66        0.39            7188152       13731279       0.04           11.89        2181582.30       0.16
   -----------
   &crng->lock    307835       [<00000000acba59cd>] _extract_crng+0x48/0x90
   &crng->lock    358051       [<00000000f0075abc>] _crng_backtrack_protect+0x32/0x90
   -----------
   &crng->lock    234241       [<00000000f0075abc>] _crng_backtrack_protect+0x32/0x90
   &crng->lock    431645       [<00000000acba59cd>] _extract_crng+0x48/0x90
<snip>

Switch from the get_random_bytes() to prandom_u32() that does not have any
internal contention when a random value is needed for the tests.

The reason is to minimize CPU cycles introduced by the test-suite itself
from the vmalloc performance metrics.

Link: https://lkml.kernel.org/r/20220607093449.3100-6-urezki@gmail.com
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-03 18:08:42 -07:00
Roman Gushchin
5035ebc644 mm: shrinkers: introduce debugfs interface for memory shrinkers
This commit introduces the /sys/kernel/debug/shrinker debugfs interface
which provides an ability to observe the state of individual kernel memory
shrinkers.

Because the feature adds some memory overhead (which shouldn't be large
unless there is a huge amount of registered shrinkers), it's guarded by a
config option (enabled by default).

This commit introduces the "count" interface for each shrinker registered
in the system.

The output is in the following format:
<cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
<cgroup inode id> <nr of objects on node 0> <nr of objects on node 1>...
...

To reduce the size of output on machines with many thousands cgroups, if
the total number of objects on all nodes is 0, the line is omitted.

If the shrinker is not memcg-aware or CONFIG_MEMCG is off, 0 is printed as
cgroup inode id.  If the shrinker is not numa-aware, 0's are printed for
all nodes except the first one.

This commit gives debugfs entries simple numeric names, which are not very
convenient.  The following commit in the series will provide shrinkers
with more meaningful names.

[akpm@linux-foundation.org: remove WARN_ON_ONCE(), per Roman]
  Reported-by: syzbot+300d27c79fe6d4cbcc39@syzkaller.appspotmail.com
Link: https://lkml.kernel.org/r/20220601032227.4076670-3-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Reviewed-by: Kent Overstreet <kent.overstreet@gmail.com>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Dave Chinner <dchinner@redhat.com>
Cc: Hillf Danton <hdanton@sina.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-07-03 18:08:40 -07:00
Linus Torvalds
b8d5109f50 lockref: remove unused 'lockref_get_or_lock()' function
Looking at the conditional lock acquire functions in the kernel due to
the new sparse support (see commit 4a557a5d1a61 "sparse: introduce
conditional lock acquire function attribute"), it became obvious that
the lockref code has a couple of them, but they don't match the usual
naming convention for the other ones, and their return value logic is
also reversed.

In the other very similar places, the naming pattern is '*_and_lock()'
(eg 'atomic_put_and_lock()' and 'refcount_dec_and_lock()'), and the
function returns true when the lock is taken.

The lockref code is superficially very similar to the refcount code,
only with the special "atomic wrt the embedded lock" semantics.  But
instead of the '*_and_lock()' naming it uses '*_or_lock()'.

And instead of returning true in case it took the lock, it returns true
if it *didn't* take the lock.

Now, arguably the reflock code is quite logical: it really is a "either
decrement _or_ lock" kind of situation - and the return value is about
whether the operation succeeded without any special care needed.

So despite the similarities, the differences do make some sense, and
maybe it's not worth trying to unify the different conditional locking
primitives in this area.

But while looking at this all, it did become obvious that the
'lockref_get_or_lock()' function hasn't actually had any users for
almost a decade.

The only user it ever had was the shortlived 'd_rcu_to_refcount()'
function, and it got removed and replaced with 'lockref_get_not_dead()'
back in 2013 in commits 0d98439ea3c6 ("vfs: use lockred 'dead' flag to
mark unrecoverably dead dentries") and e5c832d55588 ("vfs: fix dentry
RCU to refcounting possibly sleeping dput()")

In fact, that single use was removed less than a week after the whole
function was introduced in commit b3abd80250c1 ("lockref: add
'lockref_get_or_lock() helper") so this function has been around for a
decade, but only had a user for six days.

Let's just put this mis-designed and unused function out of its misery.

We can think about the naming and semantic oddities of the remaining
'lockref_put_or_lock()' later, but at least that function has users.

And while the naming is different and the return value doesn't match,
that function matches the whole '{atomic,refcount}_dec_and_test()'
pattern much better (ie the magic happens when the count goes down to
zero, not when it is incremented from zero).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-07-03 14:40:28 -07:00
Kees Cook
6a022dd29f lib: overflow: Do not define 64-bit tests on 32-bit
The 64-bit overflow tests will trigger 64-bit division on 32-bit hosts,
which is not currently used anywhere in the kernel, and tickles bugs
in at least Clang 13 and earlier:
https://github.com/ClangBuiltLinux/linux/issues/1636

In reality, there shouldn't be a reason to not build the 64-bit test
cases on 32-bit systems, so these #ifdefs can be removed once the minimum
Clang version reaches 13.

In the meantime, silence W=1 warnings given by the current code:

../lib/overflow_kunit.c:191:19: warning: 's64_tests' defined but not used [-Wunused-const-variable=]
  191 | DEFINE_TEST_ARRAY(s64) = {
      |                   ^~~
../lib/overflow_kunit.c:24:11: note: in definition of macro 'DEFINE_TEST_ARRAY'
   24 |         } t ## _tests[]
      |           ^
../lib/overflow_kunit.c:94:19: warning: 'u64_tests' defined but not used [-Wunused-const-variable=]
   94 | DEFINE_TEST_ARRAY(u64) = {
      |                   ^~~
../lib/overflow_kunit.c:24:11: note: in definition of macro 'DEFINE_TEST_ARRAY'
   24 |         } t ## _tests[]
      |           ^

Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/lkml/202205110324.7GrtxG8u-lkp@intel.com
Fixes: 455a35a6cdb6 ("lib: add runtime test of check_*_overflow functions")
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Vitor Massaru Iha <vitor@massaru.org>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Tested-by: Daniel Latypov <dlatypov@google.com>
Link: https://lore.kernel.org/lkml/CAGS_qxokQAjQRip2vPi80toW7hmBnXf=KMTNT51B1wuDqSZuVQ@mail.gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2022-07-01 17:39:49 -07:00
David Gow
c272612cb4 kunit: Taint the kernel when KUnit tests are run
Make KUnit trigger the new TAINT_TEST taint when any KUnit test is run.
Due to KUnit tests not being intended to run on production systems, and
potentially causing problems (or security issues like leaking kernel
addresses), the kernel's state should not be considered safe for
production use after KUnit tests are run.

This both marks KUnit modules as test modules using MODULE_INFO() and
manually taints the kernel when tests are run (which catches builtin
tests).

Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Tested-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: David Gow <davidgow@google.com>
Tested-by: Maíra Canal <mairacanal@riseup.net>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-07-01 16:38:43 -06:00
Linus Torvalds
d516e221e2 block-5.19-2022-07-01
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmK+6BcQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgplI3D/4u/UdnvzqYCLu1wHFZAfGXkI62ccOt9A5h
 R2t9ewK8WNv6eXoZGNo+Lmi+MsTetT+NtpjHX8gv2ZTNfo1MaMX91YnEPOM/Gxze
 TGlgR9uSDosxFpXMa1ylAzIH5Xhzo6gktMqS+dGj34dbrc26YCeLWRstdA1CkRwn
 Ttj6qZaeZcAmmAbKPT0RKhqAGq+5Oz7E8xCWNySFpj6QO4e5moWk9z5wynguiz/p
 jRH1KvzN+mevU7omC8hZR9HN8eOsZE7TWCNzF3g3ieTam/dsX9LxklEhybJ/C1oE
 abiu8ludeCKhV16PE71e0dAs1VOU+kRkPl51DpNPa00TlJCq7etBOpHnOFJmbbRy
 uM0dW3h34qqAXHAItcrBxVtbRW817x9ptI64c3WoCBs59DAVjqQD/m64LefdTBJe
 a7FLYfD2iReHptS5EnVLm1OyJLMiDIFGeJsWZbK0N+BSQEM9YVNxHAvn/tLahr/Y
 Ym78erLttFnpzf2qD1H0RwV0Ngj9IWjJWDkQR23akiTctPq2mKFXe+GAyHlqz87h
 RjyMpMth+s5ov82K/V7duZTmxuxMnCbx9pk4HpGRXajz2Njw3Z6hMmrS574m0Osp
 15y4NNIiJ3DfmqhprGteaLLcGlbFATzdjI5LPkDhKzIodfeLnsBq1QvTNaUH4O6F
 mg3AOo10RQ==
 =YVop
 -----END PGP SIGNATURE-----

Merge tag 'block-5.19-2022-07-01' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - Fix for batch getting of tags in sbitmap (wuchi)

 - NVMe pull request via Christoph:
      - More quirks (Lamarque Vieira Souza, Pablo Greco)
      - Fix a fabrics disconnect regression (Ruozhu Li)
      - Fix a nvmet-tcp data_digest calculation regression (Sagi
        Grimberg)
      - Fix nvme-tcp send failure handling (Sagi Grimberg)
      - Fix a regression with nvmet-loop and passthrough controllers
        (Alan Adamson)

* tag 'block-5.19-2022-07-01' of git://git.kernel.dk/linux-block:
  nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA IM2P33F8ABR1
  nvmet: add a clear_ids attribute for passthru targets
  nvme: fix regression when disconnect a recovering ctrl
  nvme-pci: add NVME_QUIRK_BOGUS_NID for ADATA XPG SX6000LNP (AKA SPECTRIX S40G)
  nvme-tcp: always fail a request when sending it failed
  nvmet-tcp: fix regression in data_digest calculation
  lib/sbitmap: Fix invalid loop in __sbitmap_queue_get_batch()
2022-07-01 10:42:10 -07:00
Alexander Lobakin
dc34d50366 lib: test_bitmap: add compile-time optimization/evaluations assertions
Add a function to the bitmap test suite, which will ensure that
compilers are able to evaluate operations performed by the
bitops/bitmap helpers to compile-time constants when all of the
arguments are compile-time constants as well, or trigger a build
bug otherwise. This should work on all architectures and all the
optimization levels supported by Kbuild.
The function doesn't perform any runtime tests and gets optimized
out to nothing after passing the build assertions.
Unfortunately, Clang for s390 is currently broken (up to the latest
Git snapshots) -- see the comment in the code -- so for now there's
a small workaround for it which doesn't alter the logics. Hope we'll
be able to remove it one day (bugreport is on its way).

Suggested-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-06-30 19:52:42 -07:00
Jason A. Donenfeld
d6c14da474 crypto: lib/blake2s - reduce stack frame usage in self test
Using 3 blocks here doesn't give us much more than using 2, and it
causes a stack frame size warning on certain compiler/config/arch
combinations:

   lib/crypto/blake2s-selftest.c: In function 'blake2s_selftest':
>> lib/crypto/blake2s-selftest.c:632:1: warning: the frame size of 1088 bytes is larger than 1024 bytes [-Wframe-larger-than=]
     632 | }
         | ^

So this patch just reduces the block from 3 to 2, which makes the
warning go away.

Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/linux-crypto/202206200851.gE3MHCgd-lkp@intel.com
Fixes: 2d16803c562e ("crypto: blake2s - remove shash module")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-06-30 15:56:56 +08:00
Al Viro
c3497fd009 fix short copy handling in copy_mc_pipe_to_iter()
Unlike other copying operations on ITER_PIPE, copy_mc_to_iter() can
result in a short copy.  In that case we need to trim the unused
buffers, as well as the length of partially filled one - it's not
enough to set ->head, ->iov_offset and ->count to reflect how
much had we copied.  Not hard to fix, fortunately...

I'd put a helper (pipe_discard_from(pipe, head)) into pipe_fs_i.h,
rather than iov_iter.c - it has nothing to do with iov_iter and
having it will allow us to avoid an ugly kludge in fs/splice.c.
We could put it into lib/iov_iter.c for now and move it later,
but I don't see the point going that way...

Cc: stable@kernel.org # 4.19+
Fixes: ca146f6f091e "lib/iov_iter: Fix pipe handling in _copy_to_iter_mcsafe()"
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-06-28 17:37:11 -04:00
Al Viro
59bb69c67c copy_page_{to,from}_iter(): switch iovec variants to generic
we can do copyin/copyout under kmap_local_page(); it shouldn't overflow
the kmap stack - the maximal footprint increase only by one here.

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-06-28 17:30:57 -04:00
akpm
ee56c3e8ee Merge branch 'master' into mm-nonmm-stable 2022-06-27 10:31:44 -07:00
akpm
46a3b11253 Merge branch 'master' into mm-stable 2022-06-27 10:31:34 -07:00
Keith Busch
cfa320f728 iov: introduce iov_iter_aligned
The existing iov_iter_alignment() function returns the logical OR of
address and length. For cases where address and length need to be
considered separately, introduce a helper function that a caller can
specificy length and address masks that indicate if the iov is
unaligned.

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220610195830.3574005-9-kbusch@fb.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-27 06:29:11 -06:00
wuchi
fbb564a557 lib/sbitmap: Fix invalid loop in __sbitmap_queue_get_batch()
1. Getting next index before continue branch.
2. Checking free bits when setting the target bits. Otherwise,
it may reuse the busying bits.

Signed-off-by: wuchi <wuchi.zero@gmail.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Link: https://lore.kernel.org/r/20220605145835.26916-1-wuchi.zero@gmail.com
Fixes: 9672b0d43782 ("sbitmap: add __sbitmap_queue_get_batch()")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-06-25 10:58:55 -06:00
Ignat Korchagin
f145d411a6 crypto: rsa - implement Chinese Remainder Theorem for faster private key operations
Changes from v1:
  * exported mpi_sub and mpi_mul, otherwise the build fails when RSA is a module

The kernel RSA ASN.1 private key parser already supports only private keys with
additional values to be used with the Chinese Remainder Theorem [1], but these
values are currently not used.

This rudimentary CRT implementation speeds up RSA private key operations for the
following Go benchmark up to ~3x.

This implementation also tries to minimise the allocation of additional MPIs,
so existing MPIs are reused as much as possible (hence the variable names are a
bit weird).

The benchmark used:

```
package keyring_test

import (
	"crypto"
	"crypto/rand"
	"crypto/rsa"
	"crypto/x509"
	"io"
	"syscall"
	"testing"
	"unsafe"
)

type KeySerial int32
type Keyring int32

const (
	KEY_SPEC_PROCESS_KEYRING Keyring = -2
	KEYCTL_PKEY_SIGN                 = 27
)

var (
	keyTypeAsym = []byte("asymmetric\x00")
	sha256pkcs1 = []byte("enc=pkcs1 hash=sha256\x00")
)

func (keyring Keyring) LoadAsym(desc string, payload []byte) (KeySerial, error) {
	cdesc := []byte(desc + "\x00")
	serial, _, errno := syscall.Syscall6(syscall.SYS_ADD_KEY, uintptr(unsafe.Pointer(&keyTypeAsym[0])), uintptr(unsafe.Pointer(&cdesc[0])), uintptr(unsafe.Pointer(&payload[0])), uintptr(len(payload)), uintptr(keyring), uintptr(0))
	if errno == 0 {
		return KeySerial(serial), nil
	}

	return KeySerial(serial), errno
}

type pkeyParams struct {
	key_id         KeySerial
	in_len         uint32
	out_or_in2_len uint32
	__spare        [7]uint32
}

// the output signature buffer is an input parameter here, because we want to
// avoid Go buffer allocation leaking into our benchmarks
func (key KeySerial) Sign(info, digest, out []byte) error {
	var params pkeyParams
	params.key_id = key
	params.in_len = uint32(len(digest))
	params.out_or_in2_len = uint32(len(out))

	_, _, errno := syscall.Syscall6(syscall.SYS_KEYCTL, KEYCTL_PKEY_SIGN, uintptr(unsafe.Pointer(&params)), uintptr(unsafe.Pointer(&info[0])), uintptr(unsafe.Pointer(&digest[0])), uintptr(unsafe.Pointer(&out[0])), uintptr(0))
	if errno == 0 {
		return nil
	}

	return errno
}

func BenchmarkSign(b *testing.B) {
	priv, err := rsa.GenerateKey(rand.Reader, 2048)
	if err != nil {
		b.Fatalf("failed to generate private key: %v", err)
	}

	pkcs8, err := x509.MarshalPKCS8PrivateKey(priv)
	if err != nil {
		b.Fatalf("failed to serialize the private key to PKCS8 blob: %v", err)
	}

	serial, err := KEY_SPEC_PROCESS_KEYRING.LoadAsym("test rsa key", pkcs8)
	if err != nil {
		b.Fatalf("failed to load the private key into the keyring: %v", err)
	}

	b.Logf("loaded test rsa key: %v", serial)

	digest := make([]byte, 32)
	_, err = io.ReadFull(rand.Reader, digest)
	if err != nil {
		b.Fatalf("failed to generate a random digest: %v", err)
	}

	sig := make([]byte, 256)
	for n := 0; n < b.N; n++ {
		err = serial.Sign(sha256pkcs1, digest, sig)
		if err != nil {
			b.Fatalf("failed to sign the digest: %v", err)
		}
	}

	err = rsa.VerifyPKCS1v15(&priv.PublicKey, crypto.SHA256, digest, sig)
	if err != nil {
		b.Fatalf("failed to verify the signature: %v", err)
	}
}
```

[1]: https://en.wikipedia.org/wiki/RSA_(cryptosystem)#Using_the_Chinese_remainder_algorithm

Signed-off-by: Ignat Korchagin <ignat@cloudflare.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-06-24 17:12:29 +08:00
Jian Shen
9676feccac test_bpf: fix incorrect netdev features
The prototype of .features is netdev_features_t, it should use
NETIF_F_LLTX and NETIF_F_HW_VLAN_STAG_TX, not NETIF_F_LLTX_BIT
and NETIF_F_HW_VLAN_STAG_TX_BIT.

Fixes: cf204a718357 ("bpf, testing: Introduce 'gso_linear_no_head_frag' skb_segment test")
Signed-off-by: Jian Shen <shenjian15@huawei.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20220622135002.8263-1-shenjian15@huawei.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-06-22 19:20:20 -07:00
Linus Torvalds
5d770f11a1 Build tool updates:
- Remove obsolete CONFIG_X86_SMAP reference from objtool
 
  - Fix overlapping text section failures in faddr2line for real
 
  - Remove OBJECT_FILES_NON_STANDARD usage from x86 ftrace and replace it
    with finegrained annotations so objtool can validate that code
    correctly.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmKvHbwTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoQlsEADC7BotE1Mi8HITScIlHT19bkZ7Bm6M
 g46RCtUw0u+lo7BclIetNYGWQ0z+D+1DbmZTBv5D22N/Wd6MrwOuuwRVtJ2dEosF
 l/EK1qKLxCGMdQ1x0KOmrjzHB4WmcUH0pQwDHa5N5s5IyrJRRFHtoLk3EYp6QIhi
 mIZTJgGUEe/aS7BtFM5xARprm7JxdbhXgXVbnC1ctzpOi2y6O7F4Ek3x4j5zTGV/
 49iuO2ljqmKdoFlPa1fhuw+O9sTUf+RmKLPEU+2nm7Wg+wlIwKIB3yDDBoZ9Q7lm
 YdorW0/506x4qOwa8KAjxXEb2qgZkL/fQcMS0jQQKNNDD8GHU+3YdjCNbrmZjA+T
 Ib0rs9YsxgLstnmqzU3QfarhgtdzIMhzF0cofoFDK0a8tHLqOhZF4j7wLmYo52ec
 6Hrt8IvJPW2b3TC6KfRAF+uWsDDoaum0OOFVjVu1b9G6fFf0i8h5JGWdkJh8cau9
 0qvAYg0sDjRf7zAZ6kbbGDVCljECyypIdCBF3ihA6yMhJX16VihEWTJyqIaoO+a0
 0XzPvkhX+6mY0kqn+5HvvhLEY4POA8pFKeTPx4eABvKnw7m01I9cJfHa7HEBQHe6
 ZJA7qWMBRrPfK2Ogrx8zohj2sOdHMHlIQU+6Ai+7+BHs13Nje4umGMu2YuZrnfIF
 gQ1ayCN8OsUwVA==
 =12KJ
 -----END PGP SIGNATURE-----

Merge tag 'objtool-urgent-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull build tooling updates from Thomas Gleixner:

 - Remove obsolete CONFIG_X86_SMAP reference from objtool

 - Fix overlapping text section failures in faddr2line for real

 - Remove OBJECT_FILES_NON_STANDARD usage from x86 ftrace and replace it
   with finegrained annotations so objtool can validate that code
   correctly.

* tag 'objtool-urgent-2022-06-19' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ftrace: Remove OBJECT_FILES_NON_STANDARD usage
  faddr2line: Fix overlapping text section failures, the sequel
  objtool: Fix obsolete reference to CONFIG_X86_SMAP
2022-06-19 09:54:16 -05:00
Linus Torvalds
79fe0f863f This push fixes a potential build failure when CRYPTO=m.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmKljZkACgkQxycdCkmx
 i6dCdhAAiXpo4NloUI7H2l5gLOwiQQnJqdLQ8jCnI46/P+RGdLVeZ8vNjLLLRxMl
 Yv+lQ4g/fZ4dQNYM4BG0i2htcrzx2ifgXlRfHQqlQRil/NxbkJ44sv+Gl5tR1erP
 ysfdCdQ41ryTpx9w/xuL5EbtCGB5Zn1AvJLu0AhnxdVfAR98ICnIyub2ExrFsOVs
 JDcBTJvRTiswH5dtc0ngrm7wzKVtUwtQ8G1qDm3HxN5WSe/hl3jI7D+OzgDN1FdZ
 pfPKjYG5kW/CVFS+TJyEXEeHfoue5xVg1d4mm5bTeQw96yeifjOAQBTHN3NtJrXg
 ULGcl9Z0Dz8dL+BnYVfDZ3e6GtcqfJnDJh00WBn8QE1OB3IA4AG3F0ihAMF2rMrR
 DIgMmM5eBd2ZehkyMz0Wm7JDlqJKS7YzM6YC4DI+bhYAn1nPFbhSwMCof3idL7PV
 Kk+JdJI8aUVNhI4tNKUYnjAgiybuM2DvZ2SB5KZWERZkH+hmVwn7umFbKR3P89Pt
 fVun/xpsJl7d37Crozx9eI90gywwvpN+Iz/EnigSNG6Mlg73qHP2B4wtF091eLMm
 GKW/cuvt21HA+g56QGxlOyicV2ELfE5A1fQXIXFRM/geDkgh54jaeLF0tEveEPm9
 Lh0ttMUFkYneuhz/g6yucFpBIahnB2Q+pB2dQ8LOLRgjDMenL4w=
 =kepS
 -----END PGP SIGNATURE-----

Merge tag 'v5.19-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6

Pull crypto fix from Herbert Xu:
 "This fixes a potential build failure when CRYPTO=m"

* tag 'v5.19-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: memneq - move into lib/
2022-06-17 08:27:27 -07:00
wuchi
00c9d56322 lib/error-inject: convert to DEFINE_SEQ_ATTRIBUTE
Use DEFINE_SEQ_ATTRIBUTE helper macro to simplify the code.

Link: https://lkml.kernel.org/r/20220612052015.23283-1-wuchi.zero@gmail.com
Signed-off-by: wuchi <wuchi.zero@gmail.com>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:58:23 -07:00
wuchi
c0af32fdc6 lib/btree: simplify btree_{lookup|update}
btree_{lookup|update} both need to look up node by key, using the common
parts(add function btree_lookup_node) to simplify code.

Link: https://lkml.kernel.org/r/20220607133556.34732-1-wuchi.zero@gmail.com
Signed-off-by: wuchi <wuchi.zero@gmail.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:58:21 -07:00
wuchi
a91befde35 lib/flex_proportions.c: remove local_irq_ops in fprop_new_period()
commit e78d4833c03e28> "lib: Fix possible deadlock in flexible proportion
code" adds the local_irq_ops because percpu_counter_{sum |add} ops'lock
can cause deadlock by interrupts.  Now percpu_counter _{sum|add} ops use
raw_spin_(un)lock_irq*, so revert the commit and resolve the conflict.

Link: https://lkml.kernel.org/r/20220604131502.5190-1-wuchi.zero@gmail.com
Signed-off-by: wuchi <wuchi.zero@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:58:20 -07:00
Guenter Roeck
0cc011c576 lib/list_debug.c: Detect uninitialized lists
In some circumstances, attempts are made to add entries to or to remove
entries from an uninitialized list.  A prime example is
amdgpu_bo_vm_destroy(): It is indirectly called from
ttm_bo_init_reserved() if that function fails, and tries to remove an
entry from a list.  However, that list is only initialized in
amdgpu_bo_create_vm() after the call to ttm_bo_init_reserved() returned
success.  This results in crashes such as

 BUG: kernel NULL pointer dereference, address: 0000000000000000
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] PREEMPT SMP NOPTI
 CPU: 1 PID: 1479 Comm: chrome Not tainted 5.10.110-15768-g29a72e65dae5
 Hardware name: Google Grunt/Grunt, BIOS Google_Grunt.11031.149.0 07/15/2020
 RIP: 0010:__list_del_entry_valid+0x26/0x7d
 ...
 Call Trace:
  amdgpu_bo_vm_destroy+0x48/0x8b
  ttm_bo_init_reserved+0x1d7/0x1e0
  amdgpu_bo_create+0x212/0x476
  ? amdgpu_bo_user_destroy+0x23/0x23
  ? kmem_cache_alloc+0x60/0x271
  amdgpu_bo_create_vm+0x40/0x7d
  amdgpu_vm_pt_create+0xe8/0x24b
 ...

Check if the list's prev and next pointers are NULL to catch such problems.

Link: https://lkml.kernel.org/r/20220531222951.92073-1-linux@roeck-us.net
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Cc: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:58:20 -07:00
Miaohe Lin
ed913b055a lib/test_hmm: avoid accessing uninitialized pages
If make_device_exclusive_range() fails or returns pages marked for
exclusive access less than required, remaining fields of pages will left
uninitialized.  So dmirror_atomic_map() will access those yet
uninitialized fields of pages.  To fix it, do dmirror_atomic_map() iff all
pages are marked for exclusive access (we will break if mapped is less
than required anyway) so we won't access those uninitialized fields of
pages.

Link: https://lkml.kernel.org/r/20220609130835.35110-1-linmiaohe@huawei.com
Fixes: b659baea7546 ("mm: selftests for exclusive device memory")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-06-16 19:48:30 -07:00
Prasad Sodagudi
d593d64f04 lib: Add register read/write tracing support
Generic MMIO read/write i.e., __raw_{read,write}{b,l,w,q} accessors
are typically used to read/write from/to memory mapped registers
and can cause hangs or some undefined behaviour in following few
cases,

* If the access to the register space is unclocked, for example: if
  there is an access to multimedia(MM) block registers without MM
  clocks.

* If the register space is protected and not set to be accessible from
  non-secure world, for example: only EL3 (EL: Exception level) access
  is allowed and any EL2/EL1 access is forbidden.

* If xPU(memory/register protection units) is controlling access to
  certain memory/register space for specific clients.

and more...

Such cases usually results in instant reboot/SErrors/NOC or interconnect
hangs and tracing these register accesses can be very helpful to debug
such issues during initial development stages and also in later stages.

So use ftrace trace events to log such MMIO register accesses which
provides rich feature set such as early enablement of trace events,
filtering capability, dumping ftrace logs on console and many more.

Sample output:

rwmmio_write: __qcom_geni_serial_console_write+0x160/0x1e0 width=32 val=0xa0d5d addr=0xfffffbfffdbff700
rwmmio_post_write: __qcom_geni_serial_console_write+0x160/0x1e0 width=32 val=0xa0d5d addr=0xfffffbfffdbff700
rwmmio_read: qcom_geni_serial_poll_bit+0x94/0x138 width=32 addr=0xfffffbfffdbff610
rwmmio_post_read: qcom_geni_serial_poll_bit+0x94/0x138 width=32 val=0x0 addr=0xfffffbfffdbff610

Co-developed-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
Signed-off-by: Sai Prakash Ranjan <quic_saipraka@quicinc.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-06-15 17:41:12 +02:00
Joe Lawrence
55eb9a6c8b selftests/livepatch: better synchronize test_klp_callbacks_busy
The test_klp_callbacks_busy module conditionally blocks a future
livepatch transition by busy waiting inside its workqueue function,
busymod_work_func().  After scheduling this work, a test livepatch is
loaded, introducing the transition under test.

Both events are marked in the kernel log for later verification, but
there is no synchronization to ensure that busymod_work_func() logs its
function entry message before subsequent selftest commands log their own
messages.  This can lead to a rare test failure due to unexpected
ordering like:

  --- expected
  +++ result
  @@ -1,7 +1,7 @@
   % modprobe test_klp_callbacks_busy block_transition=Y
   test_klp_callbacks_busy: test_klp_callbacks_busy_init
  -test_klp_callbacks_busy: busymod_work_func enter
   % modprobe test_klp_callbacks_demo
  +test_klp_callbacks_busy: busymod_work_func enter
   livepatch: enabling patch 'test_klp_callbacks_demo'
   livepatch: 'test_klp_callbacks_demo': initializing patching transition
   test_klp_callbacks_demo: pre_patch_callback: vmlinux

Force the module init function to wait until busymod_work_func() has
started (and logged its message), before exiting to the next selftest
steps.

Fixes: 547840bd5ae5 ("selftests/livepatch: simplify test-klp-callbacks busy target tests")
Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220602203233.979681-1-joe.lawrence@redhat.com
2022-06-15 10:29:10 +02:00
Linus Torvalds
3cae0d8475 Random number generator fixes for Linux 5.19-rc2.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmKktaEACgkQSfxwEqXe
 A65Q+RAAkyLfX6kxQlTZbglTTiapbGwpk7+VhM0pJfc3guXhIe8kNZ5cX/mfkGeZ
 f/ebUmWSJhqaNz1OIZeBZQ98ESwh8imfWaxWDkqsFh4c+hGsSp2xwIszMn3Hg+7L
 Sm/0Q71eZaSnRBGxWRVBbz3tTppUBS4nJxvFj8iM3jWWUffZa0m/w1lMvqc8kNJu
 kM1ONqb+CEuHOJyltUaH2qEXD6fQE3IOpPdC6PdsFqFX8qLN/pwZO6tpY8tYPt3j
 AUubp8F3eR4Y7WcmMi1b7BmiRgg47jdsS18aqRSH8CuYvIBbXHNM+tuK54zh/888
 d+s9Bj6ALR4z1/a8HXqtudCazYU+1VozWxVIELcpDWTX4wKgqVZ0HLEz7sAEfCgV
 wSIcIY8TRJlOTL43KenbJqbyIOsvAidqWEYz5ogF9WYUlaD82s2j8pUMj4wQoD5w
 VJqF2CaoewU0BOGK7rZFnElN5rPlfEJtNBZQOEo16BzA2tXilPgRFoOmKs9TMRQo
 dGotfp62rUuS14b9x7zc9be0QmtnmwxQKO9U6SRVd5X2HXU/E5PsqTsbt0PKlSbA
 qemw2CehPB+uFs0cqbbeI5VBpVnPQJkaflTfOVr04h7623KJ5Pnblv7/8iSedOOh
 L4ACxPO5I5MrM8rJAR4g85SYVg3A/HaTufA5XtR5QB/qmEaErXg=
 =1Lf0
 -----END PGP SIGNATURE-----

Merge tag 'random-5.19-rc2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull random number generator fixes from Jason Donenfeld:

 - A fix for a 5.19 regression for a case in which early device tree
   initializes the RNG, which flips a static branch.

   On most plaforms, jump labels aren't initialized until much later, so
   this caused splats. On a few mailing list threads, we cooked up easy
   fixes for arm64, arm32, and risc-v. But then things looked slightly
   more involved for xtensa, powerpc, arc, and mips. And at that point,
   when we're patching 7 architectures in a place before the console is
   even available, it seems like the cost/risk just wasn't worth it.

   So random.c works around it now by checking the already exported
   `static_key_initialized` boolean, as though somebody already ran into
   this issue in the past. I'm not super jazzed about that; it'd be
   prettier to not have to complicate downstream code. But I suppose
   it's practical.

 - A few small code nits and adding a missing __init annotation.

 - A change to the default config values to use the cpu and bootloader's
   seeds for initializing the RNG earlier.

   This brings them into line with what all the distros do (Fedora/RHEL,
   Debian, Ubuntu, Gentoo, Arch, NixOS, Alpine, SUSE, and Void... at
   least), and moreover will now give us test coverage in various test
   beds that might have caught the above device tree bug earlier.

 - A change to WireGuard CI's configuration to increase test coverage
   around the RNG.

 - A documentation comment fix to unrelated maintainerless CRC code that
   I was asked to take, I guess because it has to do with polynomials
   (which the RNG thankfully no longer uses).

* tag 'random-5.19-rc2-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  wireguard: selftests: use maximum cpu features and allow rng seeding
  random: remove rng_has_arch_random()
  random: credit cpu and bootloader seeds by default
  random: do not use jump labels before they are initialized
  random: account for arch randomness in bits
  random: mark bootloader randomness code as __init
  random: avoid checking crng_ready() twice in random_init()
  crc-itu-t: fix typo in CRC ITU-T polynomial comment
2022-06-12 10:33:38 -07:00
Jason A. Donenfeld
abfed87e2a crypto: memneq - move into lib/
This is used by code that doesn't need CONFIG_CRYPTO, so move this into
lib/ with a Kconfig option so that it can be selected by whatever needs
it.

This fixes a linker error Zheng pointed out when
CRYPTO_MANAGER_DISABLE_TESTS!=y and CRYPTO=m:

  lib/crypto/curve25519-selftest.o: In function `curve25519_selftest':
  curve25519-selftest.c:(.init.text+0x60): undefined reference to `__crypto_memneq'
  curve25519-selftest.c:(.init.text+0xec): undefined reference to `__crypto_memneq'
  curve25519-selftest.c:(.init.text+0x114): undefined reference to `__crypto_memneq'
  curve25519-selftest.c:(.init.text+0x154): undefined reference to `__crypto_memneq'

Reported-by: Zheng Bin <zhengbin13@huawei.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: stable@vger.kernel.org
Fixes: aa127963f1ca ("crypto: lib/curve25519 - re-add selftests")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-06-12 14:51:51 +08:00
Linus Torvalds
1c27f1fc15 iov_iter: fix build issue due to possible type mis-match
Commit 6c77676645ad ("iov_iter: Fix iter_xarray_get_pages{,_alloc}()")
introduced a problem on some 32-bit architectures (at least arm, xtensa,
csky,sparc and mips), that have a 'size_t' that is 'unsigned int'.

The reason is that we now do

    min(nr * PAGE_SIZE - offset, maxsize);

where 'nr' and 'offset' and both 'unsigned int', and PAGE_SIZE is
'unsigned long'.  As a result, the normal C type rules means that the
first argument to 'min()' ends up being 'unsigned long'.

In contrast, 'maxsize' is of type 'size_t'.

Now, 'size_t' and 'unsigned long' are always the same physical type in
the kernel, so you'd think this doesn't matter, and from an actual
arithmetic standpoint it doesn't.

But on 32-bit architectures 'size_t' is commonly 'unsigned int', even if
it could also be 'unsigned long'.  In that situation, both are unsigned
32-bit types, but they are not the *same* type.

And as a result 'min()' will complain about the distinct types (ignore
the "pointer types" part of the error message: that's an artifact of the
way we have made 'min()' check types for being the same):

  lib/iov_iter.c: In function 'iter_xarray_get_pages':
  include/linux/minmax.h:20:35: error: comparison of distinct pointer types lacks a cast [-Werror]
     20 |         (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
        |                                   ^~
  lib/iov_iter.c:1464:16: note: in expansion of macro 'min'
   1464 |         return min(nr * PAGE_SIZE - offset, maxsize);
        |                ^~~

This was not visible on 64-bit architectures (where we always define
'size_t' to be 'unsigned long').

Force these cases to use 'min_t(size_t, x, y)' to make the type explicit
and avoid the issue.

[ Nit-picky note: technically 'size_t' doesn't have to match 'unsigned
  long' arithmetically. We've certainly historically seen environments
  with 16-bit address spaces and 32-bit 'unsigned long'.

  Similarly, even in 64-bit modern environments, 'size_t' could be its
  own type distinct from 'unsigned long', even if it were arithmetically
  identical.

  So the above type commentary is only really descriptive of the kernel
  environment, not some kind of universal truth for the kinds of wild
  and crazy situations that are allowed by the C standard ]

Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Link: https://lore.kernel.org/all/YqRyL2sIqQNDfky2@debian/
Cc: Jeff Layton <jlayton@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-11 10:30:20 -07:00
Linus Torvalds
b098915985 ITER_XARRAY get_pages fix; now the return value is a lot saner
(and more similar to logics for other flavours)
 
 Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCYqOi+gAKCRBZ7Krx/gZQ
 66sQAQC0axU6eieSzk0WkfqfuwNs4AmSGdUN2BQS5oEUdKIyAgD/US9asCVX1joL
 HU+nRLNLGfA2f5IeVxjyByDmmYDNCQE=
 =w2Z9
 -----END PGP SIGNATURE-----

Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

Pull iov_iter fix from Al Viro:
 "ITER_XARRAY get_pages fix; now the return value is a lot saner (and
  more similar to logics for other flavours)"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  iov_iter: Fix iter_xarray_get_pages{,_alloc}()
2022-06-10 15:53:09 -07:00
David Howells
6c77676645 iov_iter: Fix iter_xarray_get_pages{,_alloc}()
The maths at the end of iter_xarray_get_pages() to calculate the actual
size doesn't work under some circumstances, such as when it's been asked to
extract a partial single page.  Various terms of the equation cancel out
and you end up with actual == offset.  The same issue exists in
iter_xarray_get_pages_alloc().

Fix these to just use min() to select the lesser amount from between the
amount of page content transcribed into the buffer, minus the offset, and
the size limit specified.

This doesn't appear to have caused a problem yet upstream because network
filesystems aren't getting the pages from an xarray iterator, but rather
passing it directly to the socket, which just iterates over it.  Cachefiles
*does* do DIO from one to/from ext4/xfs/btrfs/etc. but it always asks for
whole pages to be written or read.

Fixes: 7ff5062079ef ("iov_iter: Add ITER_XARRAY")
Reported-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Alexander Viro <viro@zeniv.linux.org.uk>
cc: Dominique Martinet <asmadeus@codewreck.org>
cc: Mike Marshall <hubcap@omnibond.com>
cc: Gao Xiang <xiang@kernel.org>
cc: linux-afs@lists.infradead.org
cc: v9fs-developer@lists.sourceforge.net
cc: devel@lists.orangefs.org
cc: linux-erofs@lists.ozlabs.org
cc: linux-cachefs@redhat.com
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-06-10 15:56:32 -04:00
Jason A. Donenfeld
e052a478a7 random: remove rng_has_arch_random()
With arch randomness being used by every distro and enabled in
defconfigs, the distinction between rng_has_arch_random() and
rng_is_initialized() is now rather small. In fact, the places where they
differ are now places where paranoid users and system builders really
don't want arch randomness to be used, in which case we should respect
that choice, or places where arch randomness is known to be broken, in
which case that choice is all the more important. So this commit just
removes the function and its one user.

Reviewed-by: Petr Mladek <pmladek@suse.com> # for vsprintf.c
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-10 11:29:48 +02:00
Jason A. Donenfeld
2d16803c56 crypto: blake2s - remove shash module
BLAKE2s has no currently known use as an shash. Just remove all of this
unnecessary plumbing. Removing this shash was something we talked about
back when we were making BLAKE2s a built-in, but I simply never got
around to doing it. So this completes that project.

Importantly, this fixs a bug in which the lib code depends on
crypto_simd_disabled_for_test, causing linker errors.

Also add more alignment tests to the selftests and compare SIMD and
non-SIMD compression functions, to make up for what we lose from
testmgr.c.

Reported-by: gaochao <gaochao49@huawei.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 6048fdcc5f26 ("lib/crypto: blake2s: include as built-in")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-06-10 16:43:49 +08:00
Jason A. Donenfeld
920b0442b9 crypto: memneq - move into lib/
This is used by code that doesn't need CONFIG_CRYPTO, so move this into
lib/ with a Kconfig option so that it can be selected by whatever needs
it.

This fixes a linker error Zheng pointed out when
CRYPTO_MANAGER_DISABLE_TESTS!=y and CRYPTO=m:

  lib/crypto/curve25519-selftest.o: In function `curve25519_selftest':
  curve25519-selftest.c:(.init.text+0x60): undefined reference to `__crypto_memneq'
  curve25519-selftest.c:(.init.text+0xec): undefined reference to `__crypto_memneq'
  curve25519-selftest.c:(.init.text+0x114): undefined reference to `__crypto_memneq'
  curve25519-selftest.c:(.init.text+0x154): undefined reference to `__crypto_memneq'

Reported-by: Zheng Bin <zhengbin13@huawei.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: stable@vger.kernel.org
Fixes: aa127963f1ca ("crypto: lib/curve25519 - re-add selftests")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-06-10 16:40:19 +08:00
Matthew Wilcox (Oracle)
69a37a8ba1 mm/huge_memory: Fix xarray node memory leak
If xas_split_alloc() fails to allocate the necessary nodes to complete the
xarray entry split, it sets the xa_state to -ENOMEM, which xas_nomem()
then interprets as "Please allocate more memory", not as "Please free
any unnecessary memory" (which was the intended outcome).  It's confusing
to use xas_nomem() to free memory in this context, so call xas_destroy()
instead.

Reported-by: syzbot+9e27a75a8c24f3fe75c1@syzkaller.appspotmail.com
Fixes: 6b24ca4a1a8d ("mm: Use multi-index entries in the page cache")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-06-09 16:24:25 -04:00
Roger Knecht
7799164595 crc-itu-t: fix typo in CRC ITU-T polynomial comment
The code comment says that the polynomial is x^16 + x^12 + x^15 + 1, but
the correct polynomial is x^16 + x^12 + x^5 + 1. Quoting from page 2 in
the ITU-T V.41 specification [1]:

  2 Encoding and checking process

  The service bits and information bits, taken in conjunction,
  correspond to the coefficients of a message polynomial having terms
  from x^(n-1) (n = total number of bits in a block or sequence) down to
  x^16. This polynomial is divided, modulo 2, by the generating
  polynomial x^16 + x^12 + x^5 + 1.

The hex (truncated) polynomial 0x1021 and CRC code implementation are
correct, however.

[1] https://www.itu.int/rec/T-REC-V.41-198811-I/en

Signed-off-by: Roger Knecht <roger@norberthealth.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-06-07 10:27:38 +02:00
Josh Poimboeuf
c2f75a43f5 objtool: Fix obsolete reference to CONFIG_X86_SMAP
CONFIG_X86_SMAP no longer exists.  For objtool's purposes it has been
replaced with CONFIG_HAVE_UACCESS_VALIDATION.

Fixes: 03f16cd020eb ("objtool: Add CONFIG_OBJTOOL")
Reported-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Link: https://lore.kernel.org/r/44c57668768c1ba1b4ba1ff541ec54781636e07c.1654101721.git.jpoimboe@kernel.org
2022-06-06 11:49:55 -07:00
Linus Torvalds
d0e60d46bc Bitmap patches for 5.19-rc1
This series includes the following patchsets:
  - bitmap: optimize bitmap_weight() usage(w/o bitmap_weight_cmp), from me;
  - lib/bitmap.c make bitmap_print_bitmask_to_buf parseable, from Mauro
    Carvalho Chehab;
  - include/linux/find: Fix documentation, from Anna-Maria Behnsen;
  - bitmap: fix conversion from/to fix-sized arrays, from me;
  - bitmap: Fix return values to be unsigned, from Kees Cook.
 
 It has been in linux-next for at least a week with no problems.
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmKaEzYACgkQsUSA/Tof
 vsiGKwv8Dgr3G0mLbSfmHZqdFMIsmSmwhxlEH6eBNtX6vjQbGafe/Buhj/1oSY8N
 NYC4+5Br6s7MmMRth3Kp6UECdl94TS3Ka06T+lVBKkG+C+B1w1/svqUMM2ZCQF3e
 Z5R/HhR6av9X9Qb2mWSasWLkWp629NjdtRsJSDWiVt1emVVwh+iwxQnMH9VuE+ao
 z3mvaQfSRhe4h+xCZOiohzFP+0jZb1EnPrQAIVzNUjigo7mglpNvVyO7p/8LU7gD
 dIjfGmSbtsHU72J+/0lotRqjhjORl1F/EILf8pIzx5Ga7ExUGhOzGWAj7/3uZxfA
 Cp1Z/QV271MGwv/sNdSPwCCJHf51eOmsbyOyUScjb3gFRwIStEa1jB4hKwLhS5wF
 3kh4kqu3WGuIQAdxkUpDBsy3CQjAPDkvtRJorwyWGbjwa9xUETESAgH7XCCTsgWc
 0sIuldWWaxC581+fAP1Dzmo8uuWBURTaOrVmRMILQHMTw54zoFyLY+VI9gEAT9aV
 gnPr3M4F
 =U7DN
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linux

Pull bitmap updates from Yury Norov:

 - bitmap: optimize bitmap_weight() usage, from me

 - lib/bitmap.c make bitmap_print_bitmask_to_buf parseable, from Mauro
   Carvalho Chehab

 - include/linux/find: Fix documentation, from Anna-Maria Behnsen

 - bitmap: fix conversion from/to fix-sized arrays, from me

 - bitmap: Fix return values to be unsigned, from Kees Cook

It has been in linux-next for at least a week with no problems.

* tag 'bitmap-for-5.19-rc1' of https://github.com/norov/linux: (31 commits)
  nodemask: Fix return values to be unsigned
  bitmap: Fix return values to be unsigned
  KVM: x86: hyper-v: replace bitmap_weight() with hweight64()
  KVM: x86: hyper-v: fix type of valid_bank_mask
  ia64: cleanup remove_siblinginfo()
  drm/amd/pm: use bitmap_{from,to}_arr32 where appropriate
  KVM: s390: replace bitmap_copy with bitmap_{from,to}_arr64 where appropriate
  lib/bitmap: add test for bitmap_{from,to}_arr64
  lib: add bitmap_{from,to}_arr64
  lib/bitmap: extend comment for bitmap_(from,to)_arr32()
  include/linux/find: Fix documentation
  lib/bitmap.c make bitmap_print_bitmask_to_buf parseable
  MAINTAINERS: add cpumask and nodemask files to BITMAP_API
  arch/x86: replace nodes_weight with nodes_empty where appropriate
  mm/vmstat: replace cpumask_weight with cpumask_empty where appropriate
  clocksource: replace cpumask_weight with cpumask_empty in clocksource.c
  genirq/affinity: replace cpumask_weight with cpumask_empty where appropriate
  irq: mips: replace cpumask_weight with cpumask_empty where appropriate
  drm/i915/pmu: replace cpumask_weight with cpumask_empty where appropriate
  arch/x86: replace cpumask_weight with cpumask_empty where appropriate
  ...
2022-06-04 14:04:27 -07:00
Linus Torvalds
500a434fc5 Driver core changes for 5.19-rc1
Here is the set of driver core changes for 5.19-rc1.
 
 Note, I'm not really happy with this pull request as-is, see below for
 details, but overall this is all good for everything but a small set of
 systems, which we have a fix for already.
 
 Lots of tiny driver core changes and cleanups happened this cycle,
 but the two major things were:
 
 	- firmware_loader reorganization and additions including the
 	  ability to have XZ compressed firmware images and the ability
 	  for userspace to initiate the firmware load when it needs to,
 	  instead of being always initiated by the kernel. FPGA devices
 	  specifically want this ability to have their firmware changed
 	  over the lifetime of the system boot, and this allows them to
 	  work without having to come up with yet-another-custom-uapi
 	  interface for loading firmware for them.
 	- physical location support added to sysfs so that devices that
 	  know this information, can tell userspace where they are
 	  located in a common way.  Some ACPI devices already support
 	  this today, and more bus types should support this in the
 	  future.
 
 Smaller changes included:
 	- driver_override api cleanups and fixes
 	- error path cleanups and fixes
 	- get_abi script fixes
 	- deferred probe timeout changes.
 
 It's that last change that I'm the most worried about.  It has been
 reported to cause boot problems for a number of systems, and I have a
 tested patch series that resolves this issue.  But I didn't get it
 merged into my tree before 5.18-final came out, so it has not gotten any
 linux-next testing.
 
 I'll send the fixup patches (there are 2) as a follow-on series to this
 pull request if you want to take them directly, _OR_ I can just revert
 the probe timeout changes and they can wait for the next -rc1 merge
 cycle.  Given that the fixes are tested, and pretty simple, I'm leaning
 toward that choice.  Sorry this all came at the end of the merge window,
 I should have resolved this all 2 weeks ago, that's my fault as it was
 in the middle of some travel for me.
 
 All have been tested in linux-next for weeks, with no reported issues
 other than the above-mentioned boot time outs.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYpnv/A8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yk/fACgvmenbo5HipqyHnOmTQlT50xQ9EYAn2eTq6ai
 GkjLXBGNWOPBa5cU52qf
 =yEi/
 -----END PGP SIGNATURE-----

Merge tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull driver core updates from Greg KH:
 "Here is the set of driver core changes for 5.19-rc1.

  Lots of tiny driver core changes and cleanups happened this cycle, but
  the two major things are:

   - firmware_loader reorganization and additions including the ability
     to have XZ compressed firmware images and the ability for userspace
     to initiate the firmware load when it needs to, instead of being
     always initiated by the kernel. FPGA devices specifically want this
     ability to have their firmware changed over the lifetime of the
     system boot, and this allows them to work without having to come up
     with yet-another-custom-uapi interface for loading firmware for
     them.

   - physical location support added to sysfs so that devices that know
     this information, can tell userspace where they are located in a
     common way. Some ACPI devices already support this today, and more
     bus types should support this in the future.

  Smaller changes include:

   - driver_override api cleanups and fixes

   - error path cleanups and fixes

   - get_abi script fixes

   - deferred probe timeout changes.

  It's that last change that I'm the most worried about. It has been
  reported to cause boot problems for a number of systems, and I have a
  tested patch series that resolves this issue. But I didn't get it
  merged into my tree before 5.18-final came out, so it has not gotten
  any linux-next testing.

  I'll send the fixup patches (there are 2) as a follow-on series to this
  pull request.

  All have been tested in linux-next for weeks, with no reported issues
  other than the above-mentioned boot time-outs"

* tag 'driver-core-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits)
  driver core: fix deadlock in __device_attach
  kernfs: Separate kernfs_pr_cont_buf and rename_lock.
  topology: Remove unused cpu_cluster_mask()
  driver core: Extend deferred probe timeout on driver registration
  MAINTAINERS: add Russ Weight as a firmware loader maintainer
  driver: base: fix UAF when driver_attach failed
  test_firmware: fix end of loop test in upload_read_show()
  driver core: location: Add "back" as a possible output for panel
  driver core: location: Free struct acpi_pld_info *pld
  driver core: Add "*" wildcard support to driver_async_probe cmdline param
  driver core: location: Check for allocations failure
  arch_topology: Trace the update thermal pressure
  kernfs: Rename kernfs_put_open_node to kernfs_unlink_open_file.
  export: fix string handling of namespace in EXPORT_SYMBOL_NS
  rpmsg: use local 'dev' variable
  rpmsg: Fix calling device_lock() on non-initialized device
  firmware_loader: describe 'module' parameter of firmware_upload_register()
  firmware_loader: Move definitions from sysfs_upload.h to sysfs.h
  firmware_loader: Fix configs for sysfs split
  selftests: firmware: Add firmware upload selftests
  ...
2022-06-03 11:48:47 -07:00
Linus Torvalds
04d93b2b8b SPDX changes for 5.19-rc1
Here are some SPDX (i.e. licensing) changes for 5.19-rc1
 
 The SPDX-labeling effort has started to pick up again, so here are some
 changes for various parts of the tree that are related to this effort.
 
 Included in here are:
 	- freevxfs license updates
 	- spihash.c license cleanups
 	- spdxcheck script updates to make things easier to work with
 	  going forward
 
 All of the license updates came from the original authors/copyright
 holders of the code involved.
 
 All of these have been in linux-next for weeks with no reported issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYpngmg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yl41wCgzt9M0/9hLjVV9UIW2l2phyJQZPQAoK7u0RUU
 tYRRT2gSUwAHlu3khZSS
 =fjdf
 -----END PGP SIGNATURE-----

Merge tag 'spdx-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx

Pull SPDX updates from Greg KH:
 "Here are some SPDX license marker changes.

  The SPDX-labeling effort has started to pick up again, so here are
  some changes for various parts of the tree that are related to this
  effort.

  Included in here are:

   - freevxfs license updates

   - spihash.c license cleanups

   - spdxcheck script updates to make things easier to work with going
     forward

  All of the license updates came from the original authors/copyright
  holders of the code involved.

  All of these have been in linux-next for weeks with no reported
  issues"

* tag 'spdx-5.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
  siphash: add SPDX tags as sole licensing authority
  scripts/spdxcheck: Exclude top-level README
  scripts/spdxcheck: Exclude MAINTAINERS/CREDITS
  scripts/spdxcheck: Exclude config directories
  scripts/spdxcheck: Put excluded files and directories into a separate file
  scripts/spdxcheck: Add option to display files without SPDX
  scripts/spdxcheck: Add [sub]directory statistics
  scripts/spdxcheck: Add directory statistics
  scripts/spdxcheck: Add percentage to statistics
  freevxfs: relicense to GPLv2 only
2022-06-03 10:34:34 -07:00
Kees Cook
0dfe54071d nodemask: Fix return values to be unsigned
The nodemask routines had mixed return values that provided potentially
signed return values that could never happen. This was leading to the
compiler getting confusing about the range of possible return values
(it was thinking things could be negative where they could not be). Fix
all the nodemask routines that should be returning unsigned
(or bool) values. Silences:

 mm/swapfile.c: In function ‘setup_swap_info’:
 mm/swapfile.c:2291:47: error: array subscript -1 is below array bounds of ‘struct plist_node[]’ [-Werror=array-bounds]
  2291 |                                 p->avail_lists[i].prio = 1;
       |                                 ~~~~~~~~~~~~~~^~~
 In file included from mm/swapfile.c:16:
 ./include/linux/swap.h:292:27: note: while referencing ‘avail_lists’
   292 |         struct plist_node avail_lists[]; /*
       |                           ^~~~~~~~~~~

Reported-by: Christophe de Dinechin <dinechin@redhat.com>
Link: https://lore.kernel.org/lkml/20220414150855.2407137-3-dinechin@redhat.com/
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-06-03 06:52:58 -07:00
Kees Cook
005f17007f bitmap: Fix return values to be unsigned
Both nodemask and bitmap routines had mixed return values that provided
potentially signed return values that could never happen. This was
leading to the compiler getting confusing about the range of possible
return values (it was thinking things could be negative where they could
not be). In preparation for fixing nodemask, fix all the bitmap routines
that should be returning unsigned (or bool) values.

Cc: Yury Norov <yury.norov@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Christophe de Dinechin <dinechin@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-06-03 06:52:58 -07:00
Yury Norov
2c523550b9 lib/bitmap: add test for bitmap_{from,to}_arr64
Test newly added bitmap_{from,to}_arr64() functions similarly to
already existing bitmap_{from,to}_arr32() tests.

CC: Alexander Gordeev <agordeev@linux.ibm.com>
CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Christian Borntraeger <borntraeger@linux.ibm.com>
CC: Claudio Imbrenda <imbrenda@linux.ibm.com>
CC: David Hildenbrand <david@redhat.com>
CC: Heiko Carstens <hca@linux.ibm.com>
CC: Janosch Frank <frankja@linux.ibm.com>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
CC: Sven Schnelle <svens@linux.ibm.com>
CC: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-06-03 06:52:58 -07:00
Yury Norov
0a97953fd2 lib: add bitmap_{from,to}_arr64
Manipulating 64-bit arrays with bitmap functions is potentially dangerous
because on 32-bit BE machines the order of halfwords doesn't match.
Another issue is that compiler may throw a warning about out-of-boundary
access.

This patch adds bitmap_{from,to}_arr64 functions in addition to existing
bitmap_{from,to}_arr32.

CC: Alexander Gordeev <agordeev@linux.ibm.com>
CC: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
CC: Christian Borntraeger <borntraeger@linux.ibm.com>
CC: Claudio Imbrenda <imbrenda@linux.ibm.com>
CC: David Hildenbrand <david@redhat.com>
CC: Heiko Carstens <hca@linux.ibm.com>
CC: Janosch Frank <frankja@linux.ibm.com>
CC: Rasmus Villemoes <linux@rasmusvillemoes.dk>
CC: Sven Schnelle <svens@linux.ibm.com>
CC: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-06-03 06:52:58 -07:00
Mauro Carvalho Chehab
430cd4a28d lib/bitmap.c make bitmap_print_bitmask_to_buf parseable
The documentation of such function is not on a proper ReST format,
as reported by Sphinx:

    Documentation/core-api/kernel-api:81: ./lib/bitmap.c:532: WARNING: Unexpected indentation.
    Documentation/core-api/kernel-api:81: ./lib/bitmap.c:526: WARNING: Inline emphasis start-string without end-string.
    Documentation/core-api/kernel-api:81: ./lib/bitmap.c:532: WARNING: Inline emphasis start-string without end-string.
    Documentation/core-api/kernel-api:81: ./lib/bitmap.c:532: WARNING: Inline emphasis start-string without end-string.
    Documentation/core-api/kernel-api:81: ./lib/bitmap.c:533: WARNING: Block quote ends without a blank line; unexpected unindent.
    Documentation/core-api/kernel-api:81: ./lib/bitmap.c:536: WARNING: Definition list ends without a blank line; unexpected unindent.
    Documentation/core-api/kernel-api:81: ./lib/bitmap.c:542: WARNING: Unexpected indentation.
    Documentation/core-api/kernel-api:81: ./lib/bitmap.c:536: WARNING: Inline emphasis start-string without end-string.
    Documentation/core-api/kernel-api:81: ./lib/bitmap.c:536: WARNING: Inline emphasis start-string without end-string.
    Documentation/core-api/kernel-api:81: ./lib/bitmap.c:543: WARNING: Block quote ends without a blank line; unexpected unindent.
    Documentation/core-api/kernel-api:81: ./lib/bitmap.c:552: WARNING: Unexpected indentation.
    Documentation/core-api/kernel-api:81: ./lib/bitmap.c:545: WARNING: Inline emphasis start-string without end-string.
    Documentation/core-api/kernel-api:81: ./lib/bitmap.c:545: WARNING: Inline emphasis start-string without end-string.
    Documentation/core-api/kernel-api:81: ./lib/bitmap.c:552: WARNING: Inline emphasis start-string without end-string.
    Documentation/core-api/kernel-api:81: ./lib/bitmap.c:552: WARNING: Inline emphasis start-string without end-string.
    Documentation/core-api/kernel-api:81: ./lib/bitmap.c:554: WARNING: Block quote ends without a blank line; unexpected unindent.
    Documentation/core-api/kernel-api:81: ./lib/bitmap.c:556: WARNING: Definition list ends without a blank line; unexpected unindent.
    Documentation/core-api/kernel-api:81: ./lib/bitmap.c:580: WARNING: Unexpected indentation.

So, the produced output at:

	https://www.kernel.org/doc/html/latest/core-api/kernel-api.html?#c.bitmap_print_bitmask_to_buf

is broken. Fix it by adding spaces and marking the literal blocks.

Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2022-06-03 06:52:57 -07:00
Stephen Brennan
d1dc87763f assoc_array: Fix BUG_ON during garbage collect
A rare BUG_ON triggered in assoc_array_gc:

    [3430308.818153] kernel BUG at lib/assoc_array.c:1609!

Which corresponded to the statement currently at line 1593 upstream:

    BUG_ON(assoc_array_ptr_is_meta(p));

Using the data from the core dump, I was able to generate a userspace
reproducer[1] and determine the cause of the bug.

[1]: https://github.com/brenns10/kernel_stuff/tree/master/assoc_array_gc

After running the iterator on the entire branch, an internal tree node
looked like the following:

    NODE (nr_leaves_on_branch: 3)
      SLOT [0] NODE (2 leaves)
      SLOT [1] NODE (1 leaf)
      SLOT [2..f] NODE (empty)

In the userspace reproducer, the pr_devel output when compressing this
node was:

    -- compress node 0x5607cc089380 --
    free=0, leaves=0
    [0] retain node 2/1 [nx 0]
    [1] fold node 1/1 [nx 0]
    [2] fold node 0/1 [nx 2]
    [3] fold node 0/2 [nx 2]
    [4] fold node 0/3 [nx 2]
    [5] fold node 0/4 [nx 2]
    [6] fold node 0/5 [nx 2]
    [7] fold node 0/6 [nx 2]
    [8] fold node 0/7 [nx 2]
    [9] fold node 0/8 [nx 2]
    [10] fold node 0/9 [nx 2]
    [11] fold node 0/10 [nx 2]
    [12] fold node 0/11 [nx 2]
    [13] fold node 0/12 [nx 2]
    [14] fold node 0/13 [nx 2]
    [15] fold node 0/14 [nx 2]
    after: 3

At slot 0, an internal node with 2 leaves could not be folded into the
node, because there was only one available slot (slot 0). Thus, the
internal node was retained. At slot 1, the node had one leaf, and was
able to be folded in successfully. The remaining nodes had no leaves,
and so were removed. By the end of the compression stage, there were 14
free slots, and only 3 leaf nodes. The tree was ascended and then its
parent node was compressed. When this node was seen, it could not be
folded, due to the internal node it contained.

The invariant for compression in this function is: whenever
nr_leaves_on_branch < ASSOC_ARRAY_FAN_OUT, the node should contain all
leaf nodes. The compression step currently cannot guarantee this, given
the corner case shown above.

To fix this issue, retry compression whenever we have retained a node,
and yet nr_leaves_on_branch < ASSOC_ARRAY_FAN_OUT. This second
compression will then allow the node in slot 1 to be folded in,
satisfying the invariant. Below is the output of the reproducer once the
fix is applied:

    -- compress node 0x560e9c562380 --
    free=0, leaves=0
    [0] retain node 2/1 [nx 0]
    [1] fold node 1/1 [nx 0]
    [2] fold node 0/1 [nx 2]
    [3] fold node 0/2 [nx 2]
    [4] fold node 0/3 [nx 2]
    [5] fold node 0/4 [nx 2]
    [6] fold node 0/5 [nx 2]
    [7] fold node 0/6 [nx 2]
    [8] fold node 0/7 [nx 2]
    [9] fold node 0/8 [nx 2]
    [10] fold node 0/9 [nx 2]
    [11] fold node 0/10 [nx 2]
    [12] fold node 0/11 [nx 2]
    [13] fold node 0/12 [nx 2]
    [14] fold node 0/13 [nx 2]
    [15] fold node 0/14 [nx 2]
    internal nodes remain despite enough space, retrying
    -- compress node 0x560e9c562380 --
    free=14, leaves=1
    [0] fold node 2/15 [nx 0]
    after: 3

Changes
=======
DH:
 - Use false instead of 0.
 - Reorder the inserted lines in a couple of places to put retained before
   next_slot.

ver #2)
 - Fix typo in pr_devel, correct comparison to "<="

Fixes: 3cb989501c26 ("Add a generic associative array implementation.")
Cc: <stable@vger.kernel.org>
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Andrew Morton <akpm@linux-foundation.org>
cc: keyrings@vger.kernel.org
Link: https://lore.kernel.org/r/20220511225517.407935-1-stephen.s.brennan@oracle.com/ # v1
Link: https://lore.kernel.org/r/20220512215045.489140-1-stephen.s.brennan@oracle.com/ # v2
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-06-01 18:29:06 -07:00
Linus Torvalds
76bfd3de34 tracing updates for 5.19:
- The majority of the changes are for fixes and clean ups.
 
 Noticeable changes:
 
 - Rework trace event triggers code to be easier to interact with.
 
 - Support for embedding bootconfig with the kernel (as suppose to having it
   embedded in initram). This is useful for embedded boards without initram
   disks.
 
 - Speed up boot by parallelizing the creation of tracefs files.
 
 - Allow absolute ring buffer timestamps handle timestamps that use more than
   59 bits.
 
 - Added new tracing clock "TAI" (International Atomic Time)
 
 - Have weak functions show up in available_filter_function list as:
    __ftrace_invalid_address___<invalid-offset>
   instead of using the name of the function before it.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYpOgXRQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qjkKAQDbpemxvpFyJlZqT8KgEIXubu+ag2/q
 p0XDHaPS0zF9OQEAjTxg6GMEbnFYl6fzxZtOoEbiaQ7ppfdhRI8t6sSMVA8=
 =+nDD
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:
 "The majority of the changes are for fixes and clean ups.

  Notable changes:

   - Rework trace event triggers code to be easier to interact with.

   - Support for embedding bootconfig with the kernel (as suppose to
     having it embedded in initram). This is useful for embedded boards
     without initram disks.

   - Speed up boot by parallelizing the creation of tracefs files.

   - Allow absolute ring buffer timestamps handle timestamps that use
     more than 59 bits.

   - Added new tracing clock "TAI" (International Atomic Time)

   - Have weak functions show up in available_filter_function list as:
     __ftrace_invalid_address___<invalid-offset> instead of using the
     name of the function before it"

* tag 'trace-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (52 commits)
  ftrace: Add FTRACE_MCOUNT_MAX_OFFSET to avoid adding weak function
  tracing: Fix comments for event_trigger_separate_filter()
  x86/traceponit: Fix comment about irq vector tracepoints
  x86,tracing: Remove unused headers
  ftrace: Clean up hash direct_functions on register failures
  tracing: Fix comments of create_filter()
  tracing: Disable kcov on trace_preemptirq.c
  tracing: Initialize integer variable to prevent garbage return value
  ftrace: Fix typo in comment
  ftrace: Remove return value of ftrace_arch_modify_*()
  tracing: Cleanup code by removing init "char *name"
  tracing: Change "char *" string form to "char []"
  tracing/timerlat: Do not wakeup the thread if the trace stops at the IRQ
  tracing/timerlat: Print stacktrace in the IRQ handler if needed
  tracing/timerlat: Notify IRQ new max latency only if stop tracing is set
  kprobes: Fix build errors with CONFIG_KRETPROBES=n
  tracing: Fix return value of trace_pid_write()
  tracing: Fix potential double free in create_var_ref()
  tracing: Use strim() to remove whitespace instead of doing it manually
  ftrace: Deal with error return code of the ftrace_process_locs() function
  ...
2022-05-29 10:31:36 -07:00
Jason A. Donenfeld
ca7984dff9 Revert "crypto: poly1305 - cleanup stray CRYPTO_LIB_POLY1305_RSIZE"
This reverts commit 8bdc2a190105e862dfe7a4033f2fd385b7e58ae8.

It got merged a bit prematurely and shortly after the kernel test robot
and Sudip pointed out build failures:

  arm: imx_v6_v7_defconfig and multi_v7_defconfig
  mips: decstation_64_defconfig, decstation_defconfig, decstation_r4k_defconfig

  In file included from crypto/chacha20poly1305.c:13:
  include/crypto/poly1305.h:56:46: error: 'CONFIG_CRYPTO_LIB_POLY1305_RSIZE' undeclared here (not in a function); did you mean 'CONFIG_CRYPTO_POLY1305_MODULE'?
     56 |                 struct poly1305_key opaque_r[CONFIG_CRYPTO_LIB_POLY1305_RSIZE];
        |                                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

We could attempt to fix this by listing the dependencies piecemeal, but
it's not as obvious as it looks: drivers like caam use this macro in
headers even if there's no .o compiled in that makes use of it.  So
actually fixing this might require a bit more of a comprehensive
approach, rather than whack-a-mole with hunting down which drivers use
which headers which use this macro.

Therefore, this commit just reverts the change, and maybe the problem
can be visited on the next rainy day.

Reported-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 8bdc2a190105 ("crypto: poly1305 - cleanup stray CRYPTO_LIB_POLY1305_RSIZE")
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-05-28 10:04:06 -07:00
Linus Torvalds
9d004b2f4f cxl for 5.19
- Add driver-core infrastructure for lockdep validation of
   device_lock(), and fixup a deadlock report that was previously hidden
   behind the 'lockdep no validate' policy.
 
 - Add CXL _OSC support for claiming native control of CXL hotplug and
   error handling.
 
 - Disable suspend in the presence of CXL memory unless and until a
   protocol is identified for restoring PCI device context from memory
   hosted on CXL PCI devices.
 
 - Add support for snooping CXL mailbox commands to protect against
   inopportune changes, like set-partition with the 'immediate' flag set.
 
 - Rework how the driver detects legacy CXL 1.1 configurations (CXL DVSEC
   / 'mem_enable') before enabling new CXL 2.0 decode configurations (CXL
   HDM Capability).
 
 - Miscellaneous cleanups and fixes from -next exposure.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYpFUogAKCRDfioYZHlFs
 Zz+VAP9o/NkYhbaM2Ne9ImgsdJii96gA8nN7q/q/ZoXjsSx2WQD+NRC5d3ZwZDCa
 9YKEkntnvbnAZOCs+ZUuyZBgNh6vsgU=
 =p92w
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull cxl updates from Dan Williams:
 "Compute Express Link (CXL) updates for this cycle.

  The highlight is new driver-core infrastructure and CXL subsystem
  changes for allowing lockdep to validate device_lock() usage. Thanks
  to PeterZ for setting me straight on the current capabilities of the
  lockdep API, and Greg acked it as well.

  On the CXL ACPI side this update adds support for CXL _OSC so that
  platform firmware knows that it is safe to still grant Linux native
  control of PCIe hotplug and error handling in the presence of CXL
  devices. A circular dependency problem was discovered between suspend
  and CXL memory for cases where the suspend image might be stored in
  CXL memory where that image also contains the PCI register state to
  restore to re-enable the device. Disable suspend for now until an
  architecture is defined to clarify that conflict.

  Lastly a collection of reworks, fixes, and cleanups to the CXL
  subsystem where support for snooping mailbox commands and properly
  handling the "mem_enable" flow are the highlights.

  Summary:

   - Add driver-core infrastructure for lockdep validation of
     device_lock(), and fixup a deadlock report that was previously
     hidden behind the 'lockdep no validate' policy.

   - Add CXL _OSC support for claiming native control of CXL hotplug and
     error handling.

   - Disable suspend in the presence of CXL memory unless and until a
     protocol is identified for restoring PCI device context from memory
     hosted on CXL PCI devices.

   - Add support for snooping CXL mailbox commands to protect against
     inopportune changes, like set-partition with the 'immediate' flag
     set.

   - Rework how the driver detects legacy CXL 1.1 configurations (CXL
     DVSEC / 'mem_enable') before enabling new CXL 2.0 decode
     configurations (CXL HDM Capability).

   - Miscellaneous cleanups and fixes from -next exposure"

* tag 'cxl-for-5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (47 commits)
  cxl/port: Enable HDM Capability after validating DVSEC Ranges
  cxl/port: Reuse 'struct cxl_hdm' context for hdm init
  cxl/port: Move endpoint HDM Decoder Capability init to port driver
  cxl/pci: Drop @info argument to cxl_hdm_decode_init()
  cxl/mem: Merge cxl_dvsec_ranges() and cxl_hdm_decode_init()
  cxl/mem: Skip range enumeration if mem_enable clear
  cxl/mem: Consolidate CXL DVSEC Range enumeration in the core
  cxl/pci: Move cxl_await_media_ready() to the core
  cxl/mem: Validate port connectivity before dvsec ranges
  cxl/mem: Fix cxl_mem_probe() error exit
  cxl/pci: Drop wait_for_valid() from cxl_await_media_ready()
  cxl/pci: Consolidate wait_for_media() and wait_for_media_ready()
  cxl/mem: Drop mem_enabled check from wait_for_media()
  nvdimm: Fix firmware activation deadlock scenarios
  device-core: Kill the lockdep_mutex
  nvdimm: Drop nd_device_lock()
  ACPI: NFIT: Drop nfit_device_lock()
  nvdimm: Replace lockdep_mutex with local lock classes
  cxl: Drop cxl_device_lock()
  cxl/acpi: Add root device lockdep validation
  ...
2022-05-27 21:24:19 -07:00