7579 Commits

Author SHA1 Message Date
Linus Torvalds
5628b8de12 Random number generator changes for Linux 5.18-rc1.
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmIzwtEACgkQSfxwEqXe
 A67NCBAA1+U01HXx4ethmmy1m2pXHAIwngI7PP0QzyZtmoloWockdN1lRfQ1C0uJ
 Whk/9Hc9G7iujznsxOnCS+LeNwRzd7CjtFbTgK+yGIRKwL9GFcVwA5nrifP9TjqZ
 FWmTIomjjmA06YRYsNOdNSQdN6DdpQz8xLw0EqVOZerI4ITFErYlW8lLqOOKY99N
 f9glQK75kh41SUgo+K3JSn46fhB95HldL6dYSZzjQ6QsVKBQuQTDE9ryfrH2XZDw
 xI2nf/ycXPUBv7Bb+0op+7ES++CoDigM2nIyxapEj3ZkpplxL4M+cCIHq3Juzfwm
 jDdbZbs5SqDszOQM/dvCJSR+S/D3QIKdv3fwwWHDTigByZdgpudT3rr9k7dY60Z8
 aNvOzNWOzGH9/0boLl55WysF6cBQnazbgtzeWpzeuWFhAyfxN/DJx2sf8U+TmN6n
 3bDUafamAvmkkIOoHUzOXfjo2lhXxlmRZ40rWVNX5JvcJj5+5jRmTawrQj+9fn8/
 MhiIZ6KBDV1OxPwJzG6jm++JP6rgXfXsxduomO7cIEWs10itf/cE8WD9qJrtZTtg
 kfjYUguFOd/QyzY0A1w6FD865vy8YhATk71Ywgwj9AI+cfH8QUajpDkXOutjop8x
 8HBxIGx6Itgzilfuo5jpJxlVhNO3G6v1fX/A+mUMAfHufkmnfiQ=
 =cyDR
 -----END PGP SIGNATURE-----

Merge tag 'random-5.18-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random

Pull random number generator updates from Jason Donenfeld:
 "There have been a few important changes to the RNG's crypto, but the
  intent for 5.18 has been to shore up the existing design as much as
  possible with modern cryptographic functions and proven constructions,
  rather than actually changing up anything fundamental to the RNG's
  design.

  So it's still the same old RNG at its core as before: it still counts
  entropy bits, and collects from the various sources with the same
  heuristics as before, and so forth. However, the cryptographic
  algorithms that transform that entropic data into safe random numbers
  have been modernized.

  Just as important, if not more, is that the code has been cleaned up
  and re-documented. As one of the first drivers in Linux, going back to
  1.3.30, its general style and organization was showing its age and
  becoming both a maintenance burden and an auditability impediment.

  Hopefully this provides a more solid foundation to build on for the
  future. I encourage you to open up the file in full, and maybe you'll
  remark, "oh, that's what it's doing," and enjoy reading it. That, at
  least, is the eventual goal, which this pull begins working toward.

  Here's a summary of the various patches in this pull:

   - /dev/urandom and /dev/random now do the same thing, per the patch
     we discussed on the list. I think this is worth trying out. If it
     does appear problematic, I've made sure to keep it standalone and
     revertible without any conflicts.

   - Fixes and cleanups for numerous integer type problems, locking
     issues, and general code quality concerns.

   - The input pool's LFSR has been replaced with a cryptographically
     secure hash function, which has security and performance benefits
     alike, and consequently allows us to count entropy bits linearly.

   - The pre-init injection now uses a real hash function too, instead
     of an LFSR or vanilla xor.

   - The interrupt handler's fast_mix() function now uses one round of
     SipHash, rather than the fake crypto that was there before.

   - All additions of RDRAND and RDSEED now go through the input pool's
     hash function, in part to mitigate ridiculous hypothetical CPU
     backdoors, but more so to have a consistent interface for ingesting
     entropy that's easy to analyze, making everything happen one way,
     instead of a potpourri of different ways.

   - The crng now works on per-cpu data, while also being in accordance
     with the actual "fast key erasure RNG" design. This allows us to
     fix several boot-time race complications associated with the prior
     dynamically allocated model, eliminates much locking, and makes our
     backtrack protection more robust.

   - Batched entropy now erases doled out values so that it's backtrack
     resistant.

   - Working closely with Sebastian, the interrupt handler no longer
     needs to take any locks at all, as we punt the
     synchronized/expensive operations to a workqueue. This is
     especially nice for PREEMPT_RT, where taking spinlocks in irq
     context is problematic. It also makes the handler faster for the
     rest of us.

   - Also working with Sebastian, we now do the right thing on CPU
     hotplug, so that we don't use stale entropy or fail to accumulate
     new entropy when CPUs come back online.

   - We handle virtual machines that fork / clone / snapshot, using the
     "vmgenid" ACPI specification for retrieving a unique new RNG seed,
     which we can use to also make WireGuard (and in the future, other
     things) safe across VM forks.

   - Around boot time, we now try to reseed more often if enough entropy
     is available, before settling on the usual 5 minute schedule.

   - Last, but certainly not least, the documentation in the file has
     been updated considerably"

* tag 'random-5.18-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (60 commits)
  random: check for signal and try earlier when generating entropy
  random: reseed more often immediately after booting
  random: make consistent usage of crng_ready()
  random: use SipHash as interrupt entropy accumulator
  wireguard: device: clear keys on VM fork
  random: provide notifier for VM fork
  random: replace custom notifier chain with standard one
  random: do not export add_vmfork_randomness() unless needed
  virt: vmgenid: notify RNG of VM fork and supply generation ID
  ACPI: allow longer device IDs
  random: add mechanism for VM forks to reinitialize crng
  random: don't let 644 read-only sysctls be written to
  random: give sysctl_random_min_urandom_seed a more sensible value
  random: block in /dev/urandom
  random: do crng pre-init loading in worker rather than irq
  random: unify cycles_t and jiffies usage and types
  random: cleanup UUID handling
  random: only wake up writers after zap if threshold was passed
  random: round-robin registers as ulong, not u32
  random: clear fast pool, crng, and batches in cpuhp bring up
  ...
2022-03-21 14:55:32 -07:00
Kees Cook
02788ebcf5 lib: stackinit: Convert to KUnit
Convert stackinit unit tests to KUnit, for better integration
into the kernel self test framework. Includes a rename of
test_stackinit.c to stackinit_kunit.c, and CONFIG_TEST_STACKINIT to
CONFIG_STACKINIT_KUNIT_TEST.

Adjust expected test results based on which stack initialization method
was chosen:

 $ CMD="./tools/testing/kunit/kunit.py run stackinit --raw_output \
        --arch=x86_64 --kconfig_add"

 $ $CMD | grep stackinit:
 # stackinit: pass:36 fail:0 skip:29 total:65

 $ $CMD CONFIG_GCC_PLUGIN_STRUCTLEAK_USER=y | grep stackinit:
 # stackinit: pass:37 fail:0 skip:28 total:65

 $ $CMD CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF=y | grep stackinit:
 # stackinit: pass:55 fail:0 skip:10 total:65

 $ $CMD CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL=y | grep stackinit:
 # stackinit: pass:62 fail:0 skip:3 total:65

 $ $CMD CONFIG_INIT_STACK_ALL_PATTERN=y --make_option LLVM=1 | grep stackinit:
 # stackinit: pass:60 fail:0 skip:5 total:65

 $ $CMD CONFIG_INIT_STACK_ALL_ZERO=y --make_option LLVM=1 | grep stackinit:
 # stackinit: pass:60 fail:0 skip:5 total:65

Temporarily remove the userspace-build mode, which will be restored in a
later patch.

Expand the size of the pre-case switch variable so it doesn't get
accidentally cleared.

Cc: David Gow <davidgow@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
---
v1: https://lore.kernel.org/lkml/20220224055145.1853657-1-keescook@chromium.org
v2:
 - split "userspace KUnit stub" into separate header and patch (Daniel)
 - Improve commit log and comments (David)
 - Provide mapping of expected XFAIL tests to CONFIGs (David)
2022-03-21 08:13:04 -07:00
Petr Mladek
0834c6f03b Merge branch 'for-5.18-vsprintf-fourcc-fixup' into for-linus 2022-03-21 14:44:49 +01:00
Jiri Olsa
a0019cd7d4 lib/sort: Add priv pointer to swap function
Adding support to have priv pointer in swap callback function.

Following the initial change on cmp callback functions [1]
and adding SWAP_WRAPPER macro to identify sort call of sort_r.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-2-jolsa@kernel.org

[1] 4333fb96ca10 ("media: lib/sort.c: implement sort() variant taking context argument")
2022-03-17 20:17:18 -07:00
Masami Hiramatsu
f4616fabab fprobe: Add a selftest for fprobe
Add a KUnit based selftest for fprobe interface.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/164735295554.1084943.18347620679928750960.stgit@devnote2
2022-03-17 20:17:14 -07:00
Jason A. Donenfeld
5acd35487d random: replace custom notifier chain with standard one
We previously rolled our own randomness readiness notifier, which only
has two users in the whole kernel. Replace this with a more standard
atomic notifier block that serves the same purpose with less code. Also
unexport the symbols, because no modules use it, only unconditional
builtins. The only drawback is that it's possible for a notification
handler returning the "stop" code to prevent further processing, but
given that there are only two users, and that we're unexporting this
anyway, that doesn't seem like a significant drawback for the
simplification we receive here.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-03-12 18:00:56 -07:00
Johannes Berg
2a6852cb8f lib/logic_iomem: correct fallback config references
Due to some renaming, we ended up with the "indirect iomem"
naming in Kconfig, following INDIRECT_PIO. However, clearly
I missed following through on that in the ifdefs, but so far
INDIRECT_IOMEM_FALLBACK isn't used by any architecture.

Reported-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Fixes: ca2e334232b6 ("lib: add iomem emulation (logic_iomem)")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2022-03-11 10:42:56 +01:00
Paul Menzel
5b401e4e9a lib/raid6: Include <asm/ppc-opcode.h> for VPERMXOR
On Ubuntu 21.10 (ppc64le) building raid6test with gcc (Ubuntu
11.2.0-7ubuntu2) 11.2.0 fails with the error below.

    gcc -I.. -I ../../../include -g -O2                       \
             -I../../../arch/powerpc/include -DCONFIG_ALTIVEC \
             -c -o vpermxor1.o vpermxor1.c
    vpermxor1.c: In function ‘raid6_vpermxor1_gen_syndrome_real’:
    vpermxor1.c:64:29: error: expected string literal before ‘VPERMXOR’
       64 |   asm(VPERMXOR(%0,%1,%2,%3):"=v"(wq0):"v"(gf_high), "v"(gf_low), "v"(wq0));
          |       ^~~~~~~~
    make: *** [Makefile:58: vpermxor1.o] Error 1

So, include the header asm/ppc-opcode.h defining this macro also when
not building the Linux kernel but only this too.

Cc: Matt Brown <matthew.brown.dev@gmail.com>
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Song Liu <song@kernel.org>
2022-03-08 15:20:21 -08:00
Paul Menzel
633174a704 lib/raid6/test/Makefile: Use $(pound) instead of \# for Make 4.3
Buidling raid6test on Ubuntu 21.10 (ppc64le) with GNU Make 4.3 shows the
errors below:

    $ cd lib/raid6/test/
    $ make
    <stdin>:1:1: error: stray ‘\’ in program
    <stdin>:1:2: error: stray ‘#’ in program
    <stdin>:1:11: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ \
        before ‘<’ token

    [...]

The errors come from the HAS_ALTIVEC test, which fails, and the POWER
optimized versions are not built. That’s also reason nobody noticed on the
other architectures.

GNU Make 4.3 does not remove the backslash anymore. From the 4.3 release
announcment:

> * WARNING: Backward-incompatibility!
>   Number signs (#) appearing inside a macro reference or function invocation
>   no longer introduce comments and should not be escaped with backslashes:
>   thus a call such as:
>     foo := $(shell echo '#')
>   is legal.  Previously the number sign needed to be escaped, for example:
>     foo := $(shell echo '\#')
>   Now this latter will resolve to "\#".  If you want to write makefiles
>   portable to both versions, assign the number sign to a variable:
>     H := \#
>     foo := $(shell echo '$H')
>   This was claimed to be fixed in 3.81, but wasn't, for some reason.
>   To detect this change search for 'nocomment' in the .FEATURES variable.

So, do the same as commit 9564a8cf422d ("Kbuild: fix # escaping in .cmd
files for future Make") and commit 929bef467771 ("bpf: Use $(pound) instead
of \# in Makefiles") and define and use a $(pound) variable.

Reference for the change in make:
https://git.savannah.gnu.org/cgit/make.git/commit/?id=c6966b323811c37acedff05b57

Cc: Matt Brown <matthew.brown.dev@gmail.com>
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Song Liu <song@kernel.org>
2022-03-08 15:20:21 -08:00
Dirk Müller
a5359ddd05 lib/raid6/test: fix multiple definition linking error
GCC 10+ defaults to -fno-common, which enforces proper declaration of
external references using "extern". without this change a link would
fail with:

  lib/raid6/test/algos.c:28: multiple definition of `raid6_call';
  lib/raid6/test/test.c:22: first defined here

the pq.h header that is included already includes an extern declaration
so we can just remove the redundant one here.

Cc: <stable@vger.kernel.org>
Signed-off-by: Dirk Müller <dmueller@suse.de>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Song Liu <song@kernel.org>
2022-03-08 15:20:21 -08:00
Keith Busch
f3813f4b28 crypto: add rocksoft 64b crc guard tag framework
Hardware specific features may be able to calculate a crc64, so provide
a framework for drivers to register their implementation. If nothing is
registered, fallback to the generic table lookup implementation. The
implementation is modeled after the crct10dif equivalent.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Link: https://lore.kernel.org/r/20220303201312.3255347-7-kbusch@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-07 12:48:35 -07:00
Keith Busch
cbc0a40e17 lib: add rocksoft model crc64
The NVM Express specification extended data integrity fields to 64 bits
using the Rocksoft parameters. Add the poly to the crc64 table
generation, and provide a generic library routine implementing the
algorithm.

The Rocksoft 64-bit CRC model parameters are as follows:
    Poly: 0xAD93D23594C93659
    Initial value: 0xFFFFFFFFFFFFFFFF
    Reflected Input: True
    Reflected Output: True
    Xor Final: 0xFFFFFFFFFFFFFFFF

Since this model used reflected bits, the implementation generates the
reflected table so the result is ordered consistently.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220303201312.3255347-6-kbusch@kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-07 12:48:35 -07:00
Jens Axboe
bc8419944f Merge branch 'for-5.18/block' into for-5.18/64bit-pi
* for-5.18/block: (96 commits)
  block: remove bio_devname
  ext4: stop using bio_devname
  raid5-ppl: stop using bio_devname
  raid1: stop using bio_devname
  md-multipath: stop using bio_devname
  dm-integrity: stop using bio_devname
  dm-crypt: stop using bio_devname
  pktcdvd: remove a pointless debug check in pkt_submit_bio
  block: remove handle_bad_sector
  block: fix and cleanup bio_check_ro
  bfq: fix use-after-free in bfq_dispatch_request
  blk-crypto: show crypto capabilities in sysfs
  block: don't delete queue kobject before its children
  block: simplify calling convention of elv_unregister_queue()
  block: remove redundant semicolon
  block: default BLOCK_LEGACY_AUTOLOAD to y
  block: update io_ticks when io hang
  block, bfq: don't move oom_bfqq
  block, bfq: avoid moving bfqq to it's parent bfqg
  block, bfq: cleanup bfq_bfqq_to_bfqg()
  ...
2022-03-07 12:46:13 -07:00
Jakub Kicinski
6646dc241d Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2022-03-04

We've added 32 non-merge commits during the last 14 day(s) which contain
a total of 59 files changed, 1038 insertions(+), 473 deletions(-).

The main changes are:

1) Optimize BPF stackmap's build_id retrieval by caching last valid build_id,
   as consecutive stack frames are likely to be in the same VMA and therefore
   have the same build id, from Hao Luo.

2) Several improvements to arm64 BPF JIT, that is, support for JITing
   the atomic[64]_fetch_add, atomic[64]_[fetch_]{and,or,xor} and lastly
   atomic[64]_{xchg|cmpxchg}. Also fix the BTF line info dump for JITed
   programs, from Hou Tao.

3) Optimize generic BPF map batch deletion by only enforcing synchronize_rcu()
   barrier once upon return to user space, from Eric Dumazet.

4) For kernel build parse DWARF and generate BTF through pahole with enabled
   multithreading, from Kui-Feng Lee.

5) BPF verifier usability improvements by making log info more concise and
   replacing inv with scalar type name, from Mykola Lysenko.

6) Two follow-up fixes for BPF prog JIT pack allocator, from Song Liu.

7) Add a new Kconfig to allow for loading kernel modules with non-matching
   BTF type info; their BTF info is then removed on load, from Connor O'Brien.

8) Remove reallocarray() usage from bpftool and switch to libbpf_reallocarray()
   in order to fix compilation errors for older glibc, from Mauricio Vásquez.

9) Fix libbpf to error on conflicting name in BTF when type declaration
   appears before the definition, from Xu Kuohai.

10) Fix issue in BPF preload for in-kernel light skeleton where loaded BPF
    program fds prevent init process from setting up fd 0-2, from Yucong Sun.

11) Fix libbpf reuse of pinned perf RB map when max_entries is auto-determined
    by libbpf, from Stijn Tintel.

12) Several cleanups for libbpf and a fix to enforce perf RB map #pages to be
    non-zero, from Yuntao Wang.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (32 commits)
  bpf: Small BPF verifier log improvements
  libbpf: Add a check to ensure that page_cnt is non-zero
  bpf, x86: Set header->size properly before freeing it
  x86: Disable HAVE_ARCH_HUGE_VMALLOC on 32-bit x86
  bpf, test_run: Fix overflow in XDP frags bpf_test_finish
  selftests/bpf: Update btf_dump case for conflicting names
  libbpf: Skip forward declaration when counting duplicated type names
  bpf: Add some description about BPF_JIT_ALWAYS_ON in Kconfig
  bpf, docs: Add a missing colon in verifier.rst
  bpf: Cache the last valid build_id
  libbpf: Fix BPF_MAP_TYPE_PERF_EVENT_ARRAY auto-pinning
  bpf, selftests: Use raw_tp program for atomic test
  bpf, arm64: Support more atomic operations
  bpftool: Remove redundant slashes
  bpf: Add config to allow loading modules with BTF mismatches
  bpf, arm64: Feed byte-offset into bpf line info
  bpf, arm64: Call build_prologue() first in first JIT pass
  bpf: Fix issue with bpf preload module taking over stdout/stdin of kernel.
  bpftool: Bpf skeletons assert type sizes
  bpf: Cleanup comments
  ...
====================

Link: https://lore.kernel.org/r/20220304164313.31675-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-04 19:28:17 -08:00
Jakub Kicinski
80901bff81 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
net/batman-adv/hard-interface.c
  commit 690bb6fb64f5 ("batman-adv: Request iflink once in batadv-on-batadv check")
  commit 6ee3c393eeb7 ("batman-adv: Demote batadv-on-batadv skip error message")
https://lore.kernel.org/all/20220302163049.101957-1-sw@simonwunderlich.de/

net/smc/af_smc.c
  commit 4d08b7b57ece ("net/smc: Fix cleanup when register ULP fails")
  commit 462791bbfa35 ("net/smc: add sysctl interface for SMC")
https://lore.kernel.org/all/20220302112209.355def40@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03 11:55:12 -08:00
Christoph Hellwig
27674ef6c7 mm: remove the extra ZONE_DEVICE struct page refcount
ZONE_DEVICE struct pages have an extra reference count that complicates
the code for put_page() and several places in the kernel that need to
check the reference count to see that a page is not being used (gup,
compaction, migration, etc.). Clean up the code so the reference count
doesn't need to be treated specially for ZONE_DEVICE pages.

Note that this excludes the special idle page wakeup for fsdax pages,
which still happens at refcount 1.  This is a separate issue and will
be sorted out later.  Given that only fsdax pages require the
notifiacation when the refcount hits 1 now, the PAGEMAP_OPS Kconfig
symbol can go away and be replaced with a FS_DAX check for this hook
in the put_page fastpath.

Based on an earlier patch from Ralph Campbell <rcampbell@nvidia.com>.

Link: https://lkml.kernel.org/r/20220210072828.2930359-8-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: "Sierra Guiza, Alejandro (Alex)" <alex.sierra@amd.com>

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Chaitanya Kulkarni <kch@nvidia.com>
Cc: Christian Knig <christian.koenig@amd.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-03-03 12:47:33 -05:00
Christoph Hellwig
dc90f0846d mm: don't include <linux/memremap.h> in <linux/mm.h>
Move the check for the actual pgmap types that need the free at refcount
one behavior into the out of line helper, and thus avoid the need to
pull memremap.h into mm.h.

Link: https://lkml.kernel.org/r/20220210072828.2930359-7-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: "Sierra Guiza, Alejandro (Alex)" <alex.sierra@amd.com>

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Chaitanya Kulkarni <kch@nvidia.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-03-03 12:47:33 -05:00
Christoph Hellwig
730ff52194 mm: remove pointless includes from <linux/hmm.h>
hmm.h pulls in the world for no good reason at all.  Remove the
includes and push a few ones into the users instead.

Link: https://lkml.kernel.org/r/20220210072828.2930359-4-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Tested-by: "Sierra Guiza, Alejandro (Alex)" <alex.sierra@amd.com>

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Christian Knig <christian.koenig@amd.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Karol Herbst <kherbst@redhat.com>
Cc: Lyude Paul <lyude@redhat.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-03-03 12:47:33 -05:00
Linus Torvalds
7e3d76139b ARM further fixes for 5.17-rc:
- Fix kgdb breakpoint for Thumb2
 - Fix dependency for BITREVERSE kconfig
 - Fix nommu early_params and __setup returns
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEuNNh8scc2k/wOAE+9OeQG+StrGQFAmIc4SIACgkQ9OeQG+St
 rGTlCQ//XQD4xLnvM2LScGFVOOvoQwOmv77H6jOVrfO1xs8dD0W5mBG3LdgaAKkW
 CnYRb9qF2i+lq1p3ZH9u+5bSX6ttzlRmvwUQB89YM7gkU5AY535gz1nFKScdT932
 FNftd1h4FJXvdOsVQM3MnwTNFtp3YodkkkNzKS8PkMxSuvQffMXBMo8cTpkkIF+M
 Eq/QRGIavreFqsI7UtN24j1FkDlBGYrVT8aGHwfyYRCIiFw6InaCpZ1eElJl0xdH
 v80h1ihYqIfLgkHH3Bkk8edsNoosJII5B67n1t1ZdkNBKEiPR5tLa5IMmEv2Dy07
 ufUvU1dullN5KXLQzD/8H4BZMGR1m0tDKWqCt1x1wcug/a1R0xPuO5QEOKXU0HpW
 wegV8ueYmGqAN5HN1iRpNctCJSos+qPZYuDDevJMnXjQsDRamUqUy/0V/rgc7qKE
 yzBfzgKM+Vhn5bKhtXu09Z6xAwVa0wknsJ+NF++EbukAW/WK5m3ck1Z0Ab6e3C1i
 phlnCIH083yejpxuoQMxDaGDhWwEE+a9R63BUPUxmdBIxrVc2yZLo+BUDJxaDh8n
 GcsiFnrsziwJIRlL0FsEWh1PbwWd8xhfHCBV7qbRDZ98RfDyjrajJrl9eK9u9pT+
 nUZKTC6Y+v6N3qfGJzvTHgjhOAA+crfgHcgDGZthoz3UJ0A1Tk0=
 =cSgl
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:

 - Fix kgdb breakpoint for Thumb2

 - Fix dependency for BITREVERSE kconfig

 - Fix nommu early_params and __setup returns

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9182/1: mmu: fix returns from early_param() and __setup() functions
  ARM: 9178/1: fix unmet dependency on BITREVERSE for HAVE_ARCH_BITREVERSE
  ARM: Fix kgdb breakpoint for Thumb2
2022-03-02 16:11:56 -08:00
Nicolai Stange
81771ff241 lib/mpi: export mpi_rshift
A subsequent patch will make the crypto/dh's dh_is_pubkey_valid() to
calculate a safe-prime groups Q parameter from P: Q = (P - 1) / 2. For
implementing this, mpi_rshift() will be needed. Export it so that it's
accessible from crypto/dh.

Signed-off-by: Nicolai Stange <nstange@suse.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2022-03-03 10:47:52 +12:00
Connor O'Brien
5e214f2e43 bpf: Add config to allow loading modules with BTF mismatches
BTF mismatch can occur for a separately-built module even when the ABI is
otherwise compatible and nothing else would prevent successfully loading.

Add a new Kconfig to control how mismatches are handled. By default, preserve
the current behavior of refusing to load the module. If MODULE_ALLOW_BTF_MISMATCH
is enabled, load the module but ignore its BTF information.

Suggested-by: Yonghong Song <yhs@fb.com>
Suggested-by: Michal Suchánek <msuchanek@suse.de>
Signed-off-by: Connor O'Brien <connoro@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/CAADnVQJ+OVPnBz8z3vNu8gKXX42jCUqfuvhWAyCQDu8N_yqqwQ@mail.gmail.com
Link: https://lore.kernel.org/bpf/20220223012814.1898677-1-connoro@google.com
2022-02-28 14:17:10 +01:00
Kees Cook
617f55e207 lib: overflow: Convert to Kunit
Convert overflow unit tests to KUnit, for better integration into the
kernel self test framework. Includes a rename of test_overflow.c to
overflow_kunit.c, and CONFIG_TEST_OVERFLOW to CONFIG_OVERFLOW_KUNIT_TEST.

$ ./tools/testing/kunit/kunit.py run overflow
...
[14:33:51] Starting KUnit Kernel (1/1)...
[14:33:51] ============================================================
[14:33:51] ================== overflow (11 subtests) ==================
[14:33:51] [PASSED] u8_overflow_test
[14:33:51] [PASSED] s8_overflow_test
[14:33:51] [PASSED] u16_overflow_test
[14:33:51] [PASSED] s16_overflow_test
[14:33:51] [PASSED] u32_overflow_test
[14:33:51] [PASSED] s32_overflow_test
[14:33:51] [PASSED] u64_overflow_test
[14:33:51] [PASSED] s64_overflow_test
[14:33:51] [PASSED] overflow_shift_test
[14:33:51] [PASSED] overflow_allocation_test
[14:33:51] [PASSED] overflow_size_helpers_test
[14:33:51] ==================== [PASSED] overflow =====================
[14:33:51] ============================================================
[14:33:51] Testing complete. Passed: 11, Failed: 0, Crashed: 0, Skipped: 0, Errors: 0
[14:33:51] Elapsed time: 12.525s total, 0.001s configuring, 12.402s building, 0.101s running

Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Co-developed-by: Vitor Massaru Iha <vitor@massaru.org>
Signed-off-by: Vitor Massaru Iha <vitor@massaru.org>
Link: https://lore.kernel.org/lkml/20200720224418.200495-1-vitor@massaru.org/
Co-developed-by: Daniel Latypov <dlatypov@google.com>
Signed-off-by: Daniel Latypov <dlatypov@google.com>
Link: https://lore.kernel.org/linux-kselftest/20210503211536.1384578-1-dlatypov@google.com/
Acked-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/lkml/CAKwvOdm62iA1dNiC6Q11UJ-MnTqtc4kXkm-ubPaFMK824_k0nw@mail.gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Gow <davidgow@google.com>
Link: https://lore.kernel.org/lkml/CABVgOS=TWVh649_Vjo3wnMu9gZnq66gkV-LtGgsksAWMqc+MSA@mail.gmail.com
2022-02-27 09:29:02 -08:00
Andrey Konovalov
70effdc375 kasan: test: prevent cache merging in kmem_cache_double_destroy
With HW_TAGS KASAN and kasan.stacktrace=off, the cache created in the
kmem_cache_double_destroy() test might get merged with an existing one.
Thus, the first kmem_cache_destroy() call won't actually destroy it but
will only decrease the refcount.  This causes the test to fail.

Provide an empty constructor for the created cache to prevent the cache
from getting merged.

Link: https://lkml.kernel.org/r/b597bd434c49591d8af00ee3993a42c609dc9a59.1644346040.git.andreyknvl@google.com
Fixes: f98f966cd750 ("kasan: test: add test case for double-kmem_cache_destroy()")
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Reviewed-by: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-02-26 09:51:17 -08:00
David Gow
5debe5bfa0 list: test: Add a test for list_entry_is_head()
The list_entry_is_head() macro was added[1] after the list KUnit tests,
so wasn't tested. Add a new KUnit test to complete the set.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e130816164e244b692921de49771eeb28205152d

Signed-off-by: David Gow <davidgow@google.com>
Acked-by: Daniel Latypov <dlatypov@google.com>
Acked-by: Brendan Higgins <brendanhiggins@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-02-25 08:39:01 -07:00
David Gow
37dc573c0a list: test: Add a test for list_is_head()
list_is_head() was added recently[1], and didn't have a KUnit test. The
implementation is trivial, so it's not a particularly exciting test, but
it'd be nice to get back to full coverage of the list functions.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/include/linux/list.h?id=0425473037db40d9e322631f2d4dc6ef51f97e88

Signed-off-by: David Gow <davidgow@google.com>
Acked-by: Daniel Latypov <dlatypov@google.com>
Acked-by: Brendan Higgins <brendanhiggins@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-02-25 08:38:55 -07:00
David Gow
d7fd696c12 list: test: Add test for list_del_init_careful()
The list_del_init_careful() function was added[1] after the list KUnit
test. Add a very basic test to cover it.

Note that this test only covers the single-threaded behaviour (which
matches list_del_init()), as is already the case with the test for
list_empty_careful().

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c6fe44d96fc1536af5b11cd859686453d1b7bfd1

Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-02-25 08:38:48 -07:00
Arnd Bergmann
967747bbc0 uaccess: remove CONFIG_SET_FS
There are no remaining callers of set_fs(), so CONFIG_SET_FS
can be removed globally, along with the thread_info field and
any references to it.

This turns access_ok() into a cheaper check against TASK_SIZE_MAX.

As CONFIG_SET_FS is now gone, drop all remaining references to
set_fs()/get_fs(), mm_segment_t, user_addr_max() and uaccess_kernel().

Acked-by: Sam Ravnborg <sam@ravnborg.org> # for sparc32 changes
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Tested-by: Sergey Matyukevich <sergey.matyukevich@synopsys.com> # for arc changes
Acked-by: Stafford Horne <shorne@gmail.com> # [openrisc, asm-generic]
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-25 09:36:06 +01:00
Arnd Bergmann
5a06fcb15b lib/test_lockup: fix kernel pointer check for separate address spaces
test_kernel_ptr() uses access_ok() to figure out if a given address
points to user space instead of kernel space. However on architectures
that set CONFIG_ALTERNATE_USER_ADDRESS_SPACE, a pointer can be valid
for both, and the check always fails because access_ok() returns true.

Make the check for user space pointers conditional on the type of
address space layout.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-25 09:36:06 +01:00
Arnd Bergmann
23fc539e81 uaccess: fix type mismatch warnings from access_ok()
On some architectures, access_ok() does not do any argument type
checking, so replacing the definition with a generic one causes
a few warnings for harmless issues that were never caught before.

Fix the ones that I found either through my own test builds or
that were reported by the 0-day bot.

Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-02-25 09:36:05 +01:00
Jakub Kicinski
aaa25a2fa7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
tools/testing/selftests/net/mptcp/mptcp_join.sh
  34aa6e3bccd8 ("selftests: mptcp: add ip mptcp wrappers")

  857898eb4b28 ("selftests: mptcp: add missing join check")
  6ef84b1517e0 ("selftests: mptcp: more robust signal race test")
https://lore.kernel.org/all/20220221131842.468893-1-broonie@kernel.org/

drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.h
drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ct.c
  fb7e76ea3f3b6 ("net/mlx5e: TC, Skip redundant ct clear actions")
  c63741b426e11 ("net/mlx5e: Fix MPLSoUDP encap to use MPLS action information")

  09bf97923224f ("net/mlx5e: TC, Move pedit_headers_action to parse_attr")
  84ba8062e383 ("net/mlx5e: Test CT and SAMPLE on flow attr")
  efe6f961cd2e ("net/mlx5e: CT, Don't set flow flag CT for ct clear flow")
  3b49a7edec1d ("net/mlx5e: TC, Reject rules with multiple CT actions")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 17:54:25 -08:00
Christophe Leroy
8484291132 vsprintf: Fix %pK with kptr_restrict == 0
Although kptr_restrict is set to 0 and the kernel is booted with
no_hash_pointers parameter, the content of /proc/vmallocinfo is
lacking the real addresses.

  / # cat /proc/vmallocinfo
  0x(ptrval)-0x(ptrval)    8192 load_module+0xc0c/0x2c0c pages=1 vmalloc
  0x(ptrval)-0x(ptrval)   12288 start_kernel+0x4e0/0x690 pages=2 vmalloc
  0x(ptrval)-0x(ptrval)   12288 start_kernel+0x4e0/0x690 pages=2 vmalloc
  0x(ptrval)-0x(ptrval)    8192 _mpic_map_mmio.constprop.0+0x20/0x44 phys=0x80041000 ioremap
  0x(ptrval)-0x(ptrval)   12288 _mpic_map_mmio.constprop.0+0x20/0x44 phys=0x80041000 ioremap
    ...

According to the documentation for /proc/sys/kernel/, %pK is
equivalent to %p when kptr_restrict is set to 0.

Fixes: 5ead723a20e0 ("lib/vsprintf: no_hash_pointers prints all addresses as unhashed")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/107476128e59bff11a309b5bf7579a1753a41aca.1645087605.git.christophe.leroy@csgroup.eu
2022-02-24 10:10:31 +01:00
Linus Torvalds
917bbdb107 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull ITER_PIPE fix from Al Viro:
 "Fix for old sloppiness in pipe_buffer reuse"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  lib/iov_iter: initialize "flags" in new pipe_buffer
2022-02-22 10:31:53 -08:00
Jason A. Donenfeld
14c174633f random: remove unused tracepoints
These explicit tracepoints aren't really used and show sign of aging.
It's work to keep these up to date, and before I attempted to keep them
up to date, they weren't up to date, which indicates that they're not
really used. These days there are better ways of introspecting anyway.

Cc: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
2022-02-21 21:14:00 +01:00
Max Kellermann
9d2231c5d7 lib/iov_iter: initialize "flags" in new pipe_buffer
The functions copy_page_to_iter_pipe() and push_pipe() can both
allocate a new pipe_buffer, but the "flags" member initializer is
missing.

Fixes: 241699cd72a8 ("new iov_iter flavour: pipe-backed")
To: Alexander Viro <viro@zeniv.linux.org.uk>
To: linux-fsdevel@vger.kernel.org
To: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2022-02-21 10:16:39 -05:00
Julian Braha
11c57c3ba9 ARM: 9178/1: fix unmet dependency on BITREVERSE for HAVE_ARCH_BITREVERSE
Resending this to properly add it to the patch tracker - thanks for letting
me know, Arnd :)

When ARM is enabled, and BITREVERSE is disabled,
Kbuild gives the following warning:

WARNING: unmet direct dependencies detected for HAVE_ARCH_BITREVERSE
  Depends on [n]: BITREVERSE [=n]
  Selected by [y]:
  - ARM [=y] && (CPU_32v7M [=n] || CPU_32v7 [=y]) && !CPU_32v6 [=n]

This is because ARM selects HAVE_ARCH_BITREVERSE
without selecting BITREVERSE, despite
HAVE_ARCH_BITREVERSE depending on BITREVERSE.

This unmet dependency bug was found by Kismet,
a static analysis tool for Kconfig. Please advise if this
is not the appropriate solution.

Signed-off-by: Julian Braha <julianbraha@gmail.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-02-21 14:58:34 +00:00
Kees Cook
230f6fa2c1 overflow: Provide constant expression struct_size
There have been cases where struct_size() (or flex_array_size()) needs
to be calculated for an initializer, which requires it be a constant
expression. This is possible when the "count" argument is a constant
expression, so provide this ability for the helpers.

Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Tested-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/lkml/20220210010407.GA701603@embeddedor
2022-02-16 14:30:37 -08:00
Kees Cook
e1be43d9b5 overflow: Implement size_t saturating arithmetic helpers
In order to perform more open-coded replacements of common allocation
size arithmetic, the kernel needs saturating (SIZE_MAX) helpers for
multiplication, addition, and subtraction. For example, it is common in
allocators, especially on realloc, to add to an existing size:

    p = krealloc(map->patch,
                 sizeof(struct reg_sequence) * (map->patch_regs + num_regs),
                 GFP_KERNEL);

There is no existing saturating replacement for this calculation, and
just leaving the addition open coded inside array_size() could
potentially overflow as well. For example, an overflow in an expression
for a size_t argument might wrap to zero:

    array_size(anything, something_at_size_max + 1) == 0

Introduce size_mul(), size_add(), and size_sub() helpers that
implicitly promote arguments to size_t and saturated calculations for
use in allocations. With these helpers it is also possible to redefine
array_size(), array3_size(), flex_array_size(), and struct_size() in
terms of the new helpers.

As with the check_*_overflow() helpers, the new helpers use __must_check,
though what is really desired is a way to make sure that assignment is
only to a size_t lvalue. Without this, it's still possible to introduce
overflow/underflow via type conversion (i.e. from size_t to int).
Enforcing this will currently need to be left to static analysis or
future use of -Wconversion.

Additionally update the overflow unit tests to force runtime evaluation
for the pathological cases.

Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Keith Busch <kbusch@kernel.org>
Cc: Len Baker <len.baker@gmx.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2022-02-16 14:29:48 -08:00
Kees Cook
28e77cc1c0 fortify: Detect struct member overflows in memset() at compile-time
As done for memcpy(), also update memset() to use the same tightened
compile-time bounds checking under CONFIG_FORTIFY_SOURCE.

Signed-off-by: Kees Cook <keescook@chromium.org>
2022-02-13 16:50:06 -08:00
Kees Cook
938a000e3f fortify: Detect struct member overflows in memmove() at compile-time
As done for memcpy(), also update memmove() to use the same tightened
compile-time checks under CONFIG_FORTIFY_SOURCE.

Signed-off-by: Kees Cook <keescook@chromium.org>
2022-02-13 16:50:06 -08:00
Kees Cook
f68f2ff915 fortify: Detect struct member overflows in memcpy() at compile-time
memcpy() is dead; long live memcpy()

tl;dr: In order to eliminate a large class of common buffer overflow
flaws that continue to persist in the kernel, have memcpy() (under
CONFIG_FORTIFY_SOURCE) perform bounds checking of the destination struct
member when they have a known size. This would have caught all of the
memcpy()-related buffer write overflow flaws identified in at least the
last three years.

Background and analysis:

While stack-based buffer overflow flaws are largely mitigated by stack
canaries (and similar) features, heap-based buffer overflow flaws continue
to regularly appear in the kernel. Many classes of heap buffer overflows
are mitigated by FORTIFY_SOURCE when using the strcpy() family of
functions, but a significant number remain exposed through the memcpy()
family of functions.

At its core, FORTIFY_SOURCE uses the compiler's __builtin_object_size()
internal[0] to determine the available size at a target address based on
the compile-time known structure layout details. It operates in two
modes: outer bounds (0) and inner bounds (1). In mode 0, the size of the
enclosing structure is used. In mode 1, the size of the specific field
is used. For example:

	struct object {
		u16 scalar1;	/* 2 bytes */
		char array[6];	/* 6 bytes */
		u64 scalar2;	/* 8 bytes */
		u32 scalar3;	/* 4 bytes */
		u32 scalar4;	/* 4 bytes */
	} instance;

__builtin_object_size(instance.array, 0) == 22, since the remaining size
of the enclosing structure starting from "array" is 22 bytes (6 + 8 +
4 + 4).

__builtin_object_size(instance.array, 1) == 6, since the remaining size
of the specific field "array" is 6 bytes.

The initial implementation of FORTIFY_SOURCE used mode 0 because there
were many cases of both strcpy() and memcpy() functions being used to
write (or read) across multiple fields in a structure. For example,
it would catch this, which is writing 2 bytes beyond the end of
"instance":

	memcpy(&instance.array, data, 25);

While this didn't protect against overwriting adjacent fields in a given
structure, it would at least stop overflows from reaching beyond the
end of the structure into neighboring memory, and provided a meaningful
mitigation of a subset of buffer overflow flaws. However, many desirable
targets remain within the enclosing structure (for example function
pointers).

As it happened, there were very few cases of strcpy() family functions
intentionally writing beyond the end of a string buffer. Once all known
cases were removed from the kernel, the strcpy() family was tightened[1]
to use mode 1, providing greater mitigation coverage.

What remains is switching memcpy() to mode 1 as well, but making the
switch is much more difficult because of how frustrating it can be to
find existing "normal" uses of memcpy() that expect to write (or read)
across multiple fields. The root cause of the problem is that the C
language lacks a common pattern to indicate the intent of an author's
use of memcpy(), and is further complicated by the available compile-time
and run-time mitigation behaviors.

The FORTIFY_SOURCE mitigation comes in two halves: the compile-time half,
when both the buffer size _and_ the length of the copy is known, and the
run-time half, when only the buffer size is known. If neither size is
known, there is no bounds checking possible. At compile-time when the
compiler sees that a length will always exceed a known buffer size,
a warning can be deterministically emitted. For the run-time half,
the length is tested against the known size of the buffer, and the
overflowing operation is detected. (The performance overhead for these
tests is virtually zero.)

It is relatively easy to find compile-time false-positives since a warning
is always generated. Fixing the false positives, however, can be very
time-consuming as there are hundreds of instances. While it's possible
some over-read conditions could lead to kernel memory exposures, the bulk
of the risk comes from the run-time flaws where the length of a write
may end up being attacker-controlled and lead to an overflow.

Many of the compile-time false-positives take a form similar to this:

	memcpy(&instance.scalar2, data, sizeof(instance.scalar2) +
					sizeof(instance.scalar3));

and the run-time ones are similar, but lack a constant expression for the
size of the copy:

	memcpy(instance.array, data, length);

The former is meant to cover multiple fields (though its style has been
frowned upon more recently), but has been technically legal. Both lack
any expressivity in the C language about the author's _intent_ in a way
that a compiler can check when the length isn't known at compile time.
A comment doesn't work well because what's needed is something a compiler
can directly reason about. Is a given memcpy() call expected to overflow
into neighbors? Is it not? By using the new struct_group() macro, this
intent can be much more easily encoded.

It is not as easy to find the run-time false-positives since the code path
to exercise a seemingly out-of-bounds condition that is actually expected
may not be trivially reachable. Tightening the restrictions to block an
operation for a false positive will either potentially create a greater
flaw (if a copy is truncated by the mitigation), or destabilize the kernel
(e.g. with a BUG()), making things completely useless for the end user.

As a result, tightening the memcpy() restriction (when there is a
reasonable level of uncertainty of the number of false positives), needs
to first WARN() with no truncation. (Though any sufficiently paranoid
end-user can always opt to set the panic_on_warn=1 sysctl.) Once enough
development time has passed, the mitigation can be further intensified.
(Note that this patch is only the compile-time checking step, which is
a prerequisite to doing run-time checking, which will come in future
patches.)

Given the potential frustrations of weeding out all the false positives
when tightening the run-time checks, it is reasonable to wonder if these
changes would actually add meaningful protection. Looking at just the
last three years, there are 23 identified flaws with a CVE that mention
"buffer overflow", and 11 are memcpy()-related buffer overflows.

(For the remaining 12: 7 are array index overflows that would be
mitigated by systems built with CONFIG_UBSAN_BOUNDS=y: CVE-2019-0145,
CVE-2019-14835, CVE-2019-14896, CVE-2019-14897, CVE-2019-14901,
CVE-2019-17666, CVE-2021-28952. 2 are miscalculated allocation
sizes which could be mitigated with memory tagging: CVE-2019-16746,
CVE-2019-2181. 1 is an iovec buffer bug maybe mitigated by memory tagging:
CVE-2020-10742. 1 is a type confusion bug mitigated by stack canaries:
CVE-2020-10942. 1 is a string handling logic bug with no mitigation I'm
aware of: CVE-2021-28972.)

At my last count on an x86_64 allmodconfig build, there are 35,294
calls to memcpy(). With callers instrumented to report all places
where the buffer size is known but the length remains unknown (i.e. a
run-time bounds check is added), we can count how many new run-time
bounds checks are added when the destination and source arguments of
memcpy() are changed to use "mode 1" bounds checking: 1,276. This means
for the future run-time checking, there is a worst-case upper bounds
of 3.6% false positives to fix. In addition, there were around 150 new
compile-time warnings to evaluate and fix (which have now been fixed).

With this instrumentation it's also possible to compare the places where
the known 11 memcpy() flaw overflows manifested against the resulting
list of potential new run-time bounds checks, as a measure of potential
efficacy of the tightened mitigation. Much to my surprise, horror, and
delight, all 11 flaws would have been detected by the newly added run-time
bounds checks, making this a distinctly clear mitigation improvement: 100%
coverage for known memcpy() flaws, with a possible 2 orders of magnitude
gain in coverage over existing but undiscovered run-time dynamic length
flaws (i.e. 1265 newly covered sites in addition to the 11 known), against
only <4% of all memcpy() callers maybe gaining a false positive run-time
check, with only about 150 new compile-time instances needing evaluation.

Specifically these would have been mitigated:
CVE-2020-24490 https://git.kernel.org/linus/a2ec905d1e160a33b2e210e45ad30445ef26ce0e
CVE-2020-12654 https://git.kernel.org/linus/3a9b153c5591548612c3955c9600a98150c81875
CVE-2020-12653 https://git.kernel.org/linus/b70261a288ea4d2f4ac7cd04be08a9f0f2de4f4d
CVE-2019-14895 https://git.kernel.org/linus/3d94a4a8373bf5f45cf5f939e88b8354dbf2311b
CVE-2019-14816 https://git.kernel.org/linus/7caac62ed598a196d6ddf8d9c121e12e082cac3a
CVE-2019-14815 https://git.kernel.org/linus/7caac62ed598a196d6ddf8d9c121e12e082cac3a
CVE-2019-14814 https://git.kernel.org/linus/7caac62ed598a196d6ddf8d9c121e12e082cac3a
CVE-2019-10126 https://git.kernel.org/linus/69ae4f6aac1578575126319d3f55550e7e440449
CVE-2019-9500  https://git.kernel.org/linus/1b5e2423164b3670e8bc9174e4762d297990deff
no-CVE-yet     https://git.kernel.org/linus/130f634da1af649205f4a3dd86cbe5c126b57914
no-CVE-yet     https://git.kernel.org/linus/d10a87a3535cce2b890897914f5d0d83df669c63

To accelerate the review of potential run-time false positives, it's
also worth noting that it is possible to partially automate checking
by examining the memcpy() buffer argument to check for the destination
struct member having a neighboring array member. It is reasonable to
expect that the vast majority of run-time false positives would look like
the already evaluated and fixed compile-time false positives, where the
most common pattern is neighboring arrays. (And, FWIW, many of the
compile-time fixes were actual bugs, so it is reasonable to assume we'll
have similar cases of actual bugs getting fixed for run-time checks.)

Implementation:

Tighten the memcpy() destination buffer size checking to use the actual
("mode 1") target buffer size as the bounds check instead of their
enclosing structure's ("mode 0") size. Use a common inline for memcpy()
(and memmove() in a following patch), since all the tests are the
same. All new cross-field memcpy() uses must use the struct_group() macro
or similar to target a specific range of fields, so that FORTIFY_SOURCE
can reason about the size and safety of the copy.

For now, cross-member "mode 1" _read_ detection at compile-time will be
limited to W=1 builds, since it is, unfortunately, very common. As the
priority is solving write overflows, read overflows will be part of a
future phase (and can be fixed in parallel, for anyone wanting to look
at W=1 build output).

For run-time, the "mode 0" size checking and mitigation is left unchanged,
with "mode 1" to be added in stages. In this patch, no new run-time
checks are added. Future patches will first bounds-check writes,
and only perform a WARN() for now. This way any missed run-time false
positives can be flushed out over the coming several development cycles,
but system builders who have tested their workloads to be WARN()-free
can enable the panic_on_warn=1 sysctl to immediately gain a mitigation
against this class of buffer overflows. Once that is under way, run-time
bounds-checking of reads can be similarly enabled.

Related classes of flaws that will remain unmitigated:

- memcpy() with flexible array structures, as the compiler does not
  currently have visibility into the size of the trailing flexible
  array. These can be fixed in the future by refactoring such cases
  to use a new set of flexible array structure helpers to perform the
  common serialization/deserialization code patterns doing allocation
  and/or copying.

- memcpy() with raw pointers (e.g. void *, char *, etc), or otherwise
  having their buffer size unknown at compile time, have no good
  mitigation beyond memory tagging (and even that would only protect
  against inter-object overflow, not intra-object neighboring field
  overflows), or refactoring. Some kind of "fat pointer" solution is
  likely needed to gain proper size-of-buffer awareness. (e.g. see
  struct membuf)

- type confusion where a higher level type's allocation size does
  not match the resulting cast type eventually passed to a deeper
  memcpy() call where the compiler cannot see the true type. In
  theory, greater static analysis could catch these, and the use
  of -Warray-bounds will help find some of these.

[0] https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html
[1] https://git.kernel.org/linus/6a39e62abbafd1d58d1722f40c7d26ef379c6a2f

Signed-off-by: Kees Cook <keescook@chromium.org>
2022-02-13 16:50:06 -08:00
Jakub Kicinski
5b91c5cc0e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-10 17:29:56 -08:00
Andy Shevchenko
f74a08fc61 vsprintf: Move space out of string literals in fourcc_string()
The literals "big-endian" and "little-endian" may be potentially
occurred in other places. Dropping space allows linker to
merge them by using only a single copy.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220127181233.72910-2-andriy.shevchenko@linux.intel.com
2022-02-10 13:16:50 +01:00
Andy Shevchenko
d75b26f880 vsprintf: Fix potential unaligned access
The %p4cc specifier in some cases might get an unaligned pointer.
Due to this we need to make copy to local variable once to avoid
potential crashes on some architectures due to improper access.

Fixes: af612e43de6d ("lib/vsprintf: Add support for printing V4L2 and DRM fourccs")
Cc: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20220127181233.72910-1-andriy.shevchenko@linux.intel.com
2022-02-10 13:16:50 +01:00
Jakub Kicinski
1127170d45 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2022-02-09

We've added 126 non-merge commits during the last 16 day(s) which contain
a total of 201 files changed, 4049 insertions(+), 2215 deletions(-).

The main changes are:

1) Add custom BPF allocator for JITs that pack multiple programs into a huge
   page to reduce iTLB pressure, from Song Liu.

2) Add __user tagging support in vmlinux BTF and utilize it from BPF
   verifier when generating loads, from Yonghong Song.

3) Add per-socket fast path check guarding from cgroup/BPF overhead when
   used by only some sockets, from Pavel Begunkov.

4) Continued libbpf deprecation work of APIs/features and removal of their
   usage from samples, selftests, libbpf & bpftool, from Andrii Nakryiko
   and various others.

5) Improve BPF instruction set documentation by adding byte swap
   instructions and cleaning up load/store section, from Christoph Hellwig.

6) Switch BPF preload infra to light skeleton and remove libbpf dependency
   from it, from Alexei Starovoitov.

7) Fix architecture-agnostic macros in libbpf for accessing syscall
   arguments from BPF progs for non-x86 architectures,
   from Ilya Leoshkevich.

8) Rework port members in struct bpf_sk_lookup and struct bpf_sock to be
   of 16-bit field with anonymous zero padding, from Jakub Sitnicki.

9) Add new bpf_copy_from_user_task() helper to read memory from a different
   task than current. Add ability to create sleepable BPF iterator progs,
   from Kenny Yu.

10) Implement XSK batching for ice's zero-copy driver used by AF_XDP and
    utilize TX batching API from XSK buffer pool, from Maciej Fijalkowski.

11) Generate temporary netns names for BPF selftests to avoid naming
    collisions, from Hangbin Liu.

12) Implement bpf_core_types_are_compat() with limited recursion for
    in-kernel usage, from Matteo Croce.

13) Simplify pahole version detection and finally enable CONFIG_DEBUG_INFO_DWARF5
    to be selected with CONFIG_DEBUG_INFO_BTF, from Nathan Chancellor.

14) Misc minor fixes to libbpf and selftests from various folks.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (126 commits)
  selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup
  bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide
  libbpf: Fix compilation warning due to mismatched printf format
  selftests/bpf: Test BPF_KPROBE_SYSCALL macro
  libbpf: Add BPF_KPROBE_SYSCALL macro
  libbpf: Fix accessing the first syscall argument on s390
  libbpf: Fix accessing the first syscall argument on arm64
  libbpf: Allow overriding PT_REGS_PARM1{_CORE}_SYSCALL
  selftests/bpf: Skip test_bpf_syscall_macro's syscall_arg1 on arm64 and s390
  libbpf: Fix accessing syscall arguments on riscv
  libbpf: Fix riscv register names
  libbpf: Fix accessing syscall arguments on powerpc
  selftests/bpf: Use PT_REGS_SYSCALL_REGS in bpf_syscall_macro
  libbpf: Add PT_REGS_SYSCALL_REGS macro
  selftests/bpf: Fix an endianness issue in bpf_syscall_macro test
  bpf: Fix bpf_prog_pack build HPAGE_PMD_SIZE
  bpf: Fix leftover header->pages in sparc and powerpc code.
  libbpf: Fix signedness bug in btf_dump_array_data()
  selftests/bpf: Do not export subtest as standalone test
  bpf, x86_64: Fail gracefully on bpf_jit_binary_pack_finalize failures
  ...
====================

Link: https://lore.kernel.org/r/20220209210050.8425-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-09 18:40:56 -08:00
Kees Cook
8e7c8ca6b9 test_overflow: Regularize test reporting output
Report test run summaries more regularly, so it's easier to understand
the output:
- Remove noisy "ok" reports for shift and allocator tests.
- Reorganize per-type output to the end of each type's tests.
- Replace redundant vmalloc tests with __vmalloc so that __GFP_NO_WARN
  can be used to keep the expected failure warnings out of dmesg,
  similar to commit 8e060c21ae2c ("lib/test_overflow.c: avoid tainting
  the kernel and fix wrap size")

Resulting output:

  test_overflow: 18 u8 arithmetic tests finished
  test_overflow: 19 s8 arithmetic tests finished
  test_overflow: 17 u16 arithmetic tests finished
  test_overflow: 17 s16 arithmetic tests finished
  test_overflow: 17 u32 arithmetic tests finished
  test_overflow: 17 s32 arithmetic tests finished
  test_overflow: 17 u64 arithmetic tests finished
  test_overflow: 21 s64 arithmetic tests finished
  test_overflow: 113 shift tests finished
  test_overflow: 17 overflow size helper tests finished
  test_overflow: 11 allocation overflow tests finished
  test_overflow: all tests passed

Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Link: https://lore.kernel.org/all/eb6d02ae-e2ed-e7bd-c700-8a6d004d84ce@rasmusvillemoes.dk/
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/all/CAKwvOdnYYa+72VhtJ4ug=SJVFn7w+n7Th+hKYE87BRDt4hvqOg@mail.gmail.com/
Signed-off-by: Kees Cook <keescook@chromium.org>
2022-02-09 14:33:41 -08:00
Dan Williams
3c5b903955 cxl: Prove CXL locking
When CONFIG_PROVE_LOCKING is enabled the 'struct device' definition gets
an additional mutex that is not clobbered by
lockdep_set_novalidate_class() like the typical device_lock(). This
allows for local annotation of subsystem locks with mutex_lock_nested()
per the subsystem's object/lock hierarchy. For CXL, this primarily needs
the ability to lock ports by depth and child objects of ports by their
parent parent-port lock.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ben Widawsky <ben.widawsky@intel.com>
Link: https://lore.kernel.org/r/164365853422.99383.1052399160445197427.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-02-08 22:57:29 -08:00
John Garry
3f607293b7 sbitmap: Delete old sbitmap_queue_get_shallow()
Since __sbitmap_queue_get_shallow() was introduced in commit c05e66733788
("sbitmap: add sbitmap_get_shallow() operation"), it has not been used.

Delete __sbitmap_queue_get_shallow() and rename public
__sbitmap_queue_get_shallow() -> sbitmap_queue_get_shallow() as it is odd
to have public __foo but no foo at all.

Signed-off-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/1644322024-105340-1-git-send-email-john.garry@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-08 06:55:49 -07:00
Ming Lei
3301bc5335 lib/sbitmap: kill 'depth' from sbitmap_word
Only the last sbitmap_word can have different depth, and all the others
must have same depth of 1U << sb->shift, so not necessary to store it in
sbitmap_word, and it can be retrieved easily and efficiently by adding
one internal helper of __map_depth(sb, index).

Remove 'depth' field from sbitmap_word, then the annotation of
____cacheline_aligned_in_smp for 'word' isn't needed any more.

Not see performance effect when running high parallel IOPS test on
null_blk.

This way saves us one cacheline(usually 64 words) per each sbitmap_word.

Cc: Martin Wilck <martin.wilck@suse.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Martin Wilck <mwilck@suse.com>
Reviewed-by: John Garry <john.garry@huawei.com>
Link: https://lore.kernel.org/r/20220110072945.347535-1-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-02-08 06:54:50 -07:00
Eric Dumazet
c2d1e3df4a ref_tracker: remove filter_irq_stacks() call
After commit e94006608949 ("lib/stackdepot: always do filter_irq_stacks()
in stack_depot_save()") it became unnecessary to filter the stack
before calling stack_depot_save().

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-06 11:05:28 +00:00
Eric Dumazet
8fd5522f44 ref_tracker: add a count of untracked references
We are still chasing a netdev refcount imbalance, and we suspect
we have one rogue dev_put() that is consuming a reference taken
from a dev_hold_track()

To detect this case, allow ref_tracker_alloc() and ref_tracker_free()
to be called with a NULL @trackerp parameter, and use a dedicated
refcount_t just for them.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-05 15:22:44 +00:00