7457 Commits

Author SHA1 Message Date
Wedson Almeida Filho
ee6d3dd4ed driver core: make kobj_type constant.
This way instances of kobj_type (which contain function pointers) can be
stored in .rodata, which means that they cannot be [easily/accidentally]
modified at runtime.

Signed-off-by: Wedson Almeida Filho <wedsonaf@google.com>
Link: https://lore.kernel.org/r/20211224231345.777370-1-wedsonaf@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-27 10:40:00 +01:00
Christophe JAILLET
7c63f26cb5 lib: objagg: Use the bitmap API when applicable
Use 'bitmap_zalloc()' to simplify code, improve the semantic and reduce
some open-coded arithmetic in allocator arguments.

Also change the corresponding 'kfree()' into 'bitmap_free()' to keep
consistency.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/f9541b085ec68e573004e1be200c11c9c901181a.1640295165.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-24 14:54:29 -08:00
Al Viro
5f174ec3c1 logic_io instance of iounmap() needs volatile on argument
... same as the rest of implementations

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
2021-12-21 21:31:08 +01:00
Johannes Berg
4e8a5edac5 lib/logic_iomem: Fix operation on 32-bit
On 32-bit, the first entry might be at 0/NULL, but that's
strange and leads to issues, e.g. where we check "if (ret)".
Use a IOREMAP_BIAS/IOREMAP_MASK of 0x80000000UL to avoid
this. This then requires reducing the number of areas (via
MAX_AREAS), but we still have 128 areas, which is enough.

Fixes: ca2e334232b6 ("lib: add iomem emulation (logic_iomem)")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2021-12-21 21:28:20 +01:00
Johannes Berg
4e84139e14 lib/logic_iomem: Fix 32-bit build
On a 32-bit build, the (unsigned long long) casts throw warnings
(or errors) due to being to a different integer size. Cast to
uintptr_t first (with the __force for sparse) and then further
to get the consistent print on 32 and 64-bit.

Fixes: ca2e334232b6 ("lib: add iomem emulation (logic_iomem)")
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2021-12-21 21:27:21 +01:00
David Gow
44b7da5fcd kunit: Report test parameter results as (K)TAP subtests
Currently, the results for individial parameters in a parameterised test
are simply output as (K)TAP diagnostic lines.

As kunit_tool now supports nested subtests, report each parameter as its
own subtest.

For example, here's what the output now looks like:
	# Subtest: inode_test_xtimestamp_decoding
	ok 1 - 1901-12-13 Lower bound of 32bit < 0 timestamp, no extra bits
	ok 2 - 1969-12-31 Upper bound of 32bit < 0 timestamp, no extra bits
	ok 3 - 1970-01-01 Lower bound of 32bit >=0 timestamp, no extra bits
	ok 4 - 2038-01-19 Upper bound of 32bit >=0 timestamp, no extra bits
	ok 5 - 2038-01-19 Lower bound of 32bit <0 timestamp, lo extra sec bit on
	ok 6 - 2106-02-07 Upper bound of 32bit <0 timestamp, lo extra sec bit on
	ok 7 - 2106-02-07 Lower bound of 32bit >=0 timestamp, lo extra sec bit on
	ok 8 - 2174-02-25 Upper bound of 32bit >=0 timestamp, lo extra sec bit on
	ok 9 - 2174-02-25 Lower bound of 32bit <0 timestamp, hi extra sec bit on
	ok 10 - 2242-03-16 Upper bound of 32bit <0 timestamp, hi extra sec bit on
	ok 11 - 2242-03-16 Lower bound of 32bit >=0 timestamp, hi extra sec bit on
	ok 12 - 2310-04-04 Upper bound of 32bit >=0 timestamp, hi extra sec bit on
	ok 13 - 2310-04-04 Upper bound of 32bit>=0 timestamp, hi extra sec bit 1. 1 ns
	ok 14 - 2378-04-22 Lower bound of 32bit>= timestamp. Extra sec bits 1. Max ns
	ok 15 - 2378-04-22 Lower bound of 32bit >=0 timestamp. All extra sec bits on
	ok 16 - 2446-05-10 Upper bound of 32bit >=0 timestamp. All extra sec bits on
	# inode_test_xtimestamp_decoding: pass:16 fail:0 skip:0 total:16
	ok 1 - inode_test_xtimestamp_decoding

Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-12-13 13:36:29 -07:00
David Gow
37dbb4c7c7 kunit: Don't crash if no parameters are generated
It's possible that a parameterised test could end up with zero
parameters. At the moment, the test function will nevertheless be called
with NULL as the parameter. Instead, don't try to run the test code, and
just mark the test as SKIPped.

Reported-by: Daniel Latypov <dlatypov@google.com>
Signed-off-by: David Gow <davidgow@google.com>
Reviewed-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2021-12-13 13:36:21 -07:00
Eric W. Biederman
cead185526 exit: Rename complete_and_exit to kthread_complete_and_exit
Update complete_and_exit to call kthread_exit instead of do_exit.

Change the name to reflect this change in functionality.  All of the
users of complete_and_exit are causing the current kthread to exit so
this change makes it clear what is happening.

Move the implementation of kthread_complete_and_exit from
kernel/exit.c to to kernel/kthread.c.  As this function is kthread
specific it makes most sense to live with the kthread functions.

There are no functional change.

Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
2021-12-13 12:04:45 -06:00
Mark Rutland
5fb6e8cf53 locking/atomic: atomic64: Remove unusable atomic ops
The generic atomic64 implementation provides:

* atomic64_and_return()
* atomic64_or_return()
* atomic64_xor_return()

... but none of these exist in the standard atomic64 API as described by
scripts/atomic/atomics.tbl, and none of these have prototypes exposed by
<asm-generic/atomic64.h>.

The lkp kernel test robot noted this results in warnings when building with
W=1:

  lib/atomic64.c:82:5: warning: no previous prototype for 'generic_atomic64_and_return' [-Wmissing-prototypes]

  lib/atomic64.c:82:5: warning: no previous prototype for 'generic_atomic64_or_return' [-Wmissing-prototypes]

  lib/atomic64.c:82:5: warning: no previous prototype for 'generic_atomic64_xor_return' [-Wmissing-prototypes]

This appears to have been a thinko in commit:

  28aa2bda2211f432 ("locking/atomic: Implement atomic{,64,_long}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()")

... where we grouped add/sub separately from and/ox/xor, so that we could avoid
implementing _return forms for the latter group, but forgot to remove
ATOMIC64_OP_RETURN() for that group.

This doesn't cause any functional problem, but it's pointless to build code
which cannot be used. Remove the unusable code. This does not affect add/sub,
for which _return forms will still be built.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Boqun Feng <boqun.feng@gmail.com>
Link: https://lore.kernel.org/r/20211126115923.41489-1-mark.rutland@arm.com
2021-12-13 10:56:09 +01:00
Ingo Molnar
6773cc31a9 Linux 5.16-rc5
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmG2fU0eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGC7EH/3R7Rt+OD8Wn8Ss3
 w8V+dBxVwa2u2oMTyUHPxaeOXZ7bi38XlUdLFPOK/76bGwO0a5TmYZqsWdRbGyT0
 HfcYjHsQ0lbJXk/nh2oM47oJxJXVpThIHXJEk0FZ0Y5t+DYjIYlNHzqZymUyhLem
 St74zgWcyT+MXuqY34vB827FJDUnOxhhhi85tObeunaSPAomy9aiYidSC1ARREnz
 iz2VUntP/QnRnKVvL2nUZNzcz1xL5vfCRSKsRGRSv3qW1Y/1M71ylt6JVmSftWq+
 VmMdFxFhdrb1OK/1ct/930Un/UP2NG9EJsWxote2XYlnVSZHzDqH7lUhbqgdCcLz
 1m2tVNY=
 =7wRd
 -----END PGP SIGNATURE-----

Merge tag 'v5.16-rc5' into locking/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2021-12-13 10:48:46 +01:00
Jakub Kicinski
be3158290d Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Andrii Nakryiko says:

====================
bpf-next 2021-12-10 v2

We've added 115 non-merge commits during the last 26 day(s) which contain
a total of 182 files changed, 5747 insertions(+), 2564 deletions(-).

The main changes are:

1) Various samples fixes, from Alexander Lobakin.

2) BPF CO-RE support in kernel and light skeleton, from Alexei Starovoitov.

3) A batch of new unified APIs for libbpf, logging improvements, version
   querying, etc. Also a batch of old deprecations for old APIs and various
   bug fixes, in preparation for libbpf 1.0, from Andrii Nakryiko.

4) BPF documentation reorganization and improvements, from Christoph Hellwig
   and Dave Tucker.

5) Support for declarative initialization of BPF_MAP_TYPE_PROG_ARRAY in
   libbpf, from Hengqi Chen.

6) Verifier log fixes, from Hou Tao.

7) Runtime-bounded loops support with bpf_loop() helper, from Joanne Koong.

8) Extend branch record capturing to all platforms that support it,
   from Kajol Jain.

9) Light skeleton codegen improvements, from Kumar Kartikeya Dwivedi.

10) bpftool doc-generating script improvements, from Quentin Monnet.

11) Two libbpf v0.6 bug fixes, from Shuyi Cheng and Vincent Minet.

12) Deprecation warning fix for perf/bpf_counter, from Song Liu.

13) MAX_TAIL_CALL_CNT unification and MIPS build fix for libbpf,
    from Tiezhu Yang.

14) BTF_KING_TYPE_TAG follow-up fixes, from Yonghong Song.

15) Selftests fixes and improvements, from Ilya Leoshkevich, Jean-Philippe
    Brucker, Jiri Olsa, Maxim Mikityanskiy, Tirthendu Sarkar, Yucong Sun,
    and others.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (115 commits)
  libbpf: Add "bool skipped" to struct bpf_map
  libbpf: Fix typo in btf__dedup@LIBBPF_0.0.2 definition
  bpftool: Switch bpf_object__load_xattr() to bpf_object__load()
  selftests/bpf: Remove the only use of deprecated bpf_object__load_xattr()
  selftests/bpf: Add test for libbpf's custom log_buf behavior
  selftests/bpf: Replace all uses of bpf_load_btf() with bpf_btf_load()
  libbpf: Deprecate bpf_object__load_xattr()
  libbpf: Add per-program log buffer setter and getter
  libbpf: Preserve kernel error code and remove kprobe prog type guessing
  libbpf: Improve logging around BPF program loading
  libbpf: Allow passing user log setting through bpf_object_open_opts
  libbpf: Allow passing preallocated log_buf when loading BTF into kernel
  libbpf: Add OPTS-based bpf_btf_load() API
  libbpf: Fix bpf_prog_load() log_buf logic for log_level 0
  samples/bpf: Remove unneeded variable
  bpf: Remove redundant assignment to pointer t
  selftests/bpf: Fix a compilation warning
  perf/bpf_counter: Use bpf_map_create instead of bpf_create_map
  samples: bpf: Fix 'unknown warning group' build warning on Clang
  samples: bpf: Fix xdp_sample_user.o linking with Clang
  ...
====================

Link: https://lore.kernel.org/r/20211210234746.2100561-1-andrii@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10 15:56:13 -08:00
Marco Elver
bd3d5bd1a0 kcsan: Support WEAK_MEMORY with Clang where no objtool support exists
Clang and GCC behave a little differently when it comes to the
__no_sanitize_thread attribute, which has valid reasons, and depending
on context either one could be right.

Traditionally, user space ThreadSanitizer [1] still expects instrumented
builtin atomics (to avoid false positives) and __tsan_func_{entry,exit}
(to generate meaningful stack traces), even if the function has the
attribute no_sanitize("thread").

[1] https://clang.llvm.org/docs/ThreadSanitizer.html#attribute-no-sanitize-thread

GCC doesn't follow the same policy (for better or worse), and removes
all kinds of instrumentation if no_sanitize is added. Arguably, since
this may be a problem for user space ThreadSanitizer, we expect this may
change in future.

Since KCSAN != ThreadSanitizer, the likelihood of false positives even
without barrier instrumentation everywhere, is much lower by design.

At least for Clang, however, to fully remove all sanitizer
instrumentation, we must add the disable_sanitizer_instrumentation
attribute, which is available since Clang 14.0.

Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-12-09 16:42:28 -08:00
Marco Elver
69562e4983 kcsan: Add core support for a subset of weak memory modeling
Add support for modeling a subset of weak memory, which will enable
detection of a subset of data races due to missing memory barriers.

KCSAN's approach to detecting missing memory barriers is based on
modeling access reordering, and enabled if `CONFIG_KCSAN_WEAK_MEMORY=y`,
which depends on `CONFIG_KCSAN_STRICT=y`. The feature can be enabled or
disabled at boot and runtime via the `kcsan.weak_memory` boot parameter.

Each memory access for which a watchpoint is set up, is also selected
for simulated reordering within the scope of its function (at most 1
in-flight access).

We are limited to modeling the effects of "buffering" (delaying the
access), since the runtime cannot "prefetch" accesses (therefore no
acquire modeling). Once an access has been selected for reordering, it
is checked along every other access until the end of the function scope.
If an appropriate memory barrier is encountered, the access will no
longer be considered for reordering.

When the result of a memory operation should be ordered by a barrier,
KCSAN can then detect data races where the conflict only occurs as a
result of a missing barrier due to reordering accesses.

Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Marco Elver <elver@google.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-12-09 16:42:26 -08:00
Jakub Kicinski
3150a73366 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09 13:23:02 -08:00
Jakub Kicinski
6efcdadc15 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
bpf 2021-12-08

We've added 12 non-merge commits during the last 22 day(s) which contain
a total of 29 files changed, 659 insertions(+), 80 deletions(-).

The main changes are:

1) Fix an off-by-two error in packet range markings and also add a batch of
   new tests for coverage of these corner cases, from Maxim Mikityanskiy.

2) Fix a compilation issue on MIPS JIT for R10000 CPUs, from Johan Almbladh.

3) Fix two functional regressions and a build warning related to BTF kfunc
   for modules, from Kumar Kartikeya Dwivedi.

4) Fix outdated code and docs regarding BPF's migrate_disable() use on non-
   PREEMPT_RT kernels, from Sebastian Andrzej Siewior.

5) Add missing includes in order to be able to detangle cgroup vs bpf header
   dependencies, from Jakub Kicinski.

6) Fix regression in BPF sockmap tests caused by missing detachment of progs
   from sockets when they are removed from the map, from John Fastabend.

7) Fix a missing "no previous prototype" warning in x86 JIT caused by BPF
   dispatcher, from Björn Töpel.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Add selftests to cover packet access corner cases
  bpf: Fix the off-by-two error in range markings
  treewide: Add missing includes masked by cgroup -> bpf dependency
  tools/resolve_btfids: Skip unresolved symbol warning for empty BTF sets
  bpf: Fix bpf_check_mod_kfunc_call for built-in modules
  bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL
  mips, bpf: Fix reference to non-existing Kconfig symbol
  bpf: Make sure bpf_disable_instrumentation() is safe vs preemption.
  Documentation/locking/locktypes: Update migrate_disable() bits.
  bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap
  bpf, sockmap: Attach map progs to psock early for feature probes
  bpf, x86: Fix "no previous prototype" warning
====================

Link: https://lore.kernel.org/r/20211208155125.11826-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08 16:06:44 -08:00
Eric Dumazet
4d92b95ff2 net: add net device refcount tracker infrastructure
net device are refcounted. Over the years we had numerous bugs
caused by imbalanced dev_hold() and dev_put() calls.

The general idea is to be able to precisely pair each decrement with
a corresponding prior increment. Both share a cookie, basically
a pointer to private data storing stack traces.

This patch adds dev_hold_track() and dev_put_track().

To use these helpers, each data structure owning a refcount
should also use a "netdevice_tracker" to pair the hold and put.

netdevice_tracker dev_tracker;
...
dev_hold_track(dev, &dev_tracker, GFP_ATOMIC);
...
dev_put_track(dev, &dev_tracker);

Whenever a leak happens, we will get precise stack traces
of the point dev_hold_track() happened, at device dismantle phase.

We will also get a stack trace if too many dev_put_track() for the same
netdevice_tracker are attempted.

This is guarded by CONFIG_NET_DEV_REFCNT_TRACKER option.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:05:07 -08:00
Eric Dumazet
914a7b5000 lib: add tests for reference tracker
This module uses reference tracker, forcing two issues.

1) Double free of a tracker

2) leak of two trackers, one being allocated from softirq context.

"modprobe test_ref_tracker" would emit the following traces.
(Use scripts/decode_stacktrace.sh if necessary)

[  171.648681] reference already released.
[  171.653213] allocated in:
[  171.656523]  alloctest_ref_tracker_alloc2+0x1c/0x20 [test_ref_tracker]
[  171.656526]  init_module+0x86/0x1000 [test_ref_tracker]
[  171.656528]  do_one_initcall+0x9c/0x220
[  171.656532]  do_init_module+0x60/0x240
[  171.656536]  load_module+0x32b5/0x3610
[  171.656538]  __do_sys_init_module+0x148/0x1a0
[  171.656540]  __x64_sys_init_module+0x1d/0x20
[  171.656542]  do_syscall_64+0x4a/0xb0
[  171.656546]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.656549] freed in:
[  171.659520]  alloctest_ref_tracker_free+0x13/0x20 [test_ref_tracker]
[  171.659522]  init_module+0xec/0x1000 [test_ref_tracker]
[  171.659523]  do_one_initcall+0x9c/0x220
[  171.659525]  do_init_module+0x60/0x240
[  171.659527]  load_module+0x32b5/0x3610
[  171.659529]  __do_sys_init_module+0x148/0x1a0
[  171.659532]  __x64_sys_init_module+0x1d/0x20
[  171.659534]  do_syscall_64+0x4a/0xb0
[  171.659536]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.659575] ------------[ cut here ]------------
[  171.659576] WARNING: CPU: 5 PID: 13016 at lib/ref_tracker.c:112 ref_tracker_free+0x224/0x270
[  171.659581] Modules linked in: test_ref_tracker(+)
[  171.659591] CPU: 5 PID: 13016 Comm: modprobe Tainted: G S                5.16.0-smp-DEV #290
[  171.659595] RIP: 0010:ref_tracker_free+0x224/0x270
[  171.659599] Code: 5e 41 5f 5d c3 48 c7 c7 04 9c 74 a6 31 c0 e8 62 ee 67 00 83 7b 14 00 75 1a 83 7b 18 00 75 30 4c 89 ff 4c 89 f6 e8 9c 00 69 00 <0f> 0b bb ea ff ff ff eb ae 48 c7 c7 3a 0a 77 a6 31 c0 e8 34 ee 67
[  171.659601] RSP: 0018:ffff89058ba0bbd0 EFLAGS: 00010286
[  171.659603] RAX: 0000000000000029 RBX: ffff890586b19780 RCX: 08895bff57c7d100
[  171.659604] RDX: c0000000ffff7fff RSI: 0000000000000282 RDI: ffffffffc0407000
[  171.659606] RBP: ffff89058ba0bc88 R08: 0000000000000000 R09: ffffffffa6f342e0
[  171.659607] R10: 00000000ffff7fff R11: 0000000000000000 R12: 000000008f000000
[  171.659608] R13: 0000000000000014 R14: 0000000000000282 R15: ffffffffc0407000
[  171.659609] FS:  00007f97ea29d740(0000) GS:ffff8923ff940000(0000) knlGS:0000000000000000
[  171.659611] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  171.659613] CR2: 00007f97ea299000 CR3: 0000000186b4a004 CR4: 00000000001706e0
[  171.659614] Call Trace:
[  171.659615]  <TASK>
[  171.659631]  ? alloctest_ref_tracker_free+0x13/0x20 [test_ref_tracker]
[  171.659633]  ? init_module+0x105/0x1000 [test_ref_tracker]
[  171.659636]  ? do_one_initcall+0x9c/0x220
[  171.659638]  ? do_init_module+0x60/0x240
[  171.659641]  ? load_module+0x32b5/0x3610
[  171.659644]  ? __do_sys_init_module+0x148/0x1a0
[  171.659646]  ? __x64_sys_init_module+0x1d/0x20
[  171.659649]  ? do_syscall_64+0x4a/0xb0
[  171.659652]  ? entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.659656]  ? 0xffffffffc040a000
[  171.659658]  alloctest_ref_tracker_free+0x13/0x20 [test_ref_tracker]
[  171.659660]  init_module+0x105/0x1000 [test_ref_tracker]
[  171.659663]  do_one_initcall+0x9c/0x220
[  171.659666]  do_init_module+0x60/0x240
[  171.659669]  load_module+0x32b5/0x3610
[  171.659672]  __do_sys_init_module+0x148/0x1a0
[  171.659676]  __x64_sys_init_module+0x1d/0x20
[  171.659678]  do_syscall_64+0x4a/0xb0
[  171.659694]  ? exc_page_fault+0x6e/0x140
[  171.659696]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.659698] RIP: 0033:0x7f97ea3dbe7a
[  171.659700] Code: 48 8b 0d 61 8d 06 00 f7 d8 64 89 01 48 83 c8 ff c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2e 8d 06 00 f7 d8 64 89 01 48
[  171.659701] RSP: 002b:00007ffea67ce608 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[  171.659703] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f97ea3dbe7a
[  171.659704] RDX: 00000000013a0ba0 RSI: 0000000000002808 RDI: 00007f97ea299000
[  171.659705] RBP: 00007ffea67ce670 R08: 0000000000000003 R09: 0000000000000000
[  171.659706] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000013a1048
[  171.659707] R13: 00000000013a0ba0 R14: 0000000001399930 R15: 00000000013a1030
[  171.659709]  </TASK>
[  171.659710] ---[ end trace f5dbd6afa41e60a9 ]---
[  171.659712] leaked reference.
[  171.663393]  alloctest_ref_tracker_alloc0+0x1c/0x20 [test_ref_tracker]
[  171.663395]  test_ref_tracker_timer_func+0x9/0x20 [test_ref_tracker]
[  171.663397]  call_timer_fn+0x31/0x140
[  171.663401]  expire_timers+0x46/0x110
[  171.663403]  __run_timers+0x16f/0x1b0
[  171.663404]  run_timer_softirq+0x1d/0x40
[  171.663406]  __do_softirq+0x148/0x2d3
[  171.663408] leaked reference.
[  171.667101]  alloctest_ref_tracker_alloc1+0x1c/0x20 [test_ref_tracker]
[  171.667103]  init_module+0x81/0x1000 [test_ref_tracker]
[  171.667104]  do_one_initcall+0x9c/0x220
[  171.667106]  do_init_module+0x60/0x240
[  171.667108]  load_module+0x32b5/0x3610
[  171.667111]  __do_sys_init_module+0x148/0x1a0
[  171.667113]  __x64_sys_init_module+0x1d/0x20
[  171.667115]  do_syscall_64+0x4a/0xb0
[  171.667117]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.667131] ------------[ cut here ]------------
[  171.667132] WARNING: CPU: 5 PID: 13016 at lib/ref_tracker.c:30 ref_tracker_dir_exit+0x104/0x130
[  171.667136] Modules linked in: test_ref_tracker(+)
[  171.667144] CPU: 5 PID: 13016 Comm: modprobe Tainted: G S      W         5.16.0-smp-DEV #290
[  171.667147] RIP: 0010:ref_tracker_dir_exit+0x104/0x130
[  171.667150] Code: 01 00 00 00 00 ad de 48 89 03 4c 89 63 08 48 89 df e8 20 a0 d5 ff 4c 89 f3 4d 39 ee 75 a8 4c 89 ff 48 8b 75 d0 e8 7c 05 69 00 <0f> 0b eb 0c 4c 89 ff 48 8b 75 d0 e8 6c 05 69 00 41 8b 47 08 83 f8
[  171.667151] RSP: 0018:ffff89058ba0bc68 EFLAGS: 00010286
[  171.667154] RAX: 08895bff57c7d100 RBX: ffffffffc0407010 RCX: 000000000000003b
[  171.667156] RDX: 000000000000003c RSI: 0000000000000282 RDI: ffffffffc0407000
[  171.667157] RBP: ffff89058ba0bc98 R08: 0000000000000000 R09: ffffffffa6f342e0
[  171.667159] R10: 00000000ffff7fff R11: 0000000000000000 R12: dead000000000122
[  171.667160] R13: ffffffffc0407010 R14: ffffffffc0407010 R15: ffffffffc0407000
[  171.667162] FS:  00007f97ea29d740(0000) GS:ffff8923ff940000(0000) knlGS:0000000000000000
[  171.667164] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  171.667166] CR2: 00007f97ea299000 CR3: 0000000186b4a004 CR4: 00000000001706e0
[  171.667169] Call Trace:
[  171.667170]  <TASK>
[  171.667171]  ? 0xffffffffc040a000
[  171.667173]  init_module+0x126/0x1000 [test_ref_tracker]
[  171.667175]  do_one_initcall+0x9c/0x220
[  171.667179]  do_init_module+0x60/0x240
[  171.667182]  load_module+0x32b5/0x3610
[  171.667186]  __do_sys_init_module+0x148/0x1a0
[  171.667189]  __x64_sys_init_module+0x1d/0x20
[  171.667192]  do_syscall_64+0x4a/0xb0
[  171.667194]  ? exc_page_fault+0x6e/0x140
[  171.667196]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  171.667199] RIP: 0033:0x7f97ea3dbe7a
[  171.667200] Code: 48 8b 0d 61 8d 06 00 f7 d8 64 89 01 48 83 c8 ff c3 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2e 8d 06 00 f7 d8 64 89 01 48
[  171.667201] RSP: 002b:00007ffea67ce608 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
[  171.667203] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f97ea3dbe7a
[  171.667204] RDX: 00000000013a0ba0 RSI: 0000000000002808 RDI: 00007f97ea299000
[  171.667205] RBP: 00007ffea67ce670 R08: 0000000000000003 R09: 0000000000000000
[  171.667206] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000013a1048
[  171.667207] R13: 00000000013a0ba0 R14: 0000000001399930 R15: 00000000013a1030
[  171.667209]  </TASK>
[  171.667210] ---[ end trace f5dbd6afa41e60aa ]---

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:04:44 -08:00
Eric Dumazet
4e66934eaa lib: add reference counting tracking infrastructure
It can be hard to track where references are taken and released.

In networking, we have annoying issues at device or netns dismantles,
and we had various proposals to ease root causing them.

This patch adds new infrastructure pairing refcount increases
and decreases. This will self document code, because programmers
will have to associate increments/decrements.

This is controled by CONFIG_REF_TRACKER which can be selected
by users of this feature.

This adds both cpu and memory costs, and thus should probably be
used with care.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:04:44 -08:00
Christophe JAILLET
52e68cd60d vsprintf: Use non-atomic bitmap API when applicable
The 'set' bitmap is local to this function. No concurrent access to it is
possible.
So prefer the non-atomic '__[set|clear]_bit()' function to save a few
cycles.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/1abf81a5e509d372393bd22041eed4ebc07ef9f7.1638023178.git.christophe.jaillet@wanadoo.fr
2021-12-06 13:35:28 +01:00
Sebastian Andrzej Siewior
9a75bd0c52 lockdep/selftests: Adapt ww-tests for PREEMPT_RT
The ww-mutex selftest operates directly on ww_mutex::base and assumes
its type is struct mutex. This isn't true on PREEMPT_RT which turns the
mutex into a rtmutex.

Add a ww_mutex_base_ abstraction which maps to the relevant mutex_ or
rt_mutex_ function.
Change the CONFIG_DEBUG_MUTEXES ifdef to DEBUG_WW_MUTEXES. The latter is
true for the MUTEX and RTMUTEX implementation of WW-MUTEX. The
assignment is required in order to pass the tests.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211129174654.668506-10-bigeasy@linutronix.de
2021-12-04 10:56:24 +01:00
Sebastian Andrzej Siewior
a529f8db89 lockdep/selftests: Skip the softirq related tests on PREEMPT_RT
The softirq context on PREEMPT_RT is different compared to !PREEMPT_RT.
As such lockdep_softirq_enter() is a nop and the all the "softirq safe"
tests fail on PREEMPT_RT because there is no difference.

Skip the softirq context tests on PREEMPT_RT.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211129174654.668506-9-bigeasy@linutronix.de
2021-12-04 10:56:24 +01:00
Sebastian Andrzej Siewior
512bf713cb lockdep/selftests: Unbalanced migrate_disable() & rcu_read_lock().
The tests with unbalanced lock() + unlock() operation leave a modified
preemption counter behind which is then reset to its original value
after the test.

The spin_lock() function on PREEMPT_RT does not include a
preempt_disable() statement but migrate_disable() and read_rcu_lock().
As a consequence both counter never get back to their original value
and the system explodes later after the selftest.  In the
double-unlock case on PREEMPT_RT, the migrate_disable() and RCU code
will trigger a warning which should be avoided. These counter should
not be decremented below their initial value.

Save both counters and bring them back to their original value after
the test.  In the double-unlock case, increment both counter in
advance to they become balanced after the double unlock.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211129174654.668506-8-bigeasy@linutronix.de
2021-12-04 10:56:24 +01:00
Sebastian Andrzej Siewior
fc78dd08e6 lockdep/selftests: Avoid using local_lock_{acquire|release}().
The local_lock related functions
  local_lock_acquire()
  local_lock_release()

are part of the internal implementation and should be avoided.
Define the lock as DEFINE_PER_CPU so the normal local_lock() function
can be used.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211129174654.668506-7-bigeasy@linutronix.de
2021-12-04 10:56:24 +01:00
Kumar Kartikeya Dwivedi
d9847eb8be bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL
Vinicius Costa Gomes reported [0] that build fails when
CONFIG_DEBUG_INFO_BTF is enabled and CONFIG_BPF_SYSCALL is disabled.
This leads to btf.c not being compiled, and then no symbol being present
in vmlinux for the declarations in btf.h. Since BTF is not useful
without enabling BPF subsystem, disallow this combination.

However, theoretically disabling both now could still fail, as the
symbol for kfunc_btf_id_list variables is not available. This isn't a
problem as the compiler usually optimizes the whole register/unregister
call, but at lower optimization levels it can fail the build in linking
stage.

Fix that by adding dummy variables so that modules taking address of
them still work, but the whole thing is a noop.

  [0]: https://lore.kernel.org/bpf/20211110205418.332403-1-vinicius.gomes@intel.com

Fixes: 14f267d95fe4 ("bpf: btf: Introduce helpers for dynamic BTF set registration")
Reported-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211122144742.477787-2-memxor@gmail.com
2021-12-02 13:39:46 -08:00
Arnd Bergmann
f7e5b9bfa6 siphash: use _unaligned version by default
On ARM v6 and later, we define CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
because the ordinary load/store instructions (ldr, ldrh, ldrb) can
tolerate any misalignment of the memory address. However, load/store
double and load/store multiple instructions (ldrd, ldm) may still only
be used on memory addresses that are 32-bit aligned, and so we have to
use the CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS macro with care, or we
may end up with a severe performance hit due to alignment traps that
require fixups by the kernel. Testing shows that this currently happens
with clang-13 but not gcc-11. In theory, any compiler version can
produce this bug or other problems, as we are dealing with undefined
behavior in C99 even on architectures that support this in hardware,
see also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100363.

Fortunately, the get_unaligned() accessors do the right thing: when
building for ARMv6 or later, the compiler will emit unaligned accesses
using the ordinary load/store instructions (but avoid the ones that
require 32-bit alignment). When building for older ARM, those accessors
will emit the appropriate sequence of ldrb/mov/orr instructions. And on
architectures that can truly tolerate any kind of misalignment, the
get_unaligned() accessors resolve to the leXX_to_cpup accessors that
operate on aligned addresses.

Since the compiler will in fact emit ldrd or ldm instructions when
building this code for ARM v6 or later, the solution is to use the
unaligned accessors unconditionally on architectures where this is
known to be fast. The _aligned version of the hash function is
however still needed to get the best performance on architectures
that cannot do any unaligned access in hardware.

This new version avoids the undefined behavior and should produce
the fastest hash on all architectures we support.

Link: https://lore.kernel.org/linux-arm-kernel/20181008211554.5355-4-ard.biesheuvel@linaro.org/
Link: https://lore.kernel.org/linux-crypto/CAK8P3a2KfmmGDbVHULWevB0hv71P2oi2ZCHEAqT=8dQfa0=cqQ@mail.gmail.com/
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Fixes: 2c956a60778c ("siphash: add cryptographically secure PRF")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-29 19:50:50 -08:00
Linus Walleij
2448eab440 Linux 5.16-rc2
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmGavnseHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGcl4H/jyFVlHDSa+utMA5
 7PEQX0AarkBtSvKUgK/SivZxX06nYp2UU5L4Jn70O/mccXWo0ru82eDVO3nSImDR
 Mi668IqzbYfGqVL6CMztDku+XbyT3Yr/i9QILFbLWV5DhCM422GXXN8PFBibDHdI
 6Oyt1WoUh404yjVIHOCNwprfLObxREV6ARhFsIsmCRa8Hf+RkKOY5Twua6j5emm5
 aamiq6SYLtf2H5+DwkR5TnPkie6I2o8oLtA7JYiJpKh5KK75qjlpzFd3S3OWsi1H
 0g752g12r7tLh4ac3Xfgwf36pQ2CdiZ7NUOkJhZWT4aHPqPh+MVheQfpR41f5Sgc
 pvFslTo=
 =QdMf
 -----END PGP SIGNATURE-----

Merge tag 'v5.16-rc2' into devel

Linux 5.16-rc2 is needed because nonurgent fixes headed
for next are strongly textually dependent on a fix that
was applied for rc2.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2021-11-27 00:54:16 +01:00
Helge Deller
8d192bec53 parisc: Increase FRAME_WARN to 2048 bytes on parisc
PA-RISC uses a much bigger frame size for functions than other
architectures. So increase it to 2048 for 32- and 64-bit kernels.
This fixes e.g. a warning in lib/xxhash.c.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2021-11-22 07:37:31 +01:00
Kees Cook
cab71f7495 kasan: test: silence intentional read overflow warnings
As done in commit d73dad4eb5ad ("kasan: test: bypass __alloc_size
checks") for __write_overflow warnings, also silence some more cases
that trip the __read_overflow warnings seen in 5.16-rc1[1]:

  In file included from include/linux/string.h:253,
                   from include/linux/bitmap.h:10,
                   from include/linux/cpumask.h:12,
                   from include/linux/mm_types_task.h:14,
                   from include/linux/mm_types.h:5,
                   from include/linux/page-flags.h:13,
                   from arch/arm64/include/asm/mte.h:14,
                   from arch/arm64/include/asm/pgtable.h:12,
                   from include/linux/pgtable.h:6,
                   from include/linux/kasan.h:29,
                   from lib/test_kasan.c:10:
  In function 'memcmp',
      inlined from 'kasan_memcmp' at lib/test_kasan.c:897:2:
  include/linux/fortify-string.h:263:25: error: call to '__read_overflow' declared with attribute error: detected read beyond size of object (1st parameter)
    263 |                         __read_overflow();
        |                         ^~~~~~~~~~~~~~~~~
  In function 'memchr',
      inlined from 'kasan_memchr' at lib/test_kasan.c:872:2:
  include/linux/fortify-string.h:277:17: error: call to '__read_overflow' declared with attribute error: detected read beyond size of object (1st parameter)
    277 |                 __read_overflow();
        |                 ^~~~~~~~~~~~~~~~~

[1] http://kisskb.ellerman.id.au/kisskb/buildresult/14660585/log/

Link: https://lkml.kernel.org/r/20211116004111.3171781-1-keescook@chromium.org
Fixes: d73dad4eb5ad ("kasan: test: bypass __alloc_size checks")
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com>
Acked-by: Marco Elver <elver@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-20 10:35:54 -08:00
Linus Torvalds
4c388a8e74 zstd fixes for v5.16-rc1
Fix stack usage on parisc & improve code size bloat
 
 This PR contains 3 commits:
 
 1. Fixes a minor unused variable warning reported by Kernel test robot [0].
 2. Improves the reported code bloat (-88KB / 374KB) [1] by outlining
    some functions that are unlikely to be used in performance sensitive
    workloads.
 3. Fixes the reported excess stack usage on parisc [2] by removing -O3
    from zstd's compilation flags. -O3 triggered bugs in the hppa-linux-gnu
    gcc-8 compiler. -O2 performance is acceptable: neutral compression,
    about -1% decompression speed. We also reduce code bloat
    (-105KB / 374KB).
 
 After this commit our code bloat is cut from 374KB to 105KB with gcc-11.
 If we wanted to cut the remaining 105KB we'd likely have to trade
 signicant performance, so I want to say that this is enough for now.
 
 We should be able to get further gains without sacrificing speed, but
 that will take some significant optimization effort, and isn't suitable
 for a quick fix. I've opened an upstream issue [3] to track the code size,
 and try to avoid future regressions, and improve it in the long term.
 
 [0] https://lore.kernel.org/linux-mm/202111120312.833wII4i-lkp@intel.com/T/
 [1] https://lkml.org/lkml/2021/11/15/710
 [2] https://lkml.org/lkml/2021/11/14/189
 [3] https://github.com/facebook/zstd/issues/2867
 
 Link: https://lore.kernel.org/r/20211117014949.1169186-1-nickrterrell@gmail.com/
 Link: https://lore.kernel.org/r/20211117201459.1194876-1-nickrterrell@gmail.com/
 
 Signed-off-by: Nick Terrell <terrelln@fb.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEmIwAqlFIzbQodPwyuzRpqaNEqPUFAmGWw4AACgkQuzRpqaNE
 qPUXfQ/5AXp+7Ip+YD25QUa/je10OZkdGNi5/MNh1m7f6gwlOab7Pnn65mpN8qsW
 1OJbje5PAiTkC+BzJgGw6zr8JCcvgXCVVtAoPEV73uT9QLOoeEE3E2Jf4OQQxroB
 cKC+lZaxeDgqV60koIhsVBMgs4pny57ohTm4fK8yqrIi7ZV21a/FJoVxwyNLCnbU
 uRJKzN9xa3lBYESnMzlV4dF0WhKfprgI+3YXenLBjHHDhhz0nyPT7jt0sr/CoblI
 2QMq8RItlnMleV1La1v1S38ROu1E4MXvIy/MrFyu7ebBX3jDgMYtRdZxuAL/I2+1
 TfN3LfEcwjyB4ft6Ty76kk0gwEihnEORhTeRVrhqxXx8FPWgEB+tgWHo+zLd8wPp
 khqfO6gf4PZJnf6kDOlyEYF2yTuNlWNR6J41+bLW0bA104zLYjeUhejDgyh2aRR2
 WYo/xwzs2FbI4Da/rJ4iTKy4hK++AZ/Sba9b3t29Ca+TiQZJHSUp5KnjNbIW5XCr
 0jknMki6bASlG9nrg+d2EC3fIQop8nJhywNrLZV1uJYx/H5DBmIcLPmhCb4oBOSt
 AP3d/rj5EnO0+bOGGDg00qndsnuDuko7fOsAM3D9l2HoaOly7++RQtIzZqu8Y3EX
 F8L90qvg/vIWFOppnvJX+nXaWz2J55P4iooKlBKz+JQpBff7lDA=
 =kBgl
 -----END PGP SIGNATURE-----

Merge tag 'zstd-for-linus-5.16-rc1' of git://github.com/terrelln/linux

Pull zstd fixes from Nick Terrell:
 "Fix stack usage on parisc & improve code size bloat

  This contains three commits:

   1. Fixes a minor unused variable warning reported by Kernel test
      robot [0].

   2. Improves the reported code bloat (-88KB / 374KB) [1] by outlining
      some functions that are unlikely to be used in performance
      sensitive workloads.

   3. Fixes the reported excess stack usage on parisc [2] by removing
      -O3 from zstd's compilation flags. -O3 triggered bugs in the
      hppa-linux-gnu gcc-8 compiler. -O2 performance is acceptable:
      neutral compression, about -1% decompression speed. We also reduce
      code bloat (-105KB / 374KB).

  After this our code bloat is cut from 374KB to 105KB with gcc-11. If
  we wanted to cut the remaining 105KB we'd likely have to trade
  signicant performance, so I want to say that this is enough for now.

  We should be able to get further gains without sacrificing speed, but
  that will take some significant optimization effort, and isn't
  suitable for a quick fix. I've opened an upstream issue [3] to track
  the code size, and try to avoid future regressions, and improve it in
  the long term"

Link: https://lore.kernel.org/linux-mm/202111120312.833wII4i-lkp@intel.com/T/ [0]
Link: https://lkml.org/lkml/2021/11/15/710 [1]
Link: https://lkml.org/lkml/2021/11/14/189 [2]
Link: https://github.com/facebook/zstd/issues/2867 [3]
Link: https://lore.kernel.org/r/20211117014949.1169186-1-nickrterrell@gmail.com/
Link: https://lore.kernel.org/r/20211117201459.1194876-1-nickrterrell@gmail.com/

* tag 'zstd-for-linus-5.16-rc1' of git://github.com/terrelln/linux:
  lib: zstd: Don't add -O3 to cflags
  lib: zstd: Don't inline functions in zstd_opt.c
  lib: zstd: Fix unused variable warning
2021-11-18 17:09:05 -08:00
Nick Terrell
7416cdc9b9 lib: zstd: Don't add -O3 to cflags
After the update to zstd-1.4.10 passing -O3 is no longer necessary to
get good performance from zstd. Using the default optimization level -O2
is sufficient to get good performance.

I've measured no significant change to compression speed, and a ~1%
decompression speed loss, which is acceptable.

This fixes the reported parisc -Wframe-larger-than=1536 errors [0]. The
gcc-8-hppa-linux-gnu compiler performed very poorly with -O3, generating
stacks that are ~3KB. With -O2 these same functions generate stacks in
the < 100B, completely fixing the problem. Function size deltas are
listed below:

ZSTD_compressBlock_fast_extDict_generic: 3800 -> 68
ZSTD_compressBlock_fast: 2216 -> 40
ZSTD_compressBlock_fast_dictMatchState: 1848 ->  64
ZSTD_compressBlock_doubleFast_extDict_generic: 3744 -> 76
ZSTD_fillDoubleHashTable: 3252 -> 0
ZSTD_compressBlock_doubleFast: 5856 -> 36
ZSTD_compressBlock_doubleFast_dictMatchState: 5380 -> 84
ZSTD_copmressBlock_lazy2: 2420 -> 72

Additionally, this improves the reported code bloat [1]. With gcc-11
bloat-o-meter shows an 80KB code size improvement:

```
> ../scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 31/8 grow/shrink: 24/155 up/down: 25734/-107924 (-82190)
Total: Before=6418562, After=6336372, chg -1.28%
```

Compared to before the zstd-1.4.10 update we see a total code size
regression of 105KB, down from 374KB at v5.16-rc1:

```
> ../scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 292/62 grow/shrink: 56/88 up/down: 235009/-127487 (107522)
Total: Before=6228850, After=6336372, chg +1.73%
```

[0] https://lkml.org/lkml/2021/11/15/710
[1] https://lkml.org/lkml/2021/11/14/189

Link: https://lore.kernel.org/r/20211117014949.1169186-4-nickrterrell@gmail.com/
Link: https://lore.kernel.org/r/20211117201459.1194876-4-nickrterrell@gmail.com/

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Nick Terrell <terrelln@fb.com>
2021-11-18 13:16:22 -08:00
Nick Terrell
1974990cca lib: zstd: Don't inline functions in zstd_opt.c
`zstd_opt.c` contains the match finder for the highest compression
levels. These levels are already very slow, and are unlikely to be used
in the kernel. If they are used, they shouldn't be used in latency
sensitive workloads, so slowing them down shouldn't be a big deal.

This saves 188 KB of the 288 KB regression reported by Geert Uytterhoeven [0].
I've also opened an issue upstream [1] so that we can properly tackle
the code size issue in `zstd_opt.c` for all users, and can hopefully
remove this hack in the next zstd version we import.

Bloat-o-meter output on x86-64:

```
> ../scripts/bloat-o-meter vmlinux.old vmlinux
add/remove: 6/5 grow/shrink: 1/9 up/down: 16673/-209939 (-193266)
Function                                     old     new   delta
ZSTD_compressBlock_opt_generic.constprop       -    7559   +7559
ZSTD_insertBtAndGetAllMatches                  -    6304   +6304
ZSTD_insertBt1                                 -    1731   +1731
ZSTD_storeSeq                                  -     693    +693
ZSTD_BtGetAllMatches                           -     255    +255
ZSTD_updateRep                                 -     128    +128
ZSTD_updateTree                               96      99      +3
ZSTD_insertAndFindFirstIndexHash3             81       -     -81
ZSTD_setBasePrices.constprop                  98       -     -98
ZSTD_litLengthPrice.constprop                138       -    -138
ZSTD_count                                   362     181    -181
ZSTD_count_2segments                        1407     938    -469
ZSTD_insertBt1.constprop                    2689       -   -2689
ZSTD_compressBlock_btultra2                19990     423  -19567
ZSTD_compressBlock_btultra                 19633      15  -19618
ZSTD_initStats_ultra                       19825       -  -19825
ZSTD_compressBlock_btopt                   20374      12  -20362
ZSTD_compressBlock_btopt_extDict           29984      12  -29972
ZSTD_compressBlock_btultra_extDict         30718      15  -30703
ZSTD_compressBlock_btopt_dictMatchState    32689      12  -32677
ZSTD_compressBlock_btultra_dictMatchState   33574      15  -33559
Total: Before=6611828, After=6418562, chg -2.92%
```

[0] https://lkml.org/lkml/2021/11/14/189
[1] https://github.com/facebook/zstd/issues/2862

Link: https://lore.kernel.org/r/20211117014949.1169186-3-nickrterrell@gmail.com/
Link: https://lore.kernel.org/r/20211117201459.1194876-3-nickrterrell@gmail.com/

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Tested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Nick Terrell <terrelln@fb.com>
2021-11-18 13:15:33 -08:00
Nick Terrell
ae8d67b211 lib: zstd: Fix unused variable warning
The variable `litLengthSum` is only used by an `assert()`, so when
asserts are disabled the compiler doesn't see any usage and warns.

This issue is already fixed upstream by PR #2838 [0]. It was reported
by the Kernel test robot in [1].

Another approach would be to change zstd's disabled `assert()`
definition to use the argument in a disabled branch, instead of
ignoring the argument. I've avoided this approach because there are
some small changes necessary to get zstd to build, and I would
want to thoroughly re-test for performance, since that is slightly
changing the code in every function in zstd. It seems like a
trivial change, but some functions are pretty sensitive to small
changes. However, I think it is a valid approach that I would
like to see upstream take, so I've opened Issue #2868 to attempt
this upstream.

Lastly, I've chosen not to use __maybe_unused because all code
in lib/zstd/ must eventually be upstreamed. Upstream zstd can't
use __maybe_unused because it isn't portable across all compilers.

[0] https://github.com/facebook/zstd/pull/2838
[1] https://lore.kernel.org/linux-mm/202111120312.833wII4i-lkp@intel.com/T/
[2] https://github.com/facebook/zstd/issues/2868

Link: https://lore.kernel.org/r/20211117014949.1169186-2-nickrterrell@gmail.com/
Link: https://lore.kernel.org/r/20211117201459.1194876-2-nickrterrell@gmail.com/

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Nick Terrell <terrelln@fb.com>
2021-11-18 13:12:26 -08:00
Linus Torvalds
7d5775d49e printk fixup for 5.16
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmGWF6YACgkQUqAMR0iA
 lPJJ+RAAm9pi/EElKKl+lOlBl+ehJlKuNLnPQWFmmaRc9xd0ruUipp1nsaktLJ8f
 R/PkSwR/YWpBWlF8P4o+x9sOFyTNyLasoHtqsinEcAJI4lb7d1KOrPliTXyr15Ai
 A303djwJmwCw5KxAPOjkG/nMBlpMIAQRee9GDWs1ykfSlIsI4jp7isVbCFNCQNVF
 auHYq1bfJ5MJYPjxIDZUt+NF7kg7dD4k4g+VCVjaH1u8pGeaCUCtnNjMFOk1XfU8
 yFQnaDcrAu4zJPq3d74z4eN9Bk+su8+DhnfrAEFjuFxGTgYc2MyRt0gGFeiUtNs4
 rvST/eHBO4zeuL18S8G+fLcig/9ZYE73xzjdOCzRzLDjn0VQr9t06ez1QqJOb4D6
 A4SSufwek5NIqYKMlhV/az2EceQYK8Wv3KAz8w98KDfwvVVhUSgE23MbTCO0hvQU
 PWF35d3hQ+9oH0ZGYRumb8OpXtKJ+2KmzyN8Z0xhivHFBIKlcW6IBGhWRANclJO8
 jNAM3jiwi8fRDVM2wI1fmgeEmMhG+WuTI3dJVu3tu4vI923FW5GdY6ev5EvH0Ts0
 khTwIjtmCHUJGSeWajy3Gi9irdyhPyPNRMqgal4GS+gGpVU2mMMKTG+NzxxtCRKR
 BUgfCjFDoDJWrNWIzzOwNqgF0Y+V9GgCZOkb73u/y+xVx0Rmc6U=
 =wbBy
 -----END PGP SIGNATURE-----

Merge tag 'printk-for-5.16-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux

Pull printk fixes from Petr Mladek:

 - Try to flush backtraces from other CPUs also on the local one. This
   was a regression caused by printk_safe buffers removal.

 - Remove header dependency warning.

* tag 'printk-for-5.16-fixup' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux:
  printk: Remove printk.h inclusion in percpu.h
  printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces
2021-11-18 10:50:45 -08:00
Andy Shevchenko
acdb89b6c8 lib/string_helpers: Introduce managed variant of kasprintf_strarray()
Some of the users want to have easy way to allocate array of strings
that will be automatically cleaned when associated device is gone.

Introduce managed variant of kasprintf_strarray() for such use cases.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2021-11-18 18:40:08 +02:00
Andy Shevchenko
418e0a3551 lib/string_helpers: Introduce kasprintf_strarray()
We have a few users already that basically want to have array of
sequential strings to be allocated and filled.

Provide a helper for them (basically adjusted version from gpio-mockup.c).

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
2021-11-18 18:40:08 +02:00
Petr Mladek
bf6d0d1e1a Merge branch 'rework/printk_safe-removal' into for-linus 2021-11-18 10:03:47 +01:00
Tiezhu Yang
ebf7f6f0a6 bpf: Change value of MAX_TAIL_CALL_CNT from 32 to 33
In the current code, the actual max tail call count is 33 which is greater
than MAX_TAIL_CALL_CNT (defined as 32). The actual limit is not consistent
with the meaning of MAX_TAIL_CALL_CNT and thus confusing at first glance.
We can see the historical evolution from commit 04fd61ab36ec ("bpf: allow
bpf programs to tail-call other bpf programs") and commit f9dabe016b63
("bpf: Undo off-by-one in interpreter tail call count limit"). In order
to avoid changing existing behavior, the actual limit is 33 now, this is
reasonable.

After commit 874be05f525e ("bpf, tests: Add tail call test suite"), we can
see there exists failed testcase.

On all archs when CONFIG_BPF_JIT_ALWAYS_ON is not set:
 # echo 0 > /proc/sys/net/core/bpf_jit_enable
 # modprobe test_bpf
 # dmesg | grep -w FAIL
 Tail call error path, max count reached jited:0 ret 34 != 33 FAIL

On some archs:
 # echo 1 > /proc/sys/net/core/bpf_jit_enable
 # modprobe test_bpf
 # dmesg | grep -w FAIL
 Tail call error path, max count reached jited:1 ret 34 != 33 FAIL

Although the above failed testcase has been fixed in commit 18935a72eb25
("bpf/tests: Fix error in tail call limit tests"), it would still be good
to change the value of MAX_TAIL_CALL_CNT from 32 to 33 to make the code
more readable.

The 32-bit x86 JIT was using a limit of 32, just fix the wrong comments and
limit to 33 tail calls as the constant MAX_TAIL_CALL_CNT updated. For the
mips64 JIT, use "ori" instead of "addiu" as suggested by Johan Almbladh.
For the riscv JIT, use RV_REG_TCC directly to save one register move as
suggested by Björn Töpel. For the other implementations, no function changes,
it does not change the current limit 33, the new value of MAX_TAIL_CALL_CNT
can reflect the actual max tail call count, the related tail call testcases
in test_bpf module and selftests can work well for the interpreter and the
JIT.

Here are the test results on x86_64:

 # uname -m
 x86_64
 # echo 0 > /proc/sys/net/core/bpf_jit_enable
 # modprobe test_bpf test_suite=test_tail_calls
 # dmesg | tail -1
 test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [0/8 JIT'ed]
 # rmmod test_bpf
 # echo 1 > /proc/sys/net/core/bpf_jit_enable
 # modprobe test_bpf test_suite=test_tail_calls
 # dmesg | tail -1
 test_bpf: test_tail_calls: Summary: 8 PASSED, 0 FAILED, [8/8 JIT'ed]
 # rmmod test_bpf
 # ./test_progs -t tailcalls
 #142 tailcalls:OK
 Summary: 1/11 PASSED, 0 SKIPPED, 0 FAILED

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Tested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Johan Almbladh <johan.almbladh@anyfinetworks.com>
Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/bpf/1636075800-3264-1-git-send-email-yangtiezhu@loongson.cn
2021-11-16 14:03:15 +01:00
Linus Torvalds
c8c109546a Update to zstd-1.4.10
This PR includes 5 commits that update the zstd library version:
 
 1. Adds a new kernel-style wrapper around zstd. This wrapper API
    is functionally equivalent to the subset of the current zstd API that is
    currently used. The wrapper API changes to be kernel style so that the symbols
    don't collide with zstd's symbols. The update to zstd-1.4.10 maintains the same
    API and preserves the semantics, so that none of the callers need to be
    updated. All callers are updated in the commit, because there are zero
    functional changes.
 2. Adds an indirection for `lib/decompress_unzstd.c` so it
    doesn't depend on the layout of `lib/zstd/` to include every source file.
    This allows the next patch to be automatically generated.
 3. Imports the zstd-1.4.10 source code. This commit is automatically generated
    from upstream zstd (https://github.com/facebook/zstd).
 4. Adds me (terrelln@fb.com) as the maintainer of `lib/zstd`.
 5. Fixes a newly added build warning for clang.
 
 The discussion around this patchset has been pretty long, so I've included a
 FAQ-style summary of the history of the patchset, and why we are taking this
 approach.
 
 Why do we need to update?
 -------------------------
 
 The zstd version in the kernel is based off of zstd-1.3.1, which is was released
 August 20, 2017. Since then zstd has seen many bug fixes and performance
 improvements. And, importantly, upstream zstd is continuously fuzzed by OSS-Fuzz,
 and bug fixes aren't backported to older versions. So the only way to sanely get
 these fixes is to keep up to date with upstream zstd. There are no known security
 issues that affect the kernel, but we need to be able to update in case there
 are. And while there are no known security issues, there are relevant bug fixes.
 For example the problem with large kernel decompression has been fixed upstream
 for over 2 years https://lkml.org/lkml/2020/9/29/27.
 
 Additionally the performance improvements for kernel use cases are significant.
 Measured for x86_64 on my Intel i9-9900k @ 3.6 GHz:
 
 - BtrFS zstd compression at levels 1 and 3 is 5% faster
 - BtrFS zstd decompression+read is 15% faster
 - SquashFS zstd decompression+read is 15% faster
 - F2FS zstd compression+write at level 3 is 8% faster
 - F2FS zstd decompression+read is 20% faster
 - ZRAM decompression+read is 30% faster
 - Kernel zstd decompression is 35% faster
 - Initramfs zstd decompression+build is 5% faster
 
 On top of this, there are significant performance improvements coming down the
 line in the next zstd release, and the new automated update patch generation
 will allow us to pull them easily.
 
 How is the update patch generated?
 ----------------------------------
 
 The first two patches are preparation for updating the zstd version. Then the
 3rd patch in the series imports upstream zstd into the kernel. This patch is
 automatically generated from upstream. A script makes the necessary changes and
 imports it into the kernel. The changes are:
 
 - Replace all libc dependencies with kernel replacements and rewrite includes.
 - Remove unncessary portability macros like: #if defined(_MSC_VER).
 - Use the kernel xxhash instead of bundling it.
 
 This automation gets tested every commit by upstream's continuous integration.
 When we cut a new zstd release, we will submit a patch to the kernel to update
 the zstd version in the kernel.
 
 The automated process makes it easy to keep the kernel version of zstd up to
 date. The current zstd in the kernel shares the guts of the code, but has a lot
 of API and minor changes to work in the kernel. This is because at the time
 upstream zstd was not ready to be used in the kernel envrionment as-is. But,
 since then upstream zstd has evolved to support being used in the kernel as-is.
 
 Why are we updating in one big patch?
 -------------------------------------
 
 The 3rd patch in the series is very large. This is because it is restructuring
 the code, so it both deletes the existing zstd, and re-adds the new structure.
 Future updates will be directly proportional to the changes in upstream zstd
 since the last import. They will admittidly be large, as zstd is an actively
 developed project, and has hundreds of commits between every release. However,
 there is no other great alternative.
 
 One option ruled out is to replay every upstream zstd commit. This is not feasible
 for several reasons:
 - There are over 3500 upstream commits since the zstd version in the kernel.
 - The automation to automatically generate the kernel update was only added recently,
   so older commits cannot easily be imported.
 - Not every upstream zstd commit builds.
 - Only zstd releases are "supported", and individual commits may have bugs that were
   fixed before a release.
 
 Another option to reduce the patch size would be to first reorganize to the new
 file structure, and then apply the patch. However, the current kernel zstd is formatted
 with clang-format to be more "kernel-like". But, the new method imports zstd as-is,
 without additional formatting, to allow for closer correlation with upstream, and
 easier debugging. So the patch wouldn't be any smaller.
 
 It also doesn't make sense to import upstream zstd commit by commit going
 forward. Upstream zstd doesn't support production use cases running of the
 development branch. We have a lot of post-commit fuzzing that catches many bugs,
 so indiviudal commits may be buggy, but fixed before a release. So going forward,
 I intend to import every (important) zstd release into the Kernel.
 
 So, while it isn't ideal, updating in one big patch is the only patch I see forward.
 
 Who is responsible for this code?
 ---------------------------------
 
 I am. This patchset adds me as the maintainer for zstd. Previously, there was no tree
 for zstd patches. Because of that, there were several patches that either got ignored,
 or took a long time to merge, since it wasn't clear which tree should pick them up.
 I'm officially stepping up as maintainer, and setting up my tree as the path through
 which zstd patches get merged. I'll make sure that patches to the kernel zstd get
 ported upstream, so they aren't erased when the next version update happens.
 
 How is this code tested?
 ------------------------
 
 I tested every caller of zstd on x86_64 (BtrFS, ZRAM, SquashFS, F2FS, Kernel,
 InitRAMFS). I also tested Kernel & InitRAMFS on i386 and aarch64. I checked both
 performance and correctness.
 
 Also, thanks to many people in the community who have tested these patches locally.
 If you have tested the patches, please reply with a Tested-By so I can collect them
 for the PR I will send to Linus.
 
 Lastly, this code will bake in linux-next before being merged into v5.16.
 
 Why update to zstd-1.4.10 when zstd-1.5.0 has been released?
 ------------------------------------------------------------
 
 This patchset has been outstanding since 2020, and zstd-1.4.10 was the latest
 release when it was created. Since the update patch is automatically generated
 from upstream, I could generate it from zstd-1.5.0. However, there were some
 large stack usage regressions in zstd-1.5.0, and are only fixed in the latest
 development branch. And the latest development branch contains some new code that
 needs to bake in the fuzzer before I would feel comfortable releasing to the
 kernel.
 
 Once this patchset has been merged, and we've released zstd-1.5.1, we can update
 the kernel to zstd-1.5.1, and exercise the update process.
 
 You may notice that zstd-1.4.10 doesn't exist upstream. This release is an
 artifical release based off of zstd-1.4.9, with some fixes for the kernel
 backported from the development branch. I will tag the zstd-1.4.10 release after
 this patchset is merged, so the Linux Kernel is running a known version of zstd
 that can be debugged upstream.
 
 Why was a wrapper API added?
 ----------------------------
 
 The first versions of this patchset migrated the kernel to the upstream zstd
 API. It first added a shim API that supported the new upstream API with the old
 code, then updated callers to use the new shim API, then transitioned to the
 new code and deleted the shim API. However, Cristoph Hellwig suggested that we
 transition to a kernel style API, and hide zstd's upstream API behind that.
 This is because zstd's upstream API is supports many other use cases, and does
 not follow the kernel style guide, while the kernel API is focused on the
 kernel's use cases, and follows the kernel style guide.
 
 Where is the previous discussion?
 ---------------------------------
 
 Links for the discussions of the previous versions of the patch set.
 The largest changes in the design of the patchset are driven by the discussions
 in V11, V5, and V1. Sorry for the mix of links, I couldn't find most of the the
 threads on lkml.org.
 
 V12: https://www.spinics.net/lists/linux-crypto/msg58189.html
 V11: https://lore.kernel.org/linux-btrfs/20210430013157.747152-1-nickrterrell@gmail.com/
 V10: https://lore.kernel.org/lkml/20210426234621.870684-2-nickrterrell@gmail.com/
 V9: https://lore.kernel.org/linux-btrfs/20210330225112.496213-1-nickrterrell@gmail.com/
 V8: https://lore.kernel.org/linux-f2fs-devel/20210326191859.1542272-1-nickrterrell@gmail.com/
 V7: https://lkml.org/lkml/2020/12/3/1195
 V6: https://lkml.org/lkml/2020/12/2/1245
 V5: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/
 V4: https://www.spinics.net/lists/linux-btrfs/msg105783.html
 V3: https://lkml.org/lkml/2020/9/23/1074
 V2: https://www.spinics.net/lists/linux-btrfs/msg105505.html
 V1: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/
 
 Signed-off-by: Nick Terrell <terrelln@fb.com>
 Tested By: Paul Jones <paul@pauljones.id.au>
 Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
 Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
 Tested-by: Jean-Denis Girard <jd.girard@sysnux.pf>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEmIwAqlFIzbQodPwyuzRpqaNEqPUFAmGJyKIACgkQuzRpqaNE
 qPXnmw/+PKyCn6LvRQqNfdpF5f59j/B1Fab15tkpVyz3UWnCw+EKaPZOoTfIsjRf
 7TMUVm4iGsm+6xBO/YrGdRl4IxocNgXzsgnJ1lTGDbvfRC1tG+YNwuv+EEXwKYq5
 Yz3DRwDotgsrV0Kg05b+VIgkmAuY3ukmu2n09LnAdKkxoIgmHw3MIDCdVZW2Br4c
 sjJmYI+fiJd7nAlbDa42VOrdTiLzkl/2BsjWBqTv6zbiQ5uuJGsKb7P3kpcybWzD
 5C118pyE3qlVyvFz+UFu8WbN0NSf47DP22KV/3IrhNX7CVQxYBe+9/oVuPWTgRx0
 4Vl0G6u7rzh4wDZuGqTC3LYWwH9GfycI0fnVC0URP2XMOcGfPlGd3L0PEmmAeTmR
 fEbaGAN4dr0jNO3lmbyAGe/G8tvtXQx/4ZjS9Pa3TlQP24GARU/f78/blbKR87Vz
 BGMndmSi92AscgXb9buO3bCwAY1YtH5WiFaZT1XVk42cj4MiOLvPTvP4UMzDDxcZ
 56ahmAP/84kd6H+cv9LmgEMqcIBmxdUcO1nuAItJ4wdrMUgw3+lrbxwFkH9xPV7I
 okC1K0TIVEobADbxbdMylxClAylbuW+37Pko97NmAlnzNCPNE38f3s3gtXRrUTaR
 IP8jv5UQ7q3dFiWnNLLodx5KM6s32GVBKRLRnn/6SJB7QzlyHXU=
 =Xb18
 -----END PGP SIGNATURE-----

Merge tag 'zstd-for-linus-v5.16' of git://github.com/terrelln/linux

Pull zstd update from Nick Terrell:
 "Update to zstd-1.4.10.

  Add myself as the maintainer of zstd and update the zstd version in
  the kernel, which is now 4 years out of date, to a much more recent
  zstd release. This includes bug fixes, much more extensive fuzzing,
  and performance improvements. And generates the kernel zstd
  automatically from upstream zstd, so it is easier to keep the zstd
  verison up to date, and we don't fall so far out of date again.

  This includes 5 commits that update the zstd library version:

   - Adds a new kernel-style wrapper around zstd.

     This wrapper API is functionally equivalent to the subset of the
     current zstd API that is currently used. The wrapper API changes to
     be kernel style so that the symbols don't collide with zstd's
     symbols. The update to zstd-1.4.10 maintains the same API and
     preserves the semantics, so that none of the callers need to be
     updated. All callers are updated in the commit, because there are
     zero functional changes.

   - Adds an indirection for `lib/decompress_unzstd.c` so it doesn't
     depend on the layout of `lib/zstd/` to include every source file.
     This allows the next patch to be automatically generated.

   - Imports the zstd-1.4.10 source code. This commit is automatically
     generated from upstream zstd (https://github.com/facebook/zstd).

   - Adds me (terrelln@fb.com) as the maintainer of `lib/zstd`.

   - Fixes a newly added build warning for clang.

  The discussion around this patchset has been pretty long, so I've
  included a FAQ-style summary of the history of the patchset, and why
  we are taking this approach.

  Why do we need to update?
  -------------------------

  The zstd version in the kernel is based off of zstd-1.3.1, which is
  was released August 20, 2017. Since then zstd has seen many bug fixes
  and performance improvements. And, importantly, upstream zstd is
  continuously fuzzed by OSS-Fuzz, and bug fixes aren't backported to
  older versions. So the only way to sanely get these fixes is to keep
  up to date with upstream zstd.

  There are no known security issues that affect the kernel, but we need
  to be able to update in case there are. And while there are no known
  security issues, there are relevant bug fixes. For example the problem
  with large kernel decompression has been fixed upstream for over 2
  years [1]

  Additionally the performance improvements for kernel use cases are
  significant. Measured for x86_64 on my Intel i9-9900k @ 3.6 GHz:

   - BtrFS zstd compression at levels 1 and 3 is 5% faster

   - BtrFS zstd decompression+read is 15% faster

   - SquashFS zstd decompression+read is 15% faster

   - F2FS zstd compression+write at level 3 is 8% faster

   - F2FS zstd decompression+read is 20% faster

   - ZRAM decompression+read is 30% faster

   - Kernel zstd decompression is 35% faster

   - Initramfs zstd decompression+build is 5% faster

  On top of this, there are significant performance improvements coming
  down the line in the next zstd release, and the new automated update
  patch generation will allow us to pull them easily.

  How is the update patch generated?
  ----------------------------------

  The first two patches are preparation for updating the zstd version.
  Then the 3rd patch in the series imports upstream zstd into the
  kernel. This patch is automatically generated from upstream. A script
  makes the necessary changes and imports it into the kernel. The
  changes are:

   - Replace all libc dependencies with kernel replacements and rewrite
     includes.

   - Remove unncessary portability macros like: #if defined(_MSC_VER).

   - Use the kernel xxhash instead of bundling it.

  This automation gets tested every commit by upstream's continuous
  integration. When we cut a new zstd release, we will submit a patch to
  the kernel to update the zstd version in the kernel.

  The automated process makes it easy to keep the kernel version of zstd
  up to date. The current zstd in the kernel shares the guts of the
  code, but has a lot of API and minor changes to work in the kernel.
  This is because at the time upstream zstd was not ready to be used in
  the kernel envrionment as-is. But, since then upstream zstd has
  evolved to support being used in the kernel as-is.

  Why are we updating in one big patch?
  -------------------------------------

  The 3rd patch in the series is very large. This is because it is
  restructuring the code, so it both deletes the existing zstd, and
  re-adds the new structure. Future updates will be directly
  proportional to the changes in upstream zstd since the last import.
  They will admittidly be large, as zstd is an actively developed
  project, and has hundreds of commits between every release. However,
  there is no other great alternative.

  One option ruled out is to replay every upstream zstd commit. This is
  not feasible for several reasons:

   - There are over 3500 upstream commits since the zstd version in the
     kernel.

   - The automation to automatically generate the kernel update was only
     added recently, so older commits cannot easily be imported.

   - Not every upstream zstd commit builds.

   - Only zstd releases are "supported", and individual commits may have
     bugs that were fixed before a release.

  Another option to reduce the patch size would be to first reorganize
  to the new file structure, and then apply the patch. However, the
  current kernel zstd is formatted with clang-format to be more
  "kernel-like". But, the new method imports zstd as-is, without
  additional formatting, to allow for closer correlation with upstream,
  and easier debugging. So the patch wouldn't be any smaller.

  It also doesn't make sense to import upstream zstd commit by commit
  going forward. Upstream zstd doesn't support production use cases
  running of the development branch. We have a lot of post-commit
  fuzzing that catches many bugs, so indiviudal commits may be buggy,
  but fixed before a release. So going forward, I intend to import every
  (important) zstd release into the Kernel.

  So, while it isn't ideal, updating in one big patch is the only patch
  I see forward.

  Who is responsible for this code?
  ---------------------------------

  I am. This patchset adds me as the maintainer for zstd. Previously,
  there was no tree for zstd patches. Because of that, there were
  several patches that either got ignored, or took a long time to merge,
  since it wasn't clear which tree should pick them up. I'm officially
  stepping up as maintainer, and setting up my tree as the path through
  which zstd patches get merged. I'll make sure that patches to the
  kernel zstd get ported upstream, so they aren't erased when the next
  version update happens.

  How is this code tested?
  ------------------------

  I tested every caller of zstd on x86_64 (BtrFS, ZRAM, SquashFS, F2FS,
  Kernel, InitRAMFS). I also tested Kernel & InitRAMFS on i386 and
  aarch64. I checked both performance and correctness.

  Also, thanks to many people in the community who have tested these
  patches locally.

  Lastly, this code will bake in linux-next before being merged into
  v5.16.

  Why update to zstd-1.4.10 when zstd-1.5.0 has been released?
  ------------------------------------------------------------

  This patchset has been outstanding since 2020, and zstd-1.4.10 was the
  latest release when it was created. Since the update patch is
  automatically generated from upstream, I could generate it from
  zstd-1.5.0.

  However, there were some large stack usage regressions in zstd-1.5.0,
  and are only fixed in the latest development branch. And the latest
  development branch contains some new code that needs to bake in the
  fuzzer before I would feel comfortable releasing to the kernel.

  Once this patchset has been merged, and we've released zstd-1.5.1, we
  can update the kernel to zstd-1.5.1, and exercise the update process.

  You may notice that zstd-1.4.10 doesn't exist upstream. This release
  is an artifical release based off of zstd-1.4.9, with some fixes for
  the kernel backported from the development branch. I will tag the
  zstd-1.4.10 release after this patchset is merged, so the Linux Kernel
  is running a known version of zstd that can be debugged upstream.

  Why was a wrapper API added?
  ----------------------------

  The first versions of this patchset migrated the kernel to the
  upstream zstd API. It first added a shim API that supported the new
  upstream API with the old code, then updated callers to use the new
  shim API, then transitioned to the new code and deleted the shim API.
  However, Cristoph Hellwig suggested that we transition to a kernel
  style API, and hide zstd's upstream API behind that. This is because
  zstd's upstream API is supports many other use cases, and does not
  follow the kernel style guide, while the kernel API is focused on the
  kernel's use cases, and follows the kernel style guide.

  Where is the previous discussion?
  ---------------------------------

  Links for the discussions of the previous versions of the patch set
  below. The largest changes in the design of the patchset are driven by
  the discussions in v11, v5, and v1. Sorry for the mix of links, I
  couldn't find most of the the threads on lkml.org"

Link: https://lkml.org/lkml/2020/9/29/27 [1]
Link: https://www.spinics.net/lists/linux-crypto/msg58189.html [v12]
Link: https://lore.kernel.org/linux-btrfs/20210430013157.747152-1-nickrterrell@gmail.com/ [v11]
Link: https://lore.kernel.org/lkml/20210426234621.870684-2-nickrterrell@gmail.com/ [v10]
Link: https://lore.kernel.org/linux-btrfs/20210330225112.496213-1-nickrterrell@gmail.com/ [v9]
Link: https://lore.kernel.org/linux-f2fs-devel/20210326191859.1542272-1-nickrterrell@gmail.com/ [v8]
Link: https://lkml.org/lkml/2020/12/3/1195 [v7]
Link: https://lkml.org/lkml/2020/12/2/1245 [v6]
Link: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/ [v5]
Link: https://www.spinics.net/lists/linux-btrfs/msg105783.html [v4]
Link: https://lkml.org/lkml/2020/9/23/1074 [v3]
Link: https://www.spinics.net/lists/linux-btrfs/msg105505.html [v2]
Link: https://lore.kernel.org/linux-btrfs/20200916034307.2092020-1-nickrterrell@gmail.com/ [v1]
Signed-off-by: Nick Terrell <terrelln@fb.com>
Tested By: Paul Jones <paul@pauljones.id.au>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
Tested-by: Jean-Denis Girard <jd.girard@sysnux.pf>

* tag 'zstd-for-linus-v5.16' of git://github.com/terrelln/linux:
  lib: zstd: Add cast to silence clang's -Wbitwise-instead-of-logical
  MAINTAINERS: Add maintainer entry for zstd
  lib: zstd: Upgrade to latest upstream zstd version 1.4.10
  lib: zstd: Add decompress_sources.h for decompress_unzstd
  lib: zstd: Add kernel-specific API
2021-11-13 15:32:30 -08:00
Alistair Popple
ab09243aa9 mm/migrate.c: remove MIGRATE_PFN_LOCKED
MIGRATE_PFN_LOCKED is used to indicate to migrate_vma_prepare() that a
source page was already locked during migrate_vma_collect().  If it
wasn't then the a second attempt is made to lock the page.  However if
the first attempt failed it's unlikely a second attempt will succeed,
and the retry adds complexity.  So clean this up by removing the retry
and MIGRATE_PFN_LOCKED flag.

Destination pages are also meant to have the MIGRATE_PFN_LOCKED flag
set, but nothing actually checks that.

Link: https://lkml.kernel.org/r/20211025041608.289017-1-apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-11 09:34:35 -08:00
Nicholas Piggin
5d5e4522a7 printk: restore flushing of NMI buffers on remote CPUs after NMI backtraces
printk from NMI context relies on irq work being raised on the local CPU
to print to console. This can be a problem if the NMI was raised by a
lockup detector to print lockup stack and regs, because the CPU may not
enable irqs (because it is locked up).

Introduce printk_trigger_flush() that can be called another CPU to try
to get those messages to the console, call that where printk_safe_flush
was previously called.

Fixes: 93d102f094be ("printk: remove safe buffers")
Cc: stable@vger.kernel.org # 5.15
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Reviewed-by: John Ogness <john.ogness@linutronix.de>
Signed-off-by: Petr Mladek <pmladek@suse.com>
Link: https://lore.kernel.org/r/20211107045116.1754411-1-npiggin@gmail.com
2021-11-10 16:12:00 +01:00
Linus Torvalds
59a2ceeef6 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "87 patches.

  Subsystems affected by this patch series: mm (pagecache and hugetlb),
  procfs, misc, MAINTAINERS, lib, checkpatch, binfmt, kallsyms, ramfs,
  init, codafs, nilfs2, hfs, crash_dump, signals, seq_file, fork,
  sysvfs, kcov, gdb, resource, selftests, and ipc"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (87 commits)
  ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL
  ipc: check checkpoint_restore_ns_capable() to modify C/R proc files
  selftests/kselftest/runner/run_one(): allow running non-executable files
  virtio-mem: disallow mapping virtio-mem memory via /dev/mem
  kernel/resource: disallow access to exclusive system RAM regions
  kernel/resource: clean up and optimize iomem_is_exclusive()
  scripts/gdb: handle split debug for vmlinux
  kcov: replace local_irq_save() with a local_lock_t
  kcov: avoid enable+disable interrupts if !in_task()
  kcov: allocate per-CPU memory on the relevant node
  Documentation/kcov: define `ip' in the example
  Documentation/kcov: include types.h in the example
  sysv: use BUILD_BUG_ON instead of runtime check
  kernel/fork.c: unshare(): use swap() to make code cleaner
  seq_file: fix passing wrong private data
  seq_file: move seq_escape() to a header
  signal: remove duplicate include in signal.h
  crash_dump: remove duplicate include in crash_dump.h
  crash_dump: fix boolreturn.cocci warning
  hfs/hfsplus: use WARN_ON for sanity check
  ...
2021-11-09 10:11:53 -08:00
Thomas Gleixner
723aca2085 mm/scatterlist: replace the !preemptible warning in sg_miter_stop()
sg_miter_stop() checks for disabled preemption before unmapping a page
via kunmap_atomic().  The kernel doc mentions under context that
preemption must be disabled if SG_MITER_ATOMIC is set.

There is no active requirement for the caller to have preemption
disabled before invoking sg_mitter_stop().  The sg_mitter_*()
implementation itself has no such requirement.

In fact, preemption is disabled by kmap_atomic() as part of
sg_miter_next() and remains disabled as long as there is an active
SG_MITER_ATOMIC mapping.  This is a consequence of kmap_atomic() and not
a requirement for sg_mitter_*() itself.

The user chooses SG_MITER_ATOMIC because it uses the API in a context
where blocking is not possible or blocking is possible but he chooses a
lower weight mapping which is not available on all CPUs and so it might
need less overhead to setup at a price that now preemption will be
disabled.

The kmap_atomic() implementation on PREEMPT_RT does not disable
preemption.  It simply disables CPU migration to ensure that the task
remains on the same CPU while the caller remains preemptible.  This in
turn triggers the warning in sg_miter_stop() because preemption is
allowed.

The PREEMPT_RT and !PREEMPT_RT implementation of kmap_atomic() disable
pagefaults as a requirement.  It is sufficient to check for this instead
of disabled preemption.

Check for disabled pagefault handler in the SG_MITER_ATOMIC case.
Remove the "preemption disabled" part from the kernel doc as the
sg_milter*() implementation does not care.

[bigeasy@linutronix.de: commit description]

Link: https://lkml.kernel.org/r/20211015211409.cqopacv3pxdwn2ty@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-09 10:02:50 -08:00
Alexey Dobriyan
839b395eb9 lib: uninline simple_strntoull() as well
Codegen become bloated again after simple_strntoull() introduction

	add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-224 (-224)
	Function                                     old     new   delta
	simple_strtoul                                 5       2      -3
	simple_strtol                                 23      20      -3
	simple_strtoull                              119      15    -104
	simple_strtoll                               155      41    -114

Link: https://lkml.kernel.org/r/YVmlB9yY4lvbNKYt@localhost.localdomain
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Richard Fitzgerald <rf@opensource.cirrus.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-09 10:02:50 -08:00
Imran Khan
0f68d45ef4 lib, stackdepot: add helper to print stack entries into buffer
To print stack entries into a buffer, users of stackdepot, first get a
list of stack entries using stack_depot_fetch and then print this list
into a buffer using stack_trace_snprint.  Provide a helper in stackdepot
for this purpose.  Also change above mentioned users to use this helper.

[imran.f.khan@oracle.com: fix build error]
  Link: https://lkml.kernel.org/r/20210915175321.3472770-4-imran.f.khan@oracle.com
[imran.f.khan@oracle.com: export stack_depot_snprint() to modules]
  Link: https://lkml.kernel.org/r/20210916133535.3592491-4-imran.f.khan@oracle.com

Link: https://lkml.kernel.org/r/20210915014806.3206938-4-imran.f.khan@oracle.com
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Jani Nikula <jani.nikula@intel.com>	[i915]
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-09 10:02:50 -08:00
Imran Khan
505be48165 lib, stackdepot: add helper to print stack entries
To print a stack entries, users of stackdepot, first use stack_depot_fetch
to get a list of stack entries and then use stack_trace_print to print
this list.  Provide a helper in stackdepot to print stack entries based on
stackdepot handle.  Also change above mentioned users to use this helper.

Link: https://lkml.kernel.org/r/20210915014806.3206938-3-imran.f.khan@oracle.com
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: David Airlie <airlied@linux.ie>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-09 10:02:50 -08:00
Imran Khan
4d4712c1a4 lib, stackdepot: check stackdepot handle before accessing slabs
Patch series "lib, stackdepot: check stackdepot handle before accessing slabs", v2.

PATCH-1: Checks validity of a stackdepot handle before proceeding to
access stackdepot slab/objects.

PATCH-2: Adds a helper in stackdepot, to allow users to print stack
entries just by specifying the stackdepot handle.  It also changes such
users to use this new interface.

PATCH-3: Adds a helper in stackdepot, to allow users to print stack
entries into buffers just by specifying the stackdepot handle and
destination buffer.  It also changes such users to use this new interface.

This patch (of 3):

stack_depot_save allocates slabs that will be used for storing objects in
future.If this slab allocation fails we may get to a situation where space
allocation for a new stack_record fails, causing stack_depot_save to
return 0 as handle.  If user of this handle ends up invoking
stack_depot_fetch with this handle value, current implementation of
stack_depot_fetch will end up using slab from wrong index.  To avoid this
check handle value at the beginning.

Link: https://lkml.kernel.org/r/20210915175321.3472770-1-imran.f.khan@oracle.com
Link: https://lkml.kernel.org/r/20210915014806.3206938-1-imran.f.khan@oracle.com
Link: https://lkml.kernel.org/r/20210915014806.3206938-2-imran.f.khan@oracle.com
Signed-off-by: Imran Khan <imran.f.khan@oracle.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-11-09 10:02:50 -08:00
Nathan Chancellor
0a8ea23583 lib: zstd: Add cast to silence clang's -Wbitwise-instead-of-logical
A new warning in clang warns that there is an instance where boolean
expressions are being used with bitwise operators instead of logical
ones:

lib/zstd/decompress/huf_decompress.c:890:25: warning: use of bitwise '&' with boolean operands [-Wbitwise-instead-of-logical]
                       (BIT_reloadDStreamFast(&bitD1) == BIT_DStream_unfinished)
                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

zstd does this frequently to help with performance, as logical operators
have branches whereas bitwise ones do not.

To fix this warning in other cases, the expressions were placed on
separate lines with the '&=' operator; however, this particular instance
was moved away from that so that it could be surrounded by LIKELY, which
is a macro for __builtin_expect(), to help with a performance
regression, according to upstream zstd pull #1973.

Aside from switching to logical operators, which is likely undesirable
in this instance, or disabling the warning outright, the solution is
casting one of the expressions to an integer type to make it clear to
clang that the author knows what they are doing. Add a cast to U32 to
silence the warning. The first U32 cast is to silence an instance of
-Wshorten-64-to-32 because __builtin_expect() returns long so it cannot
be moved.

Link: https://github.com/ClangBuiltLinux/linux/issues/1486
Link: https://github.com/facebook/zstd/pull/1973
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Terrell <terrelln@fb.com>
2021-11-08 16:55:38 -08:00
Nick Terrell
e0c1b49f5b lib: zstd: Upgrade to latest upstream zstd version 1.4.10
Upgrade to the latest upstream zstd version 1.4.10.

This patch is 100% generated from upstream zstd commit 20821a46f412 [0].

This patch is very large because it is transitioning from the custom
kernel zstd to using upstream directly. The new zstd follows upstreams
file structure which is different. Future update patches will be much
smaller because they will only contain the changes from one upstream
zstd release.

As an aid for review I've created a commit [1] that shows the diff
between upstream zstd as-is (which doesn't compile), and the zstd
code imported in this patch. The verion of zstd in this patch is
generated from upstream with changes applied by automation to replace
upstreams libc dependencies, remove unnecessary portability macros,
replace `/**` comments with `/*` comments, and use the kernel's xxhash
instead of bundling it.

The benefits of this patch are as follows:
1. Using upstream directly with automated script to generate kernel
   code. This allows us to update the kernel every upstream release, so
   the kernel gets the latest bug fixes and performance improvements,
   and doesn't get 3 years out of date again. The automation and the
   translated code are tested every upstream commit to ensure it
   continues to work.
2. Upgrades from a custom zstd based on 1.3.1 to 1.4.10, getting 3 years
   of performance improvements and bug fixes. On x86_64 I've measured
   15% faster BtrFS and SquashFS decompression+read speeds, 35% faster
   kernel decompression, and 30% faster ZRAM decompression+read speeds.
3. Zstd-1.4.10 supports negative compression levels, which allow zstd to
   match or subsume lzo's performance.
4. Maintains the same kernel-specific wrapper API, so no callers have to
   be modified with zstd version updates.

One concern that was brought up was stack usage. Upstream zstd had
already removed most of its heavy stack usage functions, but I just
removed the last functions that allocate arrays on the stack. I've
measured the high water mark for both compression and decompression
before and after this patch. Decompression is approximately neutral,
using about 1.2KB of stack space. Compression levels up to 3 regressed
from 1.4KB -> 1.6KB, and higher compression levels regressed from 1.5KB
-> 2KB. We've added unit tests upstream to prevent further regression.
I believe that this is a reasonable increase, and if it does end up
causing problems, this commit can be cleanly reverted, because it only
touches zstd.

I chose the bulk update instead of replaying upstream commits because
there have been ~3500 upstream commits since the 1.3.1 release, zstd
wasn't ready to be used in the kernel as-is before a month ago, and not
all upstream zstd commits build. The bulk update preserves bisectablity
because bugs can be bisected to the zstd version update. At that point
the update can be reverted, and we can work with upstream to find and
fix the bug.

Note that upstream zstd release 1.4.10 doesn't exist yet. I have cut a
staging branch at 20821a46f412 [0] and will apply any changes requested
to the staging branch. Once we're ready to merge this update I will cut
a zstd release at the commit we merge, so we have a known zstd release
in the kernel.

The implementation of the kernel API is contained in
zstd_compress_module.c and zstd_decompress_module.c.

[0] 20821a46f4
[1] e0fa481d0e

Signed-off-by: Nick Terrell <terrelln@fb.com>
Tested By: Paul Jones <paul@pauljones.id.au>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
Tested-by: Jean-Denis Girard <jd.girard@sysnux.pf>
2021-11-08 16:55:32 -08:00
Nick Terrell
2479b52389 lib: zstd: Add decompress_sources.h for decompress_unzstd
Adds decompress_sources.h which includes every .c file necessary for
zstd decompression. This is used in decompress_unzstd.c so the internal
structure of the library isn't exposed.

This allows us to upgrade the zstd library version without modifying any
callers. Instead we just need to update decompress_sources.h.

Signed-off-by: Nick Terrell <terrelln@fb.com>
Tested By: Paul Jones <paul@pauljones.id.au>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
Tested-by: Jean-Denis Girard <jd.girard@sysnux.pf>
2021-11-08 16:55:26 -08:00
Nick Terrell
cf30f6a5f0 lib: zstd: Add kernel-specific API
This patch:
- Moves `include/linux/zstd.h` -> `include/linux/zstd_lib.h`
- Updates modified zstd headers to yearless copyright
- Adds a new API in `include/linux/zstd.h` that is functionally
  equivalent to the in-use subset of the current API. Functions are
  renamed to avoid symbol collisions with zstd, to make it clear it is
  not the upstream zstd API, and to follow the kernel style guide.
- Updates all callers to use the new API.

There are no functional changes in this patch. Since there are no
functional change, I felt it was okay to update all the callers in a
single patch. Once the API is approved, the callers are mechanically
changed.

This patch is preparing for the 3rd patch in this series, which updates
zstd to version 1.4.10. Since the upstream zstd API is no longer exposed
to callers, the update can happen transparently.

Signed-off-by: Nick Terrell <terrelln@fb.com>
Tested By: Paul Jones <paul@pauljones.id.au>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # LLVM/Clang v13.0.0 on x86-64
Tested-by: Jean-Denis Girard <jd.girard@sysnux.pf>
2021-11-08 16:55:21 -08:00