1060376 Commits

Author SHA1 Message Date
Eric Dumazet
c0e5e11af1 vrf: use dev_replace_track() for better tracking
vrf_rt6_release() and vrf_rtable_release() changes dst->dev

Instead of

dev_hold(ndev);
dev_put(odev);

We should use

dev_replace_track(odev, ndev, &dst->dev_tracker, GFP_KERNEL);

If we do not transfer dst->dev_tracker to the new device,
we will get warnings from ref_tracker_dir_exit() when odev
is finally dismantled.

Fixes: 9038c320001d ("net: dst: add net device refcount tracking to dst_entry")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20211207055603.1926372-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 10:41:54 -08:00
Dan Carpenter
d17b9737c2 net/qla3xxx: fix an error code in ql_adapter_up()
The ql_wait_for_drvr_lock() fails and returns false, then this
function should return an error code instead of returning success.

The other problem is that the success path prints an error message
netdev_err(ndev, "Releasing driver lock\n");  Delete that and
re-order the code a little to make it more clear.

Fixes: 5a4faa873782 ("[PATCH] qla3xxx NIC driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/r/20211207082416.GA16110@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 10:37:10 -08:00
Jakub Kicinski
2a62df3692 linux-can-fixes-for-5.16-20211207
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEK3kIWJt9yTYMP3ehqclaivrt76kFAmGvM24THG1rbEBwZW5n
 dXRyb25peC5kZQAKCRCpyVqK+u3vqZL8CACmYXCIKdLxumCmfa7z61r6Y839grFl
 weBo1dgrg/FIIyxro0EEmGi3ZRGrVcQNyQOQKl9xtt8FFmkrl5yCioQ5N+ib/Qt7
 6BhMN0kdaWQDzgyIT5BT8Ba/13S4Hpjb7baDBf+Rqw7WemeX2hni8Dx4WxflfMbo
 lpxuRyDtUvndUHVzATkbB8TLsmB50wdTinzZkY3IV8bAhLcznQ2vYvV4HblbeNlA
 BtEEtAjsR7zFfyqDmxOIOdqMD4m4vjUnaOoT6KQznNIy1EYxFgX7VFNAp3DeGaX3
 bM4CTEbVm980hPgm8tjFL0p6BExWWR9q6lR/x41O/P2cQ2PKNDSuBnuo
 =4e0K
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-5.16-20211207' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
can 2021-12-07

The 1st patch is by Vincent Mailhol and fixes a use after free in the
pch_can driver.

Dan Carpenter fixes a use after free in the ems_pcmcia sja1000 driver.

The remaining 7 patches target the m_can driver. Brian Silverman
contributes a patch to disable and ignore the ELO interrupt, which is
currently not handled in the driver and may lead to an interrupt
storm. Vincent Mailhol's patch fixes a memory leak in the error path
of the m_can_read_fifo() function. The remaining patches are
contributed by Matthias Schiffer, first a iomap_read_fifo() and
iomap_write_fifo() functions are fixed in the PCI glue driver, then
the clock rate for the Intel Ekhart Lake platform is fixed, the last 3
patches add support for the custom bit timings on the Elkhart Lake
platform.

* tag 'linux-can-fixes-for-5.16-20211207' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: m_can: pci: use custom bit timings for Elkhart Lake
  can: m_can: make custom bittiming fields const
  Revert "can: m_can: remove support for custom bit timing"
  can: m_can: pci: fix incorrect reference clock rate
  can: m_can: pci: fix iomap_read_fifo() and iomap_write_fifo()
  can: m_can: m_can_read_fifo: fix memory leak in error branch
  can: m_can: Disable and ignore ELO interrupt
  can: sja1000: fix use after free in ems_pcmcia_add_card()
  can: pch_can: pch_can_rx_normal: fix use after free
====================

Link: https://lore.kernel.org/r/20211207102420.120131-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07 10:32:05 -08:00
Linus Torvalds
957232439c platform-drivers-x86 for v5.16-3
Various bug-fixes and hardware-id additions.
 
 The following is an automated git shortlog grouped by driver:
 
 amd-pmc:
  -  Fix s2idle failures on certain AMD laptops
 
 lg-laptop:
  -  Recognize more models
 
 platform/x86/intel:
  -  hid: add quirk to support Surface Go 3
 
 thinkpad_acpi:
  -  Add lid_logo_dot to the list of safe LEDs
  -  Restore missing hotkey_tablet_mode and hotkey_radio_sw sysfs-attr
 
 touchscreen_dmi:
  -  Add TrekStor SurfTab duo W1 touchscreen info
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmGvjMIUHGhkZWdvZWRl
 QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9wGywf/XJebHMeSVLe8udb4RMatwHpEsyVK
 CvrV4G0FXI5O3GNdaDOU8nlt71zeEaDUkOj3w/cbD9Q/NhrDEgmg3XfZpFZFwNMX
 JELy4VkcVypFWbiAfzh8bCdBJF22gRDc4e8nTsvj31MHC+qOuzHliH15KEMHlz2S
 IbOkc5qHaL4m5oEPnUFj7O/DsSxlSEhO8PVpfN7fOYf0aBm5cuC9yZD/0bw93wHk
 DtnVKzIVXrbItxnw/NJGD0eoBtHemY8u37JDR0TMyR9ekWqQCmdWUf1DibuVCDRl
 XjitIfC5oYMrJwbJKd0kKn8oVvXV07zTFRLPXw9kAXHAqMYXKU6wnt1j4g==
 =T9dH
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Hans de Goede:
 "Various bug-fixes and hardware-id additions"

* tag 'platform-drivers-x86-v5.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/x86/intel: hid: add quirk to support Surface Go 3
  platform/x86: amd-pmc: Fix s2idle failures on certain AMD laptops
  platform/x86: touchscreen_dmi: Add TrekStor SurfTab duo W1 touchscreen info
  platform/x86: lg-laptop: Recognize more models
  platform/x86: thinkpad_acpi: Add lid_logo_dot to the list of safe LEDs
  platform/x86: thinkpad_acpi: Restore missing hotkey_tablet_mode and hotkey_radio_sw sysfs-attr
2021-12-07 10:10:20 -08:00
Jeffle Xu
3cfef1b612 netfs: fix parameter of cleanup()
The order of these two parameters is just reversed. gcc didn't warn on
that, probably because 'void *' can be converted from or to other
pointer types without warning.

Cc: stable@vger.kernel.org
Fixes: 3d3c95046742 ("netfs: Provide readahead and readpage netfs helpers")
Fixes: e1b1240c1ff5 ("netfs: Add write_begin helper")
Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Link: https://lore.kernel.org/r/20211207031449.100510-1-jefflexu@linux.alibaba.com/ # v1
2021-12-07 15:47:09 +00:00
David Howells
598ad0bd09 netfs: Fix lockdep warning from taking sb_writers whilst holding mmap_lock
Taking sb_writers whilst holding mmap_lock isn't allowed and will result in
a lockdep warning like that below.  The problem comes from cachefiles
needing to take the sb_writers lock in order to do a write to the cache,
but being asked to do this by netfslib called from readpage, readahead or
write_begin[1].

Fix this by always offloading the write to the cache off to a worker
thread.  The main thread doesn't need to wait for it, so deadlock can be
avoided.

This can be tested by running the quick xfstests on something like afs or
ceph with lockdep enabled.

WARNING: possible circular locking dependency detected
5.15.0-rc1-build2+ #292 Not tainted
------------------------------------------------------
holetest/65517 is trying to acquire lock:
ffff88810c81d730 (mapping.invalidate_lock#3){.+.+}-{3:3}, at: filemap_fault+0x276/0x7a5

but task is already holding lock:
ffff8881595b53e8 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x28d/0x59c

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (&mm->mmap_lock#2){++++}-{3:3}:
       validate_chain+0x3c4/0x4a8
       __lock_acquire+0x89d/0x949
       lock_acquire+0x2dc/0x34b
       __might_fault+0x87/0xb1
       strncpy_from_user+0x25/0x18c
       removexattr+0x7c/0xe5
       __do_sys_fremovexattr+0x73/0x96
       do_syscall_64+0x67/0x7a
       entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #1 (sb_writers#10){.+.+}-{0:0}:
       validate_chain+0x3c4/0x4a8
       __lock_acquire+0x89d/0x949
       lock_acquire+0x2dc/0x34b
       cachefiles_write+0x2b3/0x4bb
       netfs_rreq_do_write_to_cache+0x3b5/0x432
       netfs_readpage+0x2de/0x39d
       filemap_read_page+0x51/0x94
       filemap_get_pages+0x26f/0x413
       filemap_read+0x182/0x427
       new_sync_read+0xf0/0x161
       vfs_read+0x118/0x16e
       ksys_read+0xb8/0x12e
       do_syscall_64+0x67/0x7a
       entry_SYSCALL_64_after_hwframe+0x44/0xae

-> #0 (mapping.invalidate_lock#3){.+.+}-{3:3}:
       check_noncircular+0xe4/0x129
       check_prev_add+0x16b/0x3a4
       validate_chain+0x3c4/0x4a8
       __lock_acquire+0x89d/0x949
       lock_acquire+0x2dc/0x34b
       down_read+0x40/0x4a
       filemap_fault+0x276/0x7a5
       __do_fault+0x96/0xbf
       do_fault+0x262/0x35a
       __handle_mm_fault+0x171/0x1b5
       handle_mm_fault+0x12a/0x233
       do_user_addr_fault+0x3d2/0x59c
       exc_page_fault+0x85/0xa5
       asm_exc_page_fault+0x1e/0x30

other info that might help us debug this:

Chain exists of:
  mapping.invalidate_lock#3 --> sb_writers#10 --> &mm->mmap_lock#2

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&mm->mmap_lock#2);
                               lock(sb_writers#10);
                               lock(&mm->mmap_lock#2);
  lock(mapping.invalidate_lock#3);

 *** DEADLOCK ***

1 lock held by holetest/65517:
 #0: ffff8881595b53e8 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x28d/0x59c

stack backtrace:
CPU: 0 PID: 65517 Comm: holetest Not tainted 5.15.0-rc1-build2+ #292
Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014
Call Trace:
 dump_stack_lvl+0x45/0x59
 check_noncircular+0xe4/0x129
 ? print_circular_bug+0x207/0x207
 ? validate_chain+0x461/0x4a8
 ? add_chain_block+0x88/0xd9
 ? hlist_add_head_rcu+0x49/0x53
 check_prev_add+0x16b/0x3a4
 validate_chain+0x3c4/0x4a8
 ? check_prev_add+0x3a4/0x3a4
 ? mark_lock+0xa5/0x1c6
 __lock_acquire+0x89d/0x949
 lock_acquire+0x2dc/0x34b
 ? filemap_fault+0x276/0x7a5
 ? rcu_read_unlock+0x59/0x59
 ? add_to_page_cache_lru+0x13c/0x13c
 ? lock_is_held_type+0x7b/0xd3
 down_read+0x40/0x4a
 ? filemap_fault+0x276/0x7a5
 filemap_fault+0x276/0x7a5
 ? pagecache_get_page+0x2dd/0x2dd
 ? __lock_acquire+0x8bc/0x949
 ? pte_offset_kernel.isra.0+0x6d/0xc3
 __do_fault+0x96/0xbf
 ? do_fault+0x124/0x35a
 do_fault+0x262/0x35a
 ? handle_pte_fault+0x1c1/0x20d
 __handle_mm_fault+0x171/0x1b5
 ? handle_pte_fault+0x20d/0x20d
 ? __lock_release+0x151/0x254
 ? mark_held_locks+0x1f/0x78
 ? rcu_read_unlock+0x3a/0x59
 handle_mm_fault+0x12a/0x233
 do_user_addr_fault+0x3d2/0x59c
 ? pgtable_bad+0x70/0x70
 ? rcu_read_lock_bh_held+0xab/0xab
 exc_page_fault+0x85/0xa5
 ? asm_exc_page_fault+0x8/0x30
 asm_exc_page_fault+0x1e/0x30
RIP: 0033:0x40192f
Code: ff 48 89 c3 48 8b 05 50 28 00 00 48 85 ed 7e 23 31 d2 4b 8d 0c 2f eb 0a 0f 1f 00 48 8b 05 39 28 00 00 48 0f af c2 48 83 c2 01 <48> 89 1c 01 48 39 d5 7f e8 8b 0d f2 27 00 00 31 c0 85 c9 74 0e 8b
RSP: 002b:00007f9931867eb0 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 00007f9931868700 RCX: 00007f993206ac00
RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00007ffc13e06ee0
RBP: 0000000000000100 R08: 0000000000000000 R09: 00007f9931868700
R10: 00007f99318689d0 R11: 0000000000000202 R12: 00007ffc13e06ee0
R13: 0000000000000c00 R14: 00007ffc13e06e00 R15: 00007f993206a000

Fixes: 726218fdc22c ("netfs: Define an interface to talk to a cache")
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Jeff Layton <jlayton@kernel.org>
cc: Jan Kara <jack@suse.cz>
cc: linux-cachefs@redhat.com
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20210922110420.GA21576@quack2.suse.cz/ [1]
Link: https://lore.kernel.org/r/163887597541.1596626.2668163316598972956.stgit@warthog.procyon.org.uk/ # v1
2021-12-07 13:39:55 +00:00
Matthias Schiffer
ea4c178768 can: m_can: pci: use custom bit timings for Elkhart Lake
The relevant datasheet [1] specifies nonstandard limits for the bit timing
parameters. While it is unclear what the exact effect of violating these
limits is, it seems like a good idea to adhere to the documentation.

[1] Intel Atom® x6000E Series, and Intel® Pentium® and Celeron® N and J
    Series Processors for IoT Applications Datasheet,
    Volume 2 (Book 3 of 3), July 2021, Revision 001

Fixes: cab7ffc0324f ("can: m_can: add PCI glue driver for Intel Elkhart Lake")
Link: https://lore.kernel.org/all/9eba5d7c05a48ead4024ffa6e5926f191d8c6b38.1636967198.git.matthias.schiffer@ew.tq-group.com
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-07 09:51:41 +01:00
Matthias Schiffer
ea22ba40de can: m_can: make custom bittiming fields const
The assigned timing structs will be defined a const anyway, so we can
avoid a few casts by declaring the struct fields as const as well.

Link: https://lore.kernel.org/all/4508fa4e639164b2584c49a065d90c78a91fa568.1636967198.git.matthias.schiffer@ew.tq-group.com
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-07 09:51:41 +01:00
Matthias Schiffer
ea768b2ffe Revert "can: m_can: remove support for custom bit timing"
The timing limits specified by the Elkhart Lake CPU datasheets do not
match the defaults. Let's reintroduce the support for custom bit timings.

This reverts commit 0ddd83fbebbc5537f9d180d31f659db3564be708.

Link: https://lore.kernel.org/all/00c9e2596b1a548906921a574d4ef7a03c0dace0.1636967198.git.matthias.schiffer@ew.tq-group.com
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-07 09:51:41 +01:00
Matthias Schiffer
8c03b8bff7 can: m_can: pci: fix incorrect reference clock rate
When testing the CAN controller on our Ekhart Lake hardware, we
determined that all communication was running with twice the configured
bitrate. Changing the reference clock rate from 100MHz to 200MHz fixed
this. Intel's support has confirmed to us that 200MHz is indeed the
correct clock rate.

Fixes: cab7ffc0324f ("can: m_can: add PCI glue driver for Intel Elkhart Lake")
Link: https://lore.kernel.org/all/c9cf3995f45c363e432b3ae8eb1275e54f009fc8.1636967198.git.matthias.schiffer@ew.tq-group.com
Cc: stable@vger.kernel.org
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Reviewed-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-07 09:51:41 +01:00
Matthias Schiffer
d737de2d7c can: m_can: pci: fix iomap_read_fifo() and iomap_write_fifo()
The same fix that was previously done in m_can_platform in commit
99d173fbe894 ("can: m_can: fix iomap_read_fifo() and iomap_write_fifo()")
is required in m_can_pci as well to make iomap_read_fifo() and
iomap_write_fifo() work for val_count > 1.

Fixes: 812270e5445b ("can: m_can: Batch FIFO writes during CAN transmit")
Fixes: 1aa6772f64b4 ("can: m_can: Batch FIFO reads during CAN receive")
Link: https://lore.kernel.org/all/20211118144011.10921-1-matthias.schiffer@ew.tq-group.com
Cc: stable@vger.kernel.org
Cc: Matt Kline <matt@bitbashing.io>
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Tested-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-07 09:51:36 +01:00
Vincent Mailhol
31cb32a590 can: m_can: m_can_read_fifo: fix memory leak in error branch
In m_can_read_fifo(), if the second call to m_can_fifo_read() fails,
the function jump to the out_fail label and returns without calling
m_can_receive_skb(). This means that the skb previously allocated by
alloc_can_skb() is not freed. In other terms, this is a memory leak.

This patch adds a goto label to destroy the skb if an error occurs.

Issue was found with GCC -fanalyzer, please follow the link below for
details.

Fixes: e39381770ec9 ("can: m_can: Disable IRQs on FIFO bus errors")
Link: https://lore.kernel.org/all/20211107050755.70655-1-mailhol.vincent@wanadoo.fr
Cc: stable@vger.kernel.org
Cc: Matt Kline <matt@bitbashing.io>
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-07 09:09:06 +01:00
Brian Silverman
f58ac1adc7 can: m_can: Disable and ignore ELO interrupt
With the design of this driver, this condition is often triggered.
However, the counter that this interrupt indicates an overflow is never
read either, so overflowing is harmless.

On my system, when a CAN bus starts flapping up and down, this locks up
the whole system with lots of interrupts and printks.

Specifically, this interrupt indicates the CEL field of ECR has
overflowed. All reads of ECR mask out CEL.

Fixes: e0d1f4816f2a ("can: m_can: add Bosch M_CAN controller support")
Link: https://lore.kernel.org/all/20211129222628.7490-1-brian.silverman@bluerivertech.com
Cc: stable@vger.kernel.org
Signed-off-by: Brian Silverman <brian.silverman@bluerivertech.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-07 09:09:02 +01:00
Dan Carpenter
3ec6ca6b1a can: sja1000: fix use after free in ems_pcmcia_add_card()
If the last channel is not available then "dev" is freed.  Fortunately,
we can just use "pdev->irq" instead.

Also we should check if at least one channel was set up.

Fixes: fd734c6f25ae ("can/sja1000: add driver for EMS PCMCIA card")
Link: https://lore.kernel.org/all/20211124145041.GB13656@kili
Cc: stable@vger.kernel.org
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Tested-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-07 09:08:58 +01:00
Vincent Mailhol
94cddf1e92 can: pch_can: pch_can_rx_normal: fix use after free
After calling netif_receive_skb(skb), dereferencing skb is unsafe.
Especially, the can_frame cf which aliases skb memory is dereferenced
just after the call netif_receive_skb(skb).

Reordering the lines solves the issue.

Fixes: b21d18b51b31 ("can: Topcliff: Add PCH_CAN driver.")
Link: https://lore.kernel.org/all/20211123111654.621610-1-mailhol.vincent@wanadoo.fr
Cc: stable@vger.kernel.org
Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-07 09:08:36 +01:00
Andrii Nakryiko
fda684fb5e Merge branch 'samples: bpf: fix build issues with Clang/LLVM'
Alexander Lobakin says:

====================

Samples, at least XDP ones, can be built only with the compiler used
to build the kernel itself.
However, XDP sample infra introduced in Aug'21 was probably tested
with GCC/Binutils only as it's not really compilable for now with
Clang/LLVM.
These two are trivial fixes addressing this.
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2021-12-06 19:33:45 -08:00
Alexander Lobakin
6f670d06e4 samples: bpf: Fix 'unknown warning group' build warning on Clang
Clang doesn't have 'stringop-truncation' group like GCC does, and
complains about it when building samples which use xdp_sample_user
infra:

 samples/bpf/xdp_sample_user.h:48:32: warning: unknown warning group '-Wstringop-truncation', ignored [-Wunknown-warning-option]
 #pragma GCC diagnostic ignored "-Wstringop-truncation"
                                ^
[ repeat ]

Those are harmless, but avoidable when guarding it with ifdef.
I could guard push/pop as well, but this would require one more
ifdef cruft around a single line which I don't think is reasonable.

Fixes: 156f886cf697 ("samples: bpf: Add basic infrastructure for XDP samples")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/bpf/20211203195004.5803-3-alexandr.lobakin@intel.com
2021-12-06 19:33:44 -08:00
Alexander Lobakin
e64fbcaa7a samples: bpf: Fix xdp_sample_user.o linking with Clang
Clang (13) doesn't get the jokes about specifying libraries to link in
cclags of individual .o objects:

clang-13: warning: -lm: 'linker' input unused [-Wunused-command-line-argument]
[ ... ]
  LD  samples/bpf/xdp_redirect_cpu
  LD  samples/bpf/xdp_redirect_map_multi
  LD  samples/bpf/xdp_redirect_map
  LD  samples/bpf/xdp_redirect
  LD  samples/bpf/xdp_monitor
/usr/bin/ld: samples/bpf/xdp_sample_user.o: in function `sample_summary_print':
xdp_sample_user.c:(.text+0x84c): undefined reference to `floor'
/usr/bin/ld: xdp_sample_user.c:(.text+0x870): undefined reference to `ceil'
/usr/bin/ld: xdp_sample_user.c:(.text+0x8cf): undefined reference to `floor'
/usr/bin/ld: xdp_sample_user.c:(.text+0x8f3): undefined reference to `ceil'
[ more ]

Specify '-lm' as ldflags for all xdp_sample_user.o users in the main
Makefile and remove it from ccflags of ^ in Makefile.target -- just
like it's done for all other samples. This works with all compilers.

Fixes: 6e1051a54e31 ("samples: bpf: Convert xdp_monitor to XDP samples helper")
Fixes: b926c55d856c ("samples: bpf: Convert xdp_redirect to XDP samples helper")
Fixes: e531a220cc59 ("samples: bpf: Convert xdp_redirect_cpu to XDP samples helper")
Fixes: bbe65865aa05 ("samples: bpf: Convert xdp_redirect_map to XDP samples helper")
Fixes: 594a116b2aa1 ("samples: bpf: Convert xdp_redirect_map_multi to XDP samples helper")
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/bpf/20211203195004.5803-2-alexandr.lobakin@intel.com
2021-12-06 19:33:44 -08:00
Alexei Starovoitov
29f2e5bd94 bpf: Silence purge_cand_cache build warning.
When CONFIG_DEBUG_INFO_BTF_MODULES is not set
the following warning can be seen:
kernel/bpf/btf.c:6588:13: warning: 'purge_cand_cache' defined but not used [-Wunused-function]
Fix it.

Fixes: 1e89106da253 ("bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211207014839.6976-1-alexei.starovoitov@gmail.com
2021-12-06 18:24:34 -08:00
Tony Lu
1c5526968e net/smc: Clear memory when release and reuse buffer
Currently, buffers are cleared when smc connections are created and
buffers are reused. This slows down the speed of establishing new
connections. In most cases, the applications want to establish
connections as quickly as possible.

This patch moves memset() from connection creation path to release and
buffer unuse path, this trades off between speed of establishing and
release.

Test environments:
- CPU Intel Xeon Platinum 8 core, mem 32 GiB, nic Mellanox CX4
- socket sndbuf / rcvbuf: 16384 / 131072 bytes
- w/o first round, 5 rounds, avg, 100 conns batch per round
- smc_buf_create() use bpftrace kprobe, introduces extra latency

Latency benchmarks for smc_buf_create():
  w/o patch : 19040.0 ns
  w/  patch :  1932.6 ns
  ratio :        10.2% (-89.8%)

Latency benchmarks for socket create and connect:
  w/o patch :   143.3 us
  w/  patch :   102.2 us
  ratio :        71.3% (-28.7%)

The latency of establishing connections is reduced by 28.7%.

Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Link: https://lore.kernel.org/r/20211203113331.2818873-1-kgraul@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 17:01:28 -08:00
Song Liu
5a897531e0 perf bpf_skel: Do not use typedef to avoid error on old clang
When building bpf_skel with clang-10, typedef causes confusions like:

  libbpf: map 'prev_readings': unexpected def kind var.

Fix this by removing the typedef.

Fixes: 7fac83aaf2eecc9e ("perf stat: Introduce 'bperf' to share hardware PMCs with BPF")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Song Liu <songliubraving@fb.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/BEF5C312-4331-4A60-AEC0-AD7617CB2BC4@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Song Liu
f7c4e85bcc perf bpf: Fix building perf with BUILD_BPF_SKEL=1 by default in more distros
Arnaldo reported that building all his containers with BUILD_BPF_SKEL=1
to then make this the default he found problems in some distros where
the system linux/bpf.h file was being used and lacked this:

   util/bpf_skel/bperf_leader.bpf.c:13:20: error: use of undeclared identifier 'BPF_F_PRESERVE_ELEMS'
           __uint(map_flags, BPF_F_PRESERVE_ELEMS);

So use instead the vmlinux.h file generated by bpftool from BTF info.

This fixed these as well, getting the build back working on debian:11,
debian:experimental and ubuntu:21.10:

  In file included from In file included from util/bpf_skel/bperf_leader.bpf.cutil/bpf_skel/bpf_prog_profiler.bpf.c::33:
  :
  In file included from In file included from /usr/include/linux/bpf.h/usr/include/linux/bpf.h::1111:
  :
  /usr/include/linux/types.h/usr/include/linux/types.h::55::1010:: In file included from  util/bpf_skel/bperf_follower.bpf.c:3fatal errorfatal error:
  : : In file included from /usr/include/linux/bpf.h:'asm/types.h' file not found11'asm/types.h' file not found:

  /usr/include/linux/types.h:5:10: fatal error: 'asm/types.h' file not found
  #include <asm/types.h>#include <asm/types.h>

           ^~~~~~~~~~~~~         ^~~~~~~~~~~~~

  #include <asm/types.h>
           ^~~~~~~~~~~~~
  1 error generated.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Song Liu <song@kernel.org>
Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/CF175681-8101-43D1-ABDB-449E644BE986@fb.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Ian Rogers
4747395082 perf header: Fix memory leaks when processing feature headers
These leaks were found with leak sanitizer running "perf pipe recording
and injection test".

In pipe mode feat_fd may hold onto an events struct that needs freeing.

When string features are processed they may overwrite an already created
string, so free this before the overwrite.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211118201730.2302927-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Ian Rogers
1aa79e5773 perf test: Reset shadow counts before loading
Otherwise load counting is an average. Without this change
duration_time in test_memory_bandwidth will alter its value if an
earlier test contains duration_time.

This patch fixes an issue that's introduced in the proposed patch:
https://lore.kernel.org/lkml/20211124015226.3317994-1-irogers@google.com/
in perf test "Parse and process metrics".

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211128085810.4027314-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Thomas Richter
6c481031c9 perf test: Fix 'Simple expression parser' test on arch without CPU die topology info
Some platforms do not have CPU die support, for example s390.

Commit
Cc: Ian Rogers <irogers@google.com>
Fixes: fdf1e29b6118c18f ("perf expr: Add metric literals for topology.")
fails on s390:

  # perf test -Fv 7
    ...
  # FAILED tests/expr.c:173 #num_dies >= #num_packages
    ---- end ----
    Simple expression parser: FAILED!
  #

Investigating this issue leads to these functions:

 build_cpu_topology()
   +--> has_die_topology(void)
        {
           struct utsname uts;

           if (uname(&uts) < 0)
                  return false;
           if (strncmp(uts.machine, "x86_64", 6))
                  return false;
           ....
        }

which always returns false on s390. The caller build_cpu_topology()
checks has_die_topology() return value. On false the the struct
cpu_topology::die_cpu_list is not contructed and has zero entries. This
leads to the failing comparison: #num_dies >= #num_packages.  s390 of
course has a positive number of packages.

Fix this and check if the function build_cpu_topology() did build up
a die_cpus_list. The number of entries in this list should be larger
than 0. If the number of list element is zero, the die_cpus_list has
not been created and the check in function test__expr():

    TEST_ASSERT_VAL("#num_dies >= #num_packages", \
		    num_dies >= num_packages)

always fails.

Output after:

  # perf test -Fv 7
   7: Simple expression parser                                        :
   --- start ---
   division by zero
   syntax error
   ---- end ----
   Simple expression parser: Ok
  #

Fixes: fdf1e29b6118c18f ("perf expr: Add metric literals for topology.")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20211129112339.3003036-1-tmricht@linux.ibm.com
[ Added comment in the added 'if (num_dies)' line about architectures not having die topology ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Arnaldo Carvalho de Melo
3d1d57debe tools build: Remove needless libpython-version feature check that breaks test-all fast path
Since 66dfdff03d196e51 ("perf tools: Add Python 3 support") we don't use
the tools/build/feature/test-libpython-version.c version in any Makefile
feature check:

  $ find tools/ -type f | xargs grep feature-libpython-version
  $

The only place where this was used was removed in 66dfdff03d196e51:

  -        ifneq ($(feature-libpython-version), 1)
  -          $(warning Python 3 is not yet supported; please set)
  -          $(warning PYTHON and/or PYTHON_CONFIG appropriately.)
  -          $(warning If you also have Python 2 installed, then)
  -          $(warning try something like:)
  -          $(warning $(and ,))
  -          $(warning $(and ,)  make PYTHON=python2)
  -          $(warning $(and ,))
  -          $(warning Otherwise, disable Python support entirely:)
  -          $(warning $(and ,))
  -          $(warning $(and ,)  make NO_LIBPYTHON=1)
  -          $(warning $(and ,))
  -          $(error   $(and ,))
  -        else
  -          LDFLAGS += $(PYTHON_EMBED_LDFLAGS)
  -          EXTLIBS += $(PYTHON_EMBED_LIBADD)
  -          LANG_BINDINGS += $(obj-perf)python/perf.so
  -          $(call detected,CONFIG_LIBPYTHON)
  -        endif

And nowadays we either build with PYTHON=python3 or just install the
python3 devel packages and perf will build against it.

But the leftover feature-libpython-version check made the fast path
feature detection to break in all cases except when python2 devel files
were installed:

  $ rpm -qa | grep python.*devel
  python3-devel-3.9.7-1.fc34.x86_64
  $ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ;
  $ make -C tools/perf O=/tmp/build/perf install-bin
  make: Entering directory '/var/home/acme/git/perf/tools/perf'
    BUILD:   Doing 'make -j32' parallel build
    HOSTCC  /tmp/build/perf/fixdep.o
  <SNIP>
  $ cat /tmp/build/perf/feature/test-all.make.output
  In file included from test-all.c:18:
  test-libpython-version.c:5:10: error: #error
      5 |         #error
        |          ^~~~~
  $ ldd ~/bin/perf | grep python
	libpython3.9.so.1.0 => /lib64/libpython3.9.so.1.0 (0x00007fda6dbcf000)
  $

As python3 is the norm these days, fix this by just removing the unused
feature-libpython-version feature check, making the test-all fast path
to work with the common case.

With this:

  $ rm -rf /tmp/build/perf ; mkdir -p /tmp/build/perf ;
  $ make -C tools/perf O=/tmp/build/perf install-bin |& head
  make: Entering directory '/var/home/acme/git/perf/tools/perf'
    BUILD:   Doing 'make -j32' parallel build
    HOSTCC  /tmp/build/perf/fixdep.o
    HOSTLD  /tmp/build/perf/fixdep-in.o
    LINK    /tmp/build/perf/fixdep

  Auto-detecting system features:
  ...                         dwarf: [ on  ]
  ...            dwarf_getlocations: [ on  ]
  ...                         glibc: [ on  ]
  $ ldd ~/bin/perf | grep python
	libpython3.9.so.1.0 => /lib64/libpython3.9.so.1.0 (0x00007f58800b0000)
  $ cat /tmp/build/perf/feature/test-all.make.output
  $

Reviewed-by: James Clark <james.clark@arm.com>
Fixes: 66dfdff03d196e51 ("perf tools: Add Python 3 support")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jaroslav Škarvada <jskarvad@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/YaYmeeC6CS2b8OSz@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Ian Rogers
4ffbe87e2d perf tools: Fix SMT detection fast read path
sysfs__read_int() returns 0 on success, and so the fast read path was
always failing.

Fixes: bb629484d924118e ("perf tools: Simplify checking if SMT is active.")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211124001231.3277836-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Arnaldo Carvalho de Melo
cba43fcf7a tools headers UAPI: Sync powerpc syscall table file changed by new futex_waitv syscall
To pick the changes in this cset:

  a0eb2da92b715d0c ("futex: Wireup futex_waitv syscall")

That add support for this new syscall in tools such as 'perf trace'.

For instance, this is now possible (adapted from the x86_64 test output):

  # perf trace -e futex_waitv
  ^C#
  # perf trace -v -e futex_waitv
  event qualifier tracepoint filter: (common_pid != 807333 && common_pid != 3564) && (id == 449)
  ^C#
  # perf trace -v -e futex* --max-events 10
  event qualifier tracepoint filter: (common_pid != 812168 && common_pid != 3564) && (id == 221 || id == 449)
  mmap size 528384B
           ? (         ): Timer/219310  ... [continued]: futex())                                            = -1 ETIMEDOUT (Connection timed out)
       0.012 ( 0.002 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.024 ( 0.060 ms): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) = 0
       0.086 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.088 (         ): Timer/219310 futex(uaddr: 0x7fd0b152d424, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
       0.075 ( 0.005 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d420, op: WAKE|PRIVATE_FLAG, val: 1)     = 1
       0.169 ( 0.004 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d424, op: WAKE|PRIVATE_FLAG, val: 1)     = 1
       0.088 ( 0.089 ms): Timer/219310  ... [continued]: futex())                                            = 0
       0.179 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.181 (         ): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
  #

That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.

  $ grep futex tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
  221	32	futex				sys_futex_time32
  221	64	futex				sys_futex
  221	spu	futex				sys_futex
  422	32	futex_time64			sys_futex			sys_futex
  449	common  futex_waitv                     sys_futex_waitv
  $

This addresses this perf build warnings:

  Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
  diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl

Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>,
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/YZ%2F1OU9mJuyS2HMa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:53 -03:00
Adrian Hunter
c29d979260 perf inject: Fix itrace space allowed for new attributes
The space allowed for new attributes can be too small if existing header
information is large. That can happen, for example, if there are very
many CPUs, due to having an event ID per CPU per event being stored in the
header information.

Fix by adding the existing header.data_offset. Also increase the extra
space allowed to 8KiB and align to a 4KiB boundary for neatness.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20211125071457.2066863-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:52 -03:00
Arnaldo Carvalho de Melo
71a16df164 tools headers UAPI: Sync s390 syscall table file changed by new futex_waitv syscall
To pick the changes in these csets:

  6c122360cf2f4c5a ("s390: wire up sys_futex_waitv system call")

That add support for this new syscall in tools such as 'perf trace'.

For instance, this is now possible (adapted from the x86_64 test output):

  # perf trace -e futex_waitv
  ^C#
  # perf trace -v -e futex_waitv
  event qualifier tracepoint filter: (common_pid != 807333 && common_pid != 3564) && (id == 449)
  ^C#
  # perf trace -v -e futex* --max-events 10
  event qualifier tracepoint filter: (common_pid != 812168 && common_pid != 3564) && (id == 238 || id == 449)
           ? (         ): Timer/219310  ... [continued]: futex())                                            = -1 ETIMEDOUT (Connection timed out)
       0.012 ( 0.002 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.024 ( 0.060 ms): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) = 0
       0.086 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.088 (         ): Timer/219310 futex(uaddr: 0x7fd0b152d424, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
       0.075 ( 0.005 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d420, op: WAKE|PRIVATE_FLAG, val: 1)     = 1
       0.169 ( 0.004 ms): Web Content/219299 futex(uaddr: 0x7fd0b152d424, op: WAKE|PRIVATE_FLAG, val: 1)     = 1
       0.088 ( 0.089 ms): Timer/219310  ... [continued]: futex())                                            = 0
       0.179 ( 0.001 ms): Timer/219310 futex(uaddr: 0x7fd0b152d3c8, op: WAKE|PRIVATE_FLAG, val: 1)           = 0
       0.181 (         ): Timer/219310 futex(uaddr: 0x7fd0b152d420, op: WAIT_BITSET|PRIVATE_FLAG, utime: 0x7fd0b1657840, val3: MATCH_ANY) ...
  #

That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.

  $ grep futex tools/perf/arch/s390/entry/syscalls/syscall.tbl
  238  common	futex			sys_futex			sys_futex_time32
  422	32	futex_time64		-				sys_futex
  449  common	futex_waitv		sys_futex_waitv			sys_futex_waitv
  $

This addresses this perf build warnings:

  Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
  diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl

Acked-by: Heiko Carstens <hca@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>,
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/lkml/YZ%2F2qRW%2FTScYTP1U@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:52 -03:00
Jiri Olsa
3f8d657716 Revert "perf bench: Fix two memory leaks detected with ASan"
This: This reverts commit 92723ea0f11d92496687db8c9725248e9d1e5e1d.

  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRRRRRR FAILED!
  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRRRRRR FAILED!
  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRRRRR FAILED!
  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRRRRRRRRRRR Ok
  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRR FAILED!
  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRRRR Ok
  # perf test 91
  91: perf stat --bpf-counters test           :RRRRRRRRRRRRRRR Ok

yep, it seems the perf bench is broken so the counts won't correlated if
I revert this one:

  92723ea0f11d perf bench: Fix two memory leaks detected with ASan

it works for me again.. it seems to break -t option

   [root@dell-r440-01 perf]# ./perf bench sched messaging -g 1 -l 100 -t
   # Running 'sched/messaging' benchmark:
   RRRperf: CLIENT: ready write: Bad file descriptor
   Rperf: SENDER: write: Bad file descriptor

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/lkml/YZev7KClb%2Fud43Lc@krava/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-12-06 21:57:52 -03:00
Eric Dumazet
4dbb0dad8e devlink: fix netns refcount leak in devlink_nl_cmd_reload()
While preparing my patch series adding netns refcount tracking,
I spotted bugs in devlink_nl_cmd_reload()

Some error paths forgot to release a refcount on a netns.

To fix this, we can reduce the scope of get_net()/put_net()
section around the call to devlink_reload().

Fixes: ccdf07219da6 ("devlink: Add reload action option to devlink reload command")
Fixes: dc64cc7c6310 ("devlink: Add devlink reload limit option")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Moshe Shemesh <moshe@mellanox.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20211205192822.1741045-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:56:32 -08:00
Antoine Tenart
dde91ccfa2 ethtool: do not perform operations on net devices being unregistered
There is a short period between a net device starts to be unregistered
and when it is actually gone. In that time frame ethtool operations
could still be performed, which might end up in unwanted or undefined
behaviours[1].

Do not allow ethtool operations after a net device starts its
unregistration. This patch targets the netlink part as the ioctl one
isn't affected: the reference to the net device is taken and the
operation is executed within an rtnl lock section and the net device
won't be found after unregister.

[1] For example adding Tx queues after unregister ends up in NULL
    pointer exceptions and UaFs, such as:

      BUG: KASAN: use-after-free in kobject_get+0x14/0x90
      Read of size 1 at addr ffff88801961248c by task ethtool/755

      CPU: 0 PID: 755 Comm: ethtool Not tainted 5.15.0-rc6+ #778
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/014
      Call Trace:
       dump_stack_lvl+0x57/0x72
       print_address_description.constprop.0+0x1f/0x140
       kasan_report.cold+0x7f/0x11b
       kobject_get+0x14/0x90
       kobject_add_internal+0x3d1/0x450
       kobject_init_and_add+0xba/0xf0
       netdev_queue_update_kobjects+0xcf/0x200
       netif_set_real_num_tx_queues+0xb4/0x310
       veth_set_channels+0x1c3/0x550
       ethnl_set_channels+0x524/0x610

Fixes: 041b1c5d4a53 ("ethtool: helper functions for netlink interface")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Link: https://lore.kernel.org/r/20211203101318.435618-1-atenart@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:53:32 -08:00
Salvatore Bonaccorso
cd8c917a56 Makefile: Do not quote value for CONFIG_CC_IMPLICIT_FALLTHROUGH
Andreas reported that a specific build environment for an external
module, being a bit broken, does pass CC_IMPLICIT_FALLTHROUGH quoted as
argument to gcc, causing an error

  gcc-11: error: "-Wimplicit-fallthrough=5": linker input file not found: No such file or directory

Until this is more generally fixed as outlined in [1], by fixing
scripts/link-vmlinux.sh, scripts/gen_autoksyms.sh, etc to not directly
include the include/config/auto.conf, and in a second step, change
Kconfig to generate the auto.conf without "", workaround the issue by
explicitly unquoting CC_IMPLICIT_FALLTHROUGH.

Reported-by: Andreas Beckmann <anbe@debian.org>
Link: https://bugs.debian.org/1001083
Link: https://lore.kernel.org/linux-kbuild/CAK7LNAR-VXwHFEJqCcrFDZj+_4+Xd6oynbj_0eS8N504_ydmyw@mail.gmail.com/ [1]
Signed-off-by: Salvatore Bonaccorso <carnil@debian.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-12-06 16:42:47 -08:00
Guangbin Huang
364d470d54 Revert "net: hns3: add void before function which don't receive ret"
This reverts commit 5ac4f180bd07116c1e57858bc3f6741adbca3eb6.

Sorry for taking no notice that the function devlink_register() has been
already declared as void, so it is needs to revert this patch.

Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Link: https://lore.kernel.org/r/20211204012448.51360-1-huangguangbin2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:30:12 -08:00
José Expósito
01081be1ea net: prestera: replace zero-length array with flexible-array member
One-element and zero-length arrays are deprecated and should be
replaced with flexible-array members:
https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays

Replace zero-length array with flexible-array member and make use
of the struct_size() helper.

Link: https://github.com/KSPP/linux/issues/78
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Reviewed-by: Volodymyr Mytnyk <vmytnyk@marvell.com>
Tested-by: Volodymyr Mytnyk <vmytnyk@marvell.com>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20211204171349.22776-1-jose.exposito89@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:29:27 -08:00
Arnd Bergmann
5382911f5d net: wwan: iosm: select CONFIG_RELAY
The iosm driver started using relayfs, but is missing the Kconfig
logic to ensure it's built into the kernel:

x86_64-linux-ld: drivers/net/wwan/iosm/iosm_ipc_trace.o: in function `ipc_trace_create_buf_file_handler':
iosm_ipc_trace.c:(.text+0x16): undefined reference to `relay_file_operations'
x86_64-linux-ld: drivers/net/wwan/iosm/iosm_ipc_trace.o: in function `ipc_trace_subbuf_start_handler':
iosm_ipc_trace.c:(.text+0x31): undefined reference to `relay_buf_full'
x86_64-linux-ld: drivers/net/wwan/iosm/iosm_ipc_trace.o: in function `ipc_trace_ctrl_file_write':
iosm_ipc_trace.c:(.text+0xd5): undefined reference to `relay_flush'
x86_64-linux-ld: drivers/net/wwan/iosm/iosm_ipc_trace.o: in function `ipc_trace_port_rx':

Fixes: 00ef32565b9b ("net: wwan: iosm: device trace collection using relayfs")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Reviewed-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com>
Link: https://lore.kernel.org/r/20211204174033.950528-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:29:10 -08:00
Eric Dumazet
45cac67545 net: fix recent csum changes
Vladimir reported csum issues after my recent change in skb_postpull_rcsum()

Issue here is the following:

initial skb->csum is the csum of

[part to be pulled][rest of packet]

Old code:
 skb->csum = csum_sub(skb->csum, csum_partial(pull, pull_length, 0));

New code:
 skb->csum = ~csum_partial(pull, pull_length, ~skb->csum);

This is broken if the csum of [pulled part]
happens to be equal to skb->csum, because end
result of skb->csum is 0 in new code, instead
of being 0xffffffff

David Laight suggested to use

skb->csum = -csum_partial(pull, pull_length, -skb->csum);

I based my patches on existing code present in include/net/seg6.h,
update_csum_diff4() and update_csum_diff16() which might need
a similar fix.

I guess that my tests, mostly pulling 40 bytes of IPv6 header
were not providing enough entropy to hit this bug.

v2: added wsum_negate() to make sparse happy.

Fixes: 29c3002644bd ("net: optimize skb_postpull_rcsum()")
Fixes: 0bd28476f636 ("gro: optimize skb_gro_postpull_rcsum()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Cc: David Lebrun <dlebrun@google.com>
Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20211204045356.3659278-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:26:46 -08:00
Jakub Kicinski
4c375272fb Merge branch 'net-add-preliminary-netdev-refcount-tracking'
Eric Dumazet says:

====================
net: add preliminary netdev refcount tracking

Two first patches add a generic infrastructure, that will be used
to get tracking of refcount increments/decrements.

The general idea is to be able to precisely pair each decrement with
a corresponding prior increment. Both share a cookie, basically
a pointer to private data storing stack traces.

The third patch adds dev_hold_track() and dev_put_track() helpers
(CONFIG_NET_DEV_REFCNT_TRACKER)

Then a series of 20 patches converts some dev_hold()/dev_put()
pairs to new hepers : dev_hold_track() and dev_put_track().

Hopefully this will be used by developpers and syzbot to
root cause bugs that cause netdevice dismantles freezes.

With CONFIG_PCPU_DEV_REFCNT=n option, we were able to detect
some class of bugs, but too late (when too many dev_put()
were happening).

Another series will be sent after this one is merged.

v3: moved NET_DEV_REFCNT_TRACKER to net/Kconfig.debug
    added "depends on DEBUG_KERNEL && STACKTRACE_SUPPORT"
    to hopefully get rid of kbuild reports for ARCH=nios2
    Reworded patch 3 changelog.
    Added missing htmldocs (Jakub)

v2: added four additional patches,
    added netdev_tracker_alloc() and netdev_tracker_free()
    addressed build error (kernel bots),
    use GFP_ATOMIC in test_ref_tracker_timer_func()
====================

Link: https://lore.kernel.org/r/20211205042217.982127-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:06:04 -08:00
Eric Dumazet
5fa5ae6058 netpoll: add net device refcount tracker to struct netpoll
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:06:02 -08:00
Eric Dumazet
42120a8643 ipmr, ip6mr: add net device refcount tracker to struct vif_device
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:06:02 -08:00
Eric Dumazet
095e200f17 net: failover: add net device refcount tracker
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:06:02 -08:00
Eric Dumazet
63f13937cb net: linkwatch: add net device refcount tracker
Add a netdevice_tracker inside struct net_device, to track
the self reference when a device is in lweventlist.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:05:44 -08:00
Eric Dumazet
606509f27f net/sched: add net device refcount tracker to struct Qdisc
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:05:12 -08:00
Eric Dumazet
c04438f58d ipv4: add net device refcount tracker to struct in_device
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:05:11 -08:00
Eric Dumazet
8c727003c4 ipv6: add net device refcount tracker to struct inet6_dev
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:05:11 -08:00
Eric Dumazet
f77159a348 net: add net device refcount tracker to struct netdev_adjacent
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:05:11 -08:00
Eric Dumazet
08d622568e net: add net device refcount tracker to struct neigh_parms
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:05:11 -08:00
Eric Dumazet
77a23b1f95 net: add net device refcount tracker to struct pneigh_entry
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:05:11 -08:00
Eric Dumazet
85662c9f8c net: add net device refcount tracker to struct neighbour
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06 16:05:11 -08:00