IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
66e4c8d95008 ("net: warn if transport header was not set") added
a check that triggers a warning in r8169, see [0].
The commit referenced in the Fixes tag refers to the change from
which the patch applies cleanly, there's nothing wrong with this
commit. It seems the actual issue (not bug, because the warning
is harmless here) was introduced with bdfa4ed68187
("r8169: use Giant Send").
[0] https://bugzilla.kernel.org/show_bug.cgi?id=216157
Fixes: 8d520b4de3ed ("r8169: work around RTL8125 UDP hw bug")
Reported-by: Erhard F. <erhard_f@mailbox.org>
Tested-by: Erhard F. <erhard_f@mailbox.org>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/1b2c2b29-3dc0-f7b6-5694-97ec526d51a0@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Remove unnecessary spi_set_drvdata() in b53_spi_remove(), the
driver_data will be set to NULL in device_unbind_cleanup() after
calling ->remove().
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20220705131733.351962-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There are UAF bugs caused by rose_t0timer_expiry(). The
root cause is that del_timer() could not stop the timer
handler that is running and there is no synchronization.
One of the race conditions is shown below:
(thread 1) | (thread 2)
| rose_device_event
| rose_rt_device_down
| rose_remove_neigh
rose_t0timer_expiry | rose_stop_t0timer(rose_neigh)
... | del_timer(&neigh->t0timer)
| kfree(rose_neigh) //[1]FREE
neigh->dce_mode //[2]USE |
The rose_neigh is deallocated in position [1] and use in
position [2].
The crash trace triggered by POC is like below:
BUG: KASAN: use-after-free in expire_timers+0x144/0x320
Write of size 8 at addr ffff888009b19658 by task swapper/0/0
...
Call Trace:
<IRQ>
dump_stack_lvl+0xbf/0xee
print_address_description+0x7b/0x440
print_report+0x101/0x230
? expire_timers+0x144/0x320
kasan_report+0xed/0x120
? expire_timers+0x144/0x320
expire_timers+0x144/0x320
__run_timers+0x3ff/0x4d0
run_timer_softirq+0x41/0x80
__do_softirq+0x233/0x544
...
This patch changes rose_stop_ftimer() and rose_stop_t0timer()
in rose_remove_neigh() to del_timer_sync() in order that the
timer handler could be finished before the resources such as
rose_neigh and so on are deallocated. As a result, the UAF
bugs could be mitigated.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Link: https://lore.kernel.org/r/20220705125610.77971-1-duoming@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
usbnet_write_cmd_async() mixed up which buffers
need to be freed in which error case.
v2: add Fixes tag
v3: fix uninitialized buf pointer
Fixes: 877bd862f32b8 ("usbnet: introduce usbnet 3 command helpers")
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Link: https://lore.kernel.org/r/20220705125351.17309-1-oneukum@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This reverts commit 2ef8e39f58f08589ab035223c2687830c0eba30f, reversing
changes made to e7ce9fc9ad38773b660ef663ae98df4f93cb6a37.
There are build warnings here which break the normal
build due to -Werror. Ratheesh was nice enough to quickly
follow up with fixes but didn't hit all the warnings I
see on GCC 12 so to unlock net-next from taking patches
let get this series out for now.
Link: https://lore.kernel.org/r/20220707013201.1372433-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fixups for OpenRISC found during recent testing:
- An OpenRISC irqchip fix to stop acking level interrupts which was
causing issues on SMP platforms.
- A comment typo fix in our unwinder code.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE2cRzVK74bBA6Je/xw7McLV5mJ+QFAmLFiUMACgkQw7McLV5m
J+RIvRAAg3Aru9TPDSqL5/NS2AF+sJAJLBuG0FXNSvVl5xSKsZZ/cVcFHx2vK0U3
UMoVnmKTuNIFF47HnkOZ6yCseM06OYWzQREyS203zvVqUjjR1tVfz0uX9p9gGxio
wFai3YNBH923dfLE9M2BpGvpKVWbD5zrZlCu+3Br1kfFxZ+B0PQECqZlXhrqIKx0
7IRrV3N/lPfvZY45/cROofQNaGG71Z/DhdHy6XL/CMCeAyRcSJCM5kcWmjPKGefO
ZYjaypid78EQwf0hDGMrP/BmBmErYxIrbUoqKslPhLifYfdK0EaNE7nQqKENu99G
igXNhn5akoBIxczakTTlK69R9DecYsu4gzRfEaqsBwpJXS4Ndq2MdN1cozZTB0Zp
lAkFFXR0EqwxSaF8qF7FhStRj+luT+dl3BKJqwQPYoohNEC4Z8VCm9VZJJpkLt2N
hX+SD/wLUg1Yay7orSdAD/R7M9lypFgV3fpjtjtJcZR5EE70R69Rp6h7cci7XaM+
pe9lMeTRQsqiNq2QZHPFN7RVLMs+pgwEJdUOmXsrEC+arpQCEMnREsnHOeSQxckx
DZf7wMkx2c4VcSRaxmkIKfz6iWnqscOkkBkzrakcPBoBMs5kv1VXRqBlOxwtkt00
m8seJg+n9agg7kynIqARqffEJESTy2RQhjuvQVw3nkKGQvsmays=
=Uug/
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of https://github.com/openrisc/linux
Pull OpenRISC fixes from Stafford Horne:
"Fixups for OpenRISC found during recent testing:
- An OpenRISC irqchip fix to stop acking level interrupts which was
causing issues on SMP platforms
- A comment typo fix in our unwinder code"
* tag 'for-linus' of https://github.com/openrisc/linux:
openrisc: unwinder: Fix grammar issue in comment
irqchip: or1k-pic: Undefine mask_ack for level triggered hardware
This became largish as it includes the pending ASoC fixes.
Almost all changes are device-specific small fixes, while many of
them are coverage for mixer issues that were detected by selftest.
In addition, usual suspects for HD/USB-audio are there.
-----BEGIN PGP SIGNATURE-----
iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmLEV6wOHHRpd2FpQHN1
c2UuZGUACgkQLtJE4w1nLE8Z9Q/6AwD3QAU78BQLKVp7I2poW35BGSKkidhBRQkP
hWA4TKh/4itD1xnoKqKVDFZCkB6Had8GduHwCq7IymA6n13Zn27GwnoTOstkbwBF
VyYnMEFA0nXVQTjdgoGPtyphOd3Oqh345/NGzwaaAh3f0IQUr4mqmekxNF/GVff6
QPgQsx0UAVmX31vj0jFxVC/J9QyRT4SHFNtY9m8/OG/t9QbGew/EiSAgtdoVGRK8
tvJq8mqaJBO0RbeYInsND5WjpbuAOPrXQ/XT56/J13YaamK+nxr13cIMXAERzrEn
TR3m2ZZrUp5zidbyZsj9Kck8Cxk6UH6W5QvkENC+3UTHZmg9ovlq9x7RuaeIHBjF
5Q4QR9Aw7L6IEOi0QmDWBhpsShKtB0g2ImTOyUu5Xi9Wo4XLSFXBJ5bCulrBl34X
io3SMZVv3F4uR+7mgm8W8VhOlJlqlUd7cIu84cWBdsPx8FVtNEe+CKFgvdYOMOV4
Pej/dSlUjwELbCB0yP2Vk4OrsBd/Gu1LGh21+ZLKE5i1h/IkNHTyRumVqCN2Dibt
r6jMwPqyCJB6eDg1MRiBZlnnIrb3zytt6Gm7SOoA4aLGY4FkPlOUE+bjFkVANpfp
TR+V2uR0pm3uh2jomSk+sP/3HItf54eyEwollr/aPC8jHn38x1Dm6BLZMvN7MITI
C/syO54=
=v73C
-----END PGP SIGNATURE-----
Merge tag 'sound-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"This became largish as it includes the pending ASoC fixes.
Almost all changes are device-specific small fixes, while many of them
are coverage for mixer issues that were detected by selftest. In
addition, usual suspects for HD/USB-audio are there"
* tag 'sound-5.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (43 commits)
ALSA: cs46xx: Fix missing snd_card_free() call at probe error
ALSA: usb-audio: Add quirk for Fiero SC-01 (fw v1.0.0)
ALSA: usb-audio: Add quirk for Fiero SC-01
ALSA: hda/realtek: Add quirk for Clevo L140PU
ALSA: usb-audio: Add quirks for MacroSilicon MS2100/MS2106 devices
ASoC: madera: Fix event generation for rate controls
ASoC: madera: Fix event generation for OUT1 demux
ASoC: cs47l15: Fix event generation for low power mux control
ASoC: cs35l41: Add ASP TX3/4 source to register patch
ASoC: dapm: Initialise kcontrol data for mux/demux controls
ASoC: rt711-sdca: fix kernel NULL pointer dereference when IO error
ASoC: cs35l41: Correct some control names
ASoC: wm5110: Fix DRE control
ASoC: wm_adsp: Fix event for preloader
MAINTAINERS: update ASoC Qualcomm maintainer email-id
ASoC: rockchip: i2s: switch BCLK to GPIO
ASoC: SOF: Intel: disable IMR boot when resuming from ACPI S4 and S5 states
ASoC: SOF: pm: add definitions for S4 and S5 states
ASoC: SOF: pm: add explicit behavior for ACPI S1 and S2
ASoC: SOF: Intel: hda: Fix compressed stream position tracking
...
Coverity detected that usdt_rel_ip is unconditionally overwritten
anyways, so there is no need to unnecessarily initialize it with unused
value. Clean this up.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20220705224818.4026623-4-andrii@kernel.org
When compiling with -O2, GCC detects few problems with selftests/bpf, so
fix all of them. Two are real issues (uninitialized err and nums
out-of-bounds access), but two other uninitialized variables warnings
are due to GCC not being able to prove that variables are indeed
initialized under conditions under which they are used.
Fix all 4 cases, though.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20220705224818.4026623-3-andrii@kernel.org
When compiling selftests/bpf in optimized mode (-O2), GCC erroneously
complains about uninitialized token variable:
In file included from network_helpers.c:22:
network_helpers.c: In function ‘open_netns’:
test_progs.h:355:22: error: ‘token’ may be used uninitialized [-Werror=maybe-uninitialized]
355 | int ___err = libbpf_get_error(___res); \
| ^~~~~~~~~~~~~~~~~~~~~~~~
network_helpers.c:440:14: note: in expansion of macro ‘ASSERT_OK_PTR’
440 | if (!ASSERT_OK_PTR(token, "malloc token"))
| ^~~~~~~~~~~~~
In file included from /data/users/andriin/linux/tools/testing/selftests/bpf/tools/include/bpf/libbpf.h:21,
from bpf_util.h:9,
from network_helpers.c:20:
/data/users/andriin/linux/tools/testing/selftests/bpf/tools/include/bpf/libbpf_legacy.h:113:17: note: by argument 1 of type ‘const void *’ to ‘libbpf_get_error’ declared here
113 | LIBBPF_API long libbpf_get_error(const void *ptr);
| ^~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
make: *** [Makefile:522: /data/users/andriin/linux/tools/testing/selftests/bpf/network_helpers.o] Error 1
This is completely bogus becuase libbpf_get_error() doesn't dereference
pointer, but the only easy way to silence this is to allocate initialized
memory with calloc().
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20220705224818.4026623-2-andrii@kernel.org
This reverts commit 284b4d93daee56dff3e10029ddf2e03227f50dbf.
When using TLS device offload and coming from tls_device_reencrypt()
flow, -EBADMSG error in tls_do_decryption() should not be counted
towards the TLSTlsDecryptError counter.
Move the counter increase back to the decrypt_internal() call site in
decrypt_skb_update().
This also fixes an issue where:
if (n_sgin < 1)
return -EBADMSG;
Errors in decrypt_internal() were not counted after the cited patch.
Fixes: 284b4d93daee ("tls: rx: move counting TlsDecryptErrors for sync")
Cc: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Qiao Ma says:
====================
net: hinic: fix bugs about dev_get_stats
These patches fixes 2 bugs of hinic driver:
- fix bug that ethtool get wrong stats because of hinic_{txq|rxq}_clean_stats() is called
- avoid kernel hung in hinic_get_stats64()
See every patch for more information.
Changes in v4:
- removed meaningless u64_stats_sync protection in hinic_{txq|rxq}_get_stats
- merged the third patch in v2 into first one
Changes in v3:
- fixes a compile warning reported by kernel test robot <lkp@intel.com>
Changes in v2:
- fixes another 2 bugs. (v1 is a single patch, see: https://lore.kernel.org/all/07736c2b7019b6883076a06129e06e8f7c5f7154.1656487154.git.mqaio@linux.alibaba.com/).
- to fix extra bugs, hinic_dev.tx_stats/rx_stats is removed, so there is no need to use spinlock or semaphore now.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When using hinic device as a bond slave device, and reading device stats
of master bond device, the kernel may hung.
The kernel panic calltrace as follows:
Kernel panic - not syncing: softlockup: hung tasks
Call trace:
native_queued_spin_lock_slowpath+0x1ec/0x31c
dev_get_stats+0x60/0xcc
dev_seq_printf_stats+0x40/0x120
dev_seq_show+0x1c/0x40
seq_read_iter+0x3c8/0x4dc
seq_read+0xe0/0x130
proc_reg_read+0xa8/0xe0
vfs_read+0xb0/0x1d4
ksys_read+0x70/0xfc
__arm64_sys_read+0x20/0x30
el0_svc_common+0x88/0x234
do_el0_svc+0x2c/0x90
el0_svc+0x1c/0x30
el0_sync_handler+0xa8/0xb0
el0_sync+0x148/0x180
And the calltrace of task that actually caused kernel hungs as follows:
__switch_to+124
__schedule+548
schedule+72
schedule_timeout+348
__down_common+188
__down+24
down+104
hinic_get_stats64+44 [hinic]
dev_get_stats+92
bond_get_stats+172 [bonding]
dev_get_stats+92
dev_seq_printf_stats+60
dev_seq_show+24
seq_read_iter+964
seq_read+220
proc_reg_read+164
vfs_read+172
ksys_read+108
__arm64_sys_read+28
el0_svc_common+132
do_el0_svc+40
el0_svc+24
el0_sync_handler+164
el0_sync+324
When getting device stats from bond, kernel will call bond_get_stats().
It first holds the spinlock bond->stats_lock, and then call
hinic_get_stats64() to collect hinic device's stats.
However, hinic_get_stats64() calls `down(&nic_dev->mgmt_lock)` to
protect its critical section, which may schedule current task out.
And if system is under high pressure, the task cannot be woken up
immediately, which eventually triggers kernel hung panic.
Since previous patch has replaced hinic_dev.tx_stats/rx_stats with local
variable in hinic_get_stats64(), there is nothing need to be protected
by lock, so just removing down()/up() is ok.
Fixes: edd384f682cc ("net-next/hinic: Add ethtool and stats")
Signed-off-by: Qiao Ma <mqaio@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Function hinic_get_stats64() will do two operations:
1. reads stats from every hinic_rxq/txq and accumulates them
2. calls hinic_rxq/txq_clean_stats() to clean every rxq/txq's stats
For hinic_get_stats64(), it could get right data, because it sums all
data to nic_dev->rx_stats/tx_stats.
But it is wrong for get_drv_queue_stats(), this function will read
hinic_rxq's stats, which have been cleared to zero by hinic_get_stats64().
I have observed hinic's cleanup operation by using such command:
> watch -n 1 "cat ethtool -S eth4 | tail -40"
Result before:
...
rxq7_pkts: 1
rxq7_bytes: 90
rxq7_errors: 0
rxq7_csum_errors: 0
rxq7_other_errors: 0
...
rxq9_pkts: 11
rxq9_bytes: 726
rxq9_errors: 0
rxq9_csum_errors: 0
rxq9_other_errors: 0
...
rxq11_pkts: 0
rxq11_bytes: 0
rxq11_errors: 0
rxq11_csum_errors: 0
rxq11_other_errors: 0
Result after a few seconds:
...
rxq7_pkts: 0
rxq7_bytes: 0
rxq7_errors: 0
rxq7_csum_errors: 0
rxq7_other_errors: 0
...
rxq9_pkts: 2
rxq9_bytes: 132
rxq9_errors: 0
rxq9_csum_errors: 0
rxq9_other_errors: 0
...
rxq11_pkts: 1
rxq11_bytes: 170
rxq11_errors: 0
rxq11_csum_errors: 0
rxq11_other_errors: 0
To solve this problem, we just keep every queue's total stats in their own
queue (aka hinic_{rxq|txq}), and simply sum all per-queue stats every time
calling hinic_get_stats64().
With that solution, there is no need to clean per-queue stats now,
and there is no need to maintain global hinic_dev.{tx|rx}_stats, too.
Fixes: edd384f682cc ("net-next/hinic: Add ethtool and stats")
Signed-off-by: Qiao Ma <mqaio@linux.alibaba.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski says:
====================
tls: rx: nopad and backlog flushing
This small series contains the two changes I've been working
towards in the previous ~50 patches a couple of months ago.
The first major change is the optional "nopad" optimization.
Currently TLS 1.3 Rx performs quite poorly because it does
not support the "zero-copy" or rather direct decrypt to a user
space buffer. Because of TLS 1.3 record padding we don't
know if a record contains data or a control message until
we decrypt it. Most records will contain data, tho, so the
optimization is to try the decryption hoping its data and
retry if it wasn't.
The performance gain from doing that is significant (~40%)
but if I'm completely honest the major reason is that we
call skb_cow_data() on the non-"zc" path. The next series
will remove the CoW, dropping the gain to only ~10%.
The second change is to flush the backlog every 128kB.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We continuously hold the socket lock during large reads and writes.
This may inflate RTT and negatively impact TCP performance.
Flush the backlog periodically. I tried to pick a flush period (128kB)
which gives significant benefit but the max Bps rate is not yet visibly
impacted.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since optimisitic decrypt may add extra load in case of retries
require socket owner to explicitly opt-in.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We currently don't support decrypt to user buffer with TLS 1.3
because we don't know the record type and how much padding
record contains before decryption. In practice data records
are by far most common and padding gets used rarely so
we can assume data record, no padding, and if we find out
that wasn't the case - retry the crypto in place (decrypt
to skb).
To safeguard from user overwriting content type and padding
before we can check it attach a 1B sg entry where last byte
of the record will land.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
To make future patches easier to review make data_len
contain the length of the data, without the tail.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mat Martineau says:
====================
mptcp: Path manager fixes for 5.19
The MPTCP userspace path manager is new in 5.19, and these patches fix
some issues in that new code.
Patches 1-3 fix path manager locking issues.
Patches 4 and 5 allow userspace path managers to change priority of
established subflows using the existing MPTCP_PM_CMD_SET_FLAGS generic
netlink command. Includes corresponding self test update.
Patches 6 and 7 fix accounting of available endpoint IDs and the
MPTCP_MIB_RMSUBFLOW counter.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch increases MPTCP_MIB_RMSUBFLOW mib counter in userspace pm
destroy subflow function mptcp_nl_cmd_sf_destroy() when removing subflow.
Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment")
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In mptcp_pm_nl_rm_addr_or_subflow() we always mark as available
the id corresponding to the just removed address.
The used bitmap actually tracks only the local IDs: we must
restrict the operation when a (local) subflow is removed.
Fixes: a88c9e496937 ("mptcp: do not block subflows creation on errors")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change updates the testing sample (pm_nl_ctl) to exercise
the updated MPTCP_PM_CMD_SET_FLAGS command for userspace PMs to
issue MP_PRIO signals over the selected subflow.
E.g. ./pm_nl_ctl set 10.0.1.2 port 47234 flags backup token 823274047 rip 10.0.1.1 rport 50003
userspace_pm.sh has a new selftest that invokes this command.
Fixes: 259a834fadda ("selftests: mptcp: functional tests for the userspace PM type")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change updates MPTCP_PM_CMD_SET_FLAGS to allow userspace PMs
to issue MP_PRIO signals over a specific subflow selected by
the connection token, local and remote address+port.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/286
Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When setting up a subflow's flags for sending MP_PRIO MPTCP options, the
subflow socket lock was not held while reading and modifying several
struct members that are also read and modified in mptcp_write_options().
Acquire the subflow socket lock earlier and send the MP_PRIO ACK with
that lock already acquired. Add a new variant of the
mptcp_subflow_send_ack() helper to use with the subflow lock held.
Fixes: 067065422fcd ("mptcp: add the outgoing MP_PRIO support")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The in-kernel path manager code for changing subflow flags acquired both
the msk socket lock and the PM lock when possibly changing the "backup"
and "fullmesh" flags. mptcp_pm_nl_mp_prio_send_ack() does not access
anything protected by the PM lock, and it must release and reacquire
the PM lock.
By pushing the PM lock to where it is needed in mptcp_pm_nl_fullmesh(),
the lock is only acquired when the fullmesh flag is changed and the
backup flag code no longer has to release and reacquire the PM lock. The
change in locking context requires the MIB update to be modified - move
that to a better location instead.
This change also makes it possible to call
mptcp_pm_nl_mp_prio_send_ack() for the userspace PM commands without
manipulating the in-kernel PM lock.
Fixes: 0f9f696a502e ("mptcp: add set_flags command in PM netlink")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The user-space PM subflow removal path uses a couple of helpers
that must be called under the msk socket lock and the current
code lacks such requirement.
Change the existing lock scope so that the relevant code is under
its protection.
Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment")
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/287
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vlad Buslov says:
====================
net: Fix police 'continue' action offload
TC act_police with 'continue' action had been supported by mlx5 matchall
classifier offload implementation for some time. However, 'continue' was
assumed implicitly and recently got broken in multiple places. Fix it in
both TC hardware offload validation code and mlx5 driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Referenced commit prepared the code for upcoming extension that allows mlx5
to offload police action attached to flower classifier. However, with
regard to existing matchall classifier offload validation should be
reversed as FLOW_ACTION_CONTINUE is the only supported notexceed police
action type. Fix the problem by allowing FLOW_ACTION_CONTINUE for police
action and extend scan_tc_matchall_fdb_actions() to only allow such actions
with matchall classifier.
Fixes: d97b4b105ce7 ("flow_offload: reject offload for all drivers with invalid police parameters")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Acked-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Offloading police with action TC_ACT_UNSPEC was erroneously disabled even
though it was supported by mlx5 matchall offload implementation, which
didn't verify the action type but instead assumed that any single police
action attached to matchall classifier is a 'continue' action. Lack of
action type check made it non-obvious what mlx5 matchall implementation
actually supports and caused implementers and reviewers of referenced
commits to disallow it as a part of improved validation code.
Fixes: b8cd5831c61c ("net: flow_offload: add tc police action parameters")
Fixes: b50e462bc22d ("net/sched: act_police: Add extack messages for offload failure")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ratheesh Kannoth says:
====================
octeontx2: *** Exact Match Table and Field hash ***
*** Exact match table and Field hash support for CN10KB silicon ***
Ratheesh Kannoth (11):
These patch series enables exact match table in CN10KB silicon. Legacy
silicon used NPC mcam to do packet fields/channel matching for NPC rules.
NPC mcam resources exahausted as customer use case increased.
Supporting many DMAC filter becomes a challenge, as RPM based filter
count is less. Exact match table has 4way 2K entry table and a 32 entry
fully associative cam table. Second table is to handle hash
table collision overflows in 4way 2K entry table. Enabling exact match table
results in KEX key to be appended with Hit/Miss status. This can be used
to match in NPC mcam for a more generic rule and drop those packets than
having DMAC drop rules for each DMAC entry in NPC mcam.
octeontx2-af: Exact match support
octeontx2-af: Exact match scan from kex profile
octeontx2-af: devlink configuration support
octeontx2-af: FLR handler for exact match table.
octeontx2-af: Drop rules for NPC MCAM
octeontx2-af: Debugsfs support for exact match.
octeontx2: Modify mbox request and response structures
octeontx2-af: Wrapper functions for mac addr add/del/update/reset
octeontx2-af: Invoke exact match functions if supported
octeontx2-pf: Add support for exact match table.
octeontx2-af: Enable Exact match flag in kex profile
Suman Ghosh (1):
CN10KB variant of CN10K series of silicons supports
a new feature where in a large protocol field
(eg 128bit IPv6 DIP) can be condensed into a small
hashed 32bit data. This saves a lot of space in MCAM key
and allows user to add more protocol fields into the filter.
A max of two such protocol data can be hashed.
This patch adds support for hashing IPv6 SIP and/or DIP.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Enabled EXACT match flag in Kex default profile. Since
there is no space in key, NPC_PARSE_NIBBLE_ERRCODE
is removed
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NPC exact match table can support more entries than RPM
dmac filters. This requires field size of DMAC filter count
and index to be increased.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If exact match table is suppoted, call functions to add/del/update
entries in exact match table instead of RPM dmac filters
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These functions are wrappers for mac add/addr/del/update in
exact match table. These will be invoked from mbox handler routines
if exact matct table is supported and enabled.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Exact match table modification requires wider fields as it has
more number of slots to fill in. Modifying an entry in exact match
table may cause hash collision and may be required to delete entry
from 4-way 2K table and add to fully associative 32 entry CAM table.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There debugfs files created.
1. General information on exact match table
2. Exact match table entries.
3. NPC mcam drop on hit count stats.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
NPC exact match table installs drop on hit rules in
NPC mcam for each channel. This rule has broadcast and multicast
bits cleared. Exact match bit cleared and channel bits
set. If exact match table hit bit is 0, corresponding NPC mcam
drop rule will be hit for the packet and will be dropped.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FLR handler should remove/free all exact match table resources
corresponding to each interface.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CN10KB silicon supports Exact match feature. This feature can be disabled
through devlink configuration. Devlink command fails if DMAC filter rules
are already present. Once disabled, legacy RPM based DMAC filters will be
configured.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CN10KB silicon supports exact match table. Scanning KEX
profile should check for exact match feature is enabled
and then set profile masks properly.
These kex profile masks are required to configure NPC
MCAM drop rules. If there is a miss in exact match table,
these drop rules will drop those packets.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CN10KB silicon has support for exact match table. This table
can be used to match maimum 64 bit value of KPU parsed output.
Hit/non hit in exact match table can be used as a KEX key to
NPC mcam.
This patch makes use of Exact match table to increase number of
DMAC filters supported. NPC mcam is no more need for each of these
DMAC entries as will be populated in Exact match table.
This patch implements following
1. Initialization of exact match table only for CN10KB.
2. Add/del/update interface function for exact match table.
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CN10KB variant of CN10K series of silicons supports
a new feature where in a large protocol field
(eg 128bit IPv6 DIP) can be condensed into a small
hashed 32bit data. This saves a lot of space in MCAM key
and allows user to add more protocol fields into the filter.
A max of two such protocol data can be hashed.
This patch adds support for hashing IPv6 SIP and/or DIP.
Signed-off-by: Suman Ghosh <sumang@marvell.com>
Signed-off-by: Ratheesh Kannoth <rkannoth@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge branch 'nfp-tso'
Simon Horman says:
====================
nfp: enable TSO by default
this short series enables TSO by default on all NICs supported by the NFP
driver.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We can benefit from TSO when the host CPU is not powerful enough,
so enable it by default now.
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Packets with metadata prepended can be correctly handled in
firmware when TSO is enabled, now remove the error path and
related comments. Since there's no existing firmware that
uses prepended metadata, no need to add compatibility check
here.
Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The feature test to detect the availability of zlib in bpftool's
Makefile does not bring much. The library is not optional: it may or may
not be required along libbfd for disassembling instructions, but in any
case it is necessary to build feature.o or even libbpf, on which bpftool
depends.
If we remove the feature test, we lose the nicely formatted error
message, but we get a compiler error about "zlib.h: No such file or
directory", which is equally informative. Let's get rid of the test.
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220705200456.285943-1-quentin@isovalent.com
Chuang Wang says:
====================
A potential scenario, when an error is returned after
add_uprobe_event_legacy() in perf_event_uprobe_open_legacy(), or
bpf_program__attach_perf_event_opts() in
bpf_program__attach_uprobe_opts() returns an error, the uprobe_event
that was previously created is not cleaned.
At the same time, the legacy kprobe_event also have similar problems.
With these patches, whenever an error is returned, it ensures that
the created kprobe_event/uprobe_event is cleaned.
V1 -> v3:
- add detail commits
- call remove_kprobe_event_legacy() on failed bpf_program__attach_perf_event_opts()
v3 -> v4:
- cleanup the legacy kprobe_event on failed add/attach_event
====================
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
A potential scenario, when an error is returned after
add_uprobe_event_legacy() in perf_event_uprobe_open_legacy(), or
bpf_program__attach_perf_event_opts() in
bpf_program__attach_uprobe_opts() returns an error, the uprobe_event
that was previously created is not cleaned.
So, with this patch, when an error is returned, fix this by adding
remove_uprobe_event_legacy()
Signed-off-by: Chuang Wang <nashuiliang@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220629151848.65587-4-nashuiliang@gmail.com