linux

iv/linux

Author	SHA1	Message	Date
Christian Marangi	fcdfc46288	net: ethernet: mediatek: disable irq before schedule napi While searching for possible refactor of napi_schedule_prep and __napi_schedule it was notice that the mtk eth driver disable the interrupt for rx and tx AFTER napi is scheduled. While this is a very hard to repro case it might happen to have situation where the interrupt is disabled and never enabled again as the napi completes and the interrupt is enabled before. This is caused by the fact that a napi driven by interrupt expect a logic with: 1. interrupt received. napi prepared -> interrupt disabled -> napi scheduled 2. napi triggered. ring cleared -> interrupt enabled -> wait for new interrupt To prevent this case, disable the interrupt BEFORE the napi is scheduled. Fixes: 656e705243fd ("net-next: mediatek: add support for MT7623 ethernet") Cc: stable@vger.kernel.org Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Link: https://lore.kernel.org/r/20231002140805.568-1-ansuelsmth@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-10-05 13:36:12 +02:00
Eric Dumazet	49e7265fd0	net_sched: sch_fq: add TCA_FQ_WEIGHTS attribute This attribute can be used to tune the per band weight and report them in "tc qdisc show" output: qdisc fq 802f: parent 1:9 limit 100000p flow_limit 500p buckets 1024 orphan_mask 1023 quantum 8364b initial_quantum 41820b low_rate_threshold 550Kbit refill_delay 40ms timer_slack 10us horizon 10s horizon_drop bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 weights 589824 196608 65536 Sent 236460814 bytes 792991 pkt (dropped 0, overlimits 0 requeues 0) rate 25816bit 10pps backlog 0b 0p requeues 0 flows 4 (inactive 4 throttled 0) gc 0 throttled 19 latency 17.6us fastpath 773882 Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Dave Taht <dave.taht@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-10-05 13:27:46 +02:00
Eric Dumazet	29f834aa32	net_sched: sch_fq: add 3 bands and WRR scheduling Before Google adopted FQ for its production servers, we had to ensure AF4 packets would get a higher share than BE1 ones. As discussed this week in Netconf 2023 in Paris, it is time to upstream this for public use. After this patch FQ can replace pfifo_fast, with the following differences : - FQ uses WRR instead of strict prio, to avoid starvation of low priority packets. - We make sure each band/prio tracks its own usage against sch->limit. This was done to make sure flood of low priority packets would not prevent AF4 packets to be queued. Contributed by Willem. - priomap can be changed, if needed (default value are the ones coming from pfifo_fast). In this patch, we set default band weights so that : - high prio (band=0) packets get 90% of the bandwidth if they compete with low prio (band=2) packets. - high prio packets get 75% of the bandwidth if they compete with medium prio (band=1) packets. Following patch in this series adds the possibility to tune the per-band weights. As we added many fields in 'struct fq_sched_data', we had to make sure to have the first cache line read-mostly, and avoid wasting precious cache lines. More optimizations are possible but will be sent separately. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Dave Taht <dave.taht@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-10-05 13:27:39 +02:00
Eric Dumazet	5579ee462d	net_sched: export pfifo_fast prio2band[] pfifo_fast prio2band[] is renamed to sch_default_prio2band[] and exported because we want to share it in FQ. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Dave Taht <dave.taht@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-10-05 13:27:31 +02:00
Eric Dumazet	2ae45136a9	net_sched: sch_fq: remove q->ktime_cache Now that both enqueue() and dequeue() need to use ktime_get_ns(), there is no point wasting 8 bytes in struct fq_sched_data. This makes room for future fields. ;) Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Dave Taht <dave.taht@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-10-05 13:27:20 +02:00
Paolo Abeni	defe4b87d5	Merge branch 'net-mana-fix-some-tx-processing-bugs' Haiyang Zhang says: ==================== net: mana: Fix some TX processing bugs Fix TX processing bugs on error handling, tso_bytes calculation, and sge0 size. ==================== Link: https://lore.kernel.org/r/1696020147-14989-1-git-send-email-haiyangz@microsoft.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-10-05 11:45:09 +02:00
Haiyang Zhang	a43e8e9ffa	net: mana: Fix oversized sge0 for GSO packets Handle the case when GSO SKB linear length is too large. MANA NIC requires GSO packets to put only the header part to SGE0, otherwise the TX queue may stop at the HW level. So, use 2 SGEs for the skb linear part which contains more than the packet header. Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-10-05 11:45:06 +02:00
Haiyang Zhang	7a54de9265	net: mana: Fix the tso_bytes calculation sizeof(struct hop_jumbo_hdr) is not part of tso_bytes, so remove the subtraction from header size. Cc: stable@vger.kernel.org Fixes: bd7fc6e1957c ("net: mana: Add new MANA VF performance counters for easier troubleshooting") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-10-05 11:45:06 +02:00
Haiyang Zhang	b2b000069a	net: mana: Fix TX CQE error handling For an unknown TX CQE error type (probably from a newer hardware), still free the SKB, update the queue tail, etc., otherwise the accounting will be wrong. Also, TX errors can be triggered by injecting corrupted packets, so replace the WARN_ONCE to ratelimited error logging. Cc: stable@vger.kernel.org Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-10-05 11:45:06 +02:00
Justin Stitt	3b9333493b	can: peak_pci: replace deprecated strncpy with strscpy `strncpy` is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. NUL-padding is not required since card is already zero-initialized: \| card = kzalloc(sizeof(*card), GFP_KERNEL); A suitable replacement is `strscpy` [2] due to the fact that it guarantees NUL-termination on the destination buffer without unnecessarily NUL-padding. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Cc: linux-hardening@vger.kernel.org Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/all/20231005-strncpy-drivers-net-can-sja1000-peak_pci-c-v1-1-c36e1702cd56@google.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2023-10-05 09:51:18 +02:00
Dmitry Antipov	9418edf8ff	wifi: rtlwifi: remove unreachable code in rtl92d_dm_check_edca_turbo() Since '!(0x5ea42b & 0xffff0000)' is always false, remove unreachable block in 'rtl92d_dm_check_edca_turbo()' and convert EDCA limits to constant variables. Compile tested only. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Acked-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231003043318.11370-1-dmantipov@yandex.ru	2023-10-05 09:54:58 +03:00
Zong-Zhe Yang	036042e157	wifi: rtw89: debug: txpwr table supports Wi-Fi 7 chips We add TX power table format for Wi-Fi 7 chips. Since Wi-Fi 7 tables are larger, in order to reuse some chunks, we extend code to process nested entries. Now, dbgfs txpwr_table can work with Wi-Fi 7 chips. An output example of dbgfs txpwr_table on Wi-Fi 7 chips is shown below. ... [TX power byrate] << BW20 >> CCK - 1M 2M 5.5M 11M \| 20, 20, 20, 20, dBm LEGACY - 6M 9M 12M 18M \| 18, 18, 18, 18, dBm LEGACY - 24M 36M 48M 54M \| 18, 18, 17, 16, dBm EHT - MCS14 MCS15 \| 0, 0, dBm DLRU_EHT - MCS14 MCS15 \| 0, 18, dBm MCS_1SS - MCS0 MCS1 MCS2 MCS3 \| 18, 18, 18, 18, dBm MCS_1SS - MCS4 MCS5 MCS6 MCS7 \| 18, 17, 16, 15, dBm MCS_1SS - MCS8 MCS9 MCS10 MCS11 \| 14, 13, 12, 11, dBm MCS_1SS - MCS12 MCS13 \| 10, 9, dBm HEDCM_1SS - MCS0 MCS1 MCS3 MCS4 \| 18, 18, 18, 18, dBm DLRU_MCS_1SS - MCS0 MCS1 MCS2 MCS3 \| 18, 18, 18, 18, dBm DLRU_MCS_1SS - MCS4 MCS5 MCS6 MCS7 \| 18, 17, 16, 15, dBm DLRU_MCS_1SS - MCS8 MCS9 MCS10 MCS11 \| 14, 13, 12, 11, dBm DLRU_MCS_1SS - MCS12 MCS13 \| 10, 9, dBm DLRU_HEDCM_1SS - MCS0 MCS1 MCS3 MCS4 \| 18, 18, 18, 18, dBm MCS_2SS - MCS0 MCS1 MCS2 MCS3 \| 18, 18, 18, 18, dBm ... [TX power limit] << 1TX >> CCK_20M - NON_BF BF \| 0, 0, dBm CCK_40M - NON_BF BF \| 0, 0, dBm OFDM - NON_BF BF \| 18, 0, dBm MCS_20M_0 - NON_BF BF \| 18, 0, dBm MCS_20M_1 - NON_BF BF \| 0, 0, dBm ... Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231003015446.14658-8-pkshih@realtek.com	2023-10-05 09:54:17 +03:00
Zong-Zhe Yang	f680fc5695	wifi: rtw89: debug: show txpwr table according to chip gen Since current TX power stuffs are for ax chips, add a suffix `_ax` to them. Then, when requested to show txpwr table, select table according to chip generation first. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231003015446.14658-7-pkshih@realtek.com	2023-10-05 09:54:16 +03:00
Zong-Zhe Yang	932f85c18a	wifi: rtw89: phy: set TX power RU limit according to chip gen Wi-Fi 6 chips and Wi-Fi 7 chips have different register design for TX power RU limit. We rename original setting stuffs with a suffix `_ax`, concentrate related enum declaration in phy.h, and implement setting flow for Wi-Fi 7 chips. Then, we set TX power RU limit according to chip generation. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231003015446.14658-6-pkshih@realtek.com	2023-10-05 09:54:16 +03:00
Zong-Zhe Yang	70aa04f2d5	wifi: rtw89: phy: set TX power limit according to chip gen Wi-Fi 6 chips and Wi-Fi 7 chips have different register design for TX power limit. We rename original setting stuffs with a suffix `_ax`, concentrate related enum declaration in phy.h, and implement setting flow for Wi-Fi 7 chips. Then, we set TX power limit according to chip generation. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231003015446.14658-5-pkshih@realtek.com	2023-10-05 09:54:16 +03:00
Zong-Zhe Yang	3b7dc652cc	wifi: rtw89: phy: set TX power offset according to chip gen We have a register to control TX power of each rate section to increase or decrease an offset. But, Wi-Fi 6 chips and Wi-Fi 7 chips have different address and format for this control register. We rename original setting stuffs with a suffix `_ax` and implement setting flow for Wi-Fi 7 chips. Then, we set TX power offset according to chip generation. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231003015446.14658-4-pkshih@realtek.com	2023-10-05 09:54:16 +03:00
Zong-Zhe Yang	d513664215	wifi: rtw89: phy: set TX power by rate according to chip gen Wi-Fi 6 chips and Wi-Fi 7 chips have different register design for TX power by rate. We rename original setting stuffs with a suffix `_ax` and implement setting flow for Wi-Fi 7 chips. Then, we set TX power by rate according to chip generation. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231003015446.14658-3-pkshih@realtek.com	2023-10-05 09:54:15 +03:00
Zong-Zhe Yang	06b26738a7	wifi: rtw89: mac: get TX power control register according to chip gen There are two difference between Wi-Fi 6 and Wi-Fi 7 chips. 1. Address range of TX power control register 2. Checking code to get a TX power control register So, separate the implementation of them, access according to chip generation, and rename original things with a suffix `_ax`. Signed-off-by: Zong-Zhe Yang <kevin_yang@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://lore.kernel.org/r/20231003015446.14658-2-pkshih@realtek.com	2023-10-05 09:54:15 +03:00
Linus Torvalds	3006adf3be	Timerlat auto-analysis: - Timerlat is reporting thread interference time without thread noise events occurrence. It was caused because the thread interference variable was not reset after the analysis of a timerlat activation that did not hit the threshold. - The IRQ handler delay is estimated from the delta of the IRQ latency reported by timerlat, and the timestamp from IRQ handler start event. If the delta is near-zero, the drift from the external clock and the trace event and/or the overhead can cause the value to be negative. If the value is negative, print a zero-delay. - IRQ handlers happening after the timerlat thread event but before the stop tracing were being reported as IRQ that happened before the current IRQ occurrence. Ignore Previous IRQ noise in this condition because they are valid only for the next timerlat activation. Timerlat user-space: - Timerlat is stopping all user-space thread if a CPU becomes offline. Do not stop the entire tool if a CPU is/become offline, but only the thread of the unavailable CPU. Stop the tool only, if all threads leave because the CPUs become/are offline. man-pages: - Fix command line example in timerlat hist man page. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEElZdCZGILCpueJPrSY3Tw0sBuFwAFAmURVMQTHGJyaXN0b3RA a2VybmVsLm9yZwAKCRBjdPDSwG4XAJezD/0fJnrzJFVSUwAXbdu1K679ik5iqwTk UE/ZHY3dBbES6DFswXomofe4LkimY1tnLvyPr5tHqCGW8cvnMkOpgDK68LEgyL5a 1FLR8D+07i2dsEcsXfcAAF8iVEeF/SzOfHwZuY1ZJyicwl3xtya/QDrXpq8LZR1n 4YEWE3Xx60bo/Q81hTXN3uS+275bfuV/N8DSOXwVVWhK5kxheitc1ESUGLV/g1HQ muyv+k+fH1qnOfkPsokhnxMjgzy7Tqv13onoVY+KUSQ1Ui58p+c3zQSkceWxM8c4 wnbfR0spF1eCoBlO2/PYUZ2p2zEh/NS3eTQchys4J2lbgURW1IIVaxaK1S5xC2CE tkYkBOaUJXlD3HzTCkPRNpOI0+8Ydo0MDzzPUqjHemfFE7zzHVoZTfmdInSyddUz ViKLi0HS+kjyvZVGa02JuDgPJmjTPgwd1F8p6cujHmSCbifbs4Oml9VaYHQRioZX bkIDAX6NMkqDpb0baGjsIzbmiWnsIeo8J1IDqdXnD3VY1J78D+kBNCISxGjXuTSF Eg3iyZJHWy2JhGBQ2k4lyCw9FZZ1FZtkURPWvTn5/PbsPqz5bjPWUcwXsyqE6wBL OPR3HUcjgaMv7gJrErbsAaAGXxwpgTOe0qMcWI2tR7n6SHzniOn9WlDjegVnwWp1 r4ognHxasRQUAQ== =1BAc -----END PGP SIGNATURE----- Merge tag 'rtla-v6.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bristot/linux Pull rtla fixes from Daniel Bristot de Oliveira: "rtla (Real-Time Linux Analysis) tool fixes. Timerlat auto-analysis: - Timerlat is reporting thread interference time without thread noise events occurrence. It was caused because the thread interference variable was not reset after the analysis of a timerlat activation that did not hit the threshold. - The IRQ handler delay is estimated from the delta of the IRQ latency reported by timerlat, and the timestamp from IRQ handler start event. If the delta is near-zero, the drift from the external clock and the trace event and/or the overhead can cause the value to be negative. If the value is negative, print a zero-delay. - IRQ handlers happening after the timerlat thread event but before the stop tracing were being reported as IRQ that happened before the current IRQ occurrence. Ignore Previous IRQ noise in this condition because they are valid only for the next timerlat activation. Timerlat user-space: - Timerlat is stopping all user-space thread if a CPU becomes offline. Do not stop the entire tool if a CPU is/become offline, but only the thread of the unavailable CPU. Stop the tool only, if all threads leave because the CPUs become/are offline. man-pages: - Fix command line example in timerlat hist man page" * tag 'rtla-v6.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bristot/linux: rtla: fix a example in rtla-timerlat-hist.rst rtla/timerlat: Do not stop user-space if a cpu is offline rtla/timerlat_aa: Fix previous IRQ delay for IRQs that happens after thread sample rtla/timerlat_aa: Fix negative IRQ delay rtla/timerlat_aa: Zero thread sum after every sample analysis	2023-10-04 18:19:55 -07:00
Jakub Kicinski	93e7eca853	Merge branch 'ynl-makefile-cleanup' Jakub Kicinski says: ==================== ynl Makefile cleanup While catching up on recent changes I noticed unexpected changes to Makefiles in YNL. Indeed they were not working as intended but the fixes put in place were not what I had in mind :) ==================== Link: https://lore.kernel.org/r/20231003153416.2479808-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 17:33:56 -07:00
Jakub Kicinski	e2ca31cee9	tools: ynl: use uAPI include magic for samples Makefile.deps provides direct includes in CFLAGS_$(obj). We just need to rewrite the rules to make use of the extra flags, no need to hard-include all of tools/include/uapi. Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20231003153416.2479808-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 17:33:54 -07:00
Jakub Kicinski	a50660173c	tools: ynl: don't regen on every make As far as I can tell the normal Makefile dependency tracking works, generated files get re-generated if the YAML was updated. Let make do its job, don't force the re-generation. make hardclean can be used to force regeneration. Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20231003153416.2479808-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 17:33:54 -07:00
Jakub Kicinski	0629f22ec1	ynl: netdev: drop unnecessary enum-as-flags enum-as-flags can be used when enum declares bit positions but we want to carry bitmask in an attribute. If the definition is already provided as flags there's no need to indicate the flag-iness of the attribute. Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20231003153416.2479808-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 17:33:54 -07:00
Eric Dumazet	d0f95894fd	netlink: annotate data-races around sk->sk_err syzbot caught another data-race in netlink when setting sk->sk_err. Annotate all of them for good measure. BUG: KCSAN: data-race in netlink_recvmsg / netlink_recvmsg write to 0xffff8881613bb220 of 4 bytes by task 28147 on cpu 0: netlink_recvmsg+0x448/0x780 net/netlink/af_netlink.c:1994 sock_recvmsg_nosec net/socket.c:1027 [inline] sock_recvmsg net/socket.c:1049 [inline] __sys_recvfrom+0x1f4/0x2e0 net/socket.c:2229 __do_sys_recvfrom net/socket.c:2247 [inline] __se_sys_recvfrom net/socket.c:2243 [inline] __x64_sys_recvfrom+0x78/0x90 net/socket.c:2243 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd write to 0xffff8881613bb220 of 4 bytes by task 28146 on cpu 1: netlink_recvmsg+0x448/0x780 net/netlink/af_netlink.c:1994 sock_recvmsg_nosec net/socket.c:1027 [inline] sock_recvmsg net/socket.c:1049 [inline] __sys_recvfrom+0x1f4/0x2e0 net/socket.c:2229 __do_sys_recvfrom net/socket.c:2247 [inline] __se_sys_recvfrom net/socket.c:2243 [inline] __x64_sys_recvfrom+0x78/0x90 net/socket.c:2243 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd value changed: 0x00000000 -> 0x00000016 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 28146 Comm: syz-executor.0 Not tainted 6.6.0-rc3-syzkaller-00055-g9ed22ae6be81 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/06/2023 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231003183455.3410550-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 17:32:54 -07:00
Eric Dumazet	d86e5fbd4c	net: skb_queue_purge_reason() optimizations 1) Exit early if the list is empty. 2) splice the list into a local list, so that we block hard irqs only once. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231003181920.3280453-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 17:32:24 -07:00
Xin Long	1f4e803cd9	sctp: update hb timer immediately after users change hb_interval Currently, when hb_interval is changed by users, it won't take effect until the next expiry of hb timer. As the default value is 30s, users have to wait up to 30s to wait its hb_interval update to work. This becomes pretty bad in containers where a much smaller value is usually set on hb_interval. This patch improves it by resetting the hb timer immediately once the value of hb_interval is updated by users. Note that we don't address the already existing 'problem' when sending a heartbeat 'on demand' if one hb has just been sent(from the timer) mentioned in: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg590224.html Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Link: https://lore.kernel.org/r/75465785f8ee5df2fb3acdca9b8fafdc18984098.1696172660.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 17:29:58 -07:00
Xin Long	2222a78075	sctp: update transport state when processing a dupcook packet During the 4-way handshake, the transport's state is set to ACTIVE in sctp_process_init() when processing INIT_ACK chunk on client or COOKIE_ECHO chunk on server. In the collision scenario below: 192.168.1.2 > 192.168.1.1: sctp (1) [INIT] [init tag: 3922216408] 192.168.1.1 > 192.168.1.2: sctp (1) [INIT] [init tag: 144230885] 192.168.1.2 > 192.168.1.1: sctp (1) [INIT ACK] [init tag: 3922216408] 192.168.1.1 > 192.168.1.2: sctp (1) [COOKIE ECHO] 192.168.1.2 > 192.168.1.1: sctp (1) [COOKIE ACK] 192.168.1.1 > 192.168.1.2: sctp (1) [INIT ACK] [init tag: 3914796021] when processing COOKIE_ECHO on 192.168.1.2, as it's in COOKIE_WAIT state, sctp_sf_do_dupcook_b() is called by sctp_sf_do_5_2_4_dupcook() where it creates a new association and sets its transport to ACTIVE then updates to the old association in sctp_assoc_update(). However, in sctp_assoc_update(), it will skip the transport update if it finds a transport with the same ipaddr already existing in the old asoc, and this causes the old asoc's transport state not to move to ACTIVE after the handshake. This means if DATA retransmission happens at this moment, it won't be able to enter PF state because of the check 'transport->state == SCTP_ACTIVE' in sctp_do_8_2_transport_strike(). This patch fixes it by updating the transport in sctp_assoc_update() with sctp_assoc_add_peer() where it updates the transport state if there is already a transport with the same ipaddr exists in the old asoc. Signed-off-by: Xin Long <lucien.xin@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Link: https://lore.kernel.org/r/fd17356abe49713ded425250cc1ae51e9f5846c6.1696172325.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 17:29:44 -07:00
Jakub Kicinski	397f70e3be	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2023-10-03 (i40e, iavf) This series contains updates to i40e and iavf drivers. Yajun Deng aligns reporting of buffer exhaustion statistics to follow documentation for i40e. Jake removes undesired 'inline' from functions in iavf. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: iavf: remove "inline" functions from iavf_txrx.c i40e: Add rx_missed_errors for buffer exhaustion ==================== Link: https://lore.kernel.org/r/20231003223610.2004976-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 17:27:24 -07:00
Jakub Kicinski	b4ac75a3bb	Merge branch 'fix-a-couple-recent-instances-of-wincompatible-function-pointer-types-strict-from-mode_get-implementations' Nathan Chancellor says: ==================== Fix a couple recent instances of -Wincompatible-function-pointer-types-strict from ->mode_get() implementations This series fixes a couple of instances of -Wincompatible-function-pointer-types-strict that were introduced by a recent series that added a new type of ops, struct dpll_device_ops, along with implementations of the callback ->mode_get() that had a mismatched mode type. This warning is not currently enabled for any build but I am planning on submitting a patch to add it to W=1 to prevent new instances of the warning from popping up while we try and fix the existing instances in other drivers. This series is based on current net-next but if they need to go into individual maintainer trees, please feel free to take the patches individually. ==================== Link: https://lore.kernel.org/r/20231002-net-wifpts-dpll_mode_get-v1-0-a356a16413cf@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 17:15:06 -07:00
Nathan Chancellor	f4ecb3d44a	mlx5: Fix type of mode parameter in mlx5_dpll_device_mode_get() When building with -Wincompatible-function-pointer-types-strict, a warning designed to catch potential kCFI failures at build time rather than run time due to incorrect function pointer types, there is a warning due to a mismatch between the type of the mode parameter in mlx5_dpll_device_mode_get() vs. what the function pointer prototype for ->mode_get() in 'struct dpll_device_ops' expects. drivers/net/ethernet/mellanox/mlx5/core/dpll.c:141:14: error: incompatible function pointer types initializing 'int ()(const struct dpll_device , void , enum dpll_mode , struct netlink_ext_ack )' with an expression of type 'int (const struct dpll_device , void , u32 , struct netlink_ext_ack )' (aka 'int (const struct dpll_device , void , unsigned int , struct netlink_ext_ack *)') [-Werror,-Wincompatible-function-pointer-types-strict] 141 \| .mode_get = mlx5_dpll_device_mode_get, \| ^~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. Change the type of the mode parameter in mlx5_dpll_device_mode_get() to clear up the warning and avoid kCFI failures at run time. Fixes: 496fd0a26bbf ("mlx5: Implement SyncE support using DPLL infrastructure") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231002-net-wifpts-dpll_mode_get-v1-2-a356a16413cf@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 17:15:04 -07:00
Nathan Chancellor	26cc115d59	ptp: Fix type of mode parameter in ptp_ocp_dpll_mode_get() When building with -Wincompatible-function-pointer-types-strict, a warning designed to catch potential kCFI failures at build time rather than run time due to incorrect function pointer types, there is a warning due to a mismatch between the type of the mode parameter in ptp_ocp_dpll_mode_get() vs. what the function pointer prototype for ->mode_get() in 'struct dpll_device_ops' expects. drivers/ptp/ptp_ocp.c:4353:14: error: incompatible function pointer types initializing 'int ()(const struct dpll_device , void , enum dpll_mode , struct netlink_ext_ack )' with an expression of type 'int (const struct dpll_device , void , u32 , struct netlink_ext_ack )' (aka 'int (const struct dpll_device , void , unsigned int , struct netlink_ext_ack *)') [-Werror,-Wincompatible-function-pointer-types-strict] 4353 \| .mode_get = ptp_ocp_dpll_mode_get, \| ^~~~~~~~~~~~~~~~~~~~~ 1 error generated. Change the type of the mode parameter in ptp_ocp_dpll_mode_get() to clear up the warning and avoid kCFI failures at run time. Fixes: 09eeb3aecc6c ("ptp_ocp: implement DPLL ops") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20231002-net-wifpts-dpll_mode_get-v1-1-a356a16413cf@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 17:15:04 -07:00
Jakub Kicinski	78cac6f171	Merge branch 'r8152-modify-rx_bottom' Hayes Wang says: ==================== r8152: modify rx_bottom v3: For patch #1, this patch is replaced. The new patch only break the loop, and keep that the driver would queue the rx packets. For patch #2, modify the code depends on patch #1. For work_down < budget, napi_get_frags() and napi_gro_frags() would be used. For the others, nothing is changed. v2: For patch #1, add comment, update commit message, and add Fixes tag. v1: These patches are used to improve rx_bottom(). ==================== Link: https://lore.kernel.org/r/20230926111714.9448-432-nic_swsd@realtek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 16:51:34 -07:00
Hayes Wang	788d30daa8	r8152: use napi_gro_frags Use napi_gro_frags() for the skb of fragments when the work_done is less than budget. Signed-off-by: Hayes Wang <hayeswang@realtek.com> Link: https://lore.kernel.org/r/20230926111714.9448-434-nic_swsd@realtek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 16:51:32 -07:00
Hayes Wang	2cf51f9317	r8152: break the loop when the budget is exhausted A bulk transfer of the USB may contain many packets. And, the total number of the packets in the bulk transfer may be more than budget. Originally, only budget packets would be handled by napi_gro_receive(), and the other packets would be queued in the driver for next schedule. This patch would break the loop about getting next bulk transfer, when the budget is exhausted. That is, only the current bulk transfer would be handled, and the other bulk transfers would be queued for next schedule. Besides, the packets which are more than the budget in the current bulk trasnfer would be still queued in the driver, as the original method. In addition, a bulk transfer wouldn't contain more than 400 packets, so the check of queue length is unnecessary. Therefore, I replace it with WARN_ON_ONCE(). Fixes: cf74eb5a5bc8 ("eth: r8152: try to use a normal budget") Signed-off-by: Hayes Wang <hayeswang@realtek.com> Link: https://lore.kernel.org/r/20230926111714.9448-433-nic_swsd@realtek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 16:51:31 -07:00
Jakub Kicinski	f8e5b77862	Merge branch 'chelsio-annotate-structs-with-__counted_by' Kees Cook says: ==================== chelsio: Annotate structs with __counted_by This annotates several chelsio structures with the coming __counted_by attribute for bounds checking of flexible arrays at run-time. For more details, see commit dd06e72e68bc ("Compiler Attributes: Add __counted_by macro"). ==================== Link: https://lore.kernel.org/r/20230929181042.work.990-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 15:37:15 -07:00
Kees Cook	1508cb7e07	cxgb4: Annotate struct smt_data with __counted_by Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct smt_data. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230929181149.3006432-5-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 15:37:13 -07:00
Kees Cook	ceba9725fb	cxgb4: Annotate struct sched_table with __counted_by Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct sched_table. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230929181149.3006432-4-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 15:37:13 -07:00
Kees Cook	157c56a4fe	cxgb4: Annotate struct cxgb4_tc_u32_table with __counted_by Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct cxgb4_tc_u32_table. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230929181149.3006432-3-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 15:37:13 -07:00
Kees Cook	c3db467b08	cxgb4: Annotate struct clip_tbl with __counted_by Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct clip_tbl. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230929181149.3006432-2-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 15:37:13 -07:00
Kees Cook	3bbae5f1c6	chelsio/l2t: Annotate struct l2t_data with __counted_by Prepare for the coming implementation by GCC and Clang of the __counted_by attribute. Flexible array members annotated with __counted_by can have their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS (for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family functions). As found with Coccinelle[1], add __counted_by for struct l2t_data. [1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci Cc: Raju Rangoju <rajur@chelsio.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Link: https://lore.kernel.org/r/20230929181149.3006432-1-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 15:37:13 -07:00
Neal Cardwell	4720852ed9	tcp: fix delayed ACKs for MSS boundary condition This commit fixes poor delayed ACK behavior that can cause poor TCP latency in a particular boundary condition: when an application makes a TCP socket write that is an exact multiple of the MSS size. The problem is that there is painful boundary discontinuity in the current delayed ACK behavior. With the current delayed ACK behavior, we have: (1) If an app reads data when > 1MSS is unacknowledged, then tcp_cleanup_rbuf() ACKs immediately because of: tp->rcv_nxt - tp->rcv_wup > icsk->icsk_ack.rcv_mss \|\| (2) If an app reads all received data, and the packets were < 1MSS, and either (a) the app is not ping-pong or (b) we received two packets < 1MSS, then tcp_cleanup_rbuf() ACKs immediately beecause of: ((icsk->icsk_ack.pending & ICSK_ACK_PUSHED2) \|\| ((icsk->icsk_ack.pending & ICSK_ACK_PUSHED) && !inet_csk_in_pingpong_mode(sk))) && (3) However: if an app reads exactly 1MSS of data, tcp_cleanup_rbuf() does not send an immediate ACK. This is true even if the app is not ping-pong and the 1MSS of data had the PSH bit set, suggesting the sending application completed an application write. Thus if the app is not ping-pong, we have this painful case where >1MSS gets an immediate ACK, and <1MSS gets an immediate ACK, but a write whose last skb is an exact multiple of 1MSS can get a 40ms delayed ACK. This means that any app that transfers data in one direction and takes care to align write size or packet size with MSS can suffer this problem. With receive zero copy making 4KB MSS values more common, it is becoming more common to have application writes naturally align with MSS, and more applications are likely to encounter this delayed ACK problem. The fix in this commit is to refine the delayed ACK heuristics with a simple check: immediately ACK a received 1MSS skb with PSH bit set if the app reads all data. Why? If an skb has a len of exactly 1MSS and has the PSH bit set then it is likely the end of an application write. So more data may not be arriving soon, and yet the data sender may be waiting for an ACK if cwnd-bound or using TX zero copy. Thus we set ICSK_ACK_PUSHED in this case so that tcp_cleanup_rbuf() will send an ACK immediately if the app reads all of the data and is not ping-pong. Note that this logic is also executed for the case where len > MSS, but in that case this logic does not matter (and does not hurt) because tcp_cleanup_rbuf() will always ACK immediately if the app reads data and there is more than an MSS of unACKed data. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Cc: Xin Guo <guoxin0309@gmail.com> Link: https://lore.kernel.org/r/20231001151239.1866845-2-ncardwell.sw@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 15:34:18 -07:00
Neal Cardwell	059217c18b	tcp: fix quick-ack counting to count actual ACKs of new data This commit fixes quick-ack counting so that it only considers that a quick-ack has been provided if we are sending an ACK that newly acknowledges data. The code was erroneously using the number of data segments in outgoing skbs when deciding how many quick-ack credits to remove. This logic does not make sense, and could cause poor performance in request-response workloads, like RPC traffic, where requests or responses can be multi-segment skbs. When a TCP connection decides to send N quick-acks, that is to accelerate the cwnd growth of the congestion control module controlling the remote endpoint of the TCP connection. That quick-ack decision is purely about the incoming data and outgoing ACKs. It has nothing to do with the outgoing data or the size of outgoing data. And in particular, an ACK only serves the intended purpose of allowing the remote congestion control to grow the congestion window quickly if the ACK is ACKing or SACKing new data. The fix is simple: only count packets as serving the goal of the quickack mechanism if they are ACKing/SACKing new data. We can tell whether this is the case by checking inet_csk_ack_scheduled(), since we schedule an ACK exactly when we are ACKing/SACKing new data. Fixes: fc6415bcb0f5 ("[TCP]: Fix quick-ack decrementing with TSO.") Signed-off-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Yuchung Cheng <ycheng@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20231001151239.1866845-1-ncardwell.sw@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 15:34:18 -07:00
Jakub Kicinski	c56e67f3ff	netfilter pull request 2023-10-04 -----BEGIN PGP SIGNATURE----- iQJBBAABCAArFiEEgKkgxbID4Gn1hq6fcJGo2a1f9gAFAmUdcDQNHGZ3QHN0cmxl bi5kZQAKCRBwkajZrV/2AOf7EACipnPx/532GUk1pECg+iWGTfhFOu1jdHjAILzy +Ft/kfTLvd8kfZg6DuKIb6KYfj7w7uQ/xcD6wfqV8HBcss0SOyilx2ZUgH8ThwDv tSIsUsx1M1gOGkXK713GrD6h/PR5BBv3vVFymvr+MliYH4C2mmsGOGWk5D+s8IqU q3XDMMMlsZpfqCA8QGKK7TkFhnvnHdeoHGhZhw9ywXik733Qa4OUbJ5tkxztDKrr DKF/FhpYxWPKHURtPXaQpWuni7xbMjg+3lHYlWTRZkQRQOoPWidBuTumqJxvwT3Z FYwlS7T7OBMiFByy4spBnBs0uGiA6rR3sZ2/Gn98o9HpYlCllxpZm53Ay0u8sZTL RBhMkacOXTWN5n1fbIqHIZc6vs7Tm1crvT2V/CseAuhe9TDiD5cHkaz7hJUQif6h dmF48QHCHuSgWGtyPmVbTDSZ02YF++R398zHuBM2TXkFz8B9vI5DRpbXw0yX4ktg LZSKnBALOPN5Ye27+W+itNfNaMC3+Elto3Cv9IvpTaXWl8WpF8hnNagLObEXxJ1Q 3dLRKpSHDKJe7BLQoqm9ESFUE80bZr+S1Xleukz0z7AamCrM/rxQGKBwbTJs2NoE 1YezWzhw68+aQ7BY8eWigDAQKmtn1Oju3v5u5IekGKQVvXd5x97VGlJQRVxQvr2Z jDHNFw== =YLDi -----END PGP SIGNATURE----- Merge tag 'nf-23-10-04' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Florian Westphal says: ==================== netfilter patches for net First patch resolves a regression with vlan header matching, this was broken since 6.5 release. From myself. Second patch fixes an ancient problem with sctp connection tracking in case INIT_ACK packets are delayed. This comes with a selftest, both patches from Xin Long. Patch 4 extends the existing nftables audit selftest, from Phil Sutter. Patch 5, also from Phil, avoids a situation where nftables would emit an audit record twice. This was broken since 5.13 days. Patch 6, from myself, avoids spurious insertion failure if we encounter an overlapping but expired range during element insertion with the 'nft_set_rbtree' backend. This problem exists since 6.2. * tag 'nf-23-10-04' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: nft_set_rbtree: fix spurious insertion failure netfilter: nf_tables: Deduplicate nft_register_obj audit logs selftests: netfilter: Extend nft_audit.sh selftests: netfilter: test for sctp collision processing in nf_conntrack netfilter: handle the connecting collision properly in nf_conntrack_proto_sctp netfilter: nft_payload: rebuild vlan header on h_proto access ==================== Link: https://lore.kernel.org/r/20231004141405.28749-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 14:53:17 -07:00
Jakub Kicinski	07cf7974a2	netfilter pull request 2023-09-28 -----BEGIN PGP SIGNATURE----- iQJBBAABCAArFiEEgKkgxbID4Gn1hq6fcJGo2a1f9gAFAmUVjwUNHGZ3QHN0cmxl bi5kZQAKCRBwkajZrV/2AKneEACzrKtIC0j0DyhgVW4Kb57T8Y7cD5wQCv7oz1Cx 8A3UJ1pSLYhRnz94zY453GIenK+zx/KKIetDhyWnjA9gjk95HkUN+OwuuiKnUAgu 7KPGbIYat7hERwoZpR88nrbTYXcDZfcZGTqWA++3yL2vn4Lu4lsuowqXYKBf/axk 5gEwEtwn2mVsdo0qTVJcXkHqnf5CCdqd26ixF4yB1rz/P6kISi4I9q7ul43paFJW +/ifacdG+7raQkGlUlYiDNMVd0uO01HHaAcWfYa+FOMK+GSn+89zzTs906CU0g2O GRJSWjNTgfDtM2AHN7peUnf/G9XHSK2Y7Re8FzauKzwWSl5N9w5610nbQnT+ME5O uOZE1P/lhnidOwCEV8zU4yhs6fBrCMCHz+S5Yh8C8PCUhi12IEEYRHyGCoUVMOwY 1LINjdn4HddL57QUGumy0VqVBlxQru8VXnlzm0eIyhsbZ3/mVXQWIHX4u1G36UUQ zSkm4/qP4kna/tV86mETNX1MUcJsQ1vQ842abcUbxudKei/uT9av6YHlz/aBOcQZ NDMrGVO6mjh7/HnYUr7+zbQfhLZdg424SpGEoiuS7dDcTpGlcT3pnWBJDGEHsy+4 0VnLI8/GPT1/jQCCYTVLu+tn0XmfZF18j2bvGhz1hM9J/HXaRpuqjGF6thLgYl63 CZf5Yg== =ALU2 -----END PGP SIGNATURE----- Merge tag 'nf-next-23-09-28' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Florian Westphal says: ==================== netfilter updates for net-next First patch, from myself, is a bug fix. The issue (connect timeout) is ancient, so I think its safe to give this more soak time given the esoteric conditions needed to trigger this. Also updates the existing selftest to cover this. Add netlink extacks when an update references a non-existent table/chain/set. This allows userspace to provide much better errors to the user, from Pablo Neira Ayuso. Last patch adds more policy checks to nf_tables as a better alternative to the existing runtime checks, from Phil Sutter. * tag 'nf-next-23-09-28' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nf_tables: Utilize NLA_POLICY_NESTED_ARRAY netfilter: nf_tables: missing extended netlink error in lookup functions selftests: netfilter: test nat source port clash resolution interaction with tcp early demux netfilter: nf_nat: undo erroneous tcp edemux lookup after port clash ==================== Link: https://lore.kernel.org/r/20230928144916.18339-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 14:25:37 -07:00
Randy Dunlap	513dbc10cf	page_pool: fix documentation typos Correct grammar for better readability. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Link: https://lore.kernel.org/r/20231001003846.29541-1-rdunlap@infradead.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 14:22:27 -07:00
Geert Uytterhoeven	2b464cc2fd	sctp: Spelling s/preceeding/preceding/g Fix a misspelling of "preceding". Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/663b14d07d6d716ddc34482834d6b65a2f714cfb.1695903447.git.geert+renesas@glider.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 14:04:58 -07:00
Chengfeng Ye	08e50cf071	tipc: fix a potential deadlock on &tx->lock It seems that tipc_crypto_key_revoke() could be be invoked by wokequeue tipc_crypto_work_rx() under process context and timer/rx callback under softirq context, thus the lock acquisition on &tx->lock seems better use spin_lock_bh() to prevent possible deadlock. This flaw was found by an experimental static analysis tool I am developing for irq-related deadlock. tipc_crypto_work_rx() <workqueue> --> tipc_crypto_key_distr() --> tipc_bcast_xmit() --> tipc_bcbase_xmit() --> tipc_bearer_bc_xmit() --> tipc_crypto_xmit() --> tipc_ehdr_build() --> tipc_crypto_key_revoke() --> spin_lock(&tx->lock) <timer interrupt> --> tipc_disc_timeout() --> tipc_bearer_xmit_skb() --> tipc_crypto_xmit() --> tipc_ehdr_build() --> tipc_crypto_key_revoke() --> spin_lock(&tx->lock) <deadlock here> Signed-off-by: Chengfeng Ye <dg573847474@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication") Link: https://lore.kernel.org/r/20230927181414.59928-1-dg573847474@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 13:24:12 -07:00
Ben Wolsieffer	6f195d6b0d	net: stmmac: dwmac-stm32: fix resume on STM32 MCU The STM32MP1 keeps clk_rx enabled during suspend, and therefore the driver does not enable the clock in stm32_dwmac_init() if the device was suspended. The problem is that this same code runs on STM32 MCUs, which do disable clk_rx during suspend, causing the clock to never be re-enabled on resume. This patch adds a variant flag to indicate that clk_rx remains enabled during suspend, and uses this to decide whether to enable the clock in stm32_dwmac_init() if the device was suspended. This approach fixes this specific bug with limited opportunity for unintended side-effects, but I have a follow up patch that will refactor the clock configuration and hopefully make it less error prone. Fixes: 6528e02cc9ff ("net: ethernet: stmmac: add adaptation for stm32mp157c.") Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20230927175749.1419774-1-ben.wolsieffer@hefring.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 13:22:37 -07:00
Dan Carpenter	24a0fbf48c	ptp: ocp: fix error code in probe() There is a copy and paste error so this uses a valid pointer instead of an error pointer. Fixes: 09eeb3aecc6c ("ptp_ocp: implement DPLL ops") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://lore.kernel.org/r/5c581336-0641-48bd-88f7-51984c3b1f79@moroto.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 13:10:41 -07:00
Russell King (Oracle)	d5a590b1b6	net: dsa: mt753x: remove mt753x_phylink_pcs_link_up() Remove the mt753x_phylink_pcs_link_up() function for two reasons: 1) priv->pcs[i].pcs.neg_mode is set true, meaning it doesn't take a MLO_AN_FIXED anymore, but one of PHYLINK_PCS_NEG_*. However, this is inconsequential due to... 2) priv->pcs[port].pcs.ops is always initialised to point at mt7530_pcs_ops, which does not have a pcs_link_up() member. So, let's remove mt753x_phylink_pcs_link_up() entirely. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/E1qlTQS-008BWe-Va@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-04 12:20:50 -07:00

1 2 3 4 5 ...

1217159 Commits