linux

iv/linux

Author	SHA1	Message	Date
Jakub Kicinski	d3a4706339	Merge branch 'rocker-two-small-changes' Ido Schimmel says: ==================== rocker: Two small changes Patch #1 avoids allocating and scheduling a work item when it is not doing any work. Patch #2 aligns rocker with other switchdev drivers to explicitly mark FDB entries as offloaded. Needed for upcoming MAB offload [1]. [1] https://lore.kernel.org/netdev/20221025100024.1287157-1-idosch@nvidia.com/ ==================== Link: https://lore.kernel.org/r/20221101123936.1900453-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-02 20:45:27 -07:00
Ido Schimmel	386b417482	rocker: Explicitly mark learned FDB entries as offloaded Currently, FDB entries that are notified to the bridge driver via 'SWITCHDEV_FDB_ADD_TO_BRIDGE' are always marked as offloaded by the bridge. With MAB enabled, this will no longer be universally true. Device drivers will report locked FDB entries to the bridge to let it know that the corresponding hosts required authorization, but it does not mean that these entries are necessarily programmed in the underlying hardware. We would like to solve it by having the bridge driver determine the offload indication based of the 'offloaded' bit in the FDB notification [1]. Prepare for that change by having rocker explicitly mark learned FDB entries as offloaded. This is consistent with all the other switchdev drivers. [1] https://lore.kernel.org/netdev/20221025100024.1287157-4-idosch@nvidia.com/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-02 20:45:23 -07:00
Ido Schimmel	42e51de97c	rocker: Avoid unnecessary scheduling of work item The work item function ofdpa_port_fdb_learn_work() does not do anything when 'OFDPA_OP_FLAG_LEARNED' is not set in the work item's flags. Therefore, do not allocate and do not schedule the work item when the flag is not set. Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-02 20:45:23 -07:00
Jiri Slaby (SUSE)	3319dbb3e7	sfc (gcc13): synchronize ef100_enqueue_skb()'s return type ef100_enqueue_skb() generates a valid warning with gcc-13: drivers/net/ethernet/sfc/ef100_tx.c:370:5: error: conflicting types for 'ef100_enqueue_skb' due to enum/integer mismatch; have 'int(struct efx_tx_queue , struct sk_buff )' drivers/net/ethernet/sfc/ef100_tx.h:25:13: note: previous declaration of 'ef100_enqueue_skb' with type 'netdev_tx_t(struct efx_tx_queue , struct sk_buff )' I.e. the type of the ef100_enqueue_skb()'s return value in the declaration is int, while the definition spells enum netdev_tx_t. Synchronize them to the latter. Cc: Martin Liska <mliska@suse.cz> Cc: Edward Cree <ecree.xilinx@gmail.com> Cc: Martin Habets <habetsm.xilinx@gmail.com> Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Link: https://lore.kernel.org/r/20221031114440.10461-1-jirislaby@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-02 20:38:17 -07:00
Jiri Slaby (SUSE)	777fa87c76	bonding (gcc13): synchronize bond_{a,t}lb_xmit() types Both bond_alb_xmit() and bond_tlb_xmit() produce a valid warning with gcc-13: drivers/net/bonding/bond_alb.c:1409:13: error: conflicting types for 'bond_tlb_xmit' due to enum/integer mismatch; have 'netdev_tx_t(struct sk_buff , struct net_device )' ... include/net/bond_alb.h:160:5: note: previous declaration of 'bond_tlb_xmit' with type 'int(struct sk_buff , struct net_device )' drivers/net/bonding/bond_alb.c:1523:13: error: conflicting types for 'bond_alb_xmit' due to enum/integer mismatch; have 'netdev_tx_t(struct sk_buff , struct net_device )' ... include/net/bond_alb.h:159:5: note: previous declaration of 'bond_alb_xmit' with type 'int(struct sk_buff , struct net_device )' I.e. the return type of the declaration is int, while the definitions spell netdev_tx_t. Synchronize both of them to the latter. Cc: Martin Liska <mliska@suse.cz> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Cc: Veaceslav Falico <vfalico@gmail.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Link: https://lore.kernel.org/r/20221031114409.10417-1-jirislaby@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-02 20:38:13 -07:00
Jiri Slaby (SUSE)	7d84118229	qed (gcc13): use u16 for fid to be big enough gcc 13 correctly reports overflow in qed_grc_dump_addr_range(): In file included from drivers/net/ethernet/qlogic/qed/qed.h:23, from drivers/net/ethernet/qlogic/qed/qed_debug.c:10: drivers/net/ethernet/qlogic/qed/qed_debug.c: In function 'qed_grc_dump_addr_range': include/linux/qed/qed_if.h:1217:9: error: overflow in conversion from 'int' to 'u8' {aka 'unsigned char'} changes value from '(int)vf_id << 8 \| 128' to '128' [-Werror=overflow] We do: u8 fid; ... fid = vf_id << 8 \| 128; Since fid is 16bit (and the stored value above too), fid should be u16, not u8. Fix that. Cc: Martin Liska <mliska@suse.cz> Cc: Ariel Elior <aelior@marvell.com> Cc: Manish Chopra <manishc@marvell.com> Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Link: https://lore.kernel.org/r/20221031114354.10398-1-jirislaby@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-02 20:38:08 -07:00
Rafał Miłecki	471ef777ec	net: broadcom: bcm4908_enet: report queued and transmitted bytes This allows BQL to operate avoiding buffer bloat and reducing latency. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Link: https://lore.kernel.org/r/20221031104856.32388-1-zajec5@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-02 20:38:04 -07:00
Suman Ghosh	2cee6401c4	octeontx2-af: Allow mkex profile without DMAC and add L2M/L2B header extraction support 1. It is possible to have custom mkex profiles which do not extract DMAC at all into the key. Hence allow mkex profiles which do not have DMAC to be loaded into MCAM hardware. This patch also adds debugging prints needed to identify profiles with wrong configuration. 2. If a mkex profile set "l2l3mb" field for Rx interface, then Rx multicast and broadcast entry should be configured. Signed-off-by: Suman Ghosh <sumang@marvell.com> Link: https://lore.kernel.org/r/20221031090856.1404303-1-sumang@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-02 20:27:04 -07:00
Jakub Kicinski	b54a0d4094	bpf-next-for-netdev -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCY2GuKgAKCRDbK58LschI gy32AP9PI0e/bUGDExKJ8g97PeeEtnpj4TTI6g+XKILtYnyXlgD/Rk4j2D/f3IBF Ha9TmqYvAUim+U/g50vUrNuoNLNJ5w8= =OKC1 -----END PGP SIGNATURE----- Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== bpf-next 2022-11-02 We've added 70 non-merge commits during the last 14 day(s) which contain a total of 96 files changed, 3203 insertions(+), 640 deletions(-). The main changes are: 1) Make cgroup local storage available to non-cgroup attached BPF programs such as tc BPF ones, from Yonghong Song. 2) Avoid unnecessary deadlock detection and failures wrt BPF task storage helpers, from Martin KaFai Lau. 3) Add LLVM disassembler as default library for dumping JITed code in bpftool, from Quentin Monnet. 4) Various kprobe_multi_link fixes related to kernel modules, from Jiri Olsa. 5) Optimize x86-64 JIT with emitting BMI2-based shift instructions, from Jie Meng. 6) Improve BPF verifier's memory type compatibility for map key/value arguments, from Dave Marchevsky. 7) Only create mmap-able data section maps in libbpf when data is exposed via skeletons, from Andrii Nakryiko. 8) Add an autoattach option for bpftool to load all object assets, from Wang Yufen. 9) Various memory handling fixes for libbpf and BPF selftests, from Xu Kuohai. 10) Initial support for BPF selftest's vmtest.sh on arm64, from Manu Bretelle. 11) Improve libbpf's BTF handling to dedup identical structs, from Alan Maguire. 12) Add BPF CI and denylist documentation for BPF selftests, from Daniel Müller. 13) Check BPF cpumap max_entries before doing allocation work, from Florian Lehner. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (70 commits) samples/bpf: Fix typo in README bpf: Remove the obsolte u64_stats_fetch_*_irq() users. bpf: check max_entries before allocating memory bpf: Fix a typo in comment for DFS algorithm bpftool: Fix spelling mistake "disasembler" -> "disassembler" selftests/bpf: Fix bpftool synctypes checking failure selftests/bpf: Panic on hard/soft lockup docs/bpf: Add documentation for new cgroup local storage selftests/bpf: Add test cgrp_local_storage to DENYLIST.s390x selftests/bpf: Add selftests for new cgroup local storage selftests/bpf: Fix test test_libbpf_str/bpf_map_type_str bpftool: Support new cgroup local storage libbpf: Support new cgroup local storage bpf: Implement cgroup storage available to non-cgroup-attached bpf progs bpf: Refactor some inode/task/sk storage functions for reuse bpf: Make struct cgroup btf id global selftests/bpf: Tracing prog can still do lookup under busy lock selftests/bpf: Ensure no task storage failure for bpf_lsm.s prog due to deadlock detection bpf: Add new bpf_task_storage_delete proto with no deadlock detection bpf: bpf_task_storage_delete_recur does lookup first before the deadlock check ... ==================== Link: https://lore.kernel.org/r/20221102062120.5724-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-02 08:18:27 -07:00
David S. Miller	ef2dd61af7	Merge branch 'renesas-eswitch' Yoshihiro Shimoda says: ==================== net: ethernet: renesas: Add support for "Ethernet Switch" This patch series is based on next-20221027. Add initial support for Renesas "Ethernet Switch" device of R-Car S4-8. The hardware has features about forwarding for an ethernet switch device. But, for now, it acts as ethernet controllers so that any forwarding offload features are not supported. So, any switchdev header files and DSA framework are not used. Notes that this driver requires some special settings on marvell10g, Especially host mactype and host speed. And, I need further investigation to modify the marvell10g driver for upstream. But, the special settings are applied, this rswitch driver can work correcfly without any changes of this rswitch driver. So, I believe the rswitch driver can go for upstream. Changes from v6: https://lore.kernel.org/all/20221028065458.2417293-1-yoshihiro.shimoda.uh@renesas.com/ - Add Reviewed-by tag in the patch [1/3]. - Fix ordering of initialization because NFS root can start mounting the filesystem before register_netdev() even returns Changes from v5: https://lore.kernel.org/all/20221027134034.2343230-1-yoshihiro.shimoda.uh@renesas.com/ - Add maxItems for the ethernet-port/port/reg property. Changes from v4: https://lore.kernel.org/all/20221019083518.933070-1-yoshihiro.shimoda.uh@renesas.com/ - Rebased on next-20221027. - Drop some unneeded properties on the dt-bindings doc. - Change the subject and commit descriptions on the patch [2/3]. - Use phylink instead of phylib. - Modify struct rswitch_*_desc to remove similar functions ([gs]et_dptr). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-02 12:38:53 +00:00
Yoshihiro Shimoda	6c6fa1a00a	net: ethernet: renesas: rswitch: Add R-Car Gen4 gPTP support Add R-Car Gen4 gPTP support into the rswitch driver. Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-02 12:38:53 +00:00
Yoshihiro Shimoda	3590918b5d	net: ethernet: renesas: Add support for "Ethernet Switch" Add initial support for Renesas "Ethernet Switch" device of R-Car S4-8. The hardware has features about forwarding for an ethernet switch device. But, for now, it acts as ethernet controllers so that any forwarding offload features are not supported. So, any switchdev header files and DSA framework are not used. Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-02 12:38:53 +00:00
Yoshihiro Shimoda	f9edd82774	dt-bindings: net: renesas: Document Renesas Ethernet Switch Document Renesas Etherent Switch for R-Car S4-8 (r8a779f0). Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-02 12:38:52 +00:00
David S. Miller	d312bad437	Merge branch 'txgbe' Mengyuan Lou says: ==================== net: WangXun txgbe/ngbe ethernet driver This patch series adds support for WangXun NICS, to initialize interface from software to firmware. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-02 12:31:24 +00:00
Mengyuan Lou	02338c484a	net: ngbe: Initialize sw info and register netdev Initialize ngbe mac/phy type. Check whether the firmware is initialized. Initialize ngbe hw and register netdev. Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-02 12:31:23 +00:00
Jiawen Wu	049fe53653	net: txgbe: Add operations to interact with firmware Add firmware interaction to get EEPROM information. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-02 12:31:23 +00:00
Jiawen Wu	1efa9bfe58	net: libwx: Implement interaction with firmware Add mailbox commands to interact with firmware. Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-02 12:31:23 +00:00
Heng Qi	2e0de6366a	veth: Avoid drop packets when xdp_redirect performs In the current processing logic, when xdp_redirect occurs, it transmits the xdp frame based on napi. If napi of the peer veth is not ready, the veth will drop the packets. This doesn't meet our expectations. In this context, we enable napi of the peer veth automatically when the veth loads the xdp. Then if the veth unloads the xdp, we need to correctly judge whether to disable napi of the peer veth, because the peer veth may have loaded xdp, or even the user has enabled GRO. Signed-off-by: Heng Qi <henqqi@linux.alibaba.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-02 12:07:10 +00:00
Dmitry Vyukov	7e8cdc9714	nfc: Add KCOV annotations Add remote KCOV annotations for NFC processing that is done in background threads. This enables efficient coverage-guided fuzzing of the NFC subsystem. The intention is to add annotations to background threads that process skb's that were allocated in syscall context (thus have a KCOV handle associated with the current fuzz test). This includes nci_recv_frame() that is called by the virtual nci driver in the syscall context. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Cc: Bongsu Jeon <bongsu.jeon@samsung.com> Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-02 11:58:13 +00:00
Shailend Chand	82fd151d38	gve: Reduce alloc and copy costs in the GQ rx path Previously, even if just one of the many fragments of a 9k packet required a copy, we'd copy the whole packet into a freshly-allocated 9k-sized linear SKB, and this led to performance issues. By having a pool of pages to copy into, each fragment can be independently handled, leading to a reduced incidence of allocation and copy. Signed-off-by: Shailend Chand <shailend@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-02 11:52:51 +00:00
Shane Parslow	d08b0f8f46	net: wwan: iosm: add rpc interface for xmm modems Add a new iosm wwan port that connects to the modem rpc interface. This interface provides a configuration channel, and in the case of the 7360, is the only way to configure the modem (as it does not support mbim). The new interface is compatible with existing software, such as open_xdatachannel.py from the xmm7360-pci project [1]. [1] https://github.com/xmm7360/xmm7360-pci Signed-off-by: Shane Parslow <shaneparslow808@gmail.com> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-02 11:51:03 +00:00
M Chetan Kumar	3349e4a48a	net: wwan: t7xx: Add port for modem logging The Modem Logging (MDL) port provides an interface to collect modem logs for debugging purposes. MDL is supported by the relay interface, and the mtk_t7xx port infrastructure. MDL allows user-space apps to control logging via mbim command and to collect logs via the relay interface, while port infrastructure facilitates communication between the driver and the modem. Signed-off-by: Moises Veleta <moises.veleta@linux.intel.com> Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Signed-off-by: Devegowda Chandrashekar <chandrashekar.devegowda@intel.com> Acked-by: Ricardo Martinez <ricardo.martinez@linux.intel.com> Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-02 11:39:49 +00:00
M Chetan Kumar	fece7a8c65	net: wwan: t7xx: use union to group port type specific data Use union inside t7xx_port to group port type specific data members. Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-11-02 11:39:49 +00:00
Eric Dumazet	b0e01253a7	tcp: refine tcp_prune_ofo_queue() logic After commits 36a6503fedda ("tcp: refine tcp_prune_ofo_queue() to not drop all packets") and 72cd43ba64fc1 ("tcp: free batches of packets in tcp_prune_ofo_queue()") tcp_prune_ofo_queue() drops a fraction of ooo queue, to make room for incoming packet. However it makes no sense to drop packets that are before the incoming packet, in sequence space. In order to recover from packet losses faster, it makes more sense to only drop ooo packets which are after the incoming packet. Tested: packetdrill test: 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [3800], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 +0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 0> +.1 < . 1:1(0) ack 1 win 1024 +0 accept(3, ..., ...) = 4 +.01 < . 200:300(100) ack 1 win 1024 +0 > . 1:1(0) ack 1 <nop,nop, sack 200:300> +.01 < . 400:500(100) ack 1 win 1024 +0 > . 1:1(0) ack 1 <nop,nop, sack 400:500 200:300> +.01 < . 600:700(100) ack 1 win 1024 +0 > . 1:1(0) ack 1 <nop,nop, sack 600:700 400:500 200:300> +.01 < . 800:900(100) ack 1 win 1024 +0 > . 1:1(0) ack 1 <nop,nop, sack 800:900 600:700 400:500 200:300> +.01 < . 1000:1100(100) ack 1 win 1024 +0 > . 1:1(0) ack 1 <nop,nop, sack 1000:1100 800:900 600:700 400:500> +.01 < . 1200:1300(100) ack 1 win 1024 +0 > . 1:1(0) ack 1 <nop,nop, sack 1200:1300 1000:1100 800:900 600:700> // this packet is dropped because we have no room left. +.01 < . 1400:1500(100) ack 1 win 1024 +.01 < . 1:200(199) ack 1 win 1024 // Make sure kernel did not drop 200:300 sequence +0 > . 1:1(0) ack 300 <nop,nop, sack 1200:1300 1000:1100 800:900 600:700> // Make room, since our RCVBUF is very small +0 read(4, ..., 299) = 299 +.01 < . 300:400(100) ack 1 win 1024 +0 > . 1:1(0) ack 500 <nop,nop, sack 1200:1300 1000:1100 800:900 600:700> +.01 < . 500:600(100) ack 1 win 1024 +0 > . 1:1(0) ack 700 <nop,nop, sack 1200:1300 1000:1100 800:900> +0 read(4, ..., 400) = 400 +.01 < . 700:800(100) ack 1 win 1024 +0 > . 1:1(0) ack 900 <nop,nop, sack 1200:1300 1000:1100> +.01 < . 900:1000(100) ack 1 win 1024 +0 > . 1:1(0) ack 1100 <nop,nop, sack 1200:1300> +.01 < . 1100:1200(100) ack 1 win 1024 // This checks that 1200:1300 has not been removed from ooo queue +0 > . 1:1(0) ack 1300 Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20221101035234.3910189-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-01 21:19:58 -07:00
Dr. David Alan Gilbert	44827016be	net: core: inet[46]_pton strlen len types inet[46]_pton check the input length against a sane length limit (INET[6]_ADDRSTRLEN), but the strlen value gets truncated due to being stored in an int, so there's a theoretical potential for a >4G string to pass the limit test. Use size_t since that's what strlen actually returns. I've had a hunt for callers that could hit this, but I've not managed to find anything that doesn't get checked with some other limit first; but it's possible that I've missed something in the depth of the storage target paths. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://lore.kernel.org/r/20221029014604.114024-1-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-11-01 21:14:39 -07:00
Kang Minchul	3a07dcf8f5	samples/bpf: Fix typo in README Fix 'cofiguration' typo in BPF samples README. Signed-off-by: Kang Minchul <tegongkang@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221030180254.34138-1-tegongkang@gmail.com	2022-11-01 15:25:21 +01:00
Jakub Kicinski	6f1a298b2e	Merge branch 'inet-add-drop-monitor-support' Eric Dumazet says: ==================== inet: add drop monitor support I recently tried to analyse flakes in ip_defrag selftest. This failed miserably. IPv4 and IPv6 reassembly units are causing false kfree_skb() notifications. It is time to deal with this issue. First two patches are changing core networking to better deal with eventual skb frag_list chains, in respect of kfree_skb/consume_skb status. Last three patches are adding three new drop reasons, and make sure skbs that have been reassembled into a large datagram are no longer viewed as dropped ones. After this, understanding why ip_defrag selftest is flaky is possible using standard drop monitoring tools. ==================== Link: https://lore.kernel.org/r/20221029154520.2747444-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-10-31 20:14:30 -07:00
Eric Dumazet	3bdfb04f13	net: dropreason: add SKB_DROP_REASON_FRAG_TOO_FAR IPv4 reassembly unit can decide to drop frags based on /proc/sys/net/ipv4/ipfrag_max_dist sysctl. Add a specific drop reason to track this specific and weird case. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-10-31 20:14:27 -07:00
Eric Dumazet	77adfd3a1d	net: dropreason: add SKB_DROP_REASON_FRAG_REASM_TIMEOUT Used to track skbs freed after a timeout happened in a reassmbly unit. Passing a @reason argument to inet_frag_rbtree_purge() allows to use correct consumed status for frags that have been successfully re-assembled. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-10-31 20:14:27 -07:00
Eric Dumazet	4ecbb1c27c	net: dropreason: add SKB_DROP_REASON_DUP_FRAG This is used to track when a duplicate segment received by various reassembly units is dropped. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-10-31 20:14:26 -07:00
Eric Dumazet	511a3eda2f	net: dropreason: propagate drop_reason to skb_release_data() When an skb with a frag list is consumed, we currently pretend all skbs in the frag list were dropped. In order to fix this, add a @reason argument to skb_release_data() and skb_release_all(). Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-10-31 20:14:26 -07:00
Eric Dumazet	0e84afe8eb	net: dropreason: add SKB_CONSUMED reason This will allow to simply use in the future: kfree_skb_reason(skb, reason); Instead of repeating sequences like: if (dropped) kfree_skb_reason(skb, reason); else consume_skb(skb); For instance, following patch in the series is adding @reason to skb_release_data() and skb_release_all(), so that we can propagate a meaningful @reason whenever consume_skb()/kfree_skb() have to take care of a potential frag_list. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-10-31 20:14:26 -07:00
Florian Fainelli	b98deb2f98	net: systemport: Add support for RDMA overflow statistic counter RDMA overflows can happen if the Ethernet controller does not have enough bandwidth allocated at the memory controller level, report RDMA overflows and deal with saturation, similar to the RBUF overflow counter. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20221028222141.3208429-1-f.fainelli@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-10-31 20:05:03 -07:00
Sergei Antonov	37c8489012	net: ftmac100: allow increasing MTU to make most use of single-segment buffers If the FTMAC100 is used as a DSA master, then it is expected that frames which are MTU sized on the wire facing the external switch port (1500 octets in L2 payload, plus L2 header) also get a DSA tag when seen by the host port. This extra tag increases the length of the packet as the host port sees it, and the FTMAC100 is not prepared to handle frames whose length exceeds 1518 octets (including FCS) at all. Only a minimal rework is needed to support this configuration. Since MTU-sized DSA-tagged frames still fit within a single buffer (RX_BUF_SIZE), we just need to optimize the resource management rather than implement multi buffer RX. In ndo_change_mtu(), we toggle the FTMAC100_MACCR_RX_FTL bit to tell the hardware to drop (or not) frames with an L2 payload length larger than 1500. We need to replicate the MACCR configuration in ftmac100_start_hw() as well, since there is a hardware reset there which clears previous settings. The advantage of dynamically changing FTMAC100_MACCR_RX_FTL is that when dev->mtu is at the default value of 1500, large frames are automatically dropped in hardware and we do not spend CPU cycles dropping them. Suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Sergei Antonov <saproj@gmail.com> Link: https://lore.kernel.org/r/20221028183220.155948-3-saproj@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-10-31 20:02:57 -07:00
Vladimir Oltean	30f837b7b9	net: ftmac100: report the correct maximum MTU of 1500 The driver uses the MAX_PKT_SIZE (1518) for both MTU reporting and for TX. However, the 2 places do not measure the same thing. On TX, skb->len measures the entire L2 packet length (without FCS, which software does not possess). So the comparison against 1518 there is correct. What is not correct is the reporting of dev->max_mtu as 1518. Since MTU measures L2 payload length (excluding L2 overhead) and not total L2 packet length, it means that the correct max_mtu supported by this device is the standard 1500. Anything higher than that will be dropped on RX currently. To fix this, subtract VLAN_ETH_HLEN from MAX_PKT_SIZE when reporting the max_mtu, since that is the difference between L2 payload length and total L2 length as seen by software. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Sergei Antonov <saproj@gmail.com> Link: https://lore.kernel.org/r/20221028183220.155948-2-saproj@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-10-31 20:02:57 -07:00
Vladimir Oltean	55f6f3dbcf	net: ftmac100: prepare data path for receiving single segment packets > 1514 Eliminate one check in the data path and move it elsewhere, to where our real limitation is. We'll want to start processing "too long" frames in the driver (currently there is a hardware MAC setting which drops theses). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Sergei Antonov <saproj@gmail.com> Link: https://lore.kernel.org/r/20221028183220.155948-1-saproj@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-10-31 20:02:57 -07:00
Steffen Bätz	91e87045a5	net: dsa: mv88e6xxx: Add RGMII delay to 88E6320 Currently, the .port_set_rgmii_delay hook is missing for the 88E6320 family, which causes failure to retrieve an IP address via DHCP. Add mv88e6320_port_set_rgmii_delay() that allows applying the RGMII delay for ports 2, 5, and 6, which are the only ports that can be used in RGMII mode. Tested on a custom i.MX8MN board connected to an 88E6320 switch. This change also applies safely to the 88E6321 variant. The only difference between 88E6320 versus 88E6321 is the temperature grade and pinout. They share exactly the same MDIO register map for ports 2, 5, and 6, which are the only ports that can be used in RGMII mode. Signed-off-by: Steffen Bätz <steffen@innosonix.de> [fabio: Improved commit log and extended it to mv88e6321_ops] Signed-off-by: Fabio Estevam <festevam@denx.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20221028163158.198108-1-festevam@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-10-31 20:00:20 -07:00
Jakub Kicinski	eff1744e62	Merge branch 'rtnetlink-honour-nlm_f_echo-flag-in-rtnl_-new-del-link' Hangbin Liu says: ==================== rtnetlink: Honour NLM_F_ECHO flag in rtnl_{new, del}link Netlink messages are used for communicating between user and kernel space. When user space configures the kernel with netlink messages, it can set the NLM_F_ECHO flag to request the kernel to send the applied configuration back to the caller. This allows user space to retrieve configuration information that are filled by the kernel (either because these parameters can only be set by the kernel or because user space let the kernel choose a default value). The kernel has support this feature in some places like RTM_{NEW, DEL}ADDR, RTM_{NEW, DEL}ROUTE. This patch set handles NLM_F_ECHO flag and send link info back after rtnl_{new, del}link. ==================== Link: https://lore.kernel.org/r/20221028084224.3509611-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-10-31 18:10:25 -07:00
Hangbin Liu	f3a63cce1b	rtnetlink: Honour NLM_F_ECHO flag in rtnl_delete_link This patch use the new helper unregister_netdevice_many_notify() for rtnl_delete_link(), so that the kernel could reply unicast when userspace set NLM_F_ECHO flag to request the new created interface info. At the same time, the parameters of rtnl_delete_link() need to be updated since we need nlmsghdr and portid info. Suggested-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-10-31 18:10:21 -07:00
Hangbin Liu	d88e136cab	rtnetlink: Honour NLM_F_ECHO flag in rtnl_newlink_create This patch pass the netlink header message in rtnl_newlink_create() to the new updated rtnl_configure_link(), so that the kernel could reply unicast when userspace set NLM_F_ECHO flag to request the new created interface info. Suggested-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-10-31 18:10:21 -07:00
Hangbin Liu	77f4aa9a2a	net: add new helper unregister_netdevice_many_notify Add new helper unregister_netdevice_many_notify(), pass netlink message header and portid, which could be used to notify userspace when flag NLM_F_ECHO is set. Make the unregister_netdevice_many() as a wrapper of new function unregister_netdevice_many_notify(). Suggested-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-10-31 18:10:21 -07:00
Hangbin Liu	1d997f1013	rtnetlink: pass netlink message header and portid to rtnl_configure_link() This patch pass netlink message header and portid to rtnl_configure_link() All the functions in this call chain need to add the parameters so we can use them in the last call rtnl_notify(), and notify the userspace about the new link info if NLM_F_ECHO flag is set. - rtnl_configure_link() - __dev_notify_flags() - rtmsg_ifinfo() - rtmsg_ifinfo_event() - rtmsg_ifinfo_build_skb() - rtmsg_ifinfo_send() - rtnl_notify() Also move __dev_notify_flags() declaration to net/core/dev.h, as Jakub suggested. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-10-31 18:10:21 -07:00
Sean Anderson	37fe9b9816	net: dpaa2: Add some debug prints on deferred probe When this device is deferred, there is often no way to determine what the cause was. Add some debug prints to make it easier to figure out what is blocking the probe. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Link: https://lore.kernel.org/r/20221027190005.400839-1-sean.anderson@seco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-10-31 17:35:24 -07:00
Thomas Gleixner	97c4090bad	bpf: Remove the obsolte u64_stats_fetch_*_irq() users. Now that the 32bit UP oddity is gone and 32bit uses always a sequence count, there is no need for the fetch_irq() variants anymore. Convert to the regular interface. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/bpf/20221026123110.331690-1-bigeasy@linutronix.de	2022-10-31 23:40:24 +01:00
Colin Ian King	0cf9deb300	net: mvneta: Remove unused variable i Variable i is just being incremented and it's never used anywhere else. The variable and the increment are redundant so remove it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-10-31 11:47:31 +00:00
David S. Miller	5565dbd01e	Merge branch 'ptp-adjfine' Jacob Keller says: ==================== ptp: convert drivers to .adjfine Many drivers implementing PTP have not yet migrated to the new .adjfine frequency adjustment implementation. A handful of these drivers use hardware with a simple increment value which is adjusted by multiplying by the adjustment factor and then dividing by 1 billion. This calculation is very easy to convert to .adjfine, by simply updating the divisor. Introduce new helper functions, diff_by_scaled_ppm and adjust_by_scaled_ppm which perform the most common calculations used by drivers for this purpose. The adjust_by_scaled_ppm takes the base increment and scaled PPM value, and calculates the new increment to use. A few drivers need the difference and direction rather than a raw increment value. The diff_by_scaled_ppm calculates the difference and returns true if it should be a subtraction, false otherwise. This most closely aligns with existing driver implementations. I previously submitted v1 of this series at [1], and got some feedback only on a handful of drivers. In the interest of merging the changes which have received feedback, I've dropped the following drivers out of this send: * ptp_phc * ptp_ipx46x * tg3 * hclge * stmac * cpts I plan to submit those drivers changes again at a later date. As before, there are some drivers which are not trivial to convert to the new helper functions. While they may be able to work, their implementation is different and I lack the hardware or datasheets to determine what the correct implementation would be. * drivers/net/ethernet/broadcom/bnx2x * drivers/net/ethernet/broadcom/bnxt * drivers/net/ethernet/cavium/liquidio * drivers/net/ethernet/chelsio/cxgb4 * drivers/net/ethernet/freescale * drivers/net/ethernet/qlogic/qed * drivers/net/ethernet/qlogic/qede * drivers/net/ethernet/sfc * drivers/net/ethernet/sfc/siena * drivers/net/ethernet/ti/am65-cpts.c * drivers/ptp/ptp_dte.c My end goal is to drop the .adjfreq implementation entirely, and to that end I plan on modifying these drivers in the future to directly use scaled_ppm_to_ppb as the simplest method to convert them. Changes since v2: * Rebased to allow landing in 6.2 * Added Richard's Acked-by Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Siva Reddy Kallam <siva.kallam@broadcom.com> Cc: Prashant Sreedharan <prashant@broadcom.com> Cc: Michael Chan <mchan@broadcom.com> Cc: Yisen Zhuang <yisen.zhuang@huawei.com> Cc: Salil Mehta <salil.mehta@huawei.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Cc: Tariq Toukan <tariqt@nvidia.com> Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Bryan Whitehead <bryan.whitehead@microchip.com> Cc: Sergey Shtylyov <s.shtylyov@omp.ru> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Jose Abreu <joabreu@synopsys.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Vivek Thampi <vithampi@vmware.com> Cc: VMware PV-Drivers Reviewers <pv-drivers@vmware.com> Cc: Jie Wang <wangjie125@huawei.com> Cc: Jacob Keller <jacob.e.keller@intel.com> Cc: Guangbin Huang <huangguangbin2@huawei.com> Cc: Eran Ben Elisha <eranbe@nvidia.com> Cc: Aya Levin <ayal@nvidia.com> Cc: Cai Huoqing <cai.huoqing@linux.dev> Cc: Biju Das <biju.das.jz@bp.renesas.com> Cc: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Cc: Phil Edworthy <phil.edworthy@renesas.com> Cc: Jiasheng Jiang <jiasheng@iscas.ac.cn> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Wan Jiabing <wanjiabing@vivo.com> Cc: Lv Ruyi <lv.ruyi@zte.com.cn> Cc: Arnd Bergmann <arnd@arndb.de> ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-10-31 11:14:16 +00:00
Jacob Keller	337ffae0e4	ptp: xgbe: convert to .adjfine and adjust_by_scaled_ppm The xgbe implementation of .adjfreq is implemented in terms of a straight forward "base * ppb / 1 billion" calculation. Convert this driver to .adjfine and use adjust_by_scaled_ppm to calculate the new addend value. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-10-31 11:14:16 +00:00
Jacob Keller	673dd2c788	ptp: ravb: convert to .adjfine and adjust_by_scaled_ppm The ravb implementation of .adjfreq is implemented in terms of a straight forward "base * ppb / 1 billion" calculation. Convert this driver to .adjfine and use the adjust_by_scaled_ppm helper function to calculate the new addend. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Cc: Sergey Shtylyov <s.shtylyov@omp.ru> Cc: Biju Das <biju.das.jz@bp.renesas.com> Cc: Phil Edworthy <phil.edworthy@renesas.com> Cc: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Cc: linux-renesas-soc@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>	2022-10-31 11:14:16 +00:00
Jacob Keller	8bc900cbff	ptp: lan743x: use diff_by_scaled_ppm in .adjfine implementation Update the lan743x driver to use the recently added diff_by_scaled_ppm helper function. This reduces the amount of code required in lan743x_ptp.c driver file. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Cc: Bryan Whitehead <bryan.whitehead@microchip.com> Cc: UNGLinuxDriver@microchip.com Signed-off-by: David S. Miller <davem@davemloft.net>	2022-10-31 11:14:16 +00:00
Jacob Keller	c56dff6a9a	ptp: lan743x: remove .adjfreq implementation The lan743x driver implements both .adjfreq and .adjfine, but the core PTP subsystem prefers .adjfine if implemented. There is no reason to carry a .adjfreq implementation, so we can remove it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Cc: Bryan Whitehead <bryan.whitehead@microchip.com> Cc: UNGLinuxDriver@microchip.com Signed-off-by: David S. Miller <davem@davemloft.net>	2022-10-31 11:14:16 +00:00

1 2 3 4 5 ...

1137192 Commits