linux

iv/linux

Author	SHA1	Message	Date
Vladimir Oltean	65d0cbb414	net: enetc: move up enetc_reuse_page and enetc_page_reusable For XDP_TX, we need to call enetc_reuse_page from enetc_clean_tx_ring, so we need to avoid a forward declaration. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:57:44 -07:00
Vladimir Oltean	1ee8d6f3be	net: enetc: clean the TX software BD on the TX confirmation path With the future introduction of some new fields into enetc_tx_swbd such as is_xdp_tx, is_xdp_redirect etc, we need not only to set these bits to true from the XDP_TX/XDP_REDIRECT code path, but also to false from the old code paths. This is because TX software buffer descriptors are kept in a ring that is shadow of the hardware TX ring, so these structures keep getting reused, and there is always the possibility that when a software BD is reused (after we ran a full circle through the TX ring), the old user of the tx_swbd had set is_xdp_tx = true, and now we are sending a regular skb, which would need to set is_xdp_tx = false. To be minimally invasive to the old code paths, let's just scrub the software TX BD in the TX confirmation path (enetc_clean_tx_ring), once we know that nobody uses this software TX BD (tx_ring->next_to_clean hasn't yet been updated, and the TX paths check enetc_bd_unused which tells them if there's any more space in the TX ring for a new enqueue). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:57:44 -07:00
Vladimir Oltean	d504498d2e	net: enetc: add a dedicated is_eof bit in the TX software BD In the transmit path, if we have a scatter/gather frame, it is put into multiple software buffer descriptors, the last of which has the skb pointer populated (which is necessary for rearming the TX MSI vector and for collecting the two-step TX timestamp from the TX confirmation path). At the moment, this is sufficient, but with XDP_TX, we'll need to service TX software buffer descriptors that don't have an skb pointer, however they might be final nonetheless. So add a dedicated bit for final software BDs that we populate and check explicitly. Also, we keep looking just for an skb when doing TX timestamping, because we don't want/need that for XDP. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:57:44 -07:00
Vladimir Oltean	a800abd3ec	net: enetc: move skb creation into enetc_build_skb We need to build an skb from two code paths now: from the plain RX data path and from the XDP data path when the verdict is XDP_PASS. Create a new enetc_build_skb function which contains the essential steps for building an skb based on the first and last positions of buffer descriptors within the RX ring. We also squash the enetc_process_skb function into enetc_build_skb, because what that function did wasn't very meaningful on its own. The "rx_frm_cnt++" instruction has been moved around napi_gro_receive for cosmetic reasons, to be in the same spot as rx_byte_cnt++, which itself must be before napi_gro_receive, because that's when we lose ownership of the skb. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:57:44 -07:00
Vladimir Oltean	2fa423f5f0	net: enetc: consume the error RX buffer descriptors in a dedicated function We can and should check the RX BD errors before starting to build the skb. The only apparent reason why things are done in this backwards order is to spare one call to enetc_rxbd_next. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-31 14:57:43 -07:00
Ioana Ciornei	bc96781a89	dpaa2-switch: setup learning state on STP state change Depending on what STP state a port is in, the learning on that port should be enabled or disabled. When the STP state is DISABLED, BLOCKING or LISTENING no learning should be happening irrespective of what the bridge previously requested. The learning state is changed to be the one setup by the bridge when the STP state is LEARNING or FORWARDING. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 17:18:26 -07:00
Ioana Ciornei	1a64ed129c	dpaa2-switch: trap STP frames to the CPU Add an ACL entry in each port's ACL table to redirect any frame that has the destination MAC address equal to the STP dmac to the control interface. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 17:18:26 -07:00
Ioana Ciornei	62734c7405	dpaa2-switch: keep track of the current learning state per port Keep track of the current learning state per port so that we can reference it in the next patches when setting up a STP state. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 17:18:26 -07:00
Ioana Ciornei	90f0710235	dpaa2-switch: create and assign an ACL table per port In order to trap frames to the CPU, the DPAA2 switch uses the ACL table. At probe time, create an ACL table for each switch port so that in the next patches we can use this to trap STP frames and redirect them to the control interface. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 17:18:26 -07:00
Ioana Ciornei	6aa6791d1a	dpaa2-switch: fix the translation between the bridge and dpsw STP states The numerical values used for STP states are different between the bridge and the MC ABI therefore, the direct usage of the BR_STATE_* macros directly in the structures passed to the firmware is incorrect. Create a separate function that translates between the bridge STP states and the enum that holds the STP state as seen by the Management Complex. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-30 17:18:26 -07:00
Claudiu Manoil	bff5b62585	gianfar: Handle error code at MAC address change Handle return error code of eth_mac_addr(); Fixes: 3d23a05c75c7 ("gianfar: Enable changing mac addr when if up") Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-29 13:45:41 -07:00
David S. Miller	241949e488	Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-03-24 The following pull-request contains BPF updates for your net-next tree. We've added 37 non-merge commits during the last 15 day(s) which contain a total of 65 files changed, 3200 insertions(+), 738 deletions(-). The main changes are: 1) Static linking of multiple BPF ELF files, from Andrii. 2) Move drop error path to devmap for XDP_REDIRECT, from Lorenzo. 3) Spelling fixes from various folks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-25 16:30:46 -07:00
Vladimir Oltean	e366a39208	net: enetc: don't depend on system endianness in enetc_set_mac_ht_flt When enetc runs out of exact match entries for unicast address filtering, it switches to an approach based on hash tables, where multiple MAC addresses might end up in the same bucket. However, the enetc_set_mac_ht_flt function currently depends on the system endianness, because it interprets the 64-bit hash value as an array of two u32 elements. Modify this to use lower_32_bits and upper_32_bits. Tested by forcing enetc to go into hash table mode by creating two macvlan upper interfaces: ip link add link eno0 address 00:01:02:03:00:00 eno0.0 type macvlan && ip link set eno0.0 up ip link add link eno0 address 00:01:02:03:00:01 eno0.1 type macvlan && ip link set eno0.1 up and verified that the same bit values are written to the registers before and after: enetc_sync_mac_filters: addr 00:00:80:00:40:10 exact match 0 enetc_sync_mac_filters: addr 00:00:00:00:80:00 exact match 0 enetc_set_mac_ht_flt: hash 0x80008000000000 UMHFR0 0x0 UMHFR1 0x800080 Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 16:28:59 -07:00
Vladimir Oltean	110eccdb24	net: enetc: don't depend on system endianness in enetc_set_vlan_ht_filter ENETC has a 64-entry hash table for VLAN RX filtering per Station Interface, which is accessed through two 32-bit registers: VHFR0 holding the low portion, and VHFR1 holding the high portion. The enetc_set_vlan_ht_filter function looks at the pf->vlan_ht_filter bitmap, which is fundamentally an unsigned long variable, and casts it to a u32 array of two elements. It puts the first u32 element into VHFR0 and the second u32 element into VHFR1. It is easy to imagine that this will not work on big endian systems (although, yes, we have bigger problems, because currently enetc assumes that the CPU endianness is equal to the controller endianness, aka little endian - but let's assume that we could add a cpu_to_le32 in enetc_wd_reg and a le32_to_cpu in enetc_rd_reg). Let's use lower_32_bits and upper_32_bits which are designed to work regardless of endianness. Tested that both the old and the new method produce the same results: $ ethtool -K eth1 rx-vlan-filter on $ ip link add link eth1 name eth1.100 type vlan id 100 enetc_set_vlan_ht_filter: method 1: si_idx 0 VHFR0 0x0 VHFR1 0x20 enetc_set_vlan_ht_filter: method 2: si_idx 0 VHFR0 0x0 VHFR1 0x20 $ ip link add link eth1 name eth1.101 type vlan id 101 enetc_set_vlan_ht_filter: method 1: si_idx 0 VHFR0 0x0 VHFR1 0x30 enetc_set_vlan_ht_filter: method 2: si_idx 0 VHFR0 0x0 VHFR1 0x30 $ ip link add link eth1 name eth1.34 type vlan id 34 enetc_set_vlan_ht_filter: method 1: si_idx 0 VHFR0 0x0 VHFR1 0x34 enetc_set_vlan_ht_filter: method 2: si_idx 0 VHFR0 0x0 VHFR1 0x34 $ ip link add link eth1 name eth1.1024 type vlan id 1024 enetc_set_vlan_ht_filter: method 1: si_idx 0 VHFR0 0x1 VHFR1 0x34 enetc_set_vlan_ht_filter: method 2: si_idx 0 VHFR0 0x1 VHFR1 0x34 Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-24 16:28:59 -07:00
Ioana Ciornei	b175dfd7e6	dpaa2-switch: mark skbs with offload_fwd_mark If a switch port is under a bridge, the offload_fwd_mark should be setup before sending the skb towards the stack so that the bridge does not try to flood the packet on the other switch ports. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-22 16:37:45 -07:00
Ioana Ciornei	6253d5e39c	dpaa2-switch: add support for configuring per port unknown flooding Add support for configuring per port unknown flooding by accepting both BR_FLOOD and BR_MCAST_FLOOD as offloadable bridge port flags. The DPAA2 switch does not support at the moment configuration of unknown multicast flooding independently of unknown unicast flooding, therefore check that both BR_FLOOD and BR_MCAST_FLOOD have the same state. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-22 16:37:45 -07:00
Ioana Ciornei	b54eb093f5	dpaa2-switch: add support for configuring per port broadcast flooding The BR_BCAST_FLOOD bridge port flag is now accepted by the driver and a change in its state will determine a reconfiguration of the broadcast egress flooding list on the FDB associated with the port. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-22 16:37:45 -07:00
Ioana Ciornei	1e7cbabfdb	dpaa2-switch: add support for configuring learning state per port Add support for configuring the learning state of a switch port. When the user requests the HW learning to be disabled, a fast-age procedure on that specific port is run so that previously learnt addresses do not linger. At device probe as well as on a bridge leave action, the ports are configured with HW learning disabled since they are basically a standalone port. At the same time, at bridge join we inherit the bridge port BR_LEARNING flag state and configure it on the switch port. There were already some MC firmware ABI functions for changing the learning state, but those were per FDB (bridging domain) and not per port so we need to adjust those to use the new MC fw command which is per port. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-22 16:37:44 -07:00
Ioana Ciornei	f054e3e217	dpaa2-switch: refactor the egress flooding domain setup Extract the code that determines the list of egress flood interfaces for a specific flood type into a new function - dpaa2_switch_fdb_get_flood_cfg(). This will help us to not duplicate code when the broadcast and unknown ucast/mcast flooding domains will be individually configurable. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-22 16:37:44 -07:00
Ioana Ciornei	c7e856c859	dpaa2-switch: move the dpaa2_switch_fdb_set_egress_flood function In order to avoid a forward declaration in the next patches, move the dpaa2_switch_fdb_set_egress_flood() function to the top of the file. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-22 16:37:44 -07:00
Vladimir Oltean	c54f042dcc	net: enetc: teardown CBDR during PF/VF unbind Michael reports that after the blamed patch, unbinding a VF would cause these transactions to remain pending, and trigger some warnings with the DMA API debug: $ echo 1 > /sys/bus/pci/devices/0000\:00\:00.0/sriov_numvfs pci 0000:00:01.0: [1957:ef00] type 00 class 0x020001 fsl_enetc_vf 0000:00:01.0: Adding to iommu group 19 fsl_enetc_vf 0000:00:01.0: enabling device (0000 -> 0002) fsl_enetc_vf 0000:00:01.0 eno0vf0: renamed from eth0 $ echo 0 > /sys/bus/pci/devices/0000\:00\:00.0/sriov_numvfs DMA-API: pci 0000:00:01.0: device driver has pending DMA allocations while released from device [count=1] One of leaked entries details: [size=2048 bytes] [mapped with DMA_BIDIRECTIONAL] [mapped as coherent] WARNING: CPU: 0 PID: 2547 at kernel/dma/debug.c:853 dma_debug_device_change+0x174/0x1c8 (...) Call trace: dma_debug_device_change+0x174/0x1c8 blocking_notifier_call_chain+0x74/0xa8 device_release_driver_internal+0x18c/0x1f0 device_release_driver+0x20/0x30 pci_stop_bus_device+0x8c/0xe8 pci_stop_and_remove_bus_device+0x20/0x38 pci_iov_remove_virtfn+0xb8/0x128 sriov_disable+0x3c/0x110 pci_disable_sriov+0x24/0x30 enetc_sriov_configure+0x4c/0x108 sriov_numvfs_store+0x11c/0x198 (...) DMA-API: Mapped at: dma_entry_alloc+0xa4/0x130 debug_dma_alloc_coherent+0xbc/0x138 dma_alloc_attrs+0xa4/0x108 enetc_setup_cbdr+0x4c/0x1d0 enetc_vf_probe+0x11c/0x250 pci 0000:00:01.0: Removing from iommu group 19 This happens because stupid me moved enetc_teardown_cbdr outside of enetc_free_si_resources, but did not bother to keep calling enetc_teardown_cbdr from all the places where enetc_free_si_resources was called. In particular, now it is no longer called from the main unbind function, just from the probe error path. Fixes: 4b47c0b81ffd ("net: enetc: don't initialize unused ports from a separate code path") Reported-by: Michael Walle <michael@walle.cc> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Michael Walle <michael@walle.cc> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-19 12:13:11 -07:00
Lorenzo Bianconi	fdc13979f9	bpf, devmap: Move drop error path to devmap for XDP_REDIRECT We want to change the current ndo_xdp_xmit drop semantics because it will allow us to implement better queue overflow handling. This is working towards the larger goal of a XDP TX queue-hook. Move XDP_REDIRECT error path handling from each XDP ethernet driver to devmap code. According to the new APIs, the driver running the ndo_xdp_xmit pointer, will break tx loop whenever the hw reports a tx error and it will just return to devmap caller the number of successfully transmitted frames. It will be devmap responsibility to free dropped frames. Move each XDP ndo_xdp_xmit capable driver to the new APIs: - veth - virtio-net - mvneta - mvpp2 - socionext - amazon ena - bnxt - freescale (dpaa2, dpaa) - xen-frontend - qede - ice - igb - ixgbe - i40e - mlx5 - ti (cpsw, cpsw-new) - tun - sfc Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Camelia Groza <camelia.groza@nxp.com> Acked-by: Edward Cree <ecree.xilinx@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Shay Agroskin <shayagr@amazon.com> Link: https://lore.kernel.org/bpf/ed670de24f951cfd77590decf0229a0ad7fd12f6.1615201152.git.lorenzo@kernel.org	2021-03-18 16:38:51 +01:00
Ioana Ciornei	4fe72de61e	dpaa2-eth: fixup kdoc warnings Running kernel-doc over the dpaa2-eth driver generates a bunch of warnings. Fix them up by removing code comments for macros which are self-explanatory, respecting the kdoc format for macro documentation and other small changes like describing the expected return values of functions. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-16 15:29:49 -07:00
Ioana Ciornei	5ac2d25438	dpaa2-switch: fit the function declaration on the same line Multiple ABI function declarations are split unnecessarry on multiple lines. Fix this so that we have a consistent coding style. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-16 15:29:49 -07:00
Ioana Ciornei	2b7e3f7d1b	dpaa2-switch: reduce the size of the if_id bitmap to 64 bits The maximum number of DPAA2 switch interfaces, including the control interface, is 64. Even though this restriction existed from the first place, the command structures which use an interface id bitmap were poorly described and even though a single uint64_t is enough, all of them used an array of 4 uint64_t's. Fix this by reducing the size of the interface id field to a single uint64_t. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-16 15:29:49 -07:00
Ioana Ciornei	05b363608b	dpaa2-switch: fix kdoc warnings Running kernel-doc over the dpaa2-switch driver generates a bunch of warnings. Fix them up by removing code comments for macros which are self-explanatory and adding a bit more context for the dpsw_if_get_port_mac_addr() function and the fields of the dpsw_vlan_if_cfg structure. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-16 15:29:48 -07:00
Ioana Ciornei	cba0445633	dpaa2-switch: remove unused ABI functions Cleanup the dpaa2-switch driver a bit by removing any unused MC firmware ABI definitions. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-16 15:29:48 -07:00
Baowen Zheng	6a56e19902	flow_offload: reject configuration of packet-per-second policing in offload drivers A follow-up patch will allow users to configures packet-per-second policing in the software datapath. In preparation for this, teach all drivers that support offload of the policer action to reject such configuration as currently none of them support it. Signed-off-by: Baowen Zheng <baowen.zheng@corigine.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Louis Peens <louis.peens@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-13 14:18:09 -08:00
Ioana Ciornei	f48298d3fb	staging: dpaa2-switch: move the driver out of staging Now that the dpaa2-switch driver has basic I/O capabilities on the switch port net_devices and multiple bridging domains are supported, move the driver out of staging. The dpaa2-switch driver is placed right next to the dpaa2-eth driver since, in the near future, they will be sharing most of the data path. I didn't implement code reuse in this patch series because I wanted to keep it as small as possible. Also, the README is removed from staging with the intention to add proper rst documentation afterwards to actually match was is supported by the driver. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 13:30:36 -08:00
Vladimir Oltean	7a5222cb7a	net: enetc: make enetc_refill_rx_ring update the consumer index Since commit fd5736bf9f23 ("enetc: Workaround for MDIO register access issue"), enetc_refill_rx_ring no longer updates the RX BD ring's consumer index, that is left to be done by the caller. This has led to bugs such as the ones found in 96a5223b918c ("net: enetc: remove bogus write to SIRXIDR from enetc_setup_rxbdr") and 3a5d12c9be6f ("net: enetc: keep RX ring consumer index in sync with hardware"), so it is desirable that we move back the update of the consumer index into enetc_refill_rx_ring. The trouble with that is the different MDIO locking context for the two callers of enetc_refill_rx_ring: - enetc_clean_rx_ring runs under enetc_lock_mdio() - enetc_setup_rxbdr runs outside enetc_lock_mdio() Simplify the callers of enetc_refill_rx_ring by making enetc_setup_rxbdr explicitly take enetc_lock_mdio() around the call. It will be the only place in need of ensuring the hot accessors can be used. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 13:14:15 -08:00
Vladimir Oltean	0486185ee2	net: enetc: remove forward declaration for enetc_map_tx_buffs There is no other reason why this forward declaration exists rather than poor ordering of the functions. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 13:14:15 -08:00
Vladimir Oltean	8580b3c3d7	net: enetc: remove forward-declarations of enetc_clean_{rx,tx}_ring This patch moves the NAPI enetc_poll after enetc_clean_rx_ring such that we can delete the forward declarations. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 13:14:15 -08:00
Vladimir Oltean	7f071a450b	net: enetc: use enum enetc_active_offloads The active_offloads variable of enetc_ndev_priv has an enum type, use it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 13:14:15 -08:00
Vladimir Oltean	c027aa9201	net: enetc: simplify callers of enetc_rxbd_next When we iterate through the BDs in the RX ring, the software producer index (which is already passed by value to enetc_rxbd_next) lags behind, and we end up with this funny looking "++i == rx_ring->bd_count" check so that we drag it after us. Let's pass the software producer index "i" by reference, so that enetc_rxbd_next can increment it by itself (mod rx_ring->bd_count), especially since enetc_rxbd_next has to increment the index anyway. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 13:14:15 -08:00
Vladimir Oltean	4b47c0b81f	net: enetc: don't initialize unused ports from a separate code path Since commit 3222b5b613db ("net: enetc: initialize RFS/RSS memories for unused ports too") there is a requirement to initialize the memories of unused PFs too, which has left the probe path in a bit of a rough shape, because we basically have a minimal initialization path for unused PFs which is separate from the main initialization path. Now that initializing a control BD ring is as simple as calling enetc_setup_cbdr, let's move that outside of enetc_alloc_si_resources (unused PFs don't need classification rules, so no point in allocating them just to free them later). But enetc_alloc_si_resources is called both for PFs and for VFs, so now that enetc_setup_cbdr is no longer called from this common function, it means that the VF probe path needs to explicitly call enetc_setup_cbdr too. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 13:14:15 -08:00
Vladimir Oltean	5b4daa7f12	net: enetc: pass bd_count as an argument to enetc_setup_cbdr It makes no sense from an API perspective to first initialize some portion of struct enetc_cbdr outside enetc_setup_cbdr, then leave that function to initialize the rest. enetc_setup_cbdr should be able to perform all initialization given a zero-initialized struct enetc_cbdr. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 13:14:15 -08:00
Vladimir Oltean	0bfde022b3	net: enetc: squash clear_cbdr and free_cbdr into teardown_cbdr All call sites call enetc_clear_cbdr and enetc_free_cbdr one after another, so let's combine the two functions into a single method named enetc_teardown_cbdr which does both, and in the same order. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 13:14:15 -08:00
Vladimir Oltean	27f9025d49	net: enetc: save the mode register address inside struct enetc_cbdr enetc_clear_cbdr depends on struct enetc_hw because it must disable the ring through a register write. We'd like to remove that dependency, so let's do what's already done with the producer and consumer indices, which is to save the iomem address in a variable kept in struct enetc_cbdr. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 13:14:15 -08:00
Vladimir Oltean	24be14e326	net: enetc: squash enetc_alloc_cbdr and enetc_setup_cbdr enetc_alloc_cbdr and enetc_setup_cbdr are always called one after another, so we can simplify the callers and make enetc_setup_cbdr do everything that's needed. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 13:14:15 -08:00
Vladimir Oltean	01121ab739	net: enetc: save the DMA device for enetc_free_cbdr We shouldn't need to pass the struct device *dev to enetc CBDR APIs over and over again, so save this inside struct enetc_cbdr::dma_dev and avoid calling it from the enetc_free_cbdr functions. This breaks the dependency of the cbdr API from struct enetc_si (the station interface). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 13:14:15 -08:00
Vladimir Oltean	176769d10f	net: enetc: move the CBDR API to enetc_cbdr.c Since there is a dedicated file in this driver for interacting with control BD rings, it makes sense to move these functions there. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 13:14:14 -08:00
Vladimir Oltean	847cbfc014	net: add a helper to avoid issues with HW TX timestamping and SO_TXTIME As explained in commit 29d98f54a4fe ("net: enetc: allow hardware timestamping on TX queues with tc-etf enabled"), hardware TX timestamping requires an skb with skb->tstamp = 0. When a packet is sent with SO_TXTIME, the skb->skb_mstamp_ns corrupts the value of skb->tstamp, so the drivers need to explicitly reset skb->tstamp to zero after consuming the TX time. Create a helper named skb_txtime_consumed() which does just that. All drivers which offload TC_SETUP_QDISC_ETF should implement it, and it would make it easier to assess during review whether they do the right thing in order to be compatible with hardware timestamping or not. Suggested-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-10 12:45:16 -08:00
Vladimir Oltean	29d98f54a4	net: enetc: allow hardware timestamping on TX queues with tc-etf enabled The txtime is passed to the driver in skb->skb_mstamp_ns, which is actually in a union with skb->tstamp (the place where software timestamps are kept). Since commit b50a5c70ffa4 ("net: allow simultaneous SW and HW transmit timestamping"), __sock_recv_timestamp has some logic for making sure that the two calls to skb_tstamp_tx: skb_tx_timestamp(skb) # Software timestamp in the driver -> skb_tstamp_tx(skb, NULL) and skb_tstamp_tx(skb, &shhwtstamps) # Hardware timestamp in the driver will both do the right thing and in a race-free manner, meaning that skb_tx_timestamp will deliver a cmsg with the software timestamp only, and skb_tstamp_tx with a non-NULL hwtstamps argument will deliver a cmsg with the hardware timestamp only. Why are races even possible? Well, because although the software timestamp skb->tstamp is private per skb, the hardware timestamp skb_hwtstamps(skb) lives in skb_shinfo(skb), an area which is shared between skbs and their clones. And skb_tstamp_tx works by cloning the packets when timestamping them, therefore attempting to perform hardware timestamping on an skb's clone will also change the hardware timestamp of the original skb. And the original skb might have been yet again cloned for software timestamping, at an earlier stage. So the logic in __sock_recv_timestamp can't be as simple as saying "does this skb have a hardware timestamp? if yes I'll send the hardware timestamp to the socket, otherwise I'll send the software timestamp", precisely because the hardware timestamp is shared. Instead, it's quite the other way around: __sock_recv_timestamp says "does this skb have a software timestamp? if yes, I'll send the software timestamp, otherwise the hardware one". This works because the software timestamp is not shared with clones. But that means we have a problem when we attempt hardware timestamping with skbs that don't have the skb->tstamp == 0. __sock_recv_timestamp will say "oh, yeah, this must be some sort of odd clone" and will not deliver the hardware timestamp to the socket. And this is exactly what is happening when we have txtime enabled on the socket: as mentioned, that is put in a union with skb->tstamp, so it is quite easy to mistake it. Do what other drivers do (intel igb/igc) and write zero to skb->tstamp before taking the hardware timestamp. It's of no use to us now (we're already on the TX confirmation path). Fixes: 0d08c9ec7d6e ("enetc: add support time specific departure base on the qos etf") Cc: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-08 12:03:42 -08:00
Alex Marginean	1b2395dfff	net: enetc: set MAC RX FIFO to recommended value On LS1028A, the MAC RX FIFO defaults to the value 2, which is too high and may lead to RX lock-up under traffic at a rate higher than 6 Gbps. Set it to 1 instead, as recommended by the hardware design team and by later versions of the ENETC block guide. Signed-off-by: Alex Marginean <alexandru.marginean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Reviewed-by: Jason Liu <jason.hui.liu@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-08 12:03:42 -08:00
Michael Braun	d8861bab48	gianfar: fix jumbo packets+napi+rx overrun crash When using jumbo packets and overrunning rx queue with napi enabled, the following sequence is observed in gfar_add_rx_frag: \| lstatus \| \| skb \| t \| lstatus, size, flags \| first \| len, data_len, *ptr \| ---+--------------------------------------+-------+-----------------------+ 13 \| 18002348, 9032, INTERRUPT LAST \| 0 \| 9600, 8000, f554c12e \| 12 \| 10000640, 1600, INTERRUPT \| 0 \| 8000, 6400, f554c12e \| 11 \| 10000640, 1600, INTERRUPT \| 0 \| 6400, 4800, f554c12e \| 10 \| 10000640, 1600, INTERRUPT \| 0 \| 4800, 3200, f554c12e \| 09 \| 10000640, 1600, INTERRUPT \| 0 \| 3200, 1600, f554c12e \| 08 \| 14000640, 1600, INTERRUPT FIRST \| 0 \| 1600, 0, f554c12e \| 07 \| 14000640, 1600, INTERRUPT FIRST \| 1 \| 0, 0, f554c12e \| 06 \| 1c000080, 128, INTERRUPT LAST FIRST \| 1 \| 0, 0, abf3bd6e \| 05 \| 18002348, 9032, INTERRUPT LAST \| 0 \| 8000, 6400, c5a57780 \| 04 \| 10000640, 1600, INTERRUPT \| 0 \| 6400, 4800, c5a57780 \| 03 \| 10000640, 1600, INTERRUPT \| 0 \| 4800, 3200, c5a57780 \| 02 \| 10000640, 1600, INTERRUPT \| 0 \| 3200, 1600, c5a57780 \| 01 \| 10000640, 1600, INTERRUPT \| 0 \| 1600, 0, c5a57780 \| 00 \| 14000640, 1600, INTERRUPT FIRST \| 1 \| 0, 0, c5a57780 \| So at t=7 a new packets is started but not finished, probably due to rx overrun - but rx overrun is not indicated in the flags. Instead a new packets starts at t=8. This results in skb->len to exceed size for the LAST fragment at t=13 and thus a negative fragment size added to the skb. This then crashes: kernel BUG at include/linux/skbuff.h:2277! Oops: Exception in kernel mode, sig: 5 [#1] ... NIP [c04689f4] skb_pull+0x2c/0x48 LR [c03f62ac] gfar_clean_rx_ring+0x2e4/0x844 Call Trace: [ec4bfd38] [c06a84c4] _raw_spin_unlock_irqrestore+0x60/0x7c (unreliable) [ec4bfda8] [c03f6a44] gfar_poll_rx_sq+0x48/0xe4 [ec4bfdc8] [c048d504] __napi_poll+0x54/0x26c [ec4bfdf8] [c048d908] net_rx_action+0x138/0x2c0 [ec4bfe68] [c06a8f34] __do_softirq+0x3a4/0x4fc [ec4bfed8] [c0040150] run_ksoftirqd+0x58/0x70 [ec4bfee8] [c0066ecc] smpboot_thread_fn+0x184/0x1cc [ec4bff08] [c0062718] kthread+0x140/0x144 [ec4bff38] [c0012350] ret_from_kernel_thread+0x14/0x1c This patch fixes this by checking for computed LAST fragment size, so a negative sized fragment is never added. In order to prevent the newer rx frame from getting corrupted, the FIRST flag is checked to discard the incomplete older frame. Signed-off-by: Michael Braun <michael-dev@fami-braun.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-05 13:13:32 -08:00
Vladimir Oltean	3a5d12c9be	net: enetc: keep RX ring consumer index in sync with hardware The RX rings have a producer index owned by hardware, where newly received frame buffers are placed, and a consumer index owned by software, where newly allocated buffers are placed, in expectation of hardware being able to place frame data in them. Hardware increments the producer index when a frame is received, however it is not allowed to increment the producer index to match the consumer index (RBCIR) since the ring can hold at most RBLENR[LENGTH]-1 received BDs. Whenever the producer index matches the value of the consumer index, the ring has no unprocessed received frames and all BDs in the ring have been initialized/prepared by software, i.e. hardware owns all BDs in the ring. The code uses the next_to_clean variable to keep track of the producer index, and the next_to_use variable to keep track of the consumer index. The RX rings are seeded from enetc_refill_rx_ring, which is called from two places: 1. initially the ring is seeded until full with enetc_bd_unused(rx_ring), i.e. with 511 buffers. This will make next_to_clean=0 and next_to_use=511: .ndo_open -> enetc_open -> enetc_setup_bdrs -> enetc_setup_rxbdr -> enetc_refill_rx_ring 2. then during the data path processing, it is refilled with 16 buffers at a time: enetc_msix -> napi_schedule -> enetc_poll -> enetc_clean_rx_ring -> enetc_refill_rx_ring There is just one problem: the initial seeding done during .ndo_open updates just the producer index (ENETC_RBPIR) with 0, and the software next_to_clean and next_to_use variables. Notably, it will not update the consumer index to make the hardware aware of the newly added buffers. Wait, what? So how does it work? Well, the reset values of the producer index and of the consumer index of a ring are both zero. As per the description in the second paragraph, it means that the ring is full of buffers waiting for hardware to put frames in them, which by coincidence is almost true, because we have in fact seeded 511 buffers into the ring. But will the hardware attempt to access the 512th entry of the ring, which has an invalid BD in it? Well, no, because in order to do that, it would have to first populate the first 511 entries, and the NAPI enetc_poll will kick in by then. Eventually, after 16 processed slots have become available in the RX ring, enetc_clean_rx_ring will call enetc_refill_rx_ring and then will [ finally ] update the consumer index with the new software next_to_use variable. From now on, the next_to_clean and next_to_use variables are in sync with the producer and consumer ring indices. So the day is saved, right? Well, not quite. Freeing the memory allocated for the rings is done in: enetc_close -> enetc_clear_bdrs -> enetc_clear_rxbdr -> this just disables the ring -> enetc_free_rxtx_rings -> enetc_free_rx_ring -> sets next_to_clean and next_to_use to 0 but again, nothing is committed to the hardware producer and consumer indices (yay!). The assumption is that the ring is disabled, so the indices don't matter anyway, and it's the responsibility of the "open" code path to set those up. .. Except that the "open" code path does not set those up properly. While initially, things almost work, during subsequent enetc_close -> enetc_open sequences, we have problems. To be precise, the enetc_open that is subsequent to enetc_close will again refill the ring with 511 entries, but it will leave the consumer index untouched. Untouched means, of course, equal to the value it had before disabling the ring and draining the old buffers in enetc_close. But as mentioned, enetc_setup_rxbdr will at least update the producer index though, through this line of code: enetc_rxbdr_wr(hw, idx, ENETC_RBPIR, 0); so at this stage we'll have: next_to_clean=0 (in hardware 0) next_to_use=511 (in hardware we'll have the refill index prior to enetc_close) Again, the next_to_clean and producer index are in sync and set to correct values, so the driver manages to limp on. Eventually, 16 ring entries will be consumed by enetc_poll, and the savior enetc_clean_rx_ring will come and call enetc_refill_rx_ring, and then update the hardware consumer ring based upon the new next_to_use. So.. it works? Well, by coincidence, it almost does, but there's a circumstance where enetc_clean_rx_ring won't be there to save us. If the previous value of the consumer index was 15, there's a problem, because the NAPI poll sequence will only issue a refill when 16 or more buffers have been consumed. It's easiest to illustrate this with an example: ip link set eno0 up ip addr add 192.168.100.1/24 dev eno0 ping 192.168.100.1 -c 20 # ping this port from another board ip link set eno0 down ip link set eno0 up ping 192.168.100.1 -c 20 # ping it again from the same other board One by one: 1. ip link set eno0 up -> calls enetc_setup_rxbdr: -> calls enetc_refill_rx_ring(511 buffers) -> next_to_clean=0 (in hw 0) -> next_to_use=511 (in hw 0) 2. ping 192.168.100.1 -c 20 # ping this port from another board enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=1 next_to_clean 0 (in hw 1) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=2 next_to_clean 1 (in hw 2) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=3 next_to_clean 2 (in hw 3) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=4 next_to_clean 3 (in hw 4) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=5 next_to_clean 4 (in hw 5) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=6 next_to_clean 5 (in hw 6) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=7 next_to_clean 6 (in hw 7) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=8 next_to_clean 7 (in hw 8) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=9 next_to_clean 8 (in hw 9) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=10 next_to_clean 9 (in hw 10) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=11 next_to_clean 10 (in hw 11) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=12 next_to_clean 11 (in hw 12) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=13 next_to_clean 12 (in hw 13) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=14 next_to_clean 13 (in hw 14) next_to_use 511 (in hw 0) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=15 next_to_clean 14 (in hw 15) next_to_use 511 (in hw 0) enetc_clean_rx_ring: enetc_refill_rx_ring(16) increments next_to_use by 16 (mod 512) and writes it to hw enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=0 next_to_clean 15 (in hw 16) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=1 next_to_clean 16 (in hw 17) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=2 next_to_clean 17 (in hw 18) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=3 next_to_clean 18 (in hw 19) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=4 next_to_clean 19 (in hw 20) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=5 next_to_clean 20 (in hw 21) next_to_use 15 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=6 next_to_clean 21 (in hw 22) next_to_use 15 (in hw 15) 20 packets transmitted, 20 packets received, 0% packet loss 3. ip link set eno0 down enetc_free_rx_ring: next_to_clean 0 (in hw 22), next_to_use 0 (in hw 15) 4. ip link set eno0 up -> calls enetc_setup_rxbdr: -> calls enetc_refill_rx_ring(511 buffers) -> next_to_clean=0 (in hw 0) -> next_to_use=511 (in hw 15) 5. ping 192.168.100.1 -c 20 # ping it again from the same other board enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=1 next_to_clean 0 (in hw 1) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=2 next_to_clean 1 (in hw 2) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=3 next_to_clean 2 (in hw 3) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=4 next_to_clean 3 (in hw 4) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=5 next_to_clean 4 (in hw 5) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=6 next_to_clean 5 (in hw 6) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=7 next_to_clean 6 (in hw 7) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=8 next_to_clean 7 (in hw 8) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=9 next_to_clean 8 (in hw 9) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=10 next_to_clean 9 (in hw 10) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=11 next_to_clean 10 (in hw 11) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=12 next_to_clean 11 (in hw 12) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=13 next_to_clean 12 (in hw 13) next_to_use 511 (in hw 15) enetc_clean_rx_ring: rx_frm_cnt=1 cleaned_cnt=14 next_to_clean 13 (in hw 14) next_to_use 511 (in hw 15) 20 packets transmitted, 12 packets received, 40% packet loss And there it dies. No enetc_refill_rx_ring (because cleaned_cnt must be equal to 15 for that to happen), no nothing. The hardware enters the condition where the producer (14) + 1 is equal to the consumer (15) index, which makes it believe it has no more free buffers to put packets in, so it starts discarding them: ip netns exec ns0 ethtool -S eno0 \| grep -v ': 0' NIC statistics: Rx ring 0 discarded frames: 8 Summarized, if the interface receives between 16 and 32 (mod 512) frames and then there is a link flap, then the port will eventually die with no way to recover. If it receives less than 16 (mod 512) frames, then the initial NAPI poll [ before the link flap ] will not update the consumer index in hardware (it will remain zero) which will be ok when the buffers are later reinitialized. If more than 32 (mod 512) frames are received, the initial NAPI poll has the chance to refill the ring twice, updating the consumer index to at least 32. So after the link flap, the consumer index is still wrong, but the post-flap NAPI poll gets a chance to refill the ring once (because it passes through cleaned_cnt=15) and makes the consumer index be again back in sync with next_to_use. The solution to this problem is actually simple, we just need to write next_to_use into the hardware consumer index at enetc_open time, which always brings it back in sync after an initial buffer seeding process. The simpler thing would be to put the write to the consumer index into enetc_refill_rx_ring directly, but there are issues with the MDIO locking: in the NAPI poll code we have the enetc_lock_mdio() taken from top-level and we use the unlocked enetc_wr_reg_hot, whereas in enetc_open, the enetc_lock_mdio() is not taken at the top level, but instead by each individual enetc_wr_reg, so we are forced to put an additional enetc_wr_reg in enetc_setup_rxbdr. Better organization of the code is left as a refactoring exercise. Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:34:47 -08:00
Vladimir Oltean	96a5223b91	net: enetc: remove bogus write to SIRXIDR from enetc_setup_rxbdr The Station Interface Receive Interrupt Detect Register (SIRXIDR) contains a 16-bit wide mask of 'interrupt detected' events for each ring associated with a port. Bit i is write-1-to-clean for RX ring i. I have no explanation whatsoever how this line of code came to be inserted in the blamed commit. I checked the downstream versions of that patch and none of them have it. The somewhat comical aspect of it is that we're writing a binary number to the SIRXIDR register, which is derived from enetc_bd_unused(rx_ring). Since the RX rings have 512 buffer descriptors, we end up writing 511 to this register, which is 0x1ff, so we are effectively clearing the 'interrupt detected' event for rings 0-8. This register is not what is used for interrupt handling though - it only provides a summary for the entire SI. The hardware provides one separate Interrupt Detect Register per RX ring, which auto-clears upon read. So there doesn't seem to be any adverse effect caused by this bogus write. There is, however, one reason why this should be handled as a bugfix: next_to_clean _should_ be committed to hardware, just not to that register, and this was obscuring the fact that it wasn't. This is fixed in the next patch, and removing the bogus line now allows the fix patch to be backported beyond that point. Fixes: fd5736bf9f23 ("enetc: Workaround for MDIO register access issue") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:34:47 -08:00
Vladimir Oltean	c76a97218d	net: enetc: force the RGMII speed and duplex instead of operating in inband mode The ENETC port 0 MAC supports in-band status signaling coming from a PHY when operating in RGMII mode, and this feature is enabled by default. It has been reported that RGMII is broken in fixed-link, and that is not surprising considering the fact that no PHY is attached to the MAC in that case, but a switch. This brings us to the topic of the patch: the enetc driver should have not enabled the optional in-band status signaling for RGMII unconditionally, but should have forced the speed and duplex to what was resolved by phylink. Note that phylink does not accept the RGMII modes as valid for in-band signaling, and these operate a bit differently than 1000base-x and SGMII (notably there is no clause 37 state machine so no ACK required from the MAC, instead the PHY sends extra code words on RXD[3:0] whenever it is not transmitting something else, so it should be safe to leave a PHY with this option unconditionally enabled even if we ignore it). The spec talks about this here: https://e2e.ti.com/cfs-file/__key/communityserver-discussions-components-files/138/RGMIIv1_5F00_3.pdf Fixes: 71b77a7a27a3 ("enetc: Migrate to PHYLINK and PCS_LYNX") Cc: Florian Fainelli <f.fainelli@gmail.com> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:34:47 -08:00
Vladimir Oltean	a74dbce9d4	net: enetc: don't disable VLAN filtering in IFF_PROMISC mode Quoting from the blamed commit: In promiscuous mode, it is more intuitive that all traffic is received, including VLAN tagged traffic. It appears that it is necessary to set the flag in PSIPVMR for that to be the case, so VLAN promiscuous mode is also temporarily enabled. On exit from promiscuous mode, the setting made by ethtool is restored. Intuitive or not, there isn't any definition issued by a standards body which says that promiscuity has anything to do with VLAN filtering - it only has to do with accepting packets regardless of destination MAC address. In fact people are already trying to use this misunderstanding/bug of the enetc driver as a justification to transform promiscuity into something it never was about: accepting every packet (maybe that would be the "rx-all" netdev feature?): https://lore.kernel.org/netdev/20201110153958.ci5ekor3o2ekg3ky@ipetronik.com/ This is relevant because there are use cases in the kernel (such as tc-flower rules with the protocol 802.1Q and a vlan_id key) which do not (yet) use the vlan_vid_add API to be compatible with VLAN-filtering NICs such as enetc, so for those, disabling rx-vlan-filter is currently the only right solution to make these setups work: https://lore.kernel.org/netdev/CA+h21hoxwRdhq4y+w8Kwgm74d4cA0xLeiHTrmT-VpSaM7obhkg@mail.gmail.com/ The blamed patch has unintentionally introduced one more way for this to work, which is to enable IFF_PROMISC, however this is non-portable because port promiscuity is not meant to disable VLAN filtering. Therefore, it could invite people to write broken scripts for enetc, and then wonder why they are broken when migrating to other drivers that don't handle promiscuity in the same way. Fixes: 7070eea5e95a ("enetc: permit configuration of rx-vlan-filter with ethtool") Cc: Markus Blöchl <Markus.Bloechl@ipetronik.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:34:47 -08:00
Vladimir Oltean	827b6fd046	net: enetc: fix incorrect TPID when receiving 802.1ad tagged packets When the enetc ports have rx-vlan-offload enabled, they report a TPID of ETH_P_8021Q regardless of what was actually in the packet. When rx-vlan-offload is disabled, packets have the proper TPID. Fix this inconsistency by finishing the TODO left in the code. Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-03-01 13:34:47 -08:00

... 2 3 4 5 6 ...

1656 Commits