linux

iv/linux

Author	SHA1	Message	Date
Mark Gray	076999e460	openvswitch: fix sparse warning incorrect type fix incorrect type in argument 1 (different address spaces) ../net/openvswitch/datapath.c:169:17: warning: incorrect type in argument 1 (different address spaces) ../net/openvswitch/datapath.c:169:17: expected void const * ../net/openvswitch/datapath.c:169:17: got struct dp_nlsk_pids [noderef] __rcu *upcall_portids Found at: https://patchwork.kernel.org/project/netdevbpf/patch/20210630095350.817785-1-mark.d.gray@redhat.com/#24285159 Signed-off-by: Mark Gray <mark.d.gray@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-27 11:48:43 +01:00
Mark Gray	784dcfa56e	openvswitch: fix alignment issues Signed-off-by: Mark Gray <mark.d.gray@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-27 11:48:42 +01:00
Mark Gray	e4252cb666	openvswitch: update kdoc OVS_DP_ATTR_PER_CPU_PIDS Signed-off-by: Mark Gray <mark.d.gray@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-27 11:48:42 +01:00
Yajun Deng	f9b282b36d	net: netlink: add the case when nlh is NULL Add the case when nlh is NULL in nlmsg_report(), so that the caller doesn't need to deal with this case. Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-27 11:43:50 +01:00
Vladimir Oltean	b0e8181762	net: build all switchdev drivers as modules when the bridge is a module Currently, all drivers depend on the bool CONFIG_NET_SWITCHDEV, but only the drivers that call some sort of function exported by the bridge, like br_vlan_enabled() or whatever, have an extra dependency on CONFIG_BRIDGE. Since the blamed commit, all switchdev drivers have a functional dependency upon switchdev_bridge_port_{,un}offload(), which is a pair of functions exported by the bridge module and not by the bridge-independent part of CONFIG_NET_SWITCHDEV. Problems appear when we have: CONFIG_BRIDGE=m CONFIG_NET_SWITCHDEV=y CONFIG_TI_CPSW_SWITCHDEV=y because cpsw, am65_cpsw and sparx5 will then be built-in but they will call a symbol exported by a loadable module. This is not possible and will result in the following build error: drivers/net/ethernet/ti/cpsw_new.o: in function `cpsw_netdevice_event': drivers/net/ethernet/ti/cpsw_new.c:1520: undefined reference to `switchdev_bridge_port_offload' drivers/net/ethernet/ti/cpsw_new.c:1537: undefined reference to `switchdev_bridge_port_unoffload' As mentioned, the other switchdev drivers don't suffer from this because switchdev_bridge_port_offload() is not the first symbol exported by the bridge that they are calling, so they already needed to deal with this in the same way. Fixes: 2f5dc00f7a3e ("net: bridge: switchdev: let drivers inform which bridge ports are offloaded") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-27 11:12:57 +01:00
Saeed Mahameed	9b29a161ef	ethtool: Fix rxnfc copy to user buffer overflow In the cited commit, copy_to_user() got called with the wrong pointer, instead of passing the actual buffer ptr to copy from, a pointer to the pointer got passed, which causes a buffer overflow calltrace to pop up when executing "ethtool -x ethX". Fix ethtool_rxnfc_copy_to_user() to use the rxnfc pointer as passed to the function, instead of a pointer to it. This fixes below call trace: [ 15.533533] ------------[ cut here ]------------ [ 15.539007] Buffer overflow detected (8 < 192)! [ 15.544110] WARNING: CPU: 3 PID: 1801 at include/linux/thread_info.h:200 copy_overflow+0x15/0x20 [ 15.549308] Modules linked in: [ 15.551449] CPU: 3 PID: 1801 Comm: ethtool Not tainted 5.14.0-rc2+ #1058 [ 15.553919] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 15.558378] RIP: 0010:copy_overflow+0x15/0x20 [ 15.560648] Code: e9 7c ff ff ff b8 a1 ff ff ff eb c4 66 0f 1f 84 00 00 00 00 00 55 48 89 f2 89 fe 48 c7 c7 88 55 78 8a 48 89 e5 e8 06 5c 1e 00 <0f> 0b 5d c3 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 [ 15.565114] RSP: 0018:ffffad49c0523bd0 EFLAGS: 00010286 [ 15.566231] RAX: 0000000000000000 RBX: 00000000000000c0 RCX: 0000000000000000 [ 15.567616] RDX: 0000000000000001 RSI: ffffffff8a7912e7 RDI: 00000000ffffffff [ 15.569050] RBP: ffffad49c0523bd0 R08: ffffffff8ab2ae28 R09: 00000000ffffdfff [ 15.570534] R10: ffffffff8aa4ae40 R11: ffffffff8aa4ae40 R12: 0000000000000000 [ 15.571899] R13: 00007ffd4cc2a230 R14: ffffad49c0523c00 R15: 0000000000000000 [ 15.573584] FS: 00007f538112f740(0000) GS:ffff96d5bdd80000(0000) knlGS:0000000000000000 [ 15.575639] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 15.577092] CR2: 00007f5381226d40 CR3: 0000000013542000 CR4: 00000000001506e0 [ 15.578929] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 15.580695] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 15.582441] Call Trace: [ 15.582970] ethtool_rxnfc_copy_to_user+0x30/0x46 [ 15.583815] ethtool_get_rxnfc.cold+0x23/0x2b [ 15.584584] dev_ethtool+0x29c/0x25f0 [ 15.585286] ? security_netlbl_sid_to_secattr+0x77/0xd0 [ 15.586728] ? do_set_pte+0xc4/0x110 [ 15.587349] ? _raw_spin_unlock+0x18/0x30 [ 15.588118] ? __might_sleep+0x49/0x80 [ 15.588956] dev_ioctl+0x2c1/0x490 [ 15.589616] sock_ioctl+0x18e/0x330 [ 15.591143] __x64_sys_ioctl+0x41c/0x990 [ 15.591823] ? irqentry_exit_to_user_mode+0x9/0x20 [ 15.592657] ? irqentry_exit+0x33/0x40 [ 15.593308] ? exc_page_fault+0x32f/0x770 [ 15.593877] ? exit_to_user_mode_prepare+0x3c/0x130 [ 15.594775] do_syscall_64+0x35/0x80 [ 15.595397] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 15.596037] RIP: 0033:0x7f5381226d4b [ 15.596492] Code: 0f 1e fa 48 8b 05 3d b1 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 0d b1 0c 00 f7 d8 64 89 01 48 [ 15.598743] RSP: 002b:00007ffd4cc2a1f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 15.599804] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5381226d4b [ 15.600795] RDX: 00007ffd4cc2a350 RSI: 0000000000008946 RDI: 0000000000000003 [ 15.601712] RBP: 00007ffd4cc2a340 R08: 00007ffd4cc2a350 R09: 0000000000000001 [ 15.602751] R10: 00007f538128a990 R11: 0000000000000246 R12: 0000000000000000 [ 15.603882] R13: 00007ffd4cc2a350 R14: 00007ffd4cc2a4b0 R15: 0000000000000000 [ 15.605042] ---[ end trace 325cf185e2795048 ]--- Fixes: dd98d2895de6 ("ethtool: improve compat ioctl handling") Reported-by: Shannon Nelson <snelson@pensando.io> CC: Arnd Bergmann <arnd@arndb.de> CC: Christoph Hellwig <hch@lst.de> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Tested-by: Shannon Nelson <snelson@pensando.io> Acked-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-27 11:09:43 +01:00
David S. Miller	268ca4129d	Merge branch 'ipa-clock' Alex Elder says: ==================== net: ipa: defer taking uC proxy clock This series rearranges some of the IPA initialization code. The first patch gets rid of two trivial setup and teardown functions, open-coding them in their callers instead. The second patch has memory regions get configured before endpoints. IPA interrupts do not depend on GSI being initialized. Therefore they can be initialized in the config phase rather than waiting for setup. The third patch moves this initialization earlier; memory regions must already be defined, so it's done after memory config. The microcontroller also has no dependency on GSI, though it does require IPA interrupts to be configured. The fourth patch moves microcontroller initialization so it too happens during the config phase rather than setup. Finally, we currently take a "proxy clock" for the microcontroller during the config phase, dropping it only after we learn the microcontroller is initialized. But microcontroller initialization is started by the modem, so there's no point in taking that clock reference before we know the modem has booted. So the last patch arranges to wait to take the "proxy clock" for the microcontroller until we know the modem is about to boot. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 23:09:31 +01:00
Alex Elder	e2f154e6b6	net: ipa: introduce ipa_uc_clock() The first time it's booted, the modem loads and starts the IPA-resident microcontroller. Once the microcontroller has completed its initialization, it notifies the AP it's "ready" by sending an INIT_COMPLETED response message. Until it receives that microcontroller message, the AP must ensure the IPA core clock remains operational. Currently, a "proxy" clock reference is taken in ipa_uc_config(), dropping it again once the message is received. However there could be a long delay between when ipa_config() completes and when modem actually starts. And because the microcontroller gets loaded by the modem, there's no need to get the modem "proxy clock" until the first time it starts. Create a new function ipa_uc_clock() which takes the "proxy" clock reference for the microcontroller. Call it when we get remoteproc SSR notification that the modem is about to start. Keep an additional flag to record whether this proxy clock reference needs to be dropped at shutdown time, and issue a warning if we get the microcontroller message either before the clock reference is taken, or after it has already been dropped. Drop the nearby use of "hh" length modifiers, which are no longer encouraged in the kernel. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 23:09:18 +01:00
Alex Elder	dc8f7e3924	net: ipa: set up the microcontroller earlier Initializing up the IPA-resident microcontroller requires the IPA clock, and sets up two IPA interrupt handlers, but this does not require GSI access. The interrupt handlers also require the clock to be enabled, and require the IPA memory regions to be configured, but neither requires GSI access. As a result, the microcontroller can be initialized during the "config" rather than "setup" phase of IPA initialization. Initialize the microcontroller in ipa_config() rather than ipa_setup(), and rename the called function ipa_uc_config(). Do the inverse in ipa_deconfig() rather than ipa_teardown(), and rename the function for that case ipa_uc_deconfig(). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 23:09:18 +01:00
Alex Elder	1118a14710	net: ipa: set up IPA interrupts earlier Initialization of the IPA driver has several phases: - "init" phase can be done without any access to IPA hardware - "config" phase requires the IPA hardware to be clocked - "setup" phase requires the GSI layer to be functional Currently, initialization for the IPA interrupt handling code occurs in the setup phase. It requires access to the IPA hardware but does not need GSI, so it can be moved to the config phase instead. Call the interrupt configuration function early in ipa_config() rather than from ipa_setup(). Rename ipa_interrupt_setup() to be ipa_interrupt_config(), and ipa_interrupt_teardown() to be ipa_interupt_deconfig(), so their names properly indicate when they get called. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 23:09:18 +01:00
Alex Elder	07e1f6897f	net: ipa: configure memory regions early IPA-resident memory is one of the most primitive resources that needs initialization, so call init_mem_config() early in ipa_config(). This is in preparation for initializing the IPA-resident microcontroller earlier. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 23:09:18 +01:00
Alex Elder	63961f544e	net: ipa: kill ipa_modem_setup() The functions ipa_modem_setup() and ipa_modem_teardown() are trivial wrappers that call ipa_qmi_setup() and ipa_qmi_teardown(). Just call the QMI functions directly, and get rid of the wrappers. Improve the documentation of what setting up QMI does. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 23:09:18 +01:00
Gustavo A. R. Silva	323e0cb473	flow_dissector: Fix out-of-bounds warnings Fix the following out-of-bounds warnings: net/core/flow_dissector.c: In function '__skb_flow_dissect': >> net/core/flow_dissector.c:1104:4: warning: 'memcpy' offset [24, 39] from the object at '<unknown>' is out of the bounds of referenced subobject 'saddr' with type 'struct in6_addr' at offset 8 [-Warray-bounds] 1104 \| memcpy(&key_addrs->v6addrs, &iph->saddr, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1105 \| sizeof(key_addrs->v6addrs)); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from include/linux/ipv6.h:5, from net/core/flow_dissector.c:6: include/uapi/linux/ipv6.h:133:18: note: subobject 'saddr' declared here 133 \| struct in6_addr saddr; \| ^~~~~ >> net/core/flow_dissector.c:1059:4: warning: 'memcpy' offset [16, 19] from the object at '<unknown>' is out of the bounds of referenced subobject 'saddr' with type 'unsigned int' at offset 12 [-Warray-bounds] 1059 \| memcpy(&key_addrs->v4addrs, &iph->saddr, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1060 \| sizeof(key_addrs->v4addrs)); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from include/linux/ip.h:17, from net/core/flow_dissector.c:5: include/uapi/linux/ip.h:103:9: note: subobject 'saddr' declared here 103 \| __be32 saddr; \| ^~~~~ The problem is that the original code is trying to copy data into a couple of struct members adjacent to each other in a single call to memcpy(). So, the compiler legitimately complains about it. As these are just a couple of members, fix this by copying each one of them in separate calls to memcpy(). This helps with the ongoing efforts to globally enable -Warray-bounds and get us closer to being able to tighten the FORTIFY_SOURCE routines on memcpy(). Link: https://github.com/KSPP/linux/issues/109 Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/lkml/d5ae2e65-1f18-2577-246f-bada7eee6ccd@intel.com/ Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 23:02:59 +01:00
Gustavo A. R. Silva	6321c7acb8	ipv4: ip_output.c: Fix out-of-bounds warning in ip_copy_addrs() Fix the following out-of-bounds warning: In function 'ip_copy_addrs', inlined from '__ip_queue_xmit' at net/ipv4/ip_output.c:517:2: net/ipv4/ip_output.c:449:2: warning: 'memcpy' offset [40, 43] from the object at 'fl' is out of the bounds of referenced subobject 'saddr' with type 'unsigned int' at offset 36 [-Warray-bounds] 449 \| memcpy(&iph->saddr, &fl4->saddr, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 450 \| sizeof(fl4->saddr) + sizeof(fl4->daddr)); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The problem is that the original code is trying to copy data into a couple of struct members adjacent to each other in a single call to memcpy(). This causes a legitimate compiler warning because memcpy() overruns the length of &iph->saddr and &fl4->saddr. As these are just a couple of struct members, fix this by using direct assignments, instead of memcpy(). This helps with the ongoing efforts to globally enable -Warray-bounds and get us closer to being able to tighten the FORTIFY_SOURCE routines on memcpy(). Link: https://github.com/KSPP/linux/issues/109 Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/lkml/d5ae2e65-1f18-2577-246f-bada7eee6ccd@intel.com/ Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 23:02:17 +01:00
Alex Elder	22171146f8	net: ipa: enable inline checksum offload for IPA v4.5+ The RMNet and IPA drivers both support inline checksum offload now. So enable it for the TX and RX modem endoints for IPA version 4.5+. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:56:34 +01:00
David S. Miller	2739bd76fc	Merge branch 'ipa-kill-validation' Alex Elder says: ==================== net: ipa: kill IPA_VALIDATION A few months ago I proposed cleaning up some code that validates certain things conditionally, arguing that doing so once is enough, thus doing so always should not be necessary. https://lore.kernel.org/netdev/20210320141729.1956732-1-elder@linaro.org/ Leon Romanovsky felt strongly that this was a mistake, and in the end I agreed to change my plans. This series finally completes what I said I would do about this, ultimately eliminating the IPA_VALIDATION symbol and conditional code entirely. The first patch both extends and simplifies some validation done for IPA immediate commands, and performs those tests unconditionally. The second patch fixes a bug that wasn't normally exposed because of the conditional compilation (a reason Leon was right about this). It makes filter and routing table validation occur unconditionally. The third eliminates the remaining conditionally-defined code and removes the line in the Makefile used to enable validation. And the fourth removes all comments containing ipa_assert() statements, replacing most of them with WARN_ON() calls. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:38:24 +01:00
Alex Elder	5bc5588466	net: ipa: use WARN_ON() rather than assertions I've added commented assertions to record certain properties that can be assumed to hold in certain places in the IPA code. Convert these into real WARN_ON() calls so the assertions are actually checked, using the standard WARN_ON() mechanism. Where errors can be returned, return an error if a warning is triggered. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:38:11 +01:00
Alex Elder	442d68ebf0	net: ipa: kill the remaining conditional validation code There are only a few remaining spots that validate IPA code conditional on whether a symbol is defined at compile time. The checks are not expensive, so just build them always. This completes the removal of all CONFIG_VALIDATE/CONFIG_VALIDATION IPA code. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:38:11 +01:00
Alex Elder	546948bf36	net: ipa: always validate filter and route tables All checks in ipa_table_validate_build() are computed at build time, so build that unconditionally. In ipa_table_valid() calls to ipa_table_valid_one() are missing the IPA pointer parameter is missing in (a bug that shows up only when IPA_VALIDATE is defined). Don't bother checking whether hashed table memory regions are valid if hashed tables are not supported. With those things fixed, have these table validation functions built unconditionally (not dependent on IPA_VALIDATE). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:38:11 +01:00
Alex Elder	f2c1dac0ab	net: ipa: fix ipa_cmd_table_valid() Stop supporting different sizes for hashed and non-hashed filter or route tables. Add BUILD_BUG_ON() calls to verify the sizes of the fields in the filter/route table initialization immediate command are the same. Add a check to ipa_cmd_table_valid() to ensure the size of the memory region being checked fits within the immediate command field that must hold it. Remove two Boolean parameters used only for error reporting. This actually fixes a bug that would only show up if IPA_VALIDATE were defined. Define ipa_cmd_table_valid() unconditionally (no longer dependent on IPA_VALIDATE). Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:38:11 +01:00
David S. Miller	beeee08ca1	Merge branch 'sja1105-bridge-port-traffic-termination' Vladimir Oltean says: ==================== Traffic termination for sja1105 ports under VLAN-aware bridge This set of patches updates the sja1105 DSA driver to be able to send and receive network stack packets on behalf of a VLAN-aware upper bridge interface. The reasons why this has traditionally been a problem are explained in the "Traffic support" section of Documentation/networking/dsa/sja1105.rst. (the entire documentation will be revised in a separate patch series). The limitations that have prevented us from doing this so far have now been partially lifted by the bridge's ability to send a packet with skb->offload_fwd_mark = true, which means that the accelerator is allowed to look up its hardware FDB when sending a packet and deliver it to those destination ports. Basically skb->dev is now just a conduit to the switchdev driver's ndo_start_xmit(), and does not guarantee that the packet will really be transmitted on that port (but it will be transmitted where it should, nonetheless). Apart from the ability to perform IP termination on VLAN-aware bridges on top of sja1105 interfaces, we also gain the following features: - VLAN-aware software bridging between sja1105 ports and "foreign" (non-DSA) interfaces - software bridging between sja1105 bridge ports, and software LAG uppers of sja1105 ports (as long as the bridge is VLAN-aware) The only things that don't work are: 1. to create an AF_PACKET socket on top of a sja1105 port that is under a VLAN-aware bridge. This is because the "imprecise RX" procedure selects an RX port for data plane* packets based on the assumption that the packet will land in the bridge's data path. If ebtables rules are added to remove some packets from the bridge's data path, that assumption will be broken. Nonetheless, this is not a limitation that negatively impacts the known use cases with this switch. If there was a way to impose user space restrictions against creating AF_PACKET sockets on this particular configuration, I could be interested in adding those restrictions, but I think there are other known broken configs already which are not checked by the kernel today (like for example that the bridge's rx_handler steals packets anyway from AF_PACKET sockets with exact-match ptype handlers, as opposed to ptype_all which are processed earlier; this is precisely the reason why ebtables rules are generally needed to avoid that). 2. to send traffic on behalf of an 8021q upper of a standalone interface, while other sja1105 ports are part of a VLAN-aware bridge. This is because sja1105 sets ds->vlan_filtering_is_global = true, so we cannot make the standalone port ignore the VLAN header from the packet on RX, so we cannot make tag_8021q enforce its own pvid for the packets belonging to that port's 8021q upper. So we cannot determine in the first place that packets come from that port, unless we iterate through all 8021q uppers of all ports, and enforce uniqueness of VLAN IDs. I am not sure if this is what I want / if it is worth it, so currently all 8021q uppers are denied, regardless of whether the switch has ports under a VLAN-aware bridge or not (otherwise it becomes complicated even to track the state). Nonetheless, the VID uniqueness of all 8021q uppers does raise another question: what to do with VID 0, which has no 8021q upper, but the 8021q module adds it to our RX filter with vlan_vid_add(). I am honestly not sure what to do. The best I can do is enable a hardware bit in sja1105 which reclassifies VID 0 frames to the PVID, and they will be sent on the CPU port using either the tag_8021q pvid of standalone ports, or the bridge pvid of VLAN-aware ports. So at the very least, those packets are still 'kinda' processed as if they were untagged, but the VID 0 is lost, though. In my defence, Marvell appears to do the same thing with reclassifying VID 0 frames, see commit b8b79c414eca ("net: dsa: mv88e6xxx: Fix adding vlan 0"). *Control packets (currently hardcoded in sja1105 as link-local packets for MAC DA ranges 01-80-c2-xx-xx-xx and 01-1b-19-xx-xx-xx) are received based on packet traps and their precise source port is always known. I have taken one patch from Colin because my work conflicts with his, and integrating it all through the same series avoids that. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:35:33 +01:00
Vladimir Oltean	edac6f6332	Revert "net: dsa: Allow drivers to filter packets they can decode source port from" This reverts commit cc1939e4b3aaf534fb2f3706820012036825731c. Currently 2 classes of DSA drivers are able to send/receive packets directly through the DSA master: - drivers with DSA_TAG_PROTO_NONE - sja1105 Now that sja1105 has gained the ability to perform traffic termination even under the tricky case (VLAN-aware bridge), and that is much more functional (we can perform VLAN-aware bridging with foreign interfaces), there is no reason to keep this code in the receive path of the network core. So delete it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:35:22 +01:00
Vladimir Oltean	b6ad86e6ad	net: dsa: sja1105: add bridge TX data plane offload based on tag_8021q The main desire for having this feature in sja1105 is to support network stack termination for traffic coming from a VLAN-aware bridge. For sja1105, offloading the bridge data plane means sending packets as-is, with the proper VLAN tag, to the chip. The chip will look up its FDB and forward them to the correct destination port. But we support bridge data plane offload even for VLAN-unaware bridges, and the implementation there is different. In fact, VLAN-unaware bridging is governed by tag_8021q, so it makes sense to have the .bridge_fwd_offload_add() implementation fully within tag_8021q. The key difference is that we only support 1 VLAN-aware bridge, but we support multiple VLAN-unaware bridges. So we need to make sure that the forwarding domain is not crossed by packets injected from the stack. For this, we introduce the concept of a tag_8021q TX VLAN for bridge forwarding offload. As opposed to the regular TX VLANs which contain only 2 ports (the user port and the CPU port), a bridge data plane TX VLAN is "multicast" (or "imprecise"): it contains all the ports that are part of a certain bridge, and the hardware will select where the packet goes within this "imprecise" forwarding domain. Each VLAN-unaware bridge has its own "imprecise" TX VLAN, so we make use of the unique "bridge_num" provided by DSA for the data plane offload. We use the same 3 bits from the tag_8021q VLAN ID format to encode this bridge number. Note that these 3 bit positions have been used before for sub-VLANs in best-effort VLAN filtering mode. The difference is that for best-effort, the sub-VLANs were only valid on RX (and it was documented that the sub-VLAN field needed to be transmitted as zero). Whereas for the bridge data plane offload, these 3 bits are only valid on TX. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:35:22 +01:00
Vladimir Oltean	884be12f85	net: dsa: sja1105: add support for imprecise RX This is already common knowledge by now, but the sja1105 does not have hardware support for DSA tagging for data plane packets, and tag_8021q sets up a unique pvid per port, transmitted as VLAN-tagged towards the CPU, for the source port to be decoded nonetheless. When the port is part of a VLAN-aware bridge, the pvid committed to hardware is taken from the bridge and not from tag_8021q, so we need to work with that the best we can. Configure the switches to send all packets to the CPU as VLAN-tagged (even ones that were originally untagged on the wire) and make use of dsa_untag_bridge_pvid() to get rid of it before we send those packets up the network stack. With the classified VLAN used by hardware known to the tagger, we first peek at the VID in an attempt to figure out if the packet was received from a VLAN-unaware port (standalone or under a VLAN-unaware bridge), case in which we can continue to call dsa_8021q_rcv(). If that is not the case, the packet probably came from a VLAN-aware bridge. So we call the DSA helper that finds for us a "designated bridge port" - one that is a member of the VLAN ID from the packet, and is in the proper STP state - basically these are all checks performed by br_handle_frame() in the software RX data path. The bridge will accept the packet as valid even if the source port was maybe wrong. So it will maybe learn the MAC SA of the packet on the wrong port, and its software FDB will be out of sync with the hardware FDB. So replies towards this same MAC DA will not work, because the bridge will send towards a different netdev. This is where the bridge data plane offload ("imprecise TX") added by the next patch comes in handy. The software FDB is wrong, true, but the hardware FDB isn't, and by offloading the bridge forwarding plane we have a chance to right a wrong, and have the hardware look up the FDB for us for the reply packet. So it all cancels out. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:35:22 +01:00
Vladimir Oltean	19fa937a39	net: dsa: sja1105: deny more than one VLAN-aware bridge With tag_sja1105.c's only ability being to perform an imprecise RX procedure and identify whether a packet comes from a VLAN-aware bridge or not, we have no way to determine whether a packet with VLAN ID 5 comes from, say, br0 or br1. Actually we could, but it would mean that we need to restrict all VLANs from br0 to be different from all VLANs from br1, and this includes the default_pvid, which makes a setup with 2 VLAN-aware bridges highly imprectical. The fact of the matter is that this isn't even that big of a practical limitation, since even with a single VLAN-aware bridge we can pretty much enforce forwarding isolation based on the VLAN port membership. So in the end, tell the user that they need to model their setup using a single VLAN-aware bridge. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:35:22 +01:00
Vladimir Oltean	4fbc08bd36	net: dsa: sja1105: deny 8021q uppers on ports Now that best-effort VLAN filtering is gone and we are left with the imprecise RX and imprecise TX based in VLAN-aware mode, where the tagger just guesses the source port based on plausibility of the VLAN ID, 8021q uppers installed on top of a standalone port, while other ports of that switch are under a VLAN-aware bridge don't quite "just work". In fact it could be possible to restrict the VLAN IDs used by the 8021q uppers to not be shared with VLAN IDs used by that VLAN-aware bridge, but then the tagger needs to be patched to search for 8021q uppers too, not just for the "designated bridge port" which will be introduced in a later patch. I haven't given a possible implementation full thought, it seems maybe possible but not worth the effort right now. The only certain thing is that currently the tagger won't be able to figure out the source port for these packets because they will come with the VLAN ID of the 8021q upper and are no longer retagged to a tag_8021q sub-VLAN like the best effort VLAN filtering code used to do. So just deny these for the moment. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:35:22 +01:00
Vladimir Oltean	6dfd23d35e	net: dsa: sja1105: delete vlan delta save/restore logic With the best_effort_vlan_filtering mode now gone, the driver does not have 3 operating modes anymore (VLAN-unaware, VLAN-aware and best effort), but only 2. The idea is that we will gain support for network stack I/O through a VLAN-aware bridge, using the data plane offload framework (imprecise RX, imprecise TX). So the VLAN-aware use case will be more functional. But standalone ports that are part of the same switch when some other ports are under a VLAN-aware bridge should work too. Termination on those should work through the tag_8021q RX VLAN and TX VLAN. This was not possible using the old logic, because: - in VLAN-unaware mode, only the tag_8021q VLANs were committed to hw - in VLAN-aware mode, only the bridge VLANs were committed to hw - in best-effort VLAN mode, both the tag_8021q and bridge VLANs were committed to hw The strategy for the new VLAN-aware mode is to allow the bridge and the tag_8021q VLANs to coexist in the VLAN table at the same time. [ yes, we need to make sure that the bridge cannot install a tag_8021q VLAN, but ] This means that the save/restore logic introduced by commit ec5ae61076d0 ("net: dsa: sja1105: save/restore VLANs using a delta commit method") does not serve a purpose any longer. We can delete it and restore the old code that simply adds a VLAN to the VLAN table and calls it a day. Note that we keep the sja1105_commit_pvid() function from those days, but adapt it slightly. Ports that are under a VLAN-aware bridge use the bridge's pvid, ports that are standalone or under a VLAN-unaware bridge use the tag_8021q pvid, for local termination or VLAN-unaware forwarding. Now, when the vlan_filtering property is toggled for the bridge, the pvid of the ports beneath it is the only thing that's changing, we no longer delete some VLANs and restore others. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:35:22 +01:00
Colin Ian King	d63f8877c4	net: dsa: sja1105: remove redundant re-assignment of pointer table The pointer table is being re-assigned with a value that is never read. The assignment is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:35:22 +01:00
Vladimir Oltean	ee80dd2e89	net: bridge: add a helper for retrieving port VLANs from the data path Introduce a brother of br_vlan_get_info() which is protected by the RCU mechanism, as opposed to br_vlan_get_info() which relies on taking the write-side rtnl_mutex. This is needed for drivers which need to find out whether a bridge port has a VLAN configured or not. For example, certain DSA switches might not offer complete source port identification to the CPU on RX, just the VLAN in which the packet was received. Based on this VLAN, we cannot set an accurate skb->dev ingress port, but at least we can configure one that behaves the same as the correct one would (this is possible because DSA sets skb->offload_fwd_mark = 1). When we look at the bridge RX handler (br_handle_frame), we see that what matters regarding skb->dev is the VLAN ID and the port STP state. So we need to select an skb->dev that has the same bridge VLAN as the packet we're receiving, and is in the LEARNING or FORWARDING STP state. The latter is easy, but for the former, we should somehow keep a shadow list of the bridge VLANs on each port, and a lookup table between VLAN ID and the 'designated port for imprecise RX'. That is rather complicated to keep in sync properly (the designated port per VLAN needs to be updated on the addition and removal of a VLAN, as well as on the join/leave events of the bridge on that port). So, to avoid all that complexity, let's just iterate through our finite number of ports and ask the bridge, for each packet: "do you have this VLAN configured on this port?". Cc: Roopa Prabhu <roopa@nvidia.com> Cc: Nikolay Aleksandrov <nikolay@nvidia.com> Cc: Ido Schimmel <idosch@nvidia.com> Cc: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:35:22 +01:00
Vladimir Oltean	f7cdb3ecc9	net: bridge: update BROPT_VLAN_ENABLED before notifying switchdev in br_vlan_filter_toggle SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING is notified by the bridge from two places: - nbp_vlan_init(), during bridge port creation - br_vlan_filter_toggle(), during a netlink/sysfs/ioctl change requested by user space If a switchdev driver uses br_vlan_enabled(br_dev) inside its handler for the SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING attribute notifier, different things will be seen depending on whether the bridge calls from the first path or the second: - in nbp_vlan_init(), br_vlan_enabled() reflects the current state of the bridge - in br_vlan_filter_toggle(), br_vlan_enabled() reflects the past state of the bridge This can lead in some cases to complications in driver implementation, which can be avoided if these could reliably use br_vlan_enabled(). Nothing seems to depend on this behavior, and it seems overall more straightforward for br_vlan_enabled() to return the proper value even during the SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING notifier, so temporarily enable the bridge option, then revert it if the switchdev notifier failed. Cc: Roopa Prabhu <roopa@nvidia.com> Cc: Nikolay Aleksandrov <nikolay@nvidia.com> Cc: Ido Schimmel <idosch@nvidia.com> Cc: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:35:22 +01:00
David S. Miller	9bff668419	mlx5-updates-2021-07-24 This series aims to reduce coupling in mlx5e, particularly between RX resources (TIRs, RQTs) and numerous code units that use them. This refactoring is required for upcoming features, ADQ and TX lag hashing. The issue with the current code is that TIRs and RQTs are unmanaged, different places all over the driver create, destroy, track and configure them, often in an uncoordinated way. The responsibilities of different units become vague, leading to a lot of hidden dependencies between numerous units and tight coupling between them, which is prone to bugs and hard to maintain. The result of this refactoring is: 1. Creating a manager for RX resources, that controls their lifecycle and provides a clear API, which restricts the set of actions that other units can do. 2. Using object-oriented approach for TIRs, RQTs and RX resource manager (struct mlx5e_rx_res). 3. Fixing a few bugs and misbehaviors found during the refactoring. 4. Reducing the amount of dependencies, removing hidden dependencies, making them one-directional and organizing the code in clear abstraction layers. 5. Explicitly exposing the remaining weird dependencies. 6. Simplifying and organizing code that creates and modifies TIRs and RQTs. -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmD+5+EACgkQSD+KveBX +j6l/Qf9FHU8xRCBW0Bn4Ja0CnS/C3MJMIBTRbe+Qq2noj6qr8y6ut8Z7jV3hGaw dvcA6bVxpWPsqrY5CX4errfo2T4QiAsYhIeXrE2liCZsMUeYeZLusAYquks0oo57 ai2FnihMvL5Xf9U2r+aAeT4dxpQBJ7ANON+b+ZRCFjwRl8CZkd7i4pnifLYpptye d6de6QSZkvxc818homiALdu0VH8HOBZALzfMI9y2Oh9peW+irc2UhEoThukD5VCD Turm6WBQ4xteQq3uNKebbqg0cEevyQjHoUmBg799xD0tHzFbSp+kV3ivHKxsksUF sOBdEumQjDc6J22uTnA086dIStN7sA== =jxI/ -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2021-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-updates-2021-07-24 This series aims to reduce coupling in mlx5e, particularly between RX resources (TIRs, RQTs) and numerous code units that use them. This refactoring is required for upcoming features, ADQ and TX lag hashing. The issue with the current code is that TIRs and RQTs are unmanaged, different places all over the driver create, destroy, track and configure them, often in an uncoordinated way. The responsibilities of different units become vague, leading to a lot of hidden dependencies between numerous units and tight coupling between them, which is prone to bugs and hard to maintain. The result of this refactoring is: 1. Creating a manager for RX resources, that controls their lifecycle and provides a clear API, which restricts the set of actions that other units can do. 2. Using object-oriented approach for TIRs, RQTs and RX resource manager (struct mlx5e_rx_res). 3. Fixing a few bugs and misbehaviors found during the refactoring. 4. Reducing the amount of dependencies, removing hidden dependencies, making them one-directional and organizing the code in clear abstraction layers. 5. Explicitly exposing the remaining weird dependencies. 6. Simplifying and organizing code that creates and modifies TIRs and RQTs. Saeed Mahameed says: ==================== mlx5 updates 2021-07-24 This series provides some refactoring to mlx5e RX resource management, it is required for upcoming ADQ and TX lag hashing features. The first two patches in this series : net/mlx5e: Prohibit inner indir TIRs in IPoIB net/mlx5e: Block LRO if firmware asks for tunneled LRO Were supposed to go to net, but due to dependency and timing they were included here. I would appreciate it if you'd apply them to net and mark for -stable. For more information please see tag log below. Please pull and let me know if there is any problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:32:24 +01:00
David S. Miller	d20e5880fe	linux-can-next-for-5.15-20210725 -----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEEK3kIWJt9yTYMP3ehqclaivrt76kFAmD9M0sTHG1rbEBwZW5n dXRyb25peC5kZQAKCRCpyVqK+u3vqT6SB/9S0h403FVv8IMCX2Vs3lD1ydQag8Uu B7Pnb2eI5lQ7sRyaCTBxQvZIwcClx1DKdRDqeyvHQ3GqMA9mhysyovhFfXt6OmaD VyPIOBxquothlelZwMz743dXAta8WVyMHxIQkn4PJuEHoaFRLXI1TcXtlN/rPfSr VRt/QTN8Z4xYVx34YNQcK0jxkgVkkWY0Bv0hjHO0+XIW27zeX6JeAsJZZxQAQpmx bT4pZq9oKYSq9WTIdh8uq2HeBIbcA8wlVHV7ixdVfe0hWrBW5INfB5hsQ+THjJ51 7XhO3cVaOuQZnZEta69RSFBvZiugevZXN1AJudgle16+317LTVUZ/bTX =/GFB -----END PGP SIGNATURE----- Merge tag 'linux-can-next-for-5.15-20210725' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next linux-can-next-for-5.15-20210725 Marc Kleine-Budde says: ==================== pull-request: can-next 2021-07-25 this is a pull request of 46 patches for net-next/master. The first 6 patches target the CAN J1939 protocol. One is from gushengxian, fixing a grammatical error, 5 are by me fixing a checkpatch warning, make use of the fallthrough pseudo-keyword, and use consistent variable naming. The next 3 patches target the rx-offload helper, are by me and improve the performance and fix the local softirq work pending error, when napi_schedule() is called from threaded IRQ context. The next 3 patches are by Vincent Mailhol and me update the CAN bittiming and transmitter delay compensation, the documentation for the struct can_tdc is fixed, clear data_bittiming if FD mode is turned off and a redundant check is removed. Followed by 4 patches targeting the m_can driver. Faiz Abbas's patches add support for CAN PHY via the generic phy subsystem. Yang Yingliang converts the driver to use devm_platform_ioremap_resource_byname(). And a patch by me which removes the unused support for custom bit timing. Andy Shevchenko contributes 2 patches for the mcp251xfd driver to prepare the driver for ACPI support. A patch by me adds support for shared IRQ handlers. Zhen Lei contributes 3 patches to convert the esd_usb2, janz-ican3 and the at91_can driver to make use of the DEVICE_ATTR_RO/RW() macros. The next 8 patches are by Peng Li and provide general cleanups for the at91_can driver. The next 7 patches target the peak driver. Frist 2 cleanup patches by me for the peak_pci driver, followed by Stephane Grosjean' patch to print the name and firmware version of the detected hardware. The peak_usb driver gets a cleanup patch, loopback and one-shot mode and an upgrading of the bus state change handling in Stephane Grosjean's patches. Vincent Mailhol provides 6 cleanup patches for the etas_es58x driver. In the last 3 patches Angelo Dureghello add support for the mcf5441x SoC to the flexcan driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 22:07:29 +01:00
Maxim Mikityanskiy	09f8356918	net/mlx5e: Use the new TIR API for kTLS One of the previous commits introduced a dedicated object for a TIR. kTLS code creates a TIR per connection using the low-level mlx5_core API. This commit converts it to the new mlx5e_tir API. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-07-26 09:50:41 -07:00
Maxim Mikityanskiy	65d6b6e5a5	net/mlx5e: Move management of indir traffic types to rx_res This commit moves the responsibility of keeping the RSS configuration for different traffic types to en/rx_res.{c,h}, hiding the implementation details behind the new getters, and abandons all usage of struct mlx5e_tirc_config, which is no longer useful and superseded by struct mlx5e_rss_params_traffic_type. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-07-26 09:50:41 -07:00
Maxim Mikityanskiy	a6696735d6	net/mlx5e: Convert TIR to a dedicated object Code related to TIR is now encapsulated into a dedicated object and put into new files en/tir.{c,h}. All usages are converted. The Builder pattern is used to initialize a TIR. It allows to create a multitude of different configurations, turning on and off some specific features in different combinations, without having long parameter lists, initializers per usage and repeating code in initializers. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-07-26 09:50:40 -07:00
Maxim Mikityanskiy	6fe5ff2c77	net/mlx5e: Create struct mlx5e_rss_params_hash This commit introduces a new struct to store RSS hash parameters: hash function and hash key. The existing usages are changed to use the new struct. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-07-26 09:50:40 -07:00
Maxim Mikityanskiy	4b3e42eecb	net/mlx5e: Remove mdev from mlx5e_build_indir_tir_ctx_common() In order to drop a dependency to mdev and make the function more universal, stop passing mdev to mlx5e_build_indir_tir_ctx_common() and pass transport domain directly instead. It also prepares this function to be used in other contexts that need a custom transport domain, such as hairpin. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-07-26 09:50:40 -07:00
Maxim Mikityanskiy	a402e3a747	net/mlx5e: Remove lro_param from mlx5e_build_indir_tir_ctx_common() In order to reduce the list of parameters and to define clearer responsibility for mlx5e_build_indir_tir_ctx_common(), stop passing lro_param and instead call mlx5e_build_tir_ctx_lro() directly where needed. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-07-26 09:50:39 -07:00
Maxim Mikityanskiy	983c9da2b1	net/mlx5e: Remove mlx5e_priv usage from mlx5e_build_tir_ctx() The functions that build TIR context for TIR create and modify commands used to depend on struct mlx5e_priv and fetch some values directly from different places. It increased coupling of code and the chance of weird misbehavior due to hidden complex dependencies. As the first step, this commit removes the priv parameter from these functions. Instead, the necessary values are passed directly. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-07-26 09:50:39 -07:00
Maxim Mikityanskiy	093d4bc173	net/mlx5e: Use mlx5e_rqt_get_rqtn to access RQT hardware id In order to abstract from implementation details of mlx5e_rqt, use the mlx5e_rqt_get_rqtn getter instead of accessing the field directly. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-07-26 09:50:39 -07:00
Maxim Mikityanskiy	0570c1c958	net/mlx5e: Take RQT out of TIR and group RX resources RQT is not part of TIR, as multiple TIRs may point to the same RQT, as it happens with indir_tir and inner_indir_tir. These instances of a TIR don't use the embedded RQT. This commit takes RQT out of TIR, making them independent. The RQTs are placed into struct mlx5e_rx_res, and items in that struct are regrouped by functionality: RSS, channels and PTP. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-07-26 09:50:39 -07:00
Maxim Mikityanskiy	3f22d6c77b	net/mlx5e: Move RX resources to a separate struct This commit moves RQTs and TIRs to a separate struct that is allocated dynamically in profiles that support these RX resources (all profiles, except IPoIB PKey). It also allows to remove rqt_enabled flags, as RQTs are always enabled in profiles that support RX resources. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-07-26 09:50:38 -07:00
Maxim Mikityanskiy	4ad3184977	net/mlx5e: Move mlx5e_build_rss_params() call to init_rx RSS params belong to the RX side initialization. Move them from profile->init to profile->init_rx stage to allow the next commit to move rss_params out of priv to a dynamically-allocated struct. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-07-26 09:50:38 -07:00
Maxim Mikityanskiy	06e9f13ac5	net/mlx5e: Convert RQT to a dedicated object Code related to RQT is now encapsulated into a dedicated object and put into new files en/rqt.{c,h}. All usages are converted. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-07-26 09:50:37 -07:00
Maxim Mikityanskiy	bc5506a166	net/mlx5e: Check if inner FT is supported outside of create/destroy functions Move the mlx5e_tunnel_inner_ft_supported() check for inner flow tables support outside of mlx5e_create_inner_ttc_table() and mlx5e_destroy_inner_ttc_table(). It allows to avoid accessing invalid TIRNs of inner indirect TIRs. In a later commit these accesses will be replaced by getters that will WARN if inner indirect TIRs don't exist. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-07-26 09:50:37 -07:00
Maxim Mikityanskiy	69994ef3da	net/mlx5: Take TIR destruction out of the TIR list lock res->td.list_lock protects the list of TIRs. There is no point to call mlx5_core_destroy_tir() and invoke a firmware command under this lock. This commit moves this call outside of the lock and puts it after deleting the TIR from the list to ensure that TIRs are always alive while in the list. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-07-26 09:50:37 -07:00
Maxim Mikityanskiy	26ab7b3845	net/mlx5e: Block LRO if firmware asks for tunneled LRO This commit does a cleanup in LRO configuration. LRO is a parameter of an RQ, but its state is changed by modifying a TIR related to the RQ. The current status: LRO for tunneled packets is not supported in the driver, inner TIRs may enable LRO on creation, but LRO status of inner TIRs isn't changed in mlx5e_modify_tirs_lro(). This is inconsistent, but as long as the firmware doesn't declare support for tunneled LRO, it works, because the same RQs are shared between the inner and outer TIRs. This commit does two fixes: 1. If the firmware has the tunneled LRO capability, LRO is blocked altogether, because it's not possible to block it for inner TIRs only, when the same RQs are shared between inner and outer TIRs, and the driver won't be able to handle tunneled LRO traffic. 2. mlx5e_modify_tirs_lro() is patched to modify LRO state for all TIRs, including inner ones, because all TIRs related to an RQ should agree on their LRO state. Fixes: 7b3722fa9ef6 ("net/mlx5e: Support RSS for GRE tunneled packets") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-07-26 09:50:37 -07:00
Maxim Mikityanskiy	9c43f3865c	net/mlx5e: Prohibit inner indir TIRs in IPoIB TIR's rx_hash_field_selector_inner can be enabled only when tunneled_offload_en = 1. tunneled_offload_en is filled according to the tunneled_offload_en field in struct mlx5e_params, which is false in the IPoIB profile. On the other hand, the IPoIB profile passes inner_ttc = true to mlx5e_create_indirect_tirs, which potentially allows the latter function to attempt to create inner indirect TIRs without having tunneled_offload_en set. This commit prohibits this behavior by passing inner_ttc = false to mlx5e_create_indirect_tirs. The latter function won't attempt to create inner indirect TIRs. As inner indirect TIRs are not created in the IPoIB profile (this commit blocks it explicitly, and even before they would have failed to be created), the call to mlx5e_create_inner_ttc_table in mlx5i_create_flow_steering is a no-op and can be removed. Fixes: 46dc933cee82 ("net/mlx5e: Provide explicit directive if to create inner indirect tirs") Fixes: 458821c72bd0 ("net/mlx5e: IPoIB, Add inner TTC table to IPoIB flow steering") Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2021-07-26 09:50:36 -07:00
Jason Wang	af996031e1	net: ixp4xx_hss: use dma_pool_zalloc The dma_pool_zalloc combines dma_pool_alloc/memset. Therefore, the dma_pool_alloc/memset can be replaced with dma_pool_zalloc which is more compact. Signed-off-by: Jason Wang <wangborong@cdjrlc.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 14:17:55 +01:00
Yinjun Zhang	9d32e4e7e9	nfp: add support for coalesce adaptive feature Use dynamic interrupt moderation library to implement coalesce adaptive feature for nfp driver. Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Signed-off-by: Yu Xiao <yu.xiao@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2021-07-26 12:21:47 +01:00

1 2 3 4 5 ...

1030302 Commits