linux

iv/linux

Author	SHA1	Message	Date
Tariq Toukan	5e0d2eef77	net/mlx5e: XDP, Support Enhanced Multi-Packet TX WQE Add support for the HW feature of multi-packet WQE in XDP xmit flow. The conventional TX descriptor (WQE, Work Queue Element) serves a single packet. Our HW has support for multi-packet WQE (MPWQE) in which a single descriptor serves multiple TX packets. This reduces both the PCI overhead and the CPU cycles wasted on writing them. In this patch we add support for the HW feature, which is supported starting from ConnectX-5. Performance: Tested packet rate for UDP 64Byte multi-stream over ConnectX-5 NICs. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz XDP_TX: We see a huge gain on single port ConnectX-5, and reach the 100 Mpps milestone. * Single-port HCA: Before: 70 Mpps After: 100 Mpps (+42.8%) * Dual-port HCA: Before: 51.7 Mpps After: 57.3 Mpps (+10.8%) * In both cases we tested traffic on one port and for now On Dual-port HCAs we see only small gain, we are working to overcome this bottleneck, but for the moment only with experimental firmware on dual port HCAs we can reach the wanted numbers as seen on Single-port HCAs. XDP_REDIRECT: Redirect from (A) ConnectX-5 to (B) ConnectX-5. Due to a setup limitation, (A) and (B) are on different NUMA nodes, so absolute performance numbers are not optimal. Note: Below is the transmit rate of (B), not the redirect rate of (A) which is in some cases higher. * (B) is single-port: Before: 77 Mpps After: 90 Mpps (+16.8%) * (B) is dual-port: Before: 61 Mpps After: 72 Mpps (+18%) Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 22:54:19 -08:00
Tariq Toukan	1feeab8007	net/mlx5e: XDP, Add array for WQE info descriptors Each xdp_wqe_info instance describes the number of data-segments and WQEBBs of the WQE. This is useful for a downstream patch that adds support for Multi-Packet TX WQE feature. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 22:54:19 -08:00
Tariq Toukan	fea28dd6a2	net/mlx5e: XDP, Maintain a FIFO structure for xdp_info instances This provides infrastructure to have multiple xdp_info instances for the same consumer index. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 22:54:19 -08:00
Tariq Toukan	b8180392ed	net/mlx5e: XDP, Replace boolean doorbell indication with segment pointer Instead of calculating the control segment to be used upon an XDP xmit doorbell, save it in SQ structure. Nullify when no pending doorbell. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 22:54:18 -08:00
Tariq Toukan	db02a308cd	net/mlx5e: XDP, Warn upon polling an error CQE Do not ignore the CQE opcode. This helps expose issues and debug them. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 22:54:18 -08:00
Tariq Toukan	feb2ff9d74	net/mlx5e: XDP, Change the XDP SQ redirect indication Do not maintain an SQ state bit to indicate whether an XDP SQ serves redirect operations. Instead, rely on the fact that such an XDP SQ doesn't reside in an RQ instance, while the others do. This info is not known to the XDP SQ functions themselves, and they rely on their callers to distinguish between the cases. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 22:54:18 -08:00
Tariq Toukan	4fb2f51618	net/mlx5e: XDP, Precede XDP-related operations in RQ poll by a loaded program check At the end of the RQ polling loop, some XDP-related operations might be required. Before checking them one by one, check if an XDP program is even loaded. Combine all the checks and operations in a single function in xdp files. This saves unnecessary checks for non-XDP flows. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 22:54:17 -08:00
Tariq Toukan	e05b8d4fc3	net/mlx5e: TX, Print opcode in error CQE warning The opcode indicates about the error reason. Printing it helps in debug. Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 22:54:17 -08:00
David S. Miller	339bbff2d6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-12-21 The following pull-request contains BPF updates for your net-next tree. There is a merge conflict in test_verifier.c. Result looks as follows: [...] }, { "calls: cross frame pruning", .insns = { [...] .prog_type = BPF_PROG_TYPE_SOCKET_FILTER, .errstr_unpriv = "function calls to other bpf functions are allowed for root only", .result_unpriv = REJECT, .errstr = "!read_ok", .result = REJECT, }, { "jset: functional", .insns = { [...] { "jset: unknown const compare not taken", .insns = { BPF_RAW_INSN(BPF_JMP \| BPF_CALL, 0, 0, 0, BPF_FUNC_get_prandom_u32), BPF_JMP_IMM(BPF_JSET, BPF_REG_0, 1, 1), BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0), BPF_EXIT_INSN(), }, .prog_type = BPF_PROG_TYPE_SOCKET_FILTER, .errstr_unpriv = "!read_ok", .result_unpriv = REJECT, .errstr = "!read_ok", .result = REJECT, }, [...] { "jset: range", .insns = { [...] }, .prog_type = BPF_PROG_TYPE_SOCKET_FILTER, .result_unpriv = ACCEPT, .result = ACCEPT, }, The main changes are: 1) Various BTF related improvements in order to get line info working. Meaning, verifier will now annotate the corresponding BPF C code to the error log, from Martin and Yonghong. 2) Implement support for raw BPF tracepoints in modules, from Matt. 3) Add several improvements to verifier state logic, namely speeding up stacksafe check, optimizations for stack state equivalence test and safety checks for liveness analysis, from Alexei. 4) Teach verifier to make use of BPF_JSET instruction, add several test cases to kselftests and remove nfp specific JSET optimization now that verifier has awareness, from Jakub. 5) Improve BPF verifier's slot_type marking logic in order to allow more stack slot sharing, from Jiong. 6) Add sk_msg->size member for context access and add set of fixes and improvements to make sock_map with kTLS usable with openssl based applications, from John. 7) Several cleanups and documentation updates in bpftool as well as auto-mount of tracefs for "bpftool prog tracelog" command, from Quentin. 8) Include sub-program tags from now on in bpf_prog_info in order to have a reliable way for user space to get all tags of the program e.g. needed for kallsyms correlation, from Song. 9) Add BTF annotations for cgroup_local_storage BPF maps and implement bpf fs pretty print support, from Roman. 10) Fix bpftool in order to allow for cross-compilation, from Ivan. 11) Update of bpftool license to GPLv2-only + BSD-2-Clause in order to be compatible with libbfd and allow for Debian packaging, from Jakub. 12) Remove an obsolete prog->aux sanitation in dump and get rid of version check for prog load, from Daniel. 13) Fix a memory leak in libbpf's line info handling, from Prashant. 14) Fix cpumap's frame alignment for build_skb() so that skb_shared_info does not get unaligned, from Jesper. 15) Fix test_progs kselftest to work with older compilers which are less smart in optimizing (and thus throwing build error), from Stanislav. 16) Cleanup and simplify AF_XDP socket teardown, from Björn. 17) Fix sk lookup in BPF kselftest's test_sock_addr with regards to netns_id argument, from Andrey. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 17:31:36 -08:00
Steen Hegelund	639c1b2625	net: mscc: ocelot: Register poll timeout should be wall time not attempts When doing indirect access in the Ocelot chip, a command is setup, issued and then we need to poll until the result is ready. The polling timeout is specified in milliseconds in the datasheet and not in register access attempts. It is not a bug on the currently supported platform, but we observed that the code does not work properly on other platforms that we want to support as the timing requirements there are different. Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 16:39:56 -08:00
David S. Miller	e716431356	mlx5-updates-2018-12-19 This series adds some misc updates and the support for tunnels over VLAN tc offloads. From Miroslav Lichvar, patches #1,2 1) Update timecounter at least twice per counter overflow 2) Extend PTP gettime function to read system clock From Gavi Teitz, patch #3 3) Increase VF representors' SQ size to 128 From Eli Britstein and Or Gerlitz, patches #4-10 4) Adds the capability to support tunnels over VLAN device. Patch 4 avoids crash for TC flow with egress upper devices Patch 5 refactors tunnel routing devs into a helper function Patch 6 avoids crash for TC encap flows with vlan on underlay Patches 7-8 refactor encap tunnel header preparing code. Patch 9 adds support for building VLAN tagged ETH header. Patch 10 adds support for tunnel routing to VLAN device. From Aviv, patches 11,12 to fix earlier VF lag series 5) Fix query_nic_sys_image_guid() error during init 6) Fix LAG requirement when CONFIG_MLX5_ESWITCH is off -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJcG5O8AAoJEEg/ir3gV/o+v0AH/ja1bJ6PkAlw2bCRMIuI6HK/ 1vh9n8D74tIOAlvUi6QiLOgJ7CLdAgiAVFppDWzJSBUsY2XNAtRdYvLDDt9hGafO QGysBWNVcX2aUVp+pLDCCVEYBWyyIzW416CWHx2IUgdAg9S6cvJK6P/81wd+l4Zp 8YlPstEUANP/JKZxHHjSelMcnY+Bj0JrDzuyyyaQdwmcHo5I7Ht0tNex14yFfFbl gd7YvfPr1PtovPX2w9hMt3y3ml4mommB+jtxc0+59D+A7680hBOpAnH5pONms9rv OKKKV8KqpxL1m32/TGbd7XSG+llLJwsSM6RtD4oUdiPA4iCiVDVdreP5vac52C4= =0QQ5 -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2018-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2018-12-19 This series adds some misc updates and the support for tunnels over VLAN tc offloads. From Miroslav Lichvar, patches #1,2 1) Update timecounter at least twice per counter overflow 2) Extend PTP gettime function to read system clock From Gavi Teitz, patch #3 3) Increase VF representors' SQ size to 128 From Eli Britstein and Or Gerlitz, patches #4-10 4) Adds the capability to support tunnels over VLAN device. Patch 4 avoids crash for TC flow with egress upper devices Patch 5 refactors tunnel routing devs into a helper function Patch 6 avoids crash for TC encap flows with vlan on underlay Patches 7-8 refactor encap tunnel header preparing code. Patch 9 adds support for building VLAN tagged ETH header. Patch 10 adds support for tunnel routing to VLAN device. From Aviv, patches 11,12 to fix earlier VF lag series 5) Fix query_nic_sys_image_guid() error during init 6) Fix LAG requirement when CONFIG_MLX5_ESWITCH is off ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 15:51:55 -08:00
Ido Schimmel	d8a1f7ab2c	mlxsw: spectrum: Remove limitation regarding VID 1 VID 1 is not reserved anymore, so remove the check that prevented the creation of VLAN devices with this VID over mlxsw ports. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 15:48:54 -08:00
Ido Schimmel	0417d25e7d	mlxsw: spectrum: Switch to VID 4095 as default VID There is no need to abuse VID 1 anymore and we can instead use VID 4095 as the default VLAN, which will be configured on the port throughout its lifetime. The OVS join / leave functions are changed to enable VIDs 1-4094 (inclusive) instead of 2-4095. This because VID 4095 is now the default VLAN instead of 1. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 15:48:54 -08:00
Ido Schimmel	16f6aceb72	mlxsw: spectrum: Add an helper function to cleanup VLAN entries VLAN entries on a port can be associated with either a bridge VLAN or a router port. Before the VLAN entry is destroyed these associations need to be cleaned up. Currently, this is always invoked from the function which destroys the VLAN entry, but next patch is going to skip the destruction of the default entry when a port in unlinked from a LAG. The above does not mean that the associations should not be cleaned up, so add a helper that will be invoked from both call sites. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 15:48:54 -08:00
Ido Schimmel	346fca3b58	mlxsw: spectrum: Store pointer to default port VLAN in port struct Subsequent patches will need to access the default port VLAN. Since this VLAN will exist throughout the lifetime of the port, simply store it in the port's struct. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 15:48:54 -08:00
Ido Schimmel	ab6c3b79ec	mlxsw: spectrum: Allow controlling destruction of default port VLAN The function allows flushing all the existing VLAN entries on a port. It is invoked when a port is destroyed and when it is unlinked from a LAG. In the latter case, when moving to the new default VLAN, there will not be a need to destroy the default VLAN entry. Therefore, add an argument that allows to control whether the default port VLAN should be destroyed or not. Currently it is always set to 'true'. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 15:48:54 -08:00
Ido Schimmel	262e1ff91c	mlxsw: spectrum: Set PVID during port initialization Currently, the driver does not set the port's PVID when initializing a new port. This is because the driver is using VID 1 as PVID which is the firmware default. Subsequent patches are going to change the PVID the driver is setting when initializing a new port. Prepare for that by explicitly setting the port's PVID. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 15:48:54 -08:00
Ido Schimmel	a2d2a20553	mlxsw: spectrum: Replace hard-coded default VID with a define Subsequent patches are going to replace the current default VID (1) with VLAN_N_VID - 1 (4095). Prepare for this conversion by replacing the hard-coded '1' with a define. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 15:48:54 -08:00
Ido Schimmel	f40be47a3e	mlxsw: spectrum_router: Do not force specific configuration order In symmetric routing, the only two members in the VLAN corresponding to the L3 VNI are the router port and the VXLAN tunnel. In case the VXLAN device is already enslaved to the bridge and only later the VLAN interface is configured, the tunnel will not be offloaded. The reason for this is that when the router interface (RIF) corresponding to the VLAN interface is configured, it calls the core fid_get() API which does not check if NVE should be enabled on the FID. Instead, call into the bridge code which will check if NVE should be enabled on the FID. This effectively means that the same code path is used to retrieve a FID when either a local port or a router port joins the FID. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Petr Machata <petrm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 15:48:54 -08:00
David S. Miller	6eea2db210	Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2018-12-20 This series contains updates to e100, igb, ixgbe, i40e and ice drivers. I replaced spinlocks for mutex locks to reduce the latency on CPU0 for igb when updating the statistics. This work was based off a patch provided by Jan Jablonsky, which was against an older version of the igb driver. Jesus adjusts the receive packet buffer size from 32K to 30K when running in QAV mode, to stay within 60K for total packet buffer size for igb. Vinicius adds igb kernel documentation regarding the CBS algorithm and its implementation in the i210 family of NICs. YueHaibing from Huawei fixed the e100 driver that was potentially passing a NULL pointer, so use the kernel macro IS_ERR_OR_NULL() instead. Konstantin Khorenko fixes i40e where we were not setting up the neigh_priv_len in our net_device, which caused the driver to read beyond the neighbor entry allocated memory. Miroslav Lichvar extends the PTP gettime() to read the system clock by adding support for PTP_SYS_OFFSET_EXTENDED ioctl in i40e. Young Xiao fixed the ice driver to only enable NAPI on q_vectors that actually have transmit and receive rings. Kai-Heng Feng fixes an igb issue that when placed in suspend mode, the NIC does not wake up when a cable is plugged in. This was due to the driver not setting PME during runtime suspend. Stephen Douthit enables the ixgbe driver allow DSA devices to use the MII interface to talk to switches. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 15:34:30 -08:00
Steve Douthit	643bae17fd	ixgbe: use mii_bus to handle MII related ioctls Use the mii_bus callbacks to address the entire clause 22/45 address space. Enables userspace to poke switch registers instead of a single PHY address. The ixgbe firmware may be polling PHYs in a way that is not protected by the mii_bus lock. This isn't new behavior, but as Andrew Lunn pointed out there are more addresses available for conflicts. Signed-off-by: Stephen Douthit <stephend@silicom-usa.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-20 12:22:39 -08:00
Steve Douthit	8fa10ef012	ixgbe: register a mdiobus Most dsa devices expect a 'struct mii_bus' pointer to talk to switches via the MII interface. While this works for dsa devices, it will not work safely with Linux PHYs in all configurations since the firmware of the ixgbe device may be polling some PHY addresses in the background. Signed-off-by: Stephen Douthit <stephend@silicom-usa.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-20 12:19:11 -08:00
Kai-Heng Feng	1fb3a7a75e	igb: Fix an issue that PME is not enabled during runtime suspend I210 ethernet card doesn't wakeup when a cable gets plugged. It's because its PME is not set. Since commit `42eca23021` ("PCI: Don't touch card regs after runtime suspend D3"), if the PCI state is saved, pci_pm_runtime_suspend() stops calling pci_finish_runtime_suspend(), which enables the PCI PME. To fix the issue, let's not to save PCI states when it's runtime suspend, to let the PCI subsystem enables PME. Fixes: `42eca23021` ("PCI: Don't touch card regs after runtime suspend D3") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-20 12:14:23 -08:00
Young Xiao	eec903769b	ice: Do not enable NAPI on q_vectors that have no rings If ice driver has q_vectors w/ active NAPI that has no rings, then this will result in a divide by zero error. To correct it I am updating the driver code so that we only support NAPI on q_vectors that have 1 or more rings allocated to them. See commit `13a8cd191a` ("i40e: Do not enable NAPI on q_vectors that have no rings") for detail. Signed-off-by: Young Xiao <YangX92@hotmail.com> Acked-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-20 12:10:24 -08:00
Miroslav Lichvar	9a2d57a7a0	i40e: extend PTP gettime function to read system clock This adds support for the PTP_SYS_OFFSET_EXTENDED ioctl. Cc: Richard Cochran <richardcochran@gmail.com> Cc: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-20 12:06:35 -08:00
Konstantin Khorenko	31389b53b3	i40e: define proper net_device::neigh_priv_len Out of bound read reported by KASan. i40iw_net_event() reads unconditionally 16 bytes from neigh->primary_key while the memory allocated for "neighbour" struct is evaluated in neigh_alloc() as tbl->entry_size + dev->neigh_priv_len where "dev" is a net_device. But the driver does not setup dev->neigh_priv_len and we read beyond the neigh entry allocated memory, so the patch in the next mail fixes this. Signed-off-by: Konstantin Khorenko <khorenko@virtuozzo.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-20 12:02:26 -08:00
YueHaibing	cd0d465bb6	e100: Fix passing zero to 'PTR_ERR' warning in e100_load_ucode_wait Fix a static code checker warning: drivers/net/ethernet/intel/e100.c:1349 e100_load_ucode_wait() warn: passing zero to 'PTR_ERR' Signed-off-by: YueHaibing <yuehaibing@huawei.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-20 11:54:27 -08:00
David S. Miller	2be09de7d6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Lots of conflicts, by happily all cases of overlapping changes, parallel adds, things of that nature. Thanks to Stephen Rothwell, Saeed Mahameed, and others for their guidance in these resolutions. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 11:53:36 -08:00
Jesus Sanchez-Palencia	6f9ae17530	igb: Change RXPBSIZE size when setting Qav mode Section 4.5.9 of the datasheet says that the total size of all packet buffers combined (TxPB 0 + 1 + 2 + 3 + RxPB + BMC2OS + OS2BMC) must not exceed 60KB. Today we are configuring a total of 62KB, so reduce the RxPB from 32KB to 30KB in order to respect that. The choice of changing RxPBSIZE here is mainly because it seems more correct to give more priority to the transmit packet buffers over the receiver ones when running in Qav mode. Also, the BMC2OS and OS2BMC sizes are already too short. Signed-off-by: Jesus Sanchez-Palencia <jesus.s.palencia@gmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-20 11:45:10 -08:00
Jeff Kirsher	59361316af	igb: reduce CPU0 latency when updating statistics This change is based off of the work and suggestion of Jan Jablonsky <jan.jablonsky@thalesgroup.com>. The Watchdog workqueue in igb driver is scheduled every 2s for each network interface. That includes updating a statistics protected by spinlock. Function igb_update_stats in this case will be protected against preemption. According to number of a statistics registers (cca 60), processing this function might cause additional cpu load on CPU0. In case of statistics spinlock may be replaced with mutex, which reduce latency on CPU0. CC: Bernhard Kaindl <bernhard.kaindl@thalesgroup.com> CC: Jan Jablonsky <jan.jablonsky@thalesgroup.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-12-20 11:02:06 -08:00
Jakub Kicinski	4987eaccd2	nfp: bpf: optimize codegen for JSET with a constant The top word of the constant can only have bits set if sign extension set it to all-1, therefore we don't really have to mask the top half of the register. We can just OR it into the result as is. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 17:28:29 +01:00
Jakub Kicinski	6e774845b3	nfp: bpf: remove the trivial JSET optimization The verifier will now understand the JSET instruction, so don't mark the dead branch in the JIT as noop. We won't generate any code, anyway. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-12-20 17:28:28 +01:00
Michael Chan	0c2ff8d796	bnxt_en: Adjust default RX coalescing ticks to 10 us. For a little better performance on faster machines and faster link speeds. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:16 -08:00
Venkat Duvvuru	abd43a1352	bnxt_en: Support for 64-bit flow handle. Older firmware only supports 16-bit flow handle, because of which the number of flows that can be offloaded can’t scale beyond a point. Newer firmware supports 64-bit flow handle enabling the host to scale upto millions of flows. With the new 64-bit flow handle support, driver has to query flow stats in a different way compared to the older approach. This patch adds support for 64-bit flow handle and new way to query flow stats. Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com> Reviewed-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:16 -08:00
Michael Chan	cf6daed098	bnxt_en: Increase context memory allocations on 57500 chips for RDMA. If RDMA is supported on the 57500 chip, increase context memory allocations for the resources used by RDMA. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:16 -08:00
Michael Chan	08fe9d1816	bnxt_en: Add Level 2 context memory paging support. Add the new functions bnxt_alloc_ctx_pg_tbls()/bnxt_free_ctx_pg_tbls() to allocate and free pages for context memory. The new functions will handle the different levels of paging support and allocate/free the pages accordingly using the existing functions. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:16 -08:00
Michael Chan	4f49b2b8d4	bnxt_en: Enhance bnxt_alloc_ring()/bnxt_free_ring(). To support level 2 context page memory structures, enhance the bnxt_ring_mem_info structure with a "depth" field to specify the page level and add a flag to specify using full pages for L1 and L2 page tables. This is needed to support RDMA functionality on 57500 chips since RDMA requires more context memory. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:16 -08:00
Venkat Duvvuru	760b6d3341	bnxt_en: Add support for 2nd firmware message channel. Earlier, some of the firmware commands (ex: CFA_FLOW_*) which are processed by KONG processor were sent to the CHIMP processor from the host. This approach was taken as there was no direct message channel to KONG. CHIMP in turn used to send them to KONG. Newer firmware supports a new message channel which the host can send messages directly to the KONG processor. This patch adds support for required changes needed in the driver to support direct KONG message channel. This speeds up flow related messages sent to the firmware for CLS_FLOWER offload. Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:16 -08:00
Venkat Duvvuru	5c209fc821	bnxt_en: Introduce bnxt_get_hwrm_resp_addr & bnxt_get_hwrm_seq_id routines. These routines will be enhanced in the subsequent patch to return the 2nd firmware comm. channel's hwrm response address & sequence id respectively. Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:16 -08:00
Venkat Duvvuru	89455017fb	bnxt_en: Avoid arithmetic on void * pointer. Typecast hwrm_cmd_resp_addr to (u8 ) from (void ) before doing arithmetic. Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:15 -08:00
Venkat Duvvuru	2e9ee39877	bnxt_en: Use macros for firmware message doorbell offsets. In preparation for adding a 2nd communication channel to firmware. Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:15 -08:00
Venkat Duvvuru	fc718bb2d1	bnxt_en: Set hwrm_intr_seq_id value to its inverted value. Set hwrm_intr_seq_id value to its inverted value instead of HWRM_SEQ_INVALID, when an hwrm completion of type CMPL_BASE_TYPE_HWRM_DONE is received. This will enable us to use the complete 16-bit sequence ID space. Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:15 -08:00
Michael Chan	3322479e6d	bnxt_en: Update firmware interface spec. to 1.10.0.33. The major changes are in the flow offload firmware APIs. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-12-20 08:26:15 -08:00
Aviv Heller	a64917446e	net/mlx5: Fix LAG requirement when CONFIG_MLX5_ESWITCH is off If CONFIG_MLX5_ESWITCH is not defined, test for SR-IOV being disabled, instead of calling e-switch LAG prereq routine. Since LAG with SRIOV is allowed only when switchdev mode is on. Fixes: `eff849b2c6` ("net/mlx5: Allow/disallow LAG according to pre-req only") Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:03 -08:00
Aviv Heller	0a5b589111	net/mlx5: Fix query_nic_sys_image_guid() error during init vport system image guid should be queried using vport nic API for Ethernet ports, and vport hca API for Infiniband ports. Fixes: `fadd59fc50` ("net/mlx5: Introduce inter-device communication mechanism") Signed-off-by: Aviv Heller <avivh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:03 -08:00
Eli Britstein	e32ee6c78e	net/mlx5e: Support tunnel encap over tagged Ethernet Generate encap header depending on the routed device to support native/tagged Ethernet header. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:03 -08:00
Eli Britstein	aa331450b8	net/mlx5e: Support VLAN encap ETH header generation Support generation of native or tagged Ethernet header for encap header, depending on provided net device. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:03 -08:00
Eli Britstein	c7bcb277bd	net/mlx5e: Re-order route and encap header memory allocation Change the order to first route IPv4/6 and return if error. Only after successful route continue to allocate an encap header, with no functional change. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:02 -08:00
Eli Britstein	05ada1adb6	net/mlx5e: Tunnel encap ETH header helper function In tunnel encap we prepare the encap header for IPv4/6 cases, in two separate functions. For ETH header generation the code is almost duplicated. Move the ETH header generation code from IPv4/6 functions to a helper function, with no functional change. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:02 -08:00
Eli Britstein	b168cff0b9	net/mlx5e: Fail attempt to offload e-switch TC encap flows with vlan on underlay Currently we don't support nor fail attempts to offload encap flows routed to vlan device on the underlay network. We wrongly consider a vlan underlay device to be on the same e-switch b/c the switchdev ID is retrieved recursively. Add explicit check for that and fail such attempts. Also align to a more strict check for the ingress and the underlay devices to practically be on the same eswitch. Fixes: `ce99f6b97f` ('net/mlx5e: Support SRIOV TC encapsulation offloads for IPv6 tunnels') Fixes: `3e621b19b0` ('net/mlx5e: Support TC encapsulation offloads with upper devices') Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-12-20 05:06:02 -08:00

1 2 3 4 5 ...

26195 Commits