linux

iv/linux

Author	SHA1	Message	Date
Jakub Kicinski	5c39f26e67	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Trivial conflict in CAN, keep the net-next + the byteswap wrapper. Conflicts: drivers/net/can/usb/gs_usb.c Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-27 18:25:27 -08:00
Ido Schimmel	ff47fa13c9	mlxsw: spectrum_router: Update adjacency index more efficiently The device supports an operation that allows the driver to issue one request to update the adjacency index for all the routes in a given virtual router (VR) from old index and size to new ones. This is useful in case the configuration of a certain nexthop group is updated and its adjacency index changes. Currently, the driver does not use this operation in an efficient manner. It iterates over all the routes using the nexthop group and issues an update request for the VR if it is not the same as the previous VR. Instead, use the VR tracking added in the previous patch to update the adjacency index once for each VR currently using the nexthop group. Example: 8k IPv6 routes were added in an alternating manner to two VRFs. All the routes are using the same nexthop object ('nhid 1'). Before: # perf stat -e devlink:devlink_hwmsg --filter='incoming==0' -- ip nexthop replace id 1 via 2001:db8:1::2 dev swp3 Performance counter stats for 'ip nexthop replace id 1 via 2001:db8:1::2 dev swp3': 16,385 devlink:devlink_hwmsg 4.255933213 seconds time elapsed 0.000000000 seconds user 0.666923000 seconds sys Number of EMAD transactions corresponds to number of routes using the nexthop group. After: # perf stat -e devlink:devlink_hwmsg --filter='incoming==0' -- ip nexthop replace id 1 via 2001:db8:1::2 dev swp3 Performance counter stats for 'ip nexthop replace id 1 via 2001:db8:1::2 dev swp3': 3 devlink:devlink_hwmsg 0.077655094 seconds time elapsed 0.000000000 seconds user 0.076698000 seconds sys Number of EMAD transactions corresponds to number of VRFs / VRs. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-27 17:17:33 -08:00
Ido Schimmel	d2141a42b9	mlxsw: spectrum_router: Track nexthop group virtual router membership For each nexthop group, track in which virtual routers (VRs) the group is used. This is going to be used by the next patch to perform a more efficient adjacency index update whenever the group's adjacency index changes. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-27 17:17:33 -08:00
Ido Schimmel	9a4ab10c74	mlxsw: spectrum_router: Rollback virtual router adjacency pointer update In the rare case where the adjacency pointer cannot be updated for a given virtual router, rollback the operation so that virtual routers that are already using the new index will use the old one again. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-27 17:17:33 -08:00
Ido Schimmel	40e4413d5d	mlxsw: spectrum_router: Pass virtual router parameters directly instead of pointer mlxsw_sp_adj_index_mass_update_vr() only needs the virtual router's identifier and protocol, so pass them directly. In a subsequent patch the caller will not have access to the pointer. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-27 17:17:33 -08:00
Ido Schimmel	1c2c5eb6e1	mlxsw: spectrum_router: Fix error handling issue Return error to the caller instead of suppressing it. Fixes: e3ddfb45bacd ("mlxsw: spectrum_router: Allow returning errors from mlxsw_sp_nexthop_group_refresh()") Addresses-Coverity: ("Error handling issues (CHECKED_RETURN)") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-27 17:17:32 -08:00
Ingo Molnar	a787bdaff8	Merge branch 'linus' into sched/core, to resolve semantic conflict Signed-off-by: Ingo Molnar <mingo@kernel.org>	2020-11-27 11:10:50 +01:00
Parav Pandit	617b860c18	net/mlx5: Treat host PF vport as other (non eswitch manager) vport When eswitch manager is running on ECPF, host PF should be treated as non eswitch manager port, similar to other VF vports. Fail to do so, results in firmware treating PF's vport as ECPF vport for eswitch ACL tables. Non zero check to figure out if a given vport is other vport or not is not sufficient becase PF vport number = 0 on ECPF. Hence, create esw acl tables with an attribute of other vport. Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-11-26 18:45:03 -08:00
Parav Pandit	5bef709d76	net/mlx5: Enable host PF HCA after eswitch is initialized Currently ECPF enables external host PF too early in the initialization sequence for Ethernet links when ECPF is eswitch manager. Due to this, when external host PF driver is loaded, host PF's HCA CAP has inner_ip_version supported by NIC RX flow table. This capability is later updated by firmware after ECPF driver enables ENCAP/DECAP as eswitch manager. This results into a timing race condition, where CREATE_TIR command fails with a below syndrome on host PF. mlx5_cmd_check:775:(pid 510): CREATE_TIR(0x900) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x562b00) Hence, enable the external host PF after necessary eswitch and per vport initialization is completed. Continue to enable host PF when eswitch manager capability is off for a ECPF. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Bodong Wang <bodong@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-11-26 18:45:03 -08:00
Parav Pandit	8a90f2fc67	net/mlx5: Rename peer_pf to host_pf To match the hardware spec, rename peer_pf to host_pf. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Bodong Wang <bodong@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-11-26 18:45:03 -08:00
Eli Cohen	8d2a9d8d64	net/mlx5: Export steering related functions Export mlx5_create_flow_table() mlx5_create_flow_group() mlx5_destroy_flow_group(). These symbols are required by the VDPA implementation to create rules that consume VDPA specific traffic. We do not deal with putting the prototypes in a header file since they already exist in include/linux/mlx5/fs.h. Signed-off-by: Eli Cohen <elic@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-11-26 18:45:02 -08:00
Meir Lichtinger	dd8595eabe	net/mlx5: Update the list of the PCI supported devices Add the upcoming BlueField-3 device ID. Signed-off-by: Meir Lichtinger <meirl@nvidia.com> Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-11-26 18:43:48 -08:00
Parav Pandit	e5dfe6b57e	net/mlx5: Avoid exposing driver internal command helpers mlx5 command init and cleanup routines are internal to mlx5_core driver. Hence, avoid exporting them and move their definition to mlx5_core driver's internal file mlx5_core.h Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-11-26 18:43:48 -08:00
Muhammad Sammar	7da3ad6c26	net/mlx5: Add misc4 to mlx5_ifc_fte_match_param_bits Add misc4 match params to enable matching on prog_sample_fields. Signed-off-by: Muhammad Sammar <muhammads@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-11-26 18:43:47 -08:00
Muhammad Sammar	699d531f55	net/mlx5: Check dr mask size against mlx5_match_param size This is to allow passing misc4 match param from userspace when function like ib_flow_matcher_create is called. Signed-off-by: Muhammad Sammar <muhammads@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-11-26 18:43:47 -08:00
Chris Mi	3873063088	net/mlx5: Add sampler destination type The flow sampler object is a new destination type. Add a new member for the flow destination. Signed-off-by: Chris Mi <cmi@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2020-11-26 18:43:47 -08:00
Jakub Kicinski	594e31bceb	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 40GbE Intel Wired LAN Driver Updates 2020-11-24 This series contains updates to i40e and igbvf drivers. Marek removes a redundant assignment for i40e. Stefan Assmann corrects reporting of VF link speed for i40e. Karen revises a couple of error messages to warnings for igbvf as they could be misinterpreted as issues when they are not. v2: Dropped PTP patch as it's being updated. * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: igbvf: Refactor traces i40e: report correct VF link speed when link state is set to enable i40e: remove redundant assignment ==================== Link: https://lore.kernel.org/r/20201124165245.2844118-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-25 18:11:25 -08:00
Rohit Maheshwari	cbf3d60329	ch_ktls: lock is not freed Currently lock gets freed only if timeout expires, but missed a case when HW returns failure and goes for cleanup. Fixes: efca3878a5fb ("ch_ktls: Issue if connection offload fails") Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com> Link: https://lore.kernel.org/r/20201125072626.10861-1-rohitm@chelsio.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-25 17:44:42 -08:00
Vladimir Oltean	90cf87d16b	enetc: Let the hardware auto-advance the taprio base-time of 0 The tc-taprio base time indicates the beginning of the tc-taprio schedule, which is cyclic by definition (where the length of the cycle in nanoseconds is called the cycle time). The base time is a 64-bit PTP time in the TAI domain. Logically, the base-time should be a future time. But that imposes some restrictions to user space, which has to retrieve the current PTP time from the NIC first, then calculate a base time that will still be larger than the base time by the time the kernel driver programs this value into the hardware. Actually ensuring that the programmed base time is in the future is still a problem even if the kernel alone deals with this. Luckily, the enetc hardware already advances a base-time that is in the past into a congruent time in the immediate future, according to the same formula that can be found in the software implementation of taprio (in taprio_get_start_time): /* Schedule the start time for the beginning of the next * cycle. / n = div64_s64(ktime_sub_ns(now, base), cycle); start = ktime_add_ns(base, (n + 1) * cycle); There's only one problem: the driver doesn't let the hardware do that. It interferes with the base-time passed from user space, by special-casing the situation when the base-time is zero, and replaces that with the current PTP time. This changes the intended effective base-time of the schedule, which will in the end have a different phase offset than if the base-time of 0.000000000 was to be advanced by an integer multiple of the cycle-time. Fixes: 34c6adf1977b ("enetc: Configure the Time-Aware Scheduler via tc-taprio offload") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20201124220259.3027991-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-25 12:36:27 -08:00
Christian Eggers	37e9d0559a	mlxsw: spectrum_ptp: use PTP wide message type definitions Use recently introduced PTP wide defines instead of a driver internal enumeration. Signed-off-by: Christian Eggers <ceggers@arri.de> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Cc: Petr Machata <petrm@mellanox.com> Cc: Jiri Pirko <jiri@nvidia.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-25 12:23:14 -08:00
Antonio Borneo	12a8fe56c0	net: stmmac: fix incorrect merge of patch upstream Commit 757926247836 ("net: stmmac: add flexible PPS to dwmac 4.10a") was intended to modify the struct dwmac410_ops, but it got somehow badly merged and modified the struct dwmac4_ops. Revert the modification in struct dwmac4_ops and re-apply it properly in struct dwmac410_ops. Fixes: 757926247836 ("net: stmmac: add flexible PPS to dwmac 4.10a") Signed-off-by: Antonio Borneo <antonio.borneo@st.com> Reported-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Link: https://lore.kernel.org/r/20201124223729.886992-1-antonio.borneo@st.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-25 11:23:40 -08:00
Lijun Pan	3ada288150	ibmvnic: enhance resetting status check during module exit Based on the discussion with Sukadev Bhattiprolu and Dany Madden, we believe that checking adapter->resetting bit is preferred since RESETTING state flag is not as strict as resetting bit. RESETTING state flag is removed since it is verbose now. Fixes: 7d7195a026ba ("ibmvnic: Do not process device remove during device reset") Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 16:28:34 -08:00
Lijun Pan	0e435befae	ibmvnic: fix NULL pointer dereference in ibmvic_reset_crq crq->msgs could be NULL if the previous reset did not complete after freeing crq->msgs. Check for NULL before dereferencing them. Snippet of call trace: ... ibmvnic 30000003 env3 (unregistering): Releasing sub-CRQ ibmvnic 30000003 env3 (unregistering): Releasing CRQ BUG: Kernel NULL pointer dereference on read at 0x00000000 Faulting instruction address: 0xc0000000000c1a30 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: ibmvnic(E-) rpadlpar_io rpaphp xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables xsk_diag tcp_diag udp_diag tun raw_diag inet_diag unix_diag bridge af_packet_diag netlink_diag stp llc rfkill sunrpc pseries_rng xts vmx_crypto uio_pdrv_genirq uio binfmt_misc ip_tables xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ibmvnic] CPU: 20 PID: 8426 Comm: kworker/20:0 Tainted: G E 5.10.0-rc1+ #12 Workqueue: events __ibmvnic_reset [ibmvnic] NIP: c0000000000c1a30 LR: c008000001b00c18 CTR: 0000000000000400 REGS: c00000000d05b7a0 TRAP: 0380 Tainted: G E (5.10.0-rc1+) MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 44002480 XER: 20040000 CFAR: c0000000000c19ec IRQMASK: 0 GPR00: 0000000000000400 c00000000d05ba30 c008000001b17c00 0000000000000000 GPR04: 0000000000000000 0000000000000000 0000000000000000 00000000000001e2 GPR08: 000000000001f400 ffffffffffffd950 0000000000000000 c008000001b0b280 GPR12: c0000000000c19c8 c00000001ec72e00 c00000000019a778 c00000002647b440 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000006 0000000000000001 0000000000000003 0000000000000002 GPR24: 0000000000001000 c008000001b0d570 0000000000000005 c00000007ab5d550 GPR28: c00000007ab5c000 c000000032fcf848 c00000007ab5cc00 c000000032fcf800 NIP [c0000000000c1a30] memset+0x68/0x104 LR [c008000001b00c18] ibmvnic_reset_crq+0x70/0x110 [ibmvnic] Call Trace: [c00000000d05ba30] [0000000000000800] 0x800 (unreliable) [c00000000d05bab0] [c008000001b0a930] do_reset.isra.40+0x224/0x634 [ibmvnic] [c00000000d05bb80] [c008000001b08574] __ibmvnic_reset+0x17c/0x3c0 [ibmvnic] [c00000000d05bc50] [c00000000018d9ac] process_one_work+0x2cc/0x800 [c00000000d05bd20] [c00000000018df58] worker_thread+0x78/0x520 [c00000000d05bdb0] [c00000000019a934] kthread+0x1c4/0x1d0 [c00000000d05be20] [c00000000000d5d0] ret_from_kernel_thread+0x5c/0x6c Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 16:28:34 -08:00
Lijun Pan	a0faaa27c7	ibmvnic: fix NULL pointer dereference in reset_sub_crq_queues adapter->tx_scrq and adapter->rx_scrq could be NULL if the previous reset did not complete after freeing sub crqs. Check for NULL before dereferencing them. Snippet of call trace: ibmvnic 30000006 env6: Releasing sub-CRQ ibmvnic 30000006 env6: Releasing CRQ ... ibmvnic 30000006 env6: Got Control IP offload Response ibmvnic 30000006 env6: Re-setting tx_scrq[0] BUG: Kernel NULL pointer dereference on read at 0x00000000 Faulting instruction address: 0xc008000003dea7cc Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: rpadlpar_io rpaphp xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables xsk_diag tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag tun bridge stp llc rfkill sunrpc pseries_rng xts vmx_crypto uio_pdrv_genirq uio binfmt_misc ip_tables xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmvnic ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod CPU: 80 PID: 1856 Comm: kworker/80:2 Tainted: G W 5.8.0+ #4 Workqueue: events __ibmvnic_reset [ibmvnic] NIP: c008000003dea7cc LR: c008000003dea7bc CTR: 0000000000000000 REGS: c0000007ef7db860 TRAP: 0380 Tainted: G W (5.8.0+) MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 28002422 XER: 0000000d CFAR: c000000000bd9520 IRQMASK: 0 GPR00: c008000003dea7bc c0000007ef7dbaf0 c008000003df7400 c0000007fa26ec00 GPR04: c0000007fcd0d008 c0000007fcd96350 0000000000000027 c0000007fcd0d010 GPR08: 0000000000000023 0000000000000000 0000000000000000 0000000000000000 GPR12: 0000000000002000 c00000001ec18e00 c0000000001982f8 c0000007bad6e840 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000000000000 0000000000000000 0000000000000000 fffffffffffffef7 GPR24: 0000000000000402 c0000007fa26f3a8 0000000000000003 c00000016f8ec048 GPR28: 0000000000000000 0000000000000000 0000000000000000 c0000007fa26ec00 NIP [c008000003dea7cc] ibmvnic_reset_init+0x15c/0x258 [ibmvnic] LR [c008000003dea7bc] ibmvnic_reset_init+0x14c/0x258 [ibmvnic] Call Trace: [c0000007ef7dbaf0] [c008000003dea7bc] ibmvnic_reset_init+0x14c/0x258 [ibmvnic] (unreliable) [c0000007ef7dbb80] [c008000003de8860] __ibmvnic_reset+0x408/0x970 [ibmvnic] [c0000007ef7dbc50] [c00000000018b7cc] process_one_work+0x2cc/0x800 [c0000007ef7dbd20] [c00000000018bd78] worker_thread+0x78/0x520 [c0000007ef7dbdb0] [c0000000001984c4] kthread+0x1d4/0x1e0 [c0000007ef7dbe20] [c00000000000cea8] ret_from_kernel_thread+0x5c/0x74 Fixes: 57a49436f4e8 ("ibmvnic: Reset sub-crqs during driver reset") Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 16:28:34 -08:00
Sven Van Asbroeck	470dfd808a	lan743x: replace polling loop by wait_event_timeout() The driver's ISR sends a 'software interrupt' event to the probe() thread using the following method: - probe(): write 0 to flag, enable s/w interrupt - probe(): poll on flag, relax using usleep_range() - ISR : write 1 to flag Replace with wake_up() / wait_event_timeout(). Besides being easier to get right, this abstraction has better timing and memory consistency properties. Tested-by: Sven Van Asbroeck <thesven73@gmail.com> # lan7430 Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com> Link: https://lore.kernel.org/r/20201123191529.14908-2-TheSven73@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 16:14:04 -08:00
Sven Van Asbroeck	c31799bae8	lan743x: clean up software_isr function For no apparent reason, this function reads the INT_STS register, and checks if the software interrupt bit is set. These things have already been carried out by this function's only caller. Clean up by removing the redundant code. Tested-by: Sven Van Asbroeck <thesven73@gmail.com> # lan7430 Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com> Link: https://lore.kernel.org/r/20201123191529.14908-1-TheSven73@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 16:14:04 -08:00
Shay Agroskin	1396d3148b	net: ena: fix packet's addresses for rx_offset feature This patch fixes two lines in which the rx_offset received by the device wasn't taken into account: - prefetch function: In our driver the copied data would reside in rx_info->page + rx_headroom + rx_offset so the prefetch function is changed accordingly. - setting page_offset to zero for descriptors > 1: for every descriptor but the first, the rx_offset is zero. Hence the page_offset value should be set to rx_headroom. The previous implementation changed the value of rx_info after the descriptor was added to the SKB (essentially providing wrong page offset). Fixes: 68f236df93a9 ("net: ena: add support for the rx offset feature") Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 16:07:13 -08:00
Shay Agroskin	09323b3bca	net: ena: set initial DMA width to avoid intel iommu issue The ENA driver uses the readless mechanism, which uses DMA, to find out what the DMA mask is supposed to be. If DMA is used without setting the dma_mask first, it causes the Intel IOMMU driver to think that ENA is a 32-bit device and therefore disables IOMMU passthrough permanently. This patch sets the dma_mask to be ENA_MAX_PHYS_ADDR_SIZE_BITS=48 before readless initialization in ena_device_init()->ena_com_mmio_reg_read_request_init(), which is large enough to workaround the intel_iommu issue. DMA mask is set again to the correct value after it's received from the device after readless is initialized. The patch also changes the driver to use dma_set_mask_and_coherent() function instead of the two pci_set_dma_mask() and pci_set_consistent_dma_mask() ones. Both methods achieve the same effect. Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Mike Cui <mikecui@amazon.com> Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com> Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 16:07:13 -08:00
Shay Agroskin	5b7022cf1d	net: ena: handle bad request id in ena_netdev After request id is checked in validate_rx_req_id() its value is still used in the line rx_ring->free_ids[next_to_clean] = rx_ring->ena_bufs[i].req_id; even if it was found to be out-of-bound for the array free_ids. The patch moves the request id to an earlier stage in the napi routine and makes sure its value isn't used if it's found out-of-bounds. Fixes: 30623e1ed116 ("net: ena: avoid memory access violation by validating req_id properly") Signed-off-by: Ido Segev <idose@amazon.com> Signed-off-by: Shay Agroskin <shayagr@amazon.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 16:07:13 -08:00
Lorenzo Bianconi	039fbc47f9	net: mvneta: alloc skb_shared_info on the mvneta_rx_swbm stack Build skb_shared_info on mvneta_rx_swbm stack and sync it to xdp_buff skb_shared_info area only on the last fragment. Leftover cache miss in mvneta_swbm_rx_frame will be addressed introducing mb bit in xdp_buff/xdp_frame struct Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 15:09:44 -08:00
Lorenzo Bianconi	eb33f11864	net: mvneta: move skb_shared_info in mvneta_xdp_put_buff caller Pass skb_shared_info pointer from mvneta_xdp_put_buff caller. This is a preliminary patch to reduce accesses to skb_shared_info area and reduce cache misses. Remove napi parameter in mvneta_xdp_put_buff signature since it is always run in NAPI context Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 15:09:41 -08:00
Lorenzo Bianconi	05c748f7d0	net: mvneta: avoid unnecessary xdp_buff initialization Explicitly initialize mandatory fields defining xdp_buff struct in mvneta_rx_swbm routine Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 15:09:29 -08:00
Stefan Chulski	9a71baf719	net: mvpp2: divide fifo for dts-active ports only Tx/Rx FIFO is a HW resource limited by total size, but shared by all ports of same CP110 and impacting port-performance. Do not divide the FIFO for ports which are not enabled in DTS, so active ports could have more FIFO. No change in FIFO allocation if all 3 ports on the communication processor enabled in DTS. The active port mapping should be done in probe before FIFO-init. Signed-off-by: Stefan Chulski <stefanc@marvell.com> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/1606154073-28267-1-git-send-email-stefanc@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 15:05:04 -08:00
Ezequiel Garcia	078eb55cdf	dpaa2-eth: Fix compile error due to missing devlink support The dpaa2 driver depends on devlink, so it should select NET_DEVLINK in order to fix compile errors, such as: drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.o: in function `dpaa2_eth_rx_err': dpaa2-eth.c:(.text+0x3cec): undefined reference to `devlink_trap_report' drivers/net/ethernet/freescale/dpaa2/dpaa2-eth-devlink.o: in function `dpaa2_eth_dl_info_get': dpaa2-eth-devlink.c:(.text+0x160): undefined reference to `devlink_info_driver_name_put' Fixes: ceeb03ad8e22 ("dpaa2-eth: add basic devlink support") Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://lore.kernel.org/r/20201123163553.1666476-1-ciorneiioana@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 14:57:55 -08:00
Colin Ian King	be419fcacf	net: hns3: fix spelling mistake "memroy" -> "memory" There are spelling mistakes in two dev_err messages. Fix them. Signed-off-by: Colin Ian King <colin.king@canonical.com> Link: https://lore.kernel.org/r/20201123103452.197708-1-colin.king@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 13:20:13 -08:00
Ido Schimmel	37b50e556e	mlxsw: spectrum_trap: Add blackhole_nexthop trap Register with devlink the blackhole_nexthop trap so that mlxsw will be able to report packets dropped due to a blackhole nexthop. The internal trap identifier is "DISCARD_ROUTER3", which traps packets dropped in the adjacency table. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 12:14:56 -08:00
Ido Schimmel	68e92ad855	mlxsw: spectrum_router: Add support for blackhole nexthops Add support for blackhole nexthops by programming them to the adjacency table with a discard action. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 12:14:56 -08:00
Ido Schimmel	18c4b79d28	mlxsw: spectrum_router: Resolve RIF from nexthop struct instead of neighbour The two are the same, but for blackhole nexthops we will not have an associated neighbour struct, so resolve the RIF from the nexthop struct itself instead. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 12:14:56 -08:00
Ido Schimmel	919f6aaa3a	mlxsw: spectrum_router: Use loopback RIF for unresolved nexthops Now that the driver creates a loopback RIF during its initialization, it can be used to program the adjacency entries for unresolved nexthops instead of other RIFs. The loopback RIF is guaranteed to exist for the entire life time of the driver, unlike other RIFs that come and go. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 12:14:55 -08:00
Ido Schimmel	52d45575ec	mlxsw: spectrum_router: Use different trap identifier for unresolved nexthops Unresolved nexthops are currently written to the adjacency table with a discard action. Packets hitting such entries are trapped to the CPU via the 'DISCARD_ROUTER3' trap which can be enabled or disabled on demand, but is always enabled in order to ensure the kernel can resolve the unresolved neighbours. This trap will be needed for blackhole nexthops support. Therefore, move unresolved nexthops to explicitly program the adjacency entries with a trap action and a different trap identifier, 'RTR_EGRESS0'. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 12:14:55 -08:00
Ido Schimmel	07c78536ef	mlxsw: spectrum_router: Create loopback RIF during initialization Up until now RIFs (router interfaces) were created on demand (e.g., when an IP address was added to a netdev). However, sometimes the device needs to be provided with a RIF when one might not be available. For example, adjacency entries that drop packets need to be programmed with an egress RIF despite the RIF not being used to forward packets. Create such a RIF during initialization so that it could be used later on to support blackhole nexthops. Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 12:14:55 -08:00
Lincoln Ramsay	9bd2702d29	aquantia: Remove the build_skb path When performing IPv6 forwarding, there is an expectation that SKBs will have some headroom. When forwarding a packet from the aquantia driver, this does not always happen, triggering a kernel warning. aq_ring.c has this code (edited slightly for brevity): if (buff->is_eop && buff->len <= AQ_CFG_RX_FRAME_MAX - AQ_SKB_ALIGN) { skb = build_skb(aq_buf_vaddr(&buff->rxdata), AQ_CFG_RX_FRAME_MAX); } else { skb = napi_alloc_skb(napi, AQ_CFG_RX_HDR_SIZE); There is a significant difference between the SKB produced by these 2 code paths. When napi_alloc_skb creates an SKB, there is a certain amount of headroom reserved. However, this is not done in the build_skb codepath. As the hardware buffer that build_skb is built around does not handle the presence of the SKB header, this code path is being removed and the napi_alloc_skb path will always be used. This code path does have to copy the packet header into the SKB, but it adds the packet data as a frag. Fixes: 018423e90bee ("net: ethernet: aquantia: Add ring support code") Signed-off-by: Lincoln Ramsay <lincoln.ramsay@opengear.com> Link: https://lore.kernel.org/r/MWHPR1001MB23184F3EAFA413E0D1910EC9E8FC0@MWHPR1001MB2318.namprd10.prod.outlook.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 10:59:17 -08:00
Vincent Whitchurch	d5a05e69ac	net: stmmac: Use hrtimer for TX coalescing This driver uses a normal timer for TX coalescing, which means that the with the default tx-usecs of 1000 microseconds the cleanups actually happen 10 ms or more later with HZ=100. This leads to very low througput with TCP when bridged to a slow link such as a 4G modem. Fix this by using an hrtimer instead. On my ARM platform with HZ=100 and the default TX coalescing settings (tx-frames 25 tx-usecs 1000), with "tc qdisc add dev eth0 root netem delay 60ms 40ms rate 50Mbit" run on the server, netperf's TCP_STREAM improves from ~5.5 Mbps to ~100 Mbps. Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Link: https://lore.kernel.org/r/20201120150208.6838-1-vincent.whitchurch@axis.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-24 08:47:44 -08:00
Karen Sornek	24453a8428	igbvf: Refactor traces Refactoring "PF still resetting" and changing "Failed to add vlan id" to "Vlan id is not added" messages because previous version looked like a bug - it informed about changes that worked as designed but might confuse users Signed-off-by: Karen Sornek <karen.sornek@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-11-24 08:02:24 -08:00
Stefan Assmann	6ec12e1e94	i40e: report correct VF link speed when link state is set to enable When the virtual link state was set to "enable" ethtool would report link speed as 40000Mb/s regardless of the underlying device. Report the correct link speed. Example from a XXV710 NIC. Before: $ ip link set ens3f0 vf 0 state auto $ ethtool enp8s2 \| grep Speed Speed: 25000Mb/s $ ip link set ens3f0 vf 0 state enable $ ethtool enp8s2 \| grep Speed Speed: 40000Mb/s After: $ ip link set ens3f0 vf 0 state auto $ ethtool enp8s2 \| grep Speed Speed: 25000Mb/s $ ip link set ens3f0 vf 0 state enable $ ethtool enp8s2 \| grep Speed Speed: 25000Mb/s Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-11-24 08:02:24 -08:00
Marek Majtyka	088d5360d0	i40e: remove redundant assignment Remove a redundant assignment of the software ring pointer in the i40e driver. The variable is assigned twice with no use in between, so just get rid of the first occurrence. Fixes: 3b4f0b66c2b3 ("i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL") Signed-off-by: Marek Majtyka <marekx.majtyka@intel.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2020-11-24 08:02:24 -08:00
Peter Zijlstra	545b8c8df4	smp: Cleanup smp_call_function*() Get rid of the __call_single_node union and cleanup the API a little to avoid external code relying on the structure layout as much. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>	2020-11-24 16:47:49 +01:00
Jakub Kicinski	cc69837fca	net: don't include ethtool.h from netdevice.h linux/netdevice.h is included in very many places, touching any of its dependecies causes large incremental builds. Drop the linux/ethtool.h include, linux/netdevice.h just needs a forward declaration of struct ethtool_ops. Fix all the places which made use of this implicit include. Acked-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Shannon Nelson <snelson@pensando.io> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Link: https://lore.kernel.org/r/20201120225052.1427503-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-23 17:27:04 -08:00
Christophe JAILLET	7fd6372e27	net: pch_gbe: Use 'dma_free_coherent()' to undo 'dma_alloc_coherent()' Memory allocation are done with 'dma_alloc_coherent()'. Be consistent and use 'dma_free_coherent()' to free the corresponding memory. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/20201121090330.1332543-1-christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-23 17:01:48 -08:00
Christophe JAILLET	8ff39301ef	net: pch_gbe: Use dma_set_mask_and_coherent to simplify code 'pci_set_dma_mask()' + 'pci_set_consistent_dma_mask()' can be replaced by an equivalent 'dma_set_mask_and_coherent()' which is much less verbose. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/20201121090302.1332491-1-christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2020-11-23 17:01:48 -08:00

... 4 5 6 7 8 ...

35849 Commits