35849 Commits

Author SHA1 Message Date
Jakub Kicinski
5c39f26e67 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Trivial conflict in CAN, keep the net-next + the byteswap wrapper.

Conflicts:
	drivers/net/can/usb/gs_usb.c

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 18:25:27 -08:00
Ido Schimmel
ff47fa13c9 mlxsw: spectrum_router: Update adjacency index more efficiently
The device supports an operation that allows the driver to issue one
request to update the adjacency index for all the routes in a given
virtual router (VR) from old index and size to new ones. This is useful
in case the configuration of a certain nexthop group is updated and its
adjacency index changes.

Currently, the driver does not use this operation in an efficient
manner. It iterates over all the routes using the nexthop group and
issues an update request for the VR if it is not the same as the
previous VR.

Instead, use the VR tracking added in the previous patch to update the
adjacency index once for each VR currently using the nexthop group.

Example:

8k IPv6 routes were added in an alternating manner to two VRFs. All the
routes are using the same nexthop object ('nhid 1').

Before:

# perf stat -e devlink:devlink_hwmsg --filter='incoming==0' -- ip nexthop replace id 1 via 2001:db8:1::2 dev swp3

 Performance counter stats for 'ip nexthop replace id 1 via 2001:db8:1::2 dev swp3':

            16,385      devlink:devlink_hwmsg

       4.255933213 seconds time elapsed

       0.000000000 seconds user
       0.666923000 seconds sys

Number of EMAD transactions corresponds to number of routes using the
nexthop group.

After:

# perf stat -e devlink:devlink_hwmsg --filter='incoming==0' -- ip nexthop replace id 1 via 2001:db8:1::2 dev swp3

 Performance counter stats for 'ip nexthop replace id 1 via 2001:db8:1::2 dev swp3':

                 3      devlink:devlink_hwmsg

       0.077655094 seconds time elapsed

       0.000000000 seconds user
       0.076698000 seconds sys

Number of EMAD transactions corresponds to number of VRFs / VRs.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 17:17:33 -08:00
Ido Schimmel
d2141a42b9 mlxsw: spectrum_router: Track nexthop group virtual router membership
For each nexthop group, track in which virtual routers (VRs) the group is
used. This is going to be used by the next patch to perform a more
efficient adjacency index update whenever the group's adjacency index
changes.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 17:17:33 -08:00
Ido Schimmel
9a4ab10c74 mlxsw: spectrum_router: Rollback virtual router adjacency pointer update
In the rare case where the adjacency pointer cannot be updated for a
given virtual router, rollback the operation so that virtual routers
that are already using the new index will use the old one again.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 17:17:33 -08:00
Ido Schimmel
40e4413d5d mlxsw: spectrum_router: Pass virtual router parameters directly instead of pointer
mlxsw_sp_adj_index_mass_update_vr() only needs the virtual router's
identifier and protocol, so pass them directly. In a subsequent patch
the caller will not have access to the pointer.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 17:17:33 -08:00
Ido Schimmel
1c2c5eb6e1 mlxsw: spectrum_router: Fix error handling issue
Return error to the caller instead of suppressing it.

Fixes: e3ddfb45bacd ("mlxsw: spectrum_router: Allow returning errors from mlxsw_sp_nexthop_group_refresh()")
Addresses-Coverity: ("Error handling issues  (CHECKED_RETURN)")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-27 17:17:32 -08:00
Ingo Molnar
a787bdaff8 Merge branch 'linus' into sched/core, to resolve semantic conflict
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2020-11-27 11:10:50 +01:00
Parav Pandit
617b860c18 net/mlx5: Treat host PF vport as other (non eswitch manager) vport
When eswitch manager is running on ECPF, host PF should be treated
as non eswitch manager port, similar to other VF vports.
Fail to do so, results in firmware treating PF's vport as ECPF
vport for eswitch ACL tables.
Non zero check to figure out if a given vport is other vport or not
is not sufficient becase PF vport number = 0 on ECPF.
Hence, create esw acl tables with an attribute of other vport.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:45:03 -08:00
Parav Pandit
5bef709d76 net/mlx5: Enable host PF HCA after eswitch is initialized
Currently ECPF enables external host PF too early in the initialization
sequence for Ethernet links when ECPF is eswitch manager.

Due to this, when external host PF driver is loaded, host PF's HCA CAP has
inner_ip_version supported by NIC RX flow table.
This capability is later updated by firmware after ECPF driver enables
ENCAP/DECAP as eswitch manager.

This results into a timing race condition, where CREATE_TIR command
fails with a below syndrome on host PF.

mlx5_cmd_check:775:(pid 510): CREATE_TIR(0x900) op_mod(0x0) failed,
status bad parameter(0x3), syndrome (0x562b00)

Hence, enable the external host PF after necessary eswitch and per vport
initialization is completed.
Continue to enable host PF when eswitch manager capability is off for a
ECPF.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Bodong Wang <bodong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:45:03 -08:00
Parav Pandit
8a90f2fc67 net/mlx5: Rename peer_pf to host_pf
To match the hardware spec, rename peer_pf to host_pf.

Signed-off-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Bodong Wang <bodong@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:45:03 -08:00
Eli Cohen
8d2a9d8d64 net/mlx5: Export steering related functions
Export
 mlx5_create_flow_table()
 mlx5_create_flow_group()
 mlx5_destroy_flow_group().

These symbols are required by the VDPA implementation to create rules
that consume VDPA specific traffic.

We do not deal with putting the prototypes in a header file since they
already exist in include/linux/mlx5/fs.h.

Signed-off-by: Eli Cohen <elic@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:45:02 -08:00
Meir Lichtinger
dd8595eabe net/mlx5: Update the list of the PCI supported devices
Add the upcoming BlueField-3 device ID.

Signed-off-by: Meir Lichtinger <meirl@nvidia.com>
Reviewed-by: Eran Ben Elisha <eranbe@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:43:48 -08:00
Parav Pandit
e5dfe6b57e net/mlx5: Avoid exposing driver internal command helpers
mlx5 command init and cleanup routines are internal to mlx5_core driver.
Hence, avoid exporting them and move their definition to mlx5_core
driver's internal file mlx5_core.h

Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:43:48 -08:00
Muhammad Sammar
7da3ad6c26 net/mlx5: Add misc4 to mlx5_ifc_fte_match_param_bits
Add misc4 match params to enable matching on prog_sample_fields.

Signed-off-by: Muhammad Sammar <muhammads@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:43:47 -08:00
Muhammad Sammar
699d531f55 net/mlx5: Check dr mask size against mlx5_match_param size
This is to allow passing misc4 match param from userspace when
function like ib_flow_matcher_create is called.

Signed-off-by: Muhammad Sammar <muhammads@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:43:47 -08:00
Chris Mi
3873063088 net/mlx5: Add sampler destination type
The flow sampler object is a new destination type. Add a new member
for the flow destination.

Signed-off-by: Chris Mi <cmi@nvidia.com>
Reviewed-by: Oz Shlomo <ozsh@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2020-11-26 18:43:47 -08:00
Jakub Kicinski
594e31bceb Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
40GbE Intel Wired LAN Driver Updates 2020-11-24

This series contains updates to i40e and igbvf drivers.

Marek removes a redundant assignment for i40e.

Stefan Assmann corrects reporting of VF link speed for i40e.

Karen revises a couple of error messages to warnings for igbvf as they
could be misinterpreted as issues when they are not.

v2: Dropped PTP patch as it's being updated.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  igbvf: Refactor traces
  i40e: report correct VF link speed when link state is set to enable
  i40e: remove redundant assignment
====================

Link: https://lore.kernel.org/r/20201124165245.2844118-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 18:11:25 -08:00
Rohit Maheshwari
cbf3d60329 ch_ktls: lock is not freed
Currently lock gets freed only if timeout expires, but missed a
case when HW returns failure and goes for cleanup.

Fixes: efca3878a5fb ("ch_ktls: Issue if connection offload fails")
Signed-off-by: Rohit Maheshwari <rohitm@chelsio.com>
Link: https://lore.kernel.org/r/20201125072626.10861-1-rohitm@chelsio.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 17:44:42 -08:00
Vladimir Oltean
90cf87d16b enetc: Let the hardware auto-advance the taprio base-time of 0
The tc-taprio base time indicates the beginning of the tc-taprio
schedule, which is cyclic by definition (where the length of the cycle
in nanoseconds is called the cycle time). The base time is a 64-bit PTP
time in the TAI domain.

Logically, the base-time should be a future time. But that imposes some
restrictions to user space, which has to retrieve the current PTP time
from the NIC first, then calculate a base time that will still be larger
than the base time by the time the kernel driver programs this value
into the hardware. Actually ensuring that the programmed base time is in
the future is still a problem even if the kernel alone deals with this.

Luckily, the enetc hardware already advances a base-time that is in the
past into a congruent time in the immediate future, according to the
same formula that can be found in the software implementation of taprio
(in taprio_get_start_time):

	/* Schedule the start time for the beginning of the next
	 * cycle.
	 */
	n = div64_s64(ktime_sub_ns(now, base), cycle);
	*start = ktime_add_ns(base, (n + 1) * cycle);

There's only one problem: the driver doesn't let the hardware do that.
It interferes with the base-time passed from user space, by special-casing
the situation when the base-time is zero, and replaces that with the
current PTP time. This changes the intended effective base-time of the
schedule, which will in the end have a different phase offset than if
the base-time of 0.000000000 was to be advanced by an integer multiple
of the cycle-time.

Fixes: 34c6adf1977b ("enetc: Configure the Time-Aware Scheduler via tc-taprio offload")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20201124220259.3027991-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 12:36:27 -08:00
Christian Eggers
37e9d0559a mlxsw: spectrum_ptp: use PTP wide message type definitions
Use recently introduced PTP wide defines instead of a driver internal
enumeration.

Signed-off-by: Christian Eggers <ceggers@arri.de>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Cc: Petr Machata <petrm@mellanox.com>
Cc: Jiri Pirko <jiri@nvidia.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 12:23:14 -08:00
Antonio Borneo
12a8fe56c0 net: stmmac: fix incorrect merge of patch upstream
Commit 757926247836 ("net: stmmac: add flexible PPS to dwmac
4.10a") was intended to modify the struct dwmac410_ops, but it got
somehow badly merged and modified the struct dwmac4_ops.

Revert the modification in struct dwmac4_ops and re-apply it
properly in struct dwmac410_ops.

Fixes: 757926247836 ("net: stmmac: add flexible PPS to dwmac 4.10a")
Signed-off-by: Antonio Borneo <antonio.borneo@st.com>
Reported-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Link: https://lore.kernel.org/r/20201124223729.886992-1-antonio.borneo@st.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-25 11:23:40 -08:00
Lijun Pan
3ada288150 ibmvnic: enhance resetting status check during module exit
Based on the discussion with Sukadev Bhattiprolu and Dany Madden,
we believe that checking adapter->resetting bit is preferred
since RESETTING state flag is not as strict as resetting bit.
RESETTING state flag is removed since it is verbose now.

Fixes: 7d7195a026ba ("ibmvnic: Do not process device remove during device reset")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 16:28:34 -08:00
Lijun Pan
0e435befae ibmvnic: fix NULL pointer dereference in ibmvic_reset_crq
crq->msgs could be NULL if the previous reset did not complete after
freeing crq->msgs. Check for NULL before dereferencing them.

Snippet of call trace:
...
ibmvnic 30000003 env3 (unregistering): Releasing sub-CRQ
ibmvnic 30000003 env3 (unregistering): Releasing CRQ
BUG: Kernel NULL pointer dereference on read at 0x00000000
Faulting instruction address: 0xc0000000000c1a30
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: ibmvnic(E-) rpadlpar_io rpaphp xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables xsk_diag tcp_diag udp_diag tun raw_diag inet_diag unix_diag bridge af_packet_diag netlink_diag stp llc rfkill sunrpc pseries_rng xts vmx_crypto uio_pdrv_genirq uio binfmt_misc ip_tables xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ibmvnic]
CPU: 20 PID: 8426 Comm: kworker/20:0 Tainted: G            E     5.10.0-rc1+ #12
Workqueue: events __ibmvnic_reset [ibmvnic]
NIP:  c0000000000c1a30 LR: c008000001b00c18 CTR: 0000000000000400
REGS: c00000000d05b7a0 TRAP: 0380   Tainted: G            E      (5.10.0-rc1+)
MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 44002480  XER: 20040000
CFAR: c0000000000c19ec IRQMASK: 0
GPR00: 0000000000000400 c00000000d05ba30 c008000001b17c00 0000000000000000
GPR04: 0000000000000000 0000000000000000 0000000000000000 00000000000001e2
GPR08: 000000000001f400 ffffffffffffd950 0000000000000000 c008000001b0b280
GPR12: c0000000000c19c8 c00000001ec72e00 c00000000019a778 c00000002647b440
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000006 0000000000000001 0000000000000003 0000000000000002
GPR24: 0000000000001000 c008000001b0d570 0000000000000005 c00000007ab5d550
GPR28: c00000007ab5c000 c000000032fcf848 c00000007ab5cc00 c000000032fcf800
NIP [c0000000000c1a30] memset+0x68/0x104
LR [c008000001b00c18] ibmvnic_reset_crq+0x70/0x110 [ibmvnic]
Call Trace:
[c00000000d05ba30] [0000000000000800] 0x800 (unreliable)
[c00000000d05bab0] [c008000001b0a930] do_reset.isra.40+0x224/0x634 [ibmvnic]
[c00000000d05bb80] [c008000001b08574] __ibmvnic_reset+0x17c/0x3c0 [ibmvnic]
[c00000000d05bc50] [c00000000018d9ac] process_one_work+0x2cc/0x800
[c00000000d05bd20] [c00000000018df58] worker_thread+0x78/0x520
[c00000000d05bdb0] [c00000000019a934] kthread+0x1c4/0x1d0
[c00000000d05be20] [c00000000000d5d0] ret_from_kernel_thread+0x5c/0x6c

Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 16:28:34 -08:00
Lijun Pan
a0faaa27c7 ibmvnic: fix NULL pointer dereference in reset_sub_crq_queues
adapter->tx_scrq and adapter->rx_scrq could be NULL if the previous reset
did not complete after freeing sub crqs. Check for NULL before
dereferencing them.

Snippet of call trace:
ibmvnic 30000006 env6: Releasing sub-CRQ
ibmvnic 30000006 env6: Releasing CRQ
...
ibmvnic 30000006 env6: Got Control IP offload Response
ibmvnic 30000006 env6: Re-setting tx_scrq[0]
BUG: Kernel NULL pointer dereference on read at 0x00000000
Faulting instruction address: 0xc008000003dea7cc
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: rpadlpar_io rpaphp xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables xsk_diag tcp_diag udp_diag raw_diag inet_diag unix_diag af_packet_diag netlink_diag tun bridge stp llc rfkill sunrpc pseries_rng xts vmx_crypto uio_pdrv_genirq uio binfmt_misc ip_tables xfs libcrc32c sd_mod t10_pi sg ibmvscsi ibmvnic ibmveth scsi_transport_srp dm_mirror dm_region_hash dm_log dm_mod
CPU: 80 PID: 1856 Comm: kworker/80:2 Tainted: G        W         5.8.0+ #4
Workqueue: events __ibmvnic_reset [ibmvnic]
NIP:  c008000003dea7cc LR: c008000003dea7bc CTR: 0000000000000000
REGS: c0000007ef7db860 TRAP: 0380   Tainted: G        W          (5.8.0+)
MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 28002422  XER: 0000000d
CFAR: c000000000bd9520 IRQMASK: 0
GPR00: c008000003dea7bc c0000007ef7dbaf0 c008000003df7400 c0000007fa26ec00
GPR04: c0000007fcd0d008 c0000007fcd96350 0000000000000027 c0000007fcd0d010
GPR08: 0000000000000023 0000000000000000 0000000000000000 0000000000000000
GPR12: 0000000000002000 c00000001ec18e00 c0000000001982f8 c0000007bad6e840
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 fffffffffffffef7
GPR24: 0000000000000402 c0000007fa26f3a8 0000000000000003 c00000016f8ec048
GPR28: 0000000000000000 0000000000000000 0000000000000000 c0000007fa26ec00
NIP [c008000003dea7cc] ibmvnic_reset_init+0x15c/0x258 [ibmvnic]
LR [c008000003dea7bc] ibmvnic_reset_init+0x14c/0x258 [ibmvnic]
Call Trace:
[c0000007ef7dbaf0] [c008000003dea7bc] ibmvnic_reset_init+0x14c/0x258 [ibmvnic] (unreliable)
[c0000007ef7dbb80] [c008000003de8860] __ibmvnic_reset+0x408/0x970 [ibmvnic]
[c0000007ef7dbc50] [c00000000018b7cc] process_one_work+0x2cc/0x800
[c0000007ef7dbd20] [c00000000018bd78] worker_thread+0x78/0x520
[c0000007ef7dbdb0] [c0000000001984c4] kthread+0x1d4/0x1e0
[c0000007ef7dbe20] [c00000000000cea8] ret_from_kernel_thread+0x5c/0x74

Fixes: 57a49436f4e8 ("ibmvnic: Reset sub-crqs during driver reset")
Signed-off-by: Lijun Pan <ljp@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 16:28:34 -08:00
Sven Van Asbroeck
470dfd808a lan743x: replace polling loop by wait_event_timeout()
The driver's ISR sends a 'software interrupt' event to the probe()
thread using the following method:
- probe(): write 0 to flag, enable s/w interrupt
- probe(): poll on flag, relax using usleep_range()
- ISR    : write 1 to flag

Replace with wake_up() / wait_event_timeout(). Besides being easier
to get right, this abstraction has better timing and memory
consistency properties.

Tested-by: Sven Van Asbroeck <thesven73@gmail.com> # lan7430
Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Link: https://lore.kernel.org/r/20201123191529.14908-2-TheSven73@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 16:14:04 -08:00
Sven Van Asbroeck
c31799bae8 lan743x: clean up software_isr function
For no apparent reason, this function reads the INT_STS register, and
checks if the software interrupt bit is set. These things have already
been carried out by this function's only caller.

Clean up by removing the redundant code.

Tested-by: Sven Van Asbroeck <thesven73@gmail.com> # lan7430
Signed-off-by: Sven Van Asbroeck <thesven73@gmail.com>
Link: https://lore.kernel.org/r/20201123191529.14908-1-TheSven73@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 16:14:04 -08:00
Shay Agroskin
1396d3148b net: ena: fix packet's addresses for rx_offset feature
This patch fixes two lines in which the rx_offset received by the device
wasn't taken into account:

- prefetch function:
	In our driver the copied data would reside in
	rx_info->page + rx_headroom + rx_offset

	so the prefetch function is changed accordingly.

- setting page_offset to zero for descriptors > 1:
	for every descriptor but the first, the rx_offset is zero. Hence
	the page_offset value should be set to rx_headroom.

	The previous implementation changed the value of rx_info after
	the descriptor was added to the SKB (essentially providing wrong
	page offset).

Fixes: 68f236df93a9 ("net: ena: add support for the rx offset feature")
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 16:07:13 -08:00
Shay Agroskin
09323b3bca net: ena: set initial DMA width to avoid intel iommu issue
The ENA driver uses the readless mechanism, which uses DMA, to find
out what the DMA mask is supposed to be.

If DMA is used without setting the dma_mask first, it causes the
Intel IOMMU driver to think that ENA is a 32-bit device and therefore
disables IOMMU passthrough permanently.

This patch sets the dma_mask to be ENA_MAX_PHYS_ADDR_SIZE_BITS=48
before readless initialization in
ena_device_init()->ena_com_mmio_reg_read_request_init(),
which is large enough to workaround the intel_iommu issue.

DMA mask is set again to the correct value after it's received from the
device after readless is initialized.

The patch also changes the driver to use dma_set_mask_and_coherent()
function instead of the two pci_set_dma_mask() and
pci_set_consistent_dma_mask() ones. Both methods achieve the same
effect.

Fixes: 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Mike Cui <mikecui@amazon.com>
Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 16:07:13 -08:00
Shay Agroskin
5b7022cf1d net: ena: handle bad request id in ena_netdev
After request id is checked in validate_rx_req_id() its value is still
used in the line
	rx_ring->free_ids[next_to_clean] =
					rx_ring->ena_bufs[i].req_id;
even if it was found to be out-of-bound for the array free_ids.

The patch moves the request id to an earlier stage in the napi routine and
makes sure its value isn't used if it's found out-of-bounds.

Fixes: 30623e1ed116 ("net: ena: avoid memory access violation by validating req_id properly")
Signed-off-by: Ido Segev <idose@amazon.com>
Signed-off-by: Shay Agroskin <shayagr@amazon.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 16:07:13 -08:00
Lorenzo Bianconi
039fbc47f9 net: mvneta: alloc skb_shared_info on the mvneta_rx_swbm stack
Build skb_shared_info on mvneta_rx_swbm stack and sync it to xdp_buff
skb_shared_info area only on the last fragment. Leftover cache miss in
mvneta_swbm_rx_frame will be addressed introducing mb bit in
xdp_buff/xdp_frame struct

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 15:09:44 -08:00
Lorenzo Bianconi
eb33f11864 net: mvneta: move skb_shared_info in mvneta_xdp_put_buff caller
Pass skb_shared_info pointer from mvneta_xdp_put_buff caller. This is a
preliminary patch to reduce accesses to skb_shared_info area and reduce
cache misses.
Remove napi parameter in mvneta_xdp_put_buff signature since it is always
run in NAPI context

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 15:09:41 -08:00
Lorenzo Bianconi
05c748f7d0 net: mvneta: avoid unnecessary xdp_buff initialization
Explicitly initialize mandatory fields defining xdp_buff struct
in mvneta_rx_swbm routine

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 15:09:29 -08:00
Stefan Chulski
9a71baf719 net: mvpp2: divide fifo for dts-active ports only
Tx/Rx FIFO is a HW resource limited by total size, but shared
by all ports of same CP110 and impacting port-performance.
Do not divide the FIFO for ports which are not enabled in DTS,
so active ports could have more FIFO.
No change in FIFO allocation if all 3 ports on the communication
processor enabled in DTS.

The active port mapping should be done in probe before FIFO-init.

Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/1606154073-28267-1-git-send-email-stefanc@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 15:05:04 -08:00
Ezequiel Garcia
078eb55cdf dpaa2-eth: Fix compile error due to missing devlink support
The dpaa2 driver depends on devlink, so it should select
NET_DEVLINK in order to fix compile errors, such as:

drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.o: in function `dpaa2_eth_rx_err':
dpaa2-eth.c:(.text+0x3cec): undefined reference to `devlink_trap_report'
drivers/net/ethernet/freescale/dpaa2/dpaa2-eth-devlink.o: in function `dpaa2_eth_dl_info_get':
dpaa2-eth-devlink.c:(.text+0x160): undefined reference to `devlink_info_driver_name_put'

Fixes: ceeb03ad8e22 ("dpaa2-eth: add basic devlink support")
Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Link: https://lore.kernel.org/r/20201123163553.1666476-1-ciorneiioana@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 14:57:55 -08:00
Colin Ian King
be419fcacf net: hns3: fix spelling mistake "memroy" -> "memory"
There are spelling mistakes in two dev_err messages. Fix them.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20201123103452.197708-1-colin.king@canonical.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 13:20:13 -08:00
Ido Schimmel
37b50e556e mlxsw: spectrum_trap: Add blackhole_nexthop trap
Register with devlink the blackhole_nexthop trap so that mlxsw will be
able to report packets dropped due to a blackhole nexthop.

The internal trap identifier is "DISCARD_ROUTER3", which traps packets
dropped in the adjacency table.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 12:14:56 -08:00
Ido Schimmel
68e92ad855 mlxsw: spectrum_router: Add support for blackhole nexthops
Add support for blackhole nexthops by programming them to the adjacency
table with a discard action.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 12:14:56 -08:00
Ido Schimmel
18c4b79d28 mlxsw: spectrum_router: Resolve RIF from nexthop struct instead of neighbour
The two are the same, but for blackhole nexthops we will not have an
associated neighbour struct, so resolve the RIF from the nexthop struct
itself instead.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 12:14:56 -08:00
Ido Schimmel
919f6aaa3a mlxsw: spectrum_router: Use loopback RIF for unresolved nexthops
Now that the driver creates a loopback RIF during its initialization, it
can be used to program the adjacency entries for unresolved nexthops
instead of other RIFs. The loopback RIF is guaranteed to exist for the
entire life time of the driver, unlike other RIFs that come and go.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 12:14:55 -08:00
Ido Schimmel
52d45575ec mlxsw: spectrum_router: Use different trap identifier for unresolved nexthops
Unresolved nexthops are currently written to the adjacency table with a
discard action. Packets hitting such entries are trapped to the CPU via
the 'DISCARD_ROUTER3' trap which can be enabled or disabled on demand,
but is always enabled in order to ensure the kernel can resolve the
unresolved neighbours.

This trap will be needed for blackhole nexthops support. Therefore, move
unresolved nexthops to explicitly program the adjacency entries with a
trap action and a different trap identifier, 'RTR_EGRESS0'.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 12:14:55 -08:00
Ido Schimmel
07c78536ef mlxsw: spectrum_router: Create loopback RIF during initialization
Up until now RIFs (router interfaces) were created on demand (e.g.,
when an IP address was added to a netdev). However, sometimes the device
needs to be provided with a RIF when one might not be available.

For example, adjacency entries that drop packets need to be programmed
with an egress RIF despite the RIF not being used to forward packets.

Create such a RIF during initialization so that it could be used later
on to support blackhole nexthops.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 12:14:55 -08:00
Lincoln Ramsay
9bd2702d29 aquantia: Remove the build_skb path
When performing IPv6 forwarding, there is an expectation that SKBs
will have some headroom. When forwarding a packet from the aquantia
driver, this does not always happen, triggering a kernel warning.

aq_ring.c has this code (edited slightly for brevity):

if (buff->is_eop && buff->len <= AQ_CFG_RX_FRAME_MAX - AQ_SKB_ALIGN) {
    skb = build_skb(aq_buf_vaddr(&buff->rxdata), AQ_CFG_RX_FRAME_MAX);
} else {
    skb = napi_alloc_skb(napi, AQ_CFG_RX_HDR_SIZE);

There is a significant difference between the SKB produced by these
2 code paths. When napi_alloc_skb creates an SKB, there is a certain
amount of headroom reserved. However, this is not done in the
build_skb codepath.

As the hardware buffer that build_skb is built around does not
handle the presence of the SKB header, this code path is being
removed and the napi_alloc_skb path will always be used. This code
path does have to copy the packet header into the SKB, but it adds
the packet data as a frag.

Fixes: 018423e90bee ("net: ethernet: aquantia: Add ring support code")
Signed-off-by: Lincoln Ramsay <lincoln.ramsay@opengear.com>
Link: https://lore.kernel.org/r/MWHPR1001MB23184F3EAFA413E0D1910EC9E8FC0@MWHPR1001MB2318.namprd10.prod.outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 10:59:17 -08:00
Vincent Whitchurch
d5a05e69ac net: stmmac: Use hrtimer for TX coalescing
This driver uses a normal timer for TX coalescing, which means that the
with the default tx-usecs of 1000 microseconds the cleanups actually
happen 10 ms or more later with HZ=100.  This leads to very low
througput with TCP when bridged to a slow link such as a 4G modem.  Fix
this by using an hrtimer instead.

On my ARM platform with HZ=100 and the default TX coalescing settings
(tx-frames 25 tx-usecs 1000), with "tc qdisc add dev eth0 root netem
delay 60ms 40ms rate 50Mbit" run on the server, netperf's TCP_STREAM
improves from ~5.5 Mbps to ~100 Mbps.

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
Link: https://lore.kernel.org/r/20201120150208.6838-1-vincent.whitchurch@axis.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24 08:47:44 -08:00
Karen Sornek
24453a8428 igbvf: Refactor traces
Refactoring "PF still resetting" and changing "Failed
 to add vlan id" to "Vlan id is not added"
messages because previous version looked like a bug
- it informed about changes that worked as
designed but might confuse users

Signed-off-by: Karen Sornek <karen.sornek@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-11-24 08:02:24 -08:00
Stefan Assmann
6ec12e1e94 i40e: report correct VF link speed when link state is set to enable
When the virtual link state was set to "enable" ethtool would report
link speed as 40000Mb/s regardless of the underlying device.
Report the correct link speed.

Example from a XXV710 NIC.
Before:
$ ip link set ens3f0 vf 0 state auto
$  ethtool enp8s2 | grep Speed
        Speed: 25000Mb/s
$ ip link set ens3f0 vf 0 state enable
$ ethtool enp8s2 | grep Speed
        Speed: 40000Mb/s
After:
$ ip link set ens3f0 vf 0 state auto
$  ethtool enp8s2 | grep Speed
        Speed: 25000Mb/s
$ ip link set ens3f0 vf 0 state enable
$ ethtool enp8s2 | grep Speed
        Speed: 25000Mb/s

Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-11-24 08:02:24 -08:00
Marek Majtyka
088d5360d0 i40e: remove redundant assignment
Remove a redundant assignment of the software ring pointer in the i40e
driver. The variable is assigned twice with no use in between, so just
get rid of the first occurrence.

Fixes: 3b4f0b66c2b3 ("i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL")
Signed-off-by: Marek Majtyka <marekx.majtyka@intel.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-11-24 08:02:24 -08:00
Peter Zijlstra
545b8c8df4 smp: Cleanup smp_call_function*()
Get rid of the __call_single_node union and cleanup the API a little
to avoid external code relying on the structure layout as much.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
2020-11-24 16:47:49 +01:00
Jakub Kicinski
cc69837fca net: don't include ethtool.h from netdevice.h
linux/netdevice.h is included in very many places, touching any
of its dependecies causes large incremental builds.

Drop the linux/ethtool.h include, linux/netdevice.h just needs
a forward declaration of struct ethtool_ops.

Fix all the places which made use of this implicit include.

Acked-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Shannon Nelson <snelson@pensando.io>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20201120225052.1427503-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-23 17:27:04 -08:00
Christophe JAILLET
7fd6372e27 net: pch_gbe: Use 'dma_free_coherent()' to undo 'dma_alloc_coherent()'
Memory allocation are done with 'dma_alloc_coherent()'. Be consistent
and use 'dma_free_coherent()' to free the corresponding memory.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/20201121090330.1332543-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-23 17:01:48 -08:00
Christophe JAILLET
8ff39301ef net: pch_gbe: Use dma_set_mask_and_coherent to simplify code
'pci_set_dma_mask()' + 'pci_set_consistent_dma_mask()' can be replaced by
an equivalent 'dma_set_mask_and_coherent()' which is much less verbose.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Link: https://lore.kernel.org/r/20201121090302.1332491-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-23 17:01:48 -08:00