67740 Commits

Author SHA1 Message Date
Vivien Didelot
f1394b78a6 net: dsa: mv88e6xxx: add VTU GetNext operation
Add a new vtu_getnext operation to the chip info structure to differ the
various implementations of the VTU accesses.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 15:03:12 -04:00
Vivien Didelot
021e64ff76 net: dsa: mv88e6xxx: load STU entry with VTU entry
Now that the code writes both VTU and STU data when loading a VTU entry,
load the corresponding STU entry at the same time.

This allows us to get rid of the STU management in the
_mv88e6xxx_vtu_new helper and thus remove the separate implementations
of STU Load/Purge and STU GetNext, as well as the unused family checks.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 15:03:11 -04:00
Vivien Didelot
ef6fcea37f net: dsa: mv88e6xxx: get STU entry on VTU GetNext
Now that the code reads both VTU and STU data on VTU GetNext operation,
fetch the STU entry data of a VTU entry at the same time.

The STU data bits are masked with the VTU data bits and they are now all
read at the same time a VTU GetNext operation is issued.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 15:03:11 -04:00
Vivien Didelot
66a8e1f933 net: dsa: mv88e6xxx: move STU GetNext operation
Extract the generic portion of code to issue an STU GetNext operation,
which will be used in other implementations.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 15:03:11 -04:00
Vivien Didelot
c499a64f34 net: dsa: mv88e6xxx: move VTU Data accessors
The code to access the VTU Data registers currently only supports the
88E6185 family and alike: 2-bit membership adjacent to 2-bit port state.

Even though the 88E6352 family introduced an indirect table to program
the VLAN Spanning Tree states, the usage of the VTU Data registers
remains the same regardless the VTU or STU operation.

Now that the mv88e6xxx_vtu_entry structure contains both port membership
and states data, factorize the code to access them in global1_vtu.c.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 15:03:11 -04:00
Vivien Didelot
f169e5ee5f net: dsa: mv88e6xxx: move generic VTU GetNext
Even though every switch model has a different way to access the VTU
Data bits, the base implementation of the VTU GetNext operation remains
the same: wait, write the first VID to iterate from, start the
operation, and read the next VID.

Move this generic implementation into global1_vtu.c and abstract the
handling of the start VID (similarly to the ATU GetNext implementation),
before introducing a new chip operation for specific chips.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 15:03:10 -04:00
Vivien Didelot
3afb4bde6f net: dsa: mv88e6xxx: move VTU VID accessors
Add helpers to access the VTU VID register in the global1_vtu.c file.

At the same time, move mv88e6xxx_g1_vtu_vid_write at the beginning of
_mv88e6xxx_vtu_loadpurge, which adds no functional changes but makes
future patches simpler.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 15:03:10 -04:00
Vivien Didelot
d2ca1ea18d net: dsa: mv88e6xxx: move VTU SID accessors
Add helpers to access the VTU SID register in the global1_vtu.c file.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 15:03:10 -04:00
Vivien Didelot
8ee51f6b4f net: dsa: mv88e6xxx: move VTU FID accessors
Add helpers to access the VTU FID register in the global1_vtu.c file.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 15:03:10 -04:00
Vivien Didelot
b486d7c95c net: dsa: mv88e6xxx: move VTU flush
Move the VTU flush operation to global1_vtu.c and call it from a
mv88e6xxx_vtu_setup helper, similarly to the ATU and PVT setup.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 15:03:09 -04:00
Vivien Didelot
332aa5ccc8 net: dsa: mv88e6xxx: move VTU Operation accessors
Move the helper functions to access the Global 1 VTU Operation register
to a new global1_vtu.c file, and get rid of the old underscore prefix
naming convention. This file will be extended will all VTU/STU related
code.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 15:03:09 -04:00
Vivien Didelot
bd00e053ae net: dsa: mv88e6xxx: split VTU entry data member
VLAN aware Marvell chips can program 802.1Q VLAN membership as well as
802.1s per VLAN Spanning Tree state using the same 3 VTU Data registers.

Some chips such as 88E6185 use different Data registers offsets for
ports state and membership, and program them in a single operation.

Other chips such as 88E6352 use the same register layout but program
them in distinct operations (an indirect table is used for 802.1s.)

Newer chips such as 88E6390 use the same offsets for both state and
membership in distinct operations, thus require multiple data accesses.

To correctly abstract this, split the "data" structure member of
mv88e6xxx_vtu_entry in two "state" and "member" members, before adding
VTU support for newer chips.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 15:03:09 -04:00
Vivien Didelot
3cf3c8469f net: dsa: mv88e6xxx: add max VID to info
Some chips don't have a VLAN Table Unit, most of them do have a 4K
table, some others as the 88E6390 family has a 13th bit for the VID.

Add a new max_vid member to the info structure, used to check the
presence of a VTU as well as the value used to iterate from in VTU
GetNext operations.

This makes the MV88E6XXX_FLAG_VTU obsolete, thus remove it.

Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 15:03:09 -04:00
Ido Schimmel
b1e455260c mlxsw: spectrum_router: Simplify VRF enslavement
When a netdev is enslaved to a VRF master, its router interface (RIF)
needs to be destroyed (if exists) and a new one created using the
corresponding virtual router (VR).

>From the driver's perspective, the above is equivalent to an inetaddr
event sent for this netdev. Therefore, when a port netdev (or its
uppers) are enslaved to a VRF master, call the same function that
would've been called had a NETDEV_UP was sent for this netdev in the
inetaddr notification chain.

This patch also fixes a bug when a LAG netdev with an existing RIF is
enslaved to a VRF. Before this patch, each LAG port would drop the
reference on the RIF, but would re-join the same one (in the wrong VR)
soon after. With this patch, the corresponding RIF is first destroyed
and a new one is created using the correct VR.

Fixes: 7179eb5acd59 ("mlxsw: spectrum_router: Add support for VRFs")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 11:47:58 -04:00
David S. Miller
cedf90c0cc mlx5-updates-2017-04-30
Or says:
 ================
 mlx5 neigh update
 
 This series (whose code name is 'neigh update') from Hadar, enhances the
 mlx5 TC IP tunnel offloads to deal with changes to tunnel destination
 neighbours used in offloaded flows which involved encapsulation.
 
 In order to keep track on the validity state of such neighbours, we register
 a netevent notifier callback and act on NEIGH_UPDATE events: if a neighbour
 becomes valid, offload the related flows to HW (the other way around when
 neigh becomes invalid) and similarly when a neigh mac addresses changes.
 
 Since this traffic is offloaded from the host OS, the neighbour for the IP
 tunnel destination can mistakenly become STALE and deleted by the kernel
 since its 'used' value wasn't changed. To address that, we proactively
 update the neighbour 'used' value every DELAY_PROBE_TIME seconds, using
 time stamps generated by the existing driver code for HW flow counters.
 We use the DELAY_PROBE_TIME_UPDATE event to adjust the frequency of the updates.
 
 Prior to the core of the series, there's a patch from Saeed that introduces an
 extendable vport representor implementation scheme. It provides a separation
 between the eswitch to the netdev related aspects of the representors.
 
 We would like to thank Ido Schimmel and Ilya Lesokhin for their coaching && advice
 through the long design and review cycles while we struggled to understand and
 (hopefully correctly) implement the locking around the different driver flows(..) .
 
 - Or.
 =================
 
 Misc Updates:
 
 From Tariq:
 Some small performance and trivial code optimization for mlx5 netdev driver
 - Optimize poll ICOSQ completion queue
 - Use prefetchw when a write is to follow
 - Use u8 as ownership type in mlx5e_get_cqe()
 
 From Eran:
 - Disable LRO by default on specific setups
 
 From Eli:
 - Small cleanup for E-Switch to avoid redundant allocation
 
 Thanks,
 Saeed.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJZBeGlAAoJEEg/ir3gV/o+AuMH/2OfS+diXrTMU90tQGfucurg
 cPuv2qSGzSzcyFOjMUKL/hgj+y8HcUNZVSqts/LPhUj5CbR8On2jo/3XZb405SMX
 1eAffpD5JxDGm4xQCADwkDMRVOuWNbXlTceRMbD+8vDtboTDEZwbaXX20En6MIEW
 qcQPX0wv7/u8VpxGlhch2RRvl7+zBhhGCLNpwzkE5K+RggIPguE+F2hLpm+SH9PD
 AOHSuwGWnE40In8dZIPqsEnUsmsC9LmUQ17ip8S8dEtGDHMvwDelsKGZcRGMuwoz
 wSxpVTWn5JaU0EwN0t+rYtERFH18SqboG3OFwqb27qtRns6ALXIx3Dj1bb9sVBo=
 =3qOg
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2017-04-30' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

mlx5-updates-2017-04-30

Or says:
================
mlx5 neigh update

This series (whose code name is 'neigh update') from Hadar, enhances the
mlx5 TC IP tunnel offloads to deal with changes to tunnel destination
neighbours used in offloaded flows which involved encapsulation.

In order to keep track on the validity state of such neighbours, we register
a netevent notifier callback and act on NEIGH_UPDATE events: if a neighbour
becomes valid, offload the related flows to HW (the other way around when
neigh becomes invalid) and similarly when a neigh mac addresses changes.

Since this traffic is offloaded from the host OS, the neighbour for the IP
tunnel destination can mistakenly become STALE and deleted by the kernel
since its 'used' value wasn't changed. To address that, we proactively
update the neighbour 'used' value every DELAY_PROBE_TIME seconds, using
time stamps generated by the existing driver code for HW flow counters.
We use the DELAY_PROBE_TIME_UPDATE event to adjust the frequency of the updates.

Prior to the core of the series, there's a patch from Saeed that introduces an
extendable vport representor implementation scheme. It provides a separation
between the eswitch to the netdev related aspects of the representors.

We would like to thank Ido Schimmel and Ilya Lesokhin for their coaching && advice
through the long design and review cycles while we struggled to understand and
(hopefully correctly) implement the locking around the different driver flows(..) .

- Or.
=================

Misc Updates:

From Tariq:
Some small performance and trivial code optimization for mlx5 netdev driver
- Optimize poll ICOSQ completion queue
- Use prefetchw when a write is to follow
- Use u8 as ownership type in mlx5e_get_cqe()

From Eran:
- Disable LRO by default on specific setups

From Eli:
- Small cleanup for E-Switch to avoid redundant allocation

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 11:47:10 -04:00
Mintz, Yuval
07ff2ed03b qed: Prevent warning without CONFIG_RFS_ACCEL
After removing the PTP related initialization from slowpath start,
the remaining PTT entry is required only in case CONFIG_RFS_ACCEL is set.
Otherwise, it leads to a warning due to it being unused.

Fixes: d179bd1699fc ("qed: Acquire/release ptt_ptp lock when enabling/disabling PTP")
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 11:42:51 -04:00
Ram Amrani
20b1bd96e9 qed: output the DPM status and WID count
Output to the RDMA driver whether DPM mode is enabled or disabled in
the HW and if so what is the number of WIDs it supports

Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 11:42:15 -04:00
Ram Amrani
107392b75f qed: align DPI configuration to HW requirements
When calculating doorbell BAR partitioning round up the number of
CPUs to the nearest power of 2 so the size of the DPI (per user
section) configured in the hardware will be stored properly and
not truncated.

Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 11:42:15 -04:00
Ram Amrani
e015d58b44 qed: verify RoCE resource bitmaps are released
Add mechanism to verify RoCE resources are released prior to freeing the
bitmaps. If this is not the case, print what resources were not released.

Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 11:42:14 -04:00
Ram Amrani
105361943d qed: add error handling flow to TID deregistratin posting failure
If the posting of the ramrod for the purpose of TID deregistration
fails, abort the deregistration operation without using the FW's
return code.

Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 11:42:14 -04:00
Ram Amrani
ba0154e964 qed: remove unused SQ error state
The internal RoCE SQE QP state isn't being used. Instead we mark the
QP as in regular error state.

Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 11:42:13 -04:00
Ram Amrani
793ea8a9c7 qed: configure the RoCE max message size
Signed-off-by: Ram Amrani <Ram.Amrani@cavium.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 11:42:13 -04:00
Karim Eshapa
2faf265753 benet: Use time_before_eq for time comparison
Use time_before_eq for time comparison more safe and dealing
with timer wrapping to be future-proof.

Signed-off-by: Karim Eshapa <karim.eshapa@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 11:12:46 -04:00
Jakub Kicinski
9861ce039c virtio_net: make use of extended ack message reporting
Try to carry error messages to the user via the netlink extended
ack message attribute.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 10:35:48 -04:00
Jakub Kicinski
d957c0f711 nfp: make use of extended ack message reporting
Try to carry error messages to the user via the netlink extended
ack message attribute.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-01 10:35:47 -04:00
Abhishek Shah
733336262b net: phy: Allow BCM5481x PHYs to setup internal TX/RX clock delay
This patch allows users to enable/disable internal TX and/or RX
clock delay for BCM5481x series PHYs so as to satisfy RGMII timing
specifications.

On a particular platform, whether TX and/or RX clock delay is required
depends on how PHY connected to the MAC IP. This requirement can be
specified through "phy-mode" property in the platform device tree.

Signed-off-by: Abhishek Shah <abhishek.shah@broadcom.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:56:19 -04:00
Colin Ian King
d832565032 net: sunhme: fix spelling mistakes: "ParityErro" -> "ParityError"
trivial fix to spelling mistakes in printk message.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:55:47 -04:00
Scott Wood
9b70de6d02 bnx2x: Align RX buffers
The bnx2x driver is not providing proper alignment on the receive buffers it
passes to build_skb(), causing skb_shared_info to be misaligned.
skb_shared_info contains an atomic, and while PPC normally supports
unaligned accesses, it does not support unaligned atomics.

Aligning the size of rx buffers will ensure that page_frag_alloc() returns
aligned addresses.

This can be reproduced on PPC by setting the network MTU to 1450 (or other
non-multiple-of-4) and then generating sufficient inbound network traffic
(one or two large "wget"s usually does it), producing the following oops:

Unable to handle kernel paging request for unaligned access at address 0xc00000ffc43af656
Faulting instruction address: 0xc00000000080ef8c
Oops: Kernel access of bad area, sig: 7 [#1]
SMP NR_CPUS=2048
NUMA
PowerNV
Modules linked in: vmx_crypto powernv_rng rng_core powernv_op_panel leds_powernv led_class nfsd ip_tables x_tables autofs4 xfs lpfc bnx2x mdio libcrc32c crc_t10dif crct10dif_generic crct10dif_common
CPU: 104 PID: 0 Comm: swapper/104 Not tainted 4.11.0-rc8-00088-g4c761da #2
task: c00000ffd4892400 task.stack: c00000ffd4920000
NIP: c00000000080ef8c LR: c00000000080eee8 CTR: c0000000001f8320
REGS: c00000ffffc33710 TRAP: 0600   Not tainted  (4.11.0-rc8-00088-g4c761da)
MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>
  CR: 24082042  XER: 00000000
CFAR: c00000000080eea0 DAR: c00000ffc43af656 DSISR: 00000000 SOFTE: 1
GPR00: c000000000907f64 c00000ffffc33990 c000000000dd3b00 c00000ffcaf22100
GPR04: c00000ffcaf22e00 0000000000000000 0000000000000000 0000000000000000
GPR08: 0000000000b80008 c00000ffc43af636 c00000ffc43af656 0000000000000000
GPR12: c0000000001f6f00 c00000000fe1a000 000000000000049f 000000000000c51f
GPR16: 00000000ffffef33 0000000000000000 0000000000008a43 0000000000000001
GPR20: c00000ffc58a90c0 0000000000000000 000000000000dd86 0000000000000000
GPR24: c000007fd0ed10c0 00000000ffffffff 0000000000000158 000000000000014a
GPR28: c00000ffc43af010 c00000ffc9144000 c00000ffcaf22e00 c00000ffcaf22100
NIP [c00000000080ef8c] __skb_clone+0xdc/0x140
LR [c00000000080eee8] __skb_clone+0x38/0x140
Call Trace:
[c00000ffffc33990] [c00000000080fb74] skb_clone+0x74/0x110 (unreliable)
[c00000ffffc339c0] [c000000000907f64] packet_rcv+0x144/0x510
[c00000ffffc33a40] [c000000000827b64] __netif_receive_skb_core+0x5b4/0xd80
[c00000ffffc33b00] [c00000000082b2bc] netif_receive_skb_internal+0x2c/0xc0
[c00000ffffc33b40] [c00000000082c49c] napi_gro_receive+0x11c/0x260
[c00000ffffc33b80] [d000000066483d68] bnx2x_poll+0xcf8/0x17b0 [bnx2x]
[c00000ffffc33d00] [c00000000082babc] net_rx_action+0x31c/0x480
[c00000ffffc33e10] [c0000000000d5a44] __do_softirq+0x164/0x3d0
[c00000ffffc33f00] [c0000000000d60a8] irq_exit+0x108/0x120
[c00000ffffc33f20] [c000000000015b98] __do_irq+0x98/0x200
[c00000ffffc33f90] [c000000000027f14] call_do_irq+0x14/0x24
[c00000ffd4923a90] [c000000000015d94] do_IRQ+0x94/0x110
[c00000ffd4923ae0] [c000000000008d90] hardware_interrupt_common+0x150/0x160

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:50:19 -04:00
Dan Carpenter
77041e89ce liquidio: silence a locking static checker warning
Presumably we never hit this return, but static checkers complain that
we need to unlock so we may as well fix that.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:41:07 -04:00
Dan Carpenter
66117a9d9a qed: Unlock on error in qed_vf_pf_acquire()
My static checker complains that we're holding a mutex on this error
path.  Let's goto exit instead of returning directly.

Fixes: b0bccb69eba3 ("qed: Change locking scheme for VF channel")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:40:45 -04:00
lipeng
804ffe5c61 net: hns: support deferred probe when no mdio
In the hip06 and hip07 SoCs, phy connect to mdio bus.The mdio
module is probed with module_init, and, as such,
is not guaranteed to probe before the HNS driver. So we need
to support deferred probe.

We check for probe deferral in the mac init, so we not init DSAF
when there is no mdio, and free all resource, to later learn that
we need to defer the probe.

Signed-off-by: lipeng <lipeng321@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Reviewed-by: Matthias Brugger <mbrugger@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:39:24 -04:00
lipeng
2fdd6bafe3 net: hns: support deferred probe when can not obtain irq
In the hip06 and hip07 SoCs, the interrupt lines from the
DSAF controllers are connected to mbigen hw module.
The mbigen module is probed with module_init, and, as such,
is not guaranteed to probe before the HNS driver. So we need
to support deferred probe.

Signed-off-by: lipeng <lipeng321@huawei.com>
Reviewed-by: Yisen Zhuang <yisen.zhuang@huawei.com>
Reviewed-by: Matthias Brugger <mbrugger@suse.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:39:24 -04:00
Jakub Kicinski
dbf637ff39 nfp: provide 256 bytes of XDP headroom in all configurations
For legacy reasons NFP FW may be compiled to DMA packets to a constant
offset into the buffer and use the space before it for metadata.  This
ensures that packets data always start at a certain offset regardless of
the amount of preceding metadata.

If rx offset is set to 0 there may still be up to 64 bytes of metadata
but metadata will start at the beginning of the buffer, instead of:

    data_start_offset = rx_offset - meta_len

Even though we make the buffers larger to accommodate up to 64 bytes of
metadata, if there is only N bytes of metadata, we will end up with
N bytes of headroom and 64 - N bytes of tailroom.  Therefore we can't
rely on that space for XDP headroom.  Make sure we always allocate
full 256 bytes.  This, unfortunately, means we can't fit the headroom
on an u8 any more.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:37:00 -04:00
Jakub Kicinski
85cb207ee3 nfp: don't completely refuse to work with old flashes
Right now the required Service Process ABI version is still tied
to max ID of known commands.  For new NSP commands we are adding
we are checking if NSP version is recent enough on command-by-command
basis.  The driver doesn't have to force the device to have the
very latest flash, anything newer than 0.8 should do.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:37:00 -04:00
Jakub Kicinski
d38df0d364 nfp: avoid reading TX queue indexes from the device
Reading TX queue indexes from the device memory on each interrupt
is expensive.  It's doubly expensive with XDP running since we have
two TX rings to check there.  If the software indexes indicate that
the TX queue is completely empty, however, we don't need to look at
the device completion index at all.

The queuing CPU is doing a wmb() before kicking the device TX so
we should be safe to assume on the CPU handling the completions will
never see old value of the software copy of the index.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:37:00 -04:00
Jakub Kicinski
92e68195eb nfp: do simple XDP TX buffer recycling
On the RX path we follow the "drop if allocation of replacement
buffer fails" rule.  With XDP we extended that to the TX action,
so if XDP prog returned TX but allocation of replacement RX buffer
failed, we will drop the packet.

To improve our XDP TX performance extend the idea of rings being
always full to XDP TX rings.  Pre-fill the XDP TX rings with RX
buffers, and when XDP prog returns TX action swap the RX buffer
with the next buffer from the TX ring.

XDP TX complete will no longer free the buffers but let them
sit on the TX ring and wait for swap with RX buffer, instead.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:37:00 -04:00
Jakub Kicinski
d78005a50f nfp: drop rx_ring param from buffer allocation
We will soon allocate RX buffers for caching on XDP TX rings.
The rx_ring parameter passed to nfp_net_rx_alloc_one() is not
actually used, remove it.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:37:00 -04:00
Jakub Kicinski
46c505188b nfp: replace -ENOTSUPP with -EOPNOTSUPP
As Or points out in commit 423b3aecf290 ("net/mlx4: Change ENOTSUPP
to EOPNOTSUPP"), ENOTSUPP is NFS specific error.  Replace it with
EOPNOTSUPP.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:37:00 -04:00
Willem de Bruijn
1d11e732e7 virtio-net: use netif_tx_napi_add for tx napi
Avoid hashing the tx napi struct into napi_hash[], which is used for
busy polling receive queues.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:33:04 -04:00
Girish Moodalbail
5e0740c445 geneve: fix incorrect setting of UDP checksum flag
Creating a geneve link with 'udpcsum' set results in a creation of link
for which UDP checksum will NOT be computed on outbound packets, as can
be seen below.

11: gen0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
    link/ether c2:85:27:b6:b4:15 brd ff:ff:ff:ff:ff:ff promiscuity 0
    geneve id 200 remote 192.168.13.1 dstport 6081 noudpcsum

Similarly, creating a link with 'noudpcsum' set results in a creation
of link for which UDP checksum will be computed on outbound packets.

Fixes: 9b4437a5b870 ("geneve: Unify LWT and netdev handling.")
Signed-off-by: Girish Moodalbail <girish.moodalbail@oracle.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Lance Richardson <lrichard@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:31:16 -04:00
Jiri Benc
baf4d78607 vxlan: do not output confusing error message
The message "Cannot bind port X, err=Y" creates only confusion. In metadata
based mode, failure of IPv6 socket creation is okay if IPv6 is disabled and
no error message should be printed. But when IPv6 tunnel was requested, such
failure is fatal. The vxlan_socket_create does not know when the error is
harmless and when it's not.

Instead of passing such information down to vxlan_socket_create, remove the
message completely. It's not useful. We propagate the error code up to the
user space and the port number comes from the user space. There's nothing in
the message that the process creating vxlan interface does not know.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:30:13 -04:00
Jiri Benc
d074bf9600 vxlan: correctly handle ipv6.disable module parameter
When IPv6 is compiled but disabled at runtime, __vxlan_sock_add returns
-EAFNOSUPPORT. For metadata based tunnels, this causes failure of the whole
operation of bringing up the tunnel.

Ignore failure of IPv6 socket creation for metadata based tunnels caused by
IPv6 not being available.

Fixes: b1be00a6c39f ("vxlan: support both IPv4 and IPv6 sockets in a single vxlan device")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:30:13 -04:00
Andy Shevchenko
90a1bb9816 bnx2x: Get rid of useless temporary variable
Replace pattern

 int status;
 ...
 status = func(...);
 return status;

by

 return func(...);

No functional change intented.

Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:28:00 -04:00
Andy Shevchenko
b77f016726 bnx2x: Reuse bnx2x_null_format_ver()
Reuse bnx2x_null_format_ver() in functions where it's appropriated
instead of open coded variant.

Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:28:00 -04:00
Andy Shevchenko
55b218c12c bnx2x: Replace custom scnprintf()
Use scnprintf() when printing version instead of custom open coded variants.

Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:28:00 -04:00
Alexandre Belloni
ae3696c167 net: macb: fix phy interrupt parsing
Since 83a77e9ec415, the phydev irq is explicitly set to PHY_POLL when
there is no pdata. It doesn't work on DT enabled platforms because the
phydev irq is already set by libphy before.

Fixes: 83a77e9ec415 ("net: macb: Added PCI wrapper for Platform Driver.")
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 22:21:49 -04:00
David S. Miller
8dd5b698c2 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2017-04-30

This series contains updates to i40e and i40evf only.

Jake provides majority of the changes in this series, starting with the
renaming of a flag to avoid confusion.  Then renamed a variable to a
more meaningful name to clarify what is actually being done and to
reduce confusion.  Amortizes the wait time when initializing or disabling
lots of VFs by using i40e_reset_all_vfs() and
i40e_vsi_stop_rings_no_wait().  Cleaned up a unnecessary delay since
pci_disable_sriov() already has its own delay, so need to add a additional
delay when removing VFs.  Avoid using the same name flags for both
vsi->state and pf->state, to make code review easier and assist future
work to use the correct state field when checking bits.  Use
DECLARE_BITMAP() to ensure that we always allocate enough space for flags.
Replace hw_disabled_flags with the new _AUTO_DISABLED flags, which are
more readable because we are not setting an *_ENABLED flag to
disable the feature.

Alex corrects a oversight where we were not reprogramming the ports
after a reset, which was causing us to lose all of the receive tunnel
offloads.

Arnd Bergmann moves the declaration of a local variable to avoid a
warning seen on architectures with larger pages about an unused variable.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-30 11:33:37 -04:00
Eli Cohen
0a0ab1d2cc net/mlx5: E-Switch, Avoid redundant memory allocation
struct esw_mc_addr is a small struct that can be part of struct
mlx5_eswitch. Define it as a field and not as a pointer and save the
kzalloc call and then error flow handling.

Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-04-30 16:03:21 +03:00
Eran Ben Elisha
0f6e4cf674 net/mlx5e: Disable HW LRO when PCI is slower than link on striding RQ
We will activate the HW LRO only on servers with PCI BW > MAX LINK BW,
or when PCI BW > 16Gbps. On other cases we do not want LRO by default as
LRO sessions might get timeout and add redundant software overhead.

Tested:
	ethtool -k <ifs-name> | grep large-receive-offload
	On systems with and without the limitations.

Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-04-30 16:03:20 +03:00
Tariq Toukan
b1b03bded1 net/mlx5e: Use u8 as ownership type in mlx5e_get_cqe()
CQE ownership indication is as small as a single bit.
Use u8 to speedup the comparison.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-04-30 16:03:19 +03:00