840740 Commits

Author SHA1 Message Date
Ido Schimmel
4b14cc313f mlxsw: spectrum: Disallow prio-tagged packets when PVID is removed
When PVID is removed from a bridge port, the Linux bridge drops both
untagged and prio-tagged packets. Align mlxsw with this behavior.

Fixes: 148f472da5db ("mlxsw: reg: Add the Switch Port Acceptable Frame Types register")
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12 11:08:15 -07:00
Petr Machata
e891ce1dd2 mlxsw: spectrum_buffers: Reduce pool size on Spectrum-2
Due to an issue on Spectrum-2, in front-panel ports split four ways, 2 out
of 32 port buffers cannot be used. To work around this, the next FW release
will mark them as unused, and will report correspondingly lower total
shared buffer size. mlxsw will pick up the new value through a query to
cap_total_buffer_size resource. However the initial size for shared buffer
pool 0 is hard-coded and therefore needs to be updated.

Thus reduce the pool size by 2.7 MiB (which corresponds to 2/32 of the
total size of 42 MiB), and round down to the whole number of cells.

Fixes: fe099bf682ab ("mlxsw: spectrum_buffers: Add Spectrum-2 shared buffer configuration")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12 11:08:15 -07:00
Jiri Pirko
0b0c009834 selftests: tc_flower: Add TOS matching test
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12 11:08:14 -07:00
Jiri Pirko
e49f9adffb mlxsw: spectrum_flower: Fix TOS matching
The TOS value was not extracted correctly. Fix it.

Fixes: 87996f91f739 ("mlxsw: spectrum_flower: Add support for ip tos")
Reported-by: Alexander Petrovskiy <alexpe@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12 11:08:14 -07:00
Ido Schimmel
45a69b70f5 selftests: mlxsw: Test nexthop offload indication
Test that IPv4 and IPv6 nexthops are correctly marked with offload
indication in response to neighbour events.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12 11:08:14 -07:00
Ido Schimmel
83d5782681 mlxsw: spectrum_router: Refresh nexthop neighbour when it becomes dead
The driver tries to periodically refresh neighbours that are used to
reach nexthops. This is done by periodically calling neigh_event_send().

However, if the neighbour becomes dead, there is nothing we can do to
return it to a connected state and the above function call is basically
a NOP.

This results in the nexthop never being written to the device's
adjacency table and therefore never used to forward packets.

Fix this by dropping our reference from the dead neighbour and
associating the nexthop with a new neigbhour which we will try to
refresh.

Fixes: a7ff87acd995 ("mlxsw: spectrum_router: Implement next-hop routing")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Alex Veber <alexve@mellanox.com>
Tested-by: Alex Veber <alexve@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12 11:08:14 -07:00
Ido Schimmel
ee02c26993 mlxsw: spectrum: Use different seeds for ECMP and LAG hash
The same hash function and seed are used for both ECMP and LAG hash.
Therefore, when a LAG device is used as a nexthop device as part of an
ECMP group, hash polarization can occur and all the traffic will be
hashed to a single LAG slave.

Fix this by using a different seed for the LAG hash.

Fixes: fa73989f2697 ("mlxsw: spectrum: Use a stable ECMP/LAG seed")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Alex Veber <alexve@mellanox.com>
Tested-by: Alex Veber <alexve@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12 11:08:14 -07:00
John Fastabend
648ee6cea7 net: tls, correctly account for copied bytes with multiple sk_msgs
tls_sw_do_sendpage needs to return the total number of bytes sent
regardless of how many sk_msgs are allocated. Unfortunately, copied
(the value we return up the stack) is zero'd before each new sk_msg
is allocated so we only return the copied size of the last sk_msg used.

The caller (splice, etc.) of sendpage will then believe only part
of its data was sent and send the missing chunks again. However,
because the data actually was sent the receiver will get multiple
copies of the same data.

To reproduce this do multiple sendfile calls with a length close to
the max record size. This will in turn call splice/sendpage, sendpage
may use multiple sk_msg in this case and then returns the incorrect
number of bytes. This will cause splice to resend creating duplicate
data on the receiver. Andre created a C program that can easily
generate this case so we will push a similar selftest for this to
bpf-next shortly.

The fix is to _not_ zero the copied field so that the total sent
bytes is returned.

Reported-by: Steinar H. Gunderson <steinar+kernel@gunderson.no>
Reported-by: Andre Tomt <andre@tomt.net>
Tested-by: Andre Tomt <andre@tomt.net>
Fixes: d829e9c4112b ("tls: convert to generic sk_msg interface")
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12 11:04:35 -07:00
Stephen Suryaputra
e1ae5c2ea4 vrf: Increment Icmp6InMsgs on the original netdev
Get the ingress interface and increment ICMP counters based on that
instead of skb->dev when the the dev is a VRF device.

This is a follow up on the following message:
https://www.spinics.net/lists/netdev/msg560268.html

v2: Avoid changing skb->dev since it has unintended effect for local
    delivery (David Ahern).
Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12 11:00:11 -07:00
Maxime Chevallier
f0d2ca1531 net: ethtool: Allow matching on vlan DEI bit
Using ethtool, users can specify a classification action matching on the
full vlan tag, which includes the DEI bit (also previously called CFI).

However, when converting the ethool_flow_spec to a flow_rule, we use
dissector keys to represent the matching patterns.

Since the vlan dissector key doesn't include the DEI bit, this
information was silently discarded when translating the ethtool
flow spec in to a flow_rule.

This commit adds the DEI bit into the vlan dissector key, and allows
propagating the information to the driver when parsing the ethtool flow
spec.

Fixes: eca4205f9ec3 ("ethtool: add ethtool_rx_flow_spec to flow_rule structure translator")
Reported-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12 10:09:56 -07:00
Masanari Iida
bb2e05e0c8 linux-next: DOC: RDS: Fix a typo in rds.txt
This patch fixes a spelling typo in rds.txt

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12 09:56:29 -07:00
Matteo Croce
ec66854c83 mpls: fix af_mpls dependencies for real
Randy reported that selecting MPLS_ROUTING without PROC_FS breaks
the build, because since commit c1a9d65954c6 ("mpls: fix af_mpls
dependencies"), MPLS_ROUTING selects PROC_SYSCTL, but Kconfig's select
doesn't recursively handle dependencies.
Change the select into a dependency.

Fixes: c1a9d65954c6 ("mpls: fix af_mpls dependencies")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-12 09:42:34 -07:00
Ilya Maximets
01d76b5317 xdp: check device pointer before clearing
We should not call 'ndo_bpf()' or 'dev_put()' with NULL argument.

Fixes: c9b47cc1fabc ("xsk: fix bug when trying to use both copy and zero-copy on one queue id")
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-12 16:41:47 +02:00
Martin KaFai Lau
f12dd75959 bpf: net: Set sk_bpf_storage back to NULL for cloned sk
The cloned sk should not carry its parent-listener's sk_bpf_storage.
This patch fixes it by setting it back to NULL.

Fixes: 6ac99e8f23d4 ("bpf: Introduce bpf sk local storage")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-12 16:38:20 +02:00
David S. Miller
93c65f83f2 Merge branch 'vxlan-geneve-linear'
Stefano Brivio says:

====================
Don't assume linear buffers in error handlers for VXLAN and GENEVE

Guillaume noticed the same issue fixed by commit 26fc181e6cac ("fou, fou6:
do not assume linear skbs") for fou and fou6 is also present in VXLAN and
GENEVE error handlers: we can't assume linear buffers there, we need to
use pskb_may_pull() instead.
====================

Acked-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-11 12:08:05 -07:00
Stefano Brivio
eccc73a6b2 geneve: Don't assume linear buffers in error handler
In commit a07966447f39 ("geneve: ICMP error lookup handler") I wrongly
assumed buffers from icmp_socket_deliver() would be linear. This is not
the case: icmp_socket_deliver() only guarantees we have 8 bytes of linear
data.

Eric fixed this same issue for fou and fou6 in commits 26fc181e6cac
("fou, fou6: do not assume linear skbs") and 5355ed6388e2 ("fou, fou6:
avoid uninit-value in gue_err() and gue6_err()").

Use pskb_may_pull() instead of checking skb->len, and take into account
the fact we later access the GENEVE header with udp_hdr(), so we also
need to sum skb_transport_header() here.

Reported-by: Guillaume Nault <gnault@redhat.com>
Fixes: a07966447f39 ("geneve: ICMP error lookup handler")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-11 12:07:33 -07:00
Stefano Brivio
8399a6930d vxlan: Don't assume linear buffers in error handler
In commit c3a43b9fec8a ("vxlan: ICMP error lookup handler") I wrongly
assumed buffers from icmp_socket_deliver() would be linear. This is not
the case: icmp_socket_deliver() only guarantees we have 8 bytes of linear
data.

Eric fixed this same issue for fou and fou6 in commits 26fc181e6cac
("fou, fou6: do not assume linear skbs") and 5355ed6388e2 ("fou, fou6:
avoid uninit-value in gue_err() and gue6_err()").

Use pskb_may_pull() instead of checking skb->len, and take into account
the fact we later access the VXLAN header with udp_hdr(), so we also
need to sum skb_transport_header() here.

Reported-by: Guillaume Nault <gnault@redhat.com>
Fixes: c3a43b9fec8a ("vxlan: ICMP error lookup handler")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-11 12:07:33 -07:00
Taehee Yoo
309b66970e net: openvswitch: do not free vport if register_netdevice() is failed.
In order to create an internal vport, internal_dev_create() is used and
that calls register_netdevice() internally.
If register_netdevice() fails, it calls dev->priv_destructor() to free
private data of netdev. actually, a private data of this is a vport.

Hence internal_dev_create() should not free and use a vport after failure
of register_netdevice().

Test command
    ovs-dpctl add-dp bonding_masters

Splat looks like:
[ 1035.667767] kasan: GPF could be caused by NULL-ptr deref or user memory access
[ 1035.675958] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[ 1035.676916] CPU: 1 PID: 1028 Comm: ovs-vswitchd Tainted: G    B             5.2.0-rc3+ #240
[ 1035.676916] RIP: 0010:internal_dev_create+0x2e5/0x4e0 [openvswitch]
[ 1035.676916] Code: 48 c1 ea 03 80 3c 02 00 0f 85 9f 01 00 00 4c 8b 23 48 b8 00 00 00 00 00 fc ff df 49 8d bc 24 60 05 00 00 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 86 01 00 00 49 8b bc 24 60 05 00 00 e8 e4 68 f4
[ 1035.713720] RSP: 0018:ffff88810dcb7578 EFLAGS: 00010206
[ 1035.713720] RAX: dffffc0000000000 RBX: ffff88810d13fe08 RCX: ffffffff84297704
[ 1035.713720] RDX: 00000000000000ac RSI: 0000000000000000 RDI: 0000000000000560
[ 1035.713720] RBP: 00000000ffffffef R08: fffffbfff0d3b881 R09: fffffbfff0d3b881
[ 1035.713720] R10: 0000000000000001 R11: fffffbfff0d3b880 R12: 0000000000000000
[ 1035.768776] R13: 0000607ee460b900 R14: ffff88810dcb7690 R15: ffff88810dcb7698
[ 1035.777709] FS:  00007f02095fc980(0000) GS:ffff88811b400000(0000) knlGS:0000000000000000
[ 1035.777709] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1035.777709] CR2: 00007ffdf01d2f28 CR3: 0000000108258000 CR4: 00000000001006e0
[ 1035.777709] Call Trace:
[ 1035.777709]  ovs_vport_add+0x267/0x4f0 [openvswitch]
[ 1035.777709]  new_vport+0x15/0x1e0 [openvswitch]
[ 1035.777709]  ovs_vport_cmd_new+0x567/0xd10 [openvswitch]
[ 1035.777709]  ? ovs_dp_cmd_dump+0x490/0x490 [openvswitch]
[ 1035.777709]  ? __kmalloc+0x131/0x2e0
[ 1035.777709]  ? genl_family_rcv_msg+0xa54/0x1030
[ 1035.777709]  genl_family_rcv_msg+0x63a/0x1030
[ 1035.777709]  ? genl_unregister_family+0x630/0x630
[ 1035.841681]  ? debug_show_all_locks+0x2d0/0x2d0
[ ... ]

Fixes: cf124db566e6 ("net: Fix inconsistent teardown and release of private netdev state.")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-11 11:54:01 -07:00
Willem de Bruijn
522924b583 net: correct udp zerocopy refcnt also when zerocopy only on append
The below patch fixes an incorrect zerocopy refcnt increment when
appending with MSG_MORE to an existing zerocopy udp skb.

  send(.., MSG_ZEROCOPY | MSG_MORE);	// refcnt 1
  send(.., MSG_ZEROCOPY | MSG_MORE);	// refcnt still 1 (bar frags)

But it missed that zerocopy need not be passed at the first send. The
right test whether the uarg is newly allocated and thus has extra
refcnt 1 is not !skb, but !skb_zcopy.

  send(.., MSG_MORE);			// <no uarg>
  send(.., MSG_ZEROCOPY);		// refcnt 1

Fixes: 100f6d8e09905 ("net: correct zerocopy refcnt with udp MSG_MORE")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-11 11:40:54 -07:00
Jonathan Lemon
da2577fdd0 bpf: lpm_trie: check left child of last leftmost node for NULL
If the leftmost parent node of the tree has does not have a child
on the left side, then trie_get_next_key (and bpftool map dump) will
not look at the child on the right.  This leads to the traversal
missing elements.

Lookup is not affected.

Update selftest to handle this case.

Reproducer:

 bpftool map create /sys/fs/bpf/lpm type lpm_trie key 6 \
     value 1 entries 256 name test_lpm flags 1
 bpftool map update pinned /sys/fs/bpf/lpm key  8 0 0 0  0   0 value 1
 bpftool map update pinned /sys/fs/bpf/lpm key 16 0 0 0  0 128 value 2
 bpftool map dump   pinned /sys/fs/bpf/lpm

Returns only 1 element. (2 expected)

Fixes: b471f2f1de8b ("bpf: implement MAP_GET_NEXT_KEY command for LPM_TRIE")
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-06-11 13:52:37 +02:00
John Hurley
dce5ccccd1 nfp: ensure skb network header is set for packet redirect
Packets received at the NFP driver may be redirected to egress of another
netdev (e.g. in the case of OvS internal ports). On the egress path, some
processes, like TC egress hooks, may expect the network header offset
field in the skb to be correctly set. If this is not the case there is
potential for abnormal behaviour and even the triggering of BUG() calls.

Set the skb network header field before the mac header pull when doing a
packet redirect.

Fixes: 27f54b582567 ("nfp: allow fallback packets from non-reprs")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09 20:08:09 -07:00
Yuchung Cheng
fcc2202a9d tcp: fix undo spurious SYNACK in passive Fast Open
Commit 794200d66273 ("tcp: undo cwnd on Fast Open spurious SYNACK
retransmit") may cause tcp_fastretrans_alert() to warn about pending
retransmission in Open state. This is triggered when the Fast Open
server both sends data and has spurious SYNACK retransmission during
the handshake, and the data packets were lost or reordered.

The root cause is a bit complicated:

(1) Upon receiving SYN-data: a full socket is created with
    snd_una = ISN + 1 by tcp_create_openreq_child()

(2) On SYNACK timeout the server/sender enters CA_Loss state.

(3) Upon receiving the final ACK to complete the handshake, sender
    does not mark FLAG_SND_UNA_ADVANCED since (1)

    Sender then calls tcp_process_loss since state is CA_loss by (2)

(4) tcp_process_loss() does not invoke undo operations but instead
    mark REXMIT_LOST to force retransmission

(5) tcp_rcv_synrecv_state_fastopen() calls tcp_try_undo_loss(). It
    changes state to CA_Open but has positive tp->retrans_out

(6) Next ACK triggers the WARN_ON in tcp_fastretrans_alert()

The step that goes wrong is (4) where the undo operation should
have been invoked because the ACK successfully acknowledged the
SYN sequence. This fixes that by specifically checking undo
when the SYN-ACK sequence is acknowledged. Then after
tcp_process_loss() the state would be further adjusted based
in tcp_fastretrans_alert() to avoid triggering the warning in (6).

Fixes: 794200d66273 ("tcp: undo cwnd on Fast Open spurious SYNACK retransmit")
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09 20:04:11 -07:00
Matteo Croce
c1a9d65954 mpls: fix af_mpls dependencies
MPLS routing code relies on sysctl to work, so let it select PROC_SYSCTL.

Reported-by: Randy Dunlap <rdunlap@infradead.org>
Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Matteo Croce <mcroce@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09 19:57:24 -07:00
David S. Miller
7f0b44a42e Merge branch 'ibmvnic-Fixes-for-device-reset-handling'
Thomas Falcon says:

====================
ibmvnic: Fixes for device reset handling

This series contains three unrelated fixes to issues seen during
device resets. The first patch fixes an error when the driver requests
to deactivate the link of an uninitialized device, resulting in a
failure to reset. Next, a patch to fix multicast transmission
failures seen after a driver reset. The final patch fixes mishandling
of memory allocation failures during device initialization, which
caused a kernel oops.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09 19:51:28 -07:00
Thomas Falcon
7c940b1a52 ibmvnic: Fix unchecked return codes of memory allocations
The return values for these memory allocations are unchecked,
which may cause an oops if the driver does not handle them after
a failure. Fix by checking the function's return code.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09 19:51:28 -07:00
Thomas Falcon
be32a24372 ibmvnic: Refresh device multicast list after reset
It was observed that multicast packets were no longer received after
a device reset.  The fix is to resend the current multicast list to
the backing device after recovery.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09 19:51:28 -07:00
Thomas Falcon
1f94608b0c ibmvnic: Do not close unopened driver during reset
Check driver state before halting it during a reset. If the driver is
not running, do nothing. Otherwise, a request to deactivate a down link
can cause an error and the reset will fail.

Signed-off-by: Thomas Falcon <tlfalcon@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09 19:51:28 -07:00
David S. Miller
4172eadb08 mlx5-fixes-2019-06-07
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAlz62dYACgkQSD+KveBX
 +j6zVggAw6zCNNIoLES7++v6utSmZ+3Vs9Qo420OiHrPJ/IgzWmKRhMbKUh1OZQW
 88Clc1hJAkGlRvoj9fhUEgLebHtUllyf4NKwcaufbwWWlt3ihi2TFi8IgzB+MNkn
 ak8b0zx3/RsEA1Ee2APFPuAc/ybER0lcdHucAqVA8GtfMaHoE6LmRTGeXIXo0tQ5
 Rp8O499B5qTUm0jd5j3UnJr8YBUVio0bSZk9TbOEvDGokq+U+bslyxcfjGiMCV41
 +jfqfr5rjiW/75i7UNS5w2D320UTSUqccefc0zBBTu4/t2ySRllLW5/GGE1/Bjqp
 UihNI2R63fxoYkg/bYCbF1T2QRaJnw==
 =JTmU
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2019-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2019-06-07

This series introduces some fixes to mlx5 driver.

Please pull and let me know if there is any problem.

For -stable v4.17
  ('net/mlx5: Avoid reloading already removed devices')

For -stable v5.0
  ('net/mlx5e: Avoid detaching non-existing netdev under switchdev mode')

For -stable v5.1
  ('net/mlx5e: Fix source port matching in fdb peer flow rule')
  ('net/mlx5e: Support tagged tunnel over bond')
  ('net/mlx5e: Add ndo_set_feature for uplink representor')
  ('net/mlx5: Update pci error handler entries and command translation')
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09 19:45:54 -07:00
David S. Miller
62f42a114b linux-can-fixes-for-5.2-20190607
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEmvEkXzgOfc881GuFWsYho5HknSAFAlz60fwTHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRBaxiGjkeSdILI8B/41j6UxwMSYMetM/Vw2AyqPt0z671Gu
 mAVzDK3PZF5WIzD5JsDlUhwN1dqvfvAZeSS1tQwQ18ZpQLkRSxlAKhkLLlxBVKpJ
 0uDwYuFWwXF8DbdlRwZq+Db0GGAXW5+8WtA4gp9GTiIP6kTgmqnnaZvWPnQzQmKJ
 RCY8YAzF52jXa4hBfLsXiG7NoO6cZHQJMFKLsQEpHV+GMqTcAYJHzBP6YVVXC8PG
 495TREmTq5lvRKc2U3QS6//yPWleFYr5ViW/psEcBPCGdruyJI8LjDRaqnGWRBJX
 vPyEVSFHtQZu49Vue3g4jbjhVr8OWTV/MsFgE/zK5Ahe+2G5969pID3n
 =iCgZ
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-5.2-20190607' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2019-06-07

this is a pull reqeust of 9 patches for net/master.

The first patch is by Alexander Dahl and removes a duplicate menu entry from
the Kconfig. The next patch by Joakim Zhang fixes the timeout in the flexcan
driver when setting small bit rates. Anssi Hannula's patch for the xilinx_can
driver fixes the bittiming_const for CAN FD core. The two patches by Sean
Nyekjaer bring mcp25625 to the existing mcp251x driver. The patch by Eugen
Hristev implements an errata for the m_can driver. YueHaibing's patch fixes the
error handling ing can_init(). The patch by Fabio Estevam for the flexcan
driver removes an unneeded registration message during flexcan_probe(). And the
last patch is by Willem de Bruijn and adds the missing purging the  socket
error queue on sock destruct.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09 19:44:01 -07:00
George Wilkie
2f3f7d1fa0 mpls: fix warning with multi-label encap
If you configure a route with multiple labels, e.g.
  ip route add 10.10.3.0/24 encap mpls 16/100 via 10.10.2.2 dev ens4
A warning is logged:
  kernel: [  130.561819] netlink: 'ip': attribute type 1 has an invalid
  length.

This happens because mpls_iptunnel_policy has set the type of
MPLS_IPTUNNEL_DST to fixed size NLA_U32.
Change it to a minimum size.
nla_get_labels() does the remaining validation.

Fixes: e3e4712ec096 ("mpls: ip tunnel support")
Signed-off-by: George Wilkie <gwilkie@vyatta.att-mail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09 13:26:34 -07:00
Michael Schmitz
a9520543b1 net: phy: rename Asix Electronics PHY driver
[Resent to net instead of net-next - may clash with Anders Roxell's patch
series addressing duplicate module names]

Commit 31dd83b96641 ("net-next: phy: new Asix Electronics PHY driver")
introduced a new PHY driver drivers/net/phy/asix.c that causes a module
name conflict with a pre-existiting driver (drivers/net/usb/asix.c).

The PHY driver is used by the X-Surf 100 ethernet card driver, and loaded
by that driver via its PHY ID. A rename of the driver looks unproblematic.

Rename PHY driver to ax88796b.c in order to resolve name conflict.

Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
Tested-by: Michael Schmitz <schmitzmic@gmail.com>
Fixes: 31dd83b96641 ("net-next: phy: new Asix Electronics PHY driver")
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09 13:24:17 -07:00
Eric Dumazet
65a3c497c0 ipv6: flowlabel: fl6_sock_lookup() must use atomic_inc_not_zero
Before taking a refcount, make sure the object is not already
scheduled for deletion.

Same fix is needed in ipv6_flowlabel_opt()

Fixes: 18367681a10b ("ipv6 flowlabel: Convert np->ipv6_fl_list to RCU.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09 13:07:14 -07:00
Enrico Weigelt
c3fee640bc net: ipv4: fib_semantics: fix uninitialized variable
fix an uninitialized variable:

  CC      net/ipv4/fib_semantics.o
net/ipv4/fib_semantics.c: In function 'fib_check_nh_v4_gw':
net/ipv4/fib_semantics.c:1027:12: warning: 'err' may be used uninitialized in this function [-Wmaybe-uninitialized]
   if (!tbl || err) {
            ^~

Signed-off-by: Enrico Weigelt <info@metux.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-09 12:47:30 -07:00
David S. Miller
38e406f600 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
pull-request: bpf 2019-06-07

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix several bugs in riscv64 JIT code emission which forgot to clear high
   32-bits for alu32 ops, from Björn and Luke with selftests covering all
   relevant BPF alu ops from Björn and Jiong.

2) Two fixes for UDP BPF reuseport that avoid calling the program in case of
   __udp6_lib_err and UDP GRO which broke reuseport_select_sock() assumption
   that skb->data is pointing to transport header, from Martin.

3) Two fixes for BPF sockmap: a use-after-free from sleep in psock's backlog
   workqueue, and a missing restore of sk_write_space when psock gets dropped,
   from Jakub and John.

4) Fix unconnected UDP sendmsg hook API which is insufficient as-is since it
   breaks standard applications like DNS if reverse NAT is not performed upon
   receive, from Daniel.

5) Fix an out-of-bounds read in __bpf_skc_lookup which in case of AF_INET6
   fails to verify that the length of the tuple is long enough, from Lorenz.

6) Fix libbpf's libbpf__probe_raw_btf to return an fd instead of 0/1 (for
   {un,}successful probe) as that is expected to be propagated as an fd to
   load_sk_storage_btf() and thus closing the wrong descriptor otherwise,
   from Michal.

7) Fix bpftool's JSON output for the case when a lookup fails, from Krzesimir.

8) Minor misc fixes in docs, samples and selftests, from various others.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-06-07 14:46:47 -07:00
Eli Britstein
45e7d4c0c1 net/mlx5e: Support tagged tunnel over bond
Stacked devices like bond interface may have a VLAN device on top of
them. Detect lag state correctly under this condition, and return the
correct routed net device, according to it the encap header is built.

Fixes: e32ee6c78efa ("net/mlx5e: Support tunnel encap over tagged Ethernet")
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-06-07 14:40:37 -07:00
Alaa Hleihel
47c9d2c99d net/mlx5e: Avoid detaching non-existing netdev under switchdev mode
After introducing dedicated uplink representor, the netdev instance
set over the esw manager vport (PF) became no longer in use, so it was
removed in the cited commit once we're on switchdev mode.
However, the mlx5e_detach function was not updated accordingly, and it
still tries to detach a non-existing netdev, causing a kernel crash.

This patch fixes this issue.

Fixes: aec002f6f82c ("net/mlx5e: Uninstantiate esw manager vport netdev on switchdev mode")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-06-07 14:40:37 -07:00
Raed Salem
b83c073016 net/mlx5e: Fix source port matching in fdb peer flow rule
The cited commit changed the initialization placement of the eswitch
attributes so it is done prior to parse tc actions function call,
including among others the in_rep and in_mdev fields which are mistakenly
reassigned inside the parse actions function.

This breaks the source port matching criteria of the peer redirect rule.

Fix by removing the now redundant reassignment of the already initialized
fields.

Fixes: 988ab9c7363a ("net/mlx5e: Introduce mlx5e_flow_esw_attr_init() helper")
Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-06-07 14:40:37 -07:00
Shay Agroskin
57c70d8740 net/mlx5e: Replace reciprocal_scale in TX select queue function
The TX queue index returned by the fallback function ranges
between [0,NUM CHANNELS - 1] if QoS isn't set and
[0, (NUM CHANNELS)*(NUM TCs) -1] otherwise.

Our HW uses different TC mapping than the fallback function
(which is denoted as 'up', user priority) so we only need to extract
a channel number out of the returned value.

Since (NUM CHANNELS)*(NUM TCs) is a relatively small number, using
reciprocal scale almost always returns zero.
We instead access the 'txq2sq' table to extract the sq (and with it the
channel number) associated with the tx queue, thus getting
a more evenly distributed channel number.

Perf:

Rx/Tx side with Intel(R) Xeon(R) Silver 4108 CPU @ 1.80GHz and ConnectX-5.
Used 'iperf' UDP traffic, 10 threads, and priority 5.

Before:	0.566Mpps
After:	 2.37Mpps

As expected, releasing the existing bottleneck of steering all traffic
to TX queue zero significantly improves transmission rates.

Fixes: 7ccdd0841b30 ("net/mlx5e: Fix select queue callback")
Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-06-07 14:40:37 -07:00
Chris Mi
d3cbd4254d net/mlx5e: Add ndo_set_feature for uplink representor
After we have a dedicated uplink representor, the new netdev ops
doesn't support ndo_set_feature. Because of that, we can't change
some features, eg. rxvlan. Now add it back.

In this patch, I also do a cleanup for the features flag handling,
eg. remove duplicate NETIF_F_HW_TC flag setting.

Fixes: aec002f6f82c ("net/mlx5e: Uninstantiate esw manager vport netdev on switchdev mode")
Signed-off-by: Chris Mi <chrism@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-06-07 14:40:37 -07:00
Alaa Hleihel
dd80857bf3 net/mlx5: Avoid reloading already removed devices
Prior to reloading a device we must first verify that it was not already
removed. Otherwise, the attempt to remove the device will do nothing, and
in that case we will end up proceeding with adding an new device that no
one was expecting to remove, leaving behind used resources such as EQs that
causes a failure to destroy comp EQs and syndrome (0x30f433).

Fix that by making sure that we try to remove and add a device (based on a
protocol) only if the device is already added.

Fixes: c5447c70594b ("net/mlx5: E-Switch, Reload IB interface when switching devlink modes")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-06-07 14:40:36 -07:00
Edward Srouji
6a6fabbfa3 net/mlx5: Update pci error handler entries and command translation
Add missing entries for create/destroy UCTX and UMEM commands.
This could get us wrong "unknown FW command" error in flows
where we unbind the device or reset the driver.

Also the translation of these commands from opcodes to string
was missing.

Fixes: 6e3722baac04 ("IB/mlx5: Use the correct commands for UMEM and UCTX allocation")
Signed-off-by: Edward Srouji <edwards@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-06-07 14:40:36 -07:00
Willem de Bruijn
fd704bd5ee can: purge socket error queue on sock destruct
CAN supports software tx timestamps as of the below commit. Purge
any queued timestamp packets on socket destroy.

Fixes: 51f31cabe3ce ("ip: support for TX timestamps on UDP and RAW sockets")
Reported-by: syzbot+a90604060cb40f5bdd16@syzkaller.appspotmail.com
Signed-off-by: Willem de Bruijn <willemb@google.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-06-07 23:03:54 +02:00
Fabio Estevam
eb503004a7 can: flexcan: Remove unneeded registration message
Currently the following message is observed when the flexcan
driver is probed:

flexcan 2090000.flexcan: device registered (reg_base=(ptrval), irq=23)

The reason for printing 'ptrval' is explained at
Documentation/core-api/printk-formats.rst:

"Pointers printed without a specifier extension (i.e unadorned %p) are
hashed to prevent leaking information about the kernel memory layout. This
has the added benefit of providing a unique identifier. On 64-bit machines
the first 32 bits are zeroed. The kernel will print ``(ptrval)`` until it
gathers enough entropy."

Instead of passing %pK, which can print the correct address, simply
remove the entire message as it is not really that useful.

Signed-off-by: Fabio Estevam <festevam@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-06-07 23:03:54 +02:00
YueHaibing
c5a3aed1cd can: af_can: Fix error path of can_init()
This patch add error path for can_init() to avoid possible crash if some
error occurs.

Fixes: 0d66548a10cb ("[CAN]: Add PF_CAN core module")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-06-07 23:03:54 +02:00
Eugen Hristev
3e82f2f34c can: m_can: implement errata "Needless activation of MRAF irq"
During frame reception while the MCAN is in Error Passive state and the
Receive Error Counter has thevalue MCAN_ECR.REC = 127, it may happen
that MCAN_IR.MRAF is set although there was no Message RAM access
failure. If MCAN_IR.MRAF is enabled, an interrupt to the Host CPU is
generated.

Work around:
The Message RAM Access Failure interrupt routine needs to check whether

    MCAN_ECR.RP = '1' and MCAN_ECR.REC = '127'.

In this case, reset MCAN_IR.MRAF. No further action is required.
This affects versions older than 3.2.0

Errata explained on Sama5d2 SoC which includes this hardware block:
http://ww1.microchip.com/downloads/en/DeviceDoc/SAMA5D2-Family-Silicon-Errata-and-Data-Sheet-Clarification-DS80000803B.pdf
chapter 6.2

Reproducibility: If 2 devices with m_can are connected back to back,
configuring different bitrate on them will lead to interrupt storm on
the receiving side, with error "Message RAM access failure occurred".
Another way is to have a bad hardware connection. Bad wire connection
can lead to this issue as well.

This patch fixes the issue according to provided workaround.

Signed-off-by: Eugen Hristev <eugen.hristev@microchip.com>
Reviewed-by: Ludovic Desroches <ludovic.desroches@microchip.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-06-07 23:03:54 +02:00
Sean Nyekjaer
35b7fa4d07 can: mcp251x: add support for mcp25625
Fully compatible with mcp2515, the mcp25625 have integrated transceiver.

This patch adds support for the mcp25625 to the existing mcp251x driver.

Signed-off-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-06-07 23:03:53 +02:00
Sean Nyekjaer
0df82dcd55 dt-bindings: can: mcp251x: add mcp25625 support
Fully compatible with mcp2515, the mcp25625 have integrated transceiver.

This patch add the mcp25625 to the device tree bindings documentation.

Signed-off-by: Sean Nyekjaer <sean@geanix.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-06-07 23:03:53 +02:00
Anssi Hannula
904044dd8f can: xilinx_can: use correct bittiming_const for CAN FD core
Commit 9e5f1b273e6a ("can: xilinx_can: add support for Xilinx CAN FD
core") added a new can_bittiming_const structure for CAN FD cores that
support larger values for tseg1, tseg2, and sjw than previous Xilinx CAN
cores, but the commit did not actually take that into use.

Fix that.

Tested with CAN FD core on a ZynqMP board.

Fixes: 9e5f1b273e6a ("can: xilinx_can: add support for Xilinx CAN FD core")
Reported-by: Shubhrajyoti Datta <shubhrajyoti.datta@gmail.com>
Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi>
Cc: Michal Simek <michal.simek@xilinx.com>
Reviewed-by: Shubhrajyoti Datta <shubhrajyoti.datta@gmail.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-06-07 23:03:53 +02:00
Joakim Zhang
247e5356a7 can: flexcan: fix timeout when set small bitrate
Current we can meet timeout issue when setting a small bitrate like
10000 as follows on i.MX6UL EVK board (ipg clock = 66MHZ, per clock =
30MHZ):

| root@imx6ul7d:~# ip link set can0 up type can bitrate 10000

A link change request failed with some changes committed already.
Interface can0 may have been left with an inconsistent configuration,
please check.

| RTNETLINK answers: Connection timed out

It is caused by calling of flexcan_chip_unfreeze() timeout.

Originally the code is using usleep_range(10, 20) for unfreeze
operation, but the patch (8badd65 can: flexcan: avoid calling
usleep_range from interrupt context) changed it into udelay(10) which is
only a half delay of before, there're also some other delay changes.

After double to FLEXCAN_TIMEOUT_US to 100 can fix the issue.

Meanwhile, Rasmus Villemoes reported that even with a timeout of 100,
flexcan_probe() fails on the MPC8309, which requires a value of at least
140 to work reliably. 250 works for everyone.

Signed-off-by: Joakim Zhang <qiangqing.zhang@nxp.com>
Reviewed-by: Dong Aisheng <aisheng.dong@nxp.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-06-07 23:03:53 +02:00
Alexander Dahl
0ed89d777d can: usb: Kconfig: Remove duplicate menu entry
This seems to have slipped in by accident when sorting the entries.

Fixes: ffbdd9172ee2f53020f763574b4cdad8d9760a4f
Signed-off-by: Alexander Dahl <ada@thorsis.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2019-06-07 23:03:53 +02:00