Commit Graph

888572 Commits

Author SHA1 Message Date
Paul Durrant
9476654bd5 xen-netback: support dynamic unbind/bind
By re-attaching RX, TX, and CTL rings during connect() rather than
assuming they are freshly allocated (i.e. assuming the counters are zero),
and avoiding forcing state to Closed in netback_remove() it is possible
for vif instances to be unbound and re-bound from and to (respectively) a
running guest.

Dynamic unbind/bind is a highly useful feature for a backend module as it
allows it to be unloaded and re-loaded (i.e. updated) without requiring
domUs to be halted.

This has been tested by running iperf as a server in the test VM and
then running a client against it in a continuous loop, whilst also
running:

while true;
  do echo vif-$DOMID-$VIF >unbind;
  echo down;
  rmmod xen-netback;
  echo unloaded;
  modprobe xen-netback;
  cd $(pwd);
  brctl addif xenbr0 vif$DOMID.$VIF;
  ip link set vif$DOMID.$VIF up;
  echo up;
  sleep 5;
  done

in dom0 from /sys/bus/xen-backend/drivers/vif to continuously unbind,
unload, re-load, re-bind and re-plumb the backend.

Clearly a performance drop was seen but no TCP connection resets were
observed during this test and moreover a parallel SSH connection into the
guest remained perfectly usable throughout.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26 15:16:26 -08:00
Madalin Bucur
1c93fb4576 net: phy: aquantia: add suspend / resume ops for AQR105
The suspend/resume code for AQR107 works on AQR105 too.
This patch fixes issues with the partner not seeing the link down
when the interface using AQR105 is brought down.

Fixes: bee8259dd3 ("net: phy: add driver for aquantia phy")
Signed-off-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26 15:14:53 -08:00
Madalin Bucur
c27569fcd6 dpaa_eth: fix DMA mapping leak
On the error path some fragments remain DMA mapped. Adding a fix
that unmaps all the fragments. Rework cleanup path to be simpler.

Fixes: 8151ee88ba ("dpaa_eth: use page backed rx buffers")
Signed-off-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26 15:11:31 -08:00
David S. Miller
8d34799298 Merge branch 'RTL8211F-RGMII-RX-TX-delay-configuration-improvements'
Martin Blumenstingl says:

====================
RTL8211F: RGMII RX/TX delay configuration improvements

In discussion with Andrew [0] we figured out that it would be best to
make the RX delay of the RTL8211F PHY configurable (just like the TX
delay is already configurable).

While here I took the opportunity to add some logging to the TX delay
configuration as well.

There is no public documentation for the RX and TX delay registers.
I received this information a while ago (and created this RfC patch
back then: [1]). Realtek gave me permission to take the information
from the datasheet extracts and phase them in my own words and publish
that (I am not allowed to publish the datasheet extracts).

I have tested these patches on two boards:
- Amlogic Meson8b Odroid-C1
- Amlogic GXM Khadas VIM2
Both still behave as before these changes (iperf3 speeds are the same
in both directions: RX and TX), which is expected because they are
currently using phy-mode = "rgmii" with the RX delay not being generated
by the PHY.

[0] https://patchwork.ozlabs.org/patch/1215313/
[1] https://patchwork.ozlabs.org/patch/843946/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26 13:22:17 -08:00
Martin Blumenstingl
1b3047b520 net: phy: realtek: add support for configuring the RX delay on RTL8211F
On RTL8211F the RX and TX delays (2ns) can be configured in two ways:
- pin strapping (RXD1 for the TX delay and RXD0 for the RX delay, LOW
  means "off" and HIGH means "on") which is read during PHY reset
- using software to configure the TX and RX delay registers

So far only the configuration using pin strapping has been supported.
Add support for enabling or disabling the RGMII RX delay based on the
phy-mode to be able to get the RX delay into a known state. This is
important because the RX delay has to be coordinated between the PHY,
MAC and the PCB design (trace length). With an invalid RX delay applied
(for example if both PHY and MAC add a 2ns RX delay) Ethernet may not
work at all.

Also add debug logging when configuring the RX delay (just like the TX
delay) because this is a common source of problems.

Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26 13:22:17 -08:00
Martin Blumenstingl
3aec743d69 net: phy: realtek: add logging for the RGMII TX delay configuration
RGMII requires a delay of 2ns between the data and the clock signal.
There are at least three ways this can happen. One possibility is by
having the PHY generate this delay.
This is a common source for problems (for example with slow TX speeds or
packet loss when sending data). The TX delay configuration of the
RTL8211F PHY can be set either by pin-strappping the RXD1 pin (HIGH
means enabled, LOW means disabled) or through configuring a paged
register. The setting from the RXD1 pin is also reflected in the
register.

Add debug logging to the TX delay configuration on RTL8211F so it's
easier to spot these issues (for example if the TX delay is enabled for
both, the RTL8211F PHY and the MAC).
This is especially helpful because there is no public datasheet for the
RTL8211F PHY available with all the RX/TX delay specifics.

Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26 13:22:17 -08:00
David S. Miller
1f4f16fa19 Merge branch 'mlxsw-spectrum_router-Cleanups'
Ido Schimmel says:

====================
mlxsw: spectrum_router: Cleanups

This patch set removes from mlxsw code that is no longer necessary after
the simplification of the IPv4 and IPv6 route offload API.

The patches eliminate unnecessary code by taking advantage of the fact
that mlxsw no longer needs to maintain a list of identical routes,
following recent changes in route offload API.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26 13:13:22 -08:00
Ido Schimmel
7c4a7ec855 mlxsw: spectrum_router: Remove FIB entry list from FIB node
As explained in previous patches, the driver no longer needs to maintain
a list of identical FIB entries (i.e, same {tb_id, prefix, prefix
length}) and therefore each FIB node can only store one FIB entry.

Remove the FIB entry list and simplify the code.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26 13:13:22 -08:00
Ido Schimmel
b04720aee9 mlxsw: spectrum_router: Consolidate identical functions
After the last patch mlxsw_sp_fib{4,6}_node_entry_link() and
mlxsw_sp_fib{4,6}_node_entry_unlink() are identical and can therefore be
consolidated into the same common function.

Perform the consolidation.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26 13:13:22 -08:00
Ido Schimmel
0705297e51 mlxsw: spectrum_router: Make route creation and destruction symmetric
Host routes that perform decapsulation of IP in IP tunnels have a
special adjacency entry linked to them. This entry stores information
such as the expected underlay source IP. When the route is deleted this
entry needs to be freed.

The allocation of the adjacency entry happens in
mlxsw_sp_fib4_entry_type_set(), but it is freed in
mlxsw_sp_fib4_node_entry_unlink().

Create a new function - mlxsw_sp_fib4_entry_type_unset() - and free the
adjacency entry there.

This will allow us to consolidate mlxsw_sp_fib{4,6}_node_entry_unlink()
in the next patch.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26 13:13:22 -08:00
Ido Schimmel
0d2fb5aa93 mlxsw: spectrum_router: Eliminate dead code
Since the driver no longer maintains a list of identical routes there is
no route to promote when a route is deleted.

Remove that code that took care of it.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26 13:13:22 -08:00
Ido Schimmel
231c8d2bbc mlxsw: spectrum_router: Remove unnecessary checks
Now that the networking stack takes care of only notifying the routes of
interest, we do not need to maintain a list of identical routes.

Remove the check that tests if the route is the first route in the FIB
node.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26 13:13:22 -08:00
David S. Miller
ec34c01575 Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Fix endianness issue in flowtable TCP flags dissector,
   from Arnd Bergmann.

2) Extend flowtable test script with dnat rules, from Florian Westphal.

3) Reject padding in ebtables user entries and validate computed user
   offset, reported by syzbot, from Florian Westphal.

4) Fix endianness in nft_tproxy, from Phil Sutter.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26 13:11:40 -08:00
Andy Roulin
c1e4699026 bonding: rename AD_STATE_* to LACP_STATE_*
As the LACP actor/partner state is now part of the uapi, rename the
3ad state defines with LACP prefix. The LACP prefix is preferred over
BOND_3AD as the LACP standard moved to 802.1AX.

Fixes: 826f66b30c ("bonding: move 802.3ad port state flags to uapi")
Signed-off-by: Andy Roulin <aroulin@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26 13:09:37 -08:00
Kevin Kou
f643ee295c sctp: move trace_sctp_probe_path into sctp_outq_sack
The original patch bringed in the "SCTP ACK tracking trace event"
feature was committed at Dec.20, 2017, it replaced jprobe usage
with trace events, and bringed in two trace events, one is
TRACE_EVENT(sctp_probe), another one is TRACE_EVENT(sctp_probe_path).
The original patch intended to trigger the trace_sctp_probe_path in
TRACE_EVENT(sctp_probe) as below code,

+TRACE_EVENT(sctp_probe,
+
+	TP_PROTO(const struct sctp_endpoint *ep,
+		 const struct sctp_association *asoc,
+		 struct sctp_chunk *chunk),
+
+	TP_ARGS(ep, asoc, chunk),
+
+	TP_STRUCT__entry(
+		__field(__u64, asoc)
+		__field(__u32, mark)
+		__field(__u16, bind_port)
+		__field(__u16, peer_port)
+		__field(__u32, pathmtu)
+		__field(__u32, rwnd)
+		__field(__u16, unack_data)
+	),
+
+	TP_fast_assign(
+		struct sk_buff *skb = chunk->skb;
+
+		__entry->asoc = (unsigned long)asoc;
+		__entry->mark = skb->mark;
+		__entry->bind_port = ep->base.bind_addr.port;
+		__entry->peer_port = asoc->peer.port;
+		__entry->pathmtu = asoc->pathmtu;
+		__entry->rwnd = asoc->peer.rwnd;
+		__entry->unack_data = asoc->unack_data;
+
+		if (trace_sctp_probe_path_enabled()) {
+			struct sctp_transport *sp;
+
+			list_for_each_entry(sp, &asoc->peer.transport_addr_list,
+					    transports) {
+				trace_sctp_probe_path(sp, asoc);
+			}
+		}
+	),

But I found it did not work when I did testing, and trace_sctp_probe_path
had no output, I finally found that there is trace buffer lock
operation(trace_event_buffer_reserve) in include/trace/trace_events.h:

static notrace void							\
trace_event_raw_event_##call(void *__data, proto)			\
{									\
	struct trace_event_file *trace_file = __data;			\
	struct trace_event_data_offsets_##call __maybe_unused __data_offsets;\
	struct trace_event_buffer fbuffer;				\
	struct trace_event_raw_##call *entry;				\
	int __data_size;						\
									\
	if (trace_trigger_soft_disabled(trace_file))			\
		return;							\
									\
	__data_size = trace_event_get_offsets_##call(&__data_offsets, args); \
									\
	entry = trace_event_buffer_reserve(&fbuffer, trace_file,	\
				 sizeof(*entry) + __data_size);		\
									\
	if (!entry)							\
		return;							\
									\
	tstruct								\
									\
	{ assign; }							\
									\
	trace_event_buffer_commit(&fbuffer);				\
}

The reason caused no output of trace_sctp_probe_path is that
trace_sctp_probe_path written in TP_fast_assign part of
TRACE_EVENT(sctp_probe), and it will be placed( { assign; } ) after the
trace_event_buffer_reserve() when compiler expands Macro,

        entry = trace_event_buffer_reserve(&fbuffer, trace_file,        \
                                 sizeof(*entry) + __data_size);         \
                                                                        \
        if (!entry)                                                     \
                return;                                                 \
                                                                        \
        tstruct                                                         \
                                                                        \
        { assign; }                                                     \

so trace_sctp_probe_path finally can not acquire trace_event_buffer
and return no output, that is to say the nest of tracepoint entry function
is not allowed. The function call flow is:

trace_sctp_probe()
-> trace_event_raw_event_sctp_probe()
 -> lock buffer
 -> trace_sctp_probe_path()
   -> trace_event_raw_event_sctp_probe_path()  --nested
   -> buffer has been locked and return no output.

This patch is to remove trace_sctp_probe_path from the TP_fast_assign
part of TRACE_EVENT(sctp_probe) to avoid the nest of entry function,
and trigger sctp_probe_path_trace in sctp_outq_sack.

After this patch, you can enable both events individually,
  # cd /sys/kernel/debug/tracing
  # echo 1 > events/sctp/sctp_probe/enable
  # echo 1 > events/sctp/sctp_probe_path/enable

Or, you can enable all the events under sctp.

  # echo 1 > events/sctp/enable

Signed-off-by: Kevin Kou <qdkevin.kou@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26 13:06:45 -08:00
Vladyslav Tarasiuk
a5bcd72e05 net/mlxfw: Fix out-of-memory error in mfa2 flash burning
The burning process requires to perform internal allocations of large
chunks of memory. This memory doesn't need to be contiguous and can be
safely allocated by vzalloc() instead of kzalloc(). This patch changes
such allocation to avoid possible out-of-memory failure.

Fixes: 410ed13cae ("Add the mlxfw module for Mellanox firmware flash process")
Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Aya Levin <ayal@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-26 11:48:39 -08:00
Florian Westphal
c14ceb0ec7 netfilter: nft_meta: add support for slave device ifindex matching
Allow to match on vrf slave ifindex or name.

In case there was no slave interface involved, store 0 in the
destination register just like existing iif/oif matching.

sdif(name) is restricted to the ipv4/ipv6 input and forward hooks,
as it depends on ip(6) stack parsing/storing info in skb->cb[].

Cc: Martin Willi <martin@strongswan.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: Shrijeet Mukherjee <shrijeet@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-12-26 17:41:34 +01:00
Florian Westphal
01a0fc8225 netfilter: nft_meta: place rtclassid handling in a helper
skb_dst is an inline helper with a WARN_ON(), so this is a bit more code
than it looks like.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-12-26 17:41:34 +01:00
Florian Westphal
6b2faee0ca netfilter: nft_meta: place prandom handling in a helper
Move this out of the main eval loop, the numgen expression
provides a better alternative to meta random.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-12-26 17:41:34 +01:00
Florian Westphal
8724e819cc netfilter: nft_meta: move all interface related keys to helper
Reduces repetiveness and reduces size of meta eval function.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-12-26 17:41:34 +01:00
Florian Westphal
a4150a1faa netfilter: nft_meta: move interface kind handling to helper
checkpatch complains about == NULL checks in original code,
so use !in instead.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-12-26 17:41:34 +01:00
Florian Westphal
b1327fbc29 netfilter: nft_meta: move cgroup handling to helper
Reduce size of main eval function.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-12-26 17:41:34 +01:00
Florian Westphal
726b44f044 netfilter: nft_meta: move sk uid/git handling to helper
Not a hot path.  Also, both have copy&paste case statements,
so use a common helper for both.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-12-26 17:41:34 +01:00
Florian Westphal
4a54594abd netfilter: nft_meta: move pkttype handling to helper
When pkttype is loopback, nft_meta performs guesswork to detect
broad/multicast packets. Place this in a helper, this is hardly a hot path.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-12-26 17:41:33 +01:00
Florian Westphal
db8f6f5c8d netfilter: nft_meta: move time handling to helper
reduce size of the (large) meta evaluation function.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2019-12-26 17:41:33 +01:00
Hechao Li
1162f84403 bpf: Print error message for bpftool cgroup show
Currently, when bpftool cgroup show <path> has an error, no error
message is printed. This is confusing because the user may think the
result is empty.

Before the change:

$ bpftool cgroup show /sys/fs/cgroup
ID       AttachType      AttachFlags     Name
$ echo $?
255

After the change:
$ ./bpftool cgroup show /sys/fs/cgroup
Error: can't query bpf programs attached to /sys/fs/cgroup: Operation
not permitted

v2: Rename check_query_cgroup_progs to cgroup_has_attached_progs

Signed-off-by: Hechao Li <hechaol@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191224011742.3714301-1-hechaol@fb.com
2019-12-26 10:35:54 +01:00
Andrii Nakryiko
8ab9da573d libbpf: Support CO-RE relocations for LDX/ST/STX instructions
Clang patch [0] enables emitting relocatable generic ALU/ALU64 instructions
(i.e, shifts and arithmetic operations), as well as generic load/store
instructions. The former ones are already supported by libbpf as is. This
patch adds further support for load/store instructions. Relocatable field
offset is encoded in BPF instruction's 16-bit offset section and are adjusted
by libbpf based on target kernel BTF.

These Clang changes and corresponding libbpf changes allow for more succinct
generated BPF code by encoding relocatable field reads as a single
ST/LDX/STX instruction. It also enables relocatable access to BPF context.
Previously, if context struct (e.g., __sk_buff) was accessed with CO-RE
relocations (e.g., due to preserve_access_index attribute), it would be
rejected by BPF verifier due to modified context pointer dereference. With
Clang patch, such context accesses are both relocatable and have a fixed
offset from the point of view of BPF verifier.

  [0] https://reviews.llvm.org/D71790

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20191223180305.86417-1-andriin@fb.com
2019-12-26 10:31:44 +01:00
David S. Miller
aea3dee86c Merge branch 'Peer-to-Peer-One-Step-time-stamping'
Richard Cochran says:

====================
Peer to Peer One-Step time stamping

This series adds support for PTP (IEEE 1588) P2P one-step time
stamping along with a driver for a hardware device that supports this.

If the hardware supports p2p one-step, it subtracts the ingress time
stamp value from the Pdelay_Request correction field.  The user space
software stack then simply copies the correction field into the
Pdelay_Response, and on transmission the hardware adds the egress time
stamp into the correction field.

This new functionality extends CONFIG_NETWORK_PHY_TIMESTAMPING to
cover MII snooping devices, but it still depends on phylib, just as
that option does.  Expanding beyond phylib is not within the scope of
the this series.

User space support is available in the current linuxptp master branch.

- Patch 1 adds phy_device methods for existing time stamping fields.
- Patches 2-5 convert the stack and drivers to the new methods.
- Patch 6 moves code around the dp83640 driver.
- Patches 7-10 add support for MII time stamping in non-PHY devices.
- Patch 11 adds the new P2P 1-step option.
- Patch 12 adds a driver implementing the new option.

Thanks,
Richard

Changed in v9:
~~~~~~~~~~~~~~

- Fix two more drivers' switch/case blocks WRT the new HWTSTAMP ioctl.
- Picked up two more review tags from Andrew.

Changed in v8:
~~~~~~~~~~~~~~

- Avoided adding forward functional declarations in the dp83640 driver.
- Picked up Florian's new review tags and another one from Andrew.

Changed in v7:
~~~~~~~~~~~~~~

- Converted pr_debug|err to dev_ variants in new driver.
- Fixed device tree documentation per Rob's v6 review.
- Picked up Andrew's and Rob's review tags.
- Silenced sparse warnings in new driver.

Changed in v6:
~~~~~~~~~~~~~~

- Added methods for accessing the phy_device time stamping fields.
- Adjust the device tree documentation per Rob's v5 review.
- Fixed the build failures due to missing exports.

Changed in v5:
~~~~~~~~~~~~~~

- Fixed build failure in macvlan.
- Fixed latent bug with its gcc warning in the driver.

Changed in v4:
~~~~~~~~~~~~~~

- Correct error paths and PTR_ERR return values in the framework.
- Expanded KernelDoc comments WRT PHY locking.
- Pick up Andrew's review tag.

Changed in v3:
~~~~~~~~~~~~~~

- Simplify the device tree binding and document the time stamping
  phandle by itself.

Changed in v2:
~~~~~~~~~~~~~~

- Per the v1 review, changed the modeling of MII time stamping
  devices.  They are no longer a kind of mdio device.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25 19:51:34 -08:00
Richard Cochran
bad1eaa6ac ptp: Add a driver for InES time stamping IP core.
The InES at the ZHAW offers a PTP time stamping IP core.  The FPGA
logic recognizes and time stamps PTP frames on the MII bus.  This
patch adds a driver for the core along with a device tree binding to
allow hooking the driver to MII buses.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25 19:51:34 -08:00
Richard Cochran
b6fd7b9636 net: Introduce peer to peer one step PTP time stamping.
The 1588 standard defines one step operation for both Sync and
PDelay_Resp messages.  Up until now, hardware with P2P one step has
been rare, and kernel support was lacking.  This patch adds support of
the mode in anticipation of new hardware developments.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25 19:51:34 -08:00
Richard Cochran
1dca22b184 net: mdio: of: Register discovered MII time stampers.
When parsing a PHY node, register its time stamper, if any, and attach
the instance to the PHY device.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25 19:51:33 -08:00
Richard Cochran
25d12e1dde dt-bindings: ptp: Introduce MII time stamping devices.
This patch add a new binding that allows non-PHY MII time stamping
devices to find their buses.  The new documentation covers both the
generic binding and one upcoming user.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25 19:51:33 -08:00
Richard Cochran
767ff48373 net: Add a layer for non-PHY MII time stamping drivers.
While PHY time stamping drivers can simply attach their interface
directly to the PHY instance, stand alone drivers require support in
order to manage their services.  Non-PHY MII time stamping drivers
have a control interface over another bus like I2C, SPI, UART, or via
a memory mapped peripheral.  The controller device will be associated
with one or more time stamping channels, each of which sits snoops in
on a MII bus.

This patch provides a glue layer that will enable time stamping
channels to find their controlling device.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25 19:51:33 -08:00
Richard Cochran
4715f65ffa net: Introduce a new MII time stamping interface.
Currently the stack supports time stamping in PHY devices.  However,
there are newer, non-PHY devices that can snoop an MII bus and provide
time stamps.  In order to support such devices, this patch introduces
a new interface to be used by both PHY and non-PHY devices.

In addition, the one and only user of the old PHY time stamping API is
converted to the new interface.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25 19:51:33 -08:00
Richard Cochran
12d0efb9e6 net: phy: dp83640: Move the probe and remove methods around.
An upcoming patch will change how the PHY time stamping functions are
registered with the networking stack, and adapting this driver would
entail adding forward declarations for four time stamping methods.
However, forward declarations are considered to be stylistic defects.
This patch avoids the issue by moving the probe and remove methods
immediately above the phy_driver interface structure.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25 19:51:33 -08:00
Richard Cochran
bfd57b5900 net: netcp_ethss: Use the PHY time stamping interface.
The netcp_ethss driver tests fields of the phy_device in order to
determine whether to defer to the PHY's time stamping functionality.
This patch replaces the open coded logic with an invocation of the
proper methods.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25 19:51:33 -08:00
Richard Cochran
7774ee2368 net: ethtool: Use the PHY time stamping interface.
The ethtool layer tests fields of the phy_device in order to determine
whether to invoke the PHY's tsinfo ethtool callback.  This patch
replaces the open coded logic with an invocation of the proper
methods.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25 19:51:33 -08:00
Richard Cochran
dfe6d68fc4 net: vlan: Use the PHY time stamping interface.
The vlan layer tests fields of the phy_device in order to determine
whether to invoke the PHY's tsinfo ethtool callback.  This patch
replaces the open coded logic with an invocation of the proper
methods.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25 19:51:33 -08:00
Richard Cochran
d25de984aa net: macvlan: Use the PHY time stamping interface.
The macvlan layer tests fields of the phy_device in order to determine
whether to invoke the PHY's tsinfo ethtool callback.  This patch
replaces the open coded logic with an invocation of the proper
methods.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25 19:51:33 -08:00
Richard Cochran
0e5dafc8a6 net: phy: Introduce helper functions for time stamping support.
Some parts of the networking stack and at least one driver test fields
within the 'struct phy_device' in order to query time stamping
capabilities and to invoke time stamping methods.  This patch adds a
functional interface around the time stamping fields.  This will allow
insulating the callers from future changes to the details of the time
stamping implemenation.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25 19:51:33 -08:00
Florian Fainelli
bf0e5013bc ata: ahci_brcm: Add missing clock management during recovery
The downstream implementation of ahci_brcm.c did contain clock
management recovery, but until recently, did that outside of the
libahci_platform helpers and this was unintentionally stripped out while
forward porting the patch upstream.

Add the missing clock management during recovery and sleep for 10
milliseconds per the design team recommendations to ensure the SATA PHY
controller and AFE have been fully quiesced.

Fixes: eb73390ae2 ("ata: ahci_brcm: Recover from failures to identify devices")
Cc: stable@vger.kernel.org
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-25 20:47:24 -07:00
Florian Fainelli
1a3d78cb6e ata: ahci_brcm: BCM7425 AHCI requires AHCI_HFLAG_DELAY_ENGINE
Set AHCI_HFLAG_DELAY_ENGINE for the BCM7425 AHCI controller thus making
it conforming to the 'strict' AHCI implementation which this controller
is based on.

This solves long link establishment with specific hard drives (e.g.:
Seagate ST1000VM002-9ZL1 SC12) that would otherwise have to complete the
error recovery handling before finally establishing a succesful SATA
link at the desired speed.

We re-order the hpriv->flags assignment to also remove the NONCQ quirk
since we can set the flag directly.

Fixes: 9586114cf1e9 ("ata: ahci_brcmstb: add support MIPS-based platforms")
Fixes: 423be77daabe ("ata: ahci_brcmstb: add quirk for broken ncq")
Cc: stable@vger.kernel.org
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-25 20:47:23 -07:00
Florian Fainelli
c0cdf2ac4b ata: ahci_brcm: Fix AHCI resources management
The AHCI resources management within ahci_brcm.c is a little
convoluted, largely because it historically had a dedicated clock that
was managed within this file in the downstream tree. Once brough
upstream though, the clock was left to be managed by libahci_platform.c
which is entirely appropriate.

This patch series ensures that the AHCI resources are fetched and
enabled before any register access is done, thus avoiding bus errors on
platforms which clock gate the controller by default.

As a result we need to re-arrange the suspend() and resume() functions
in order to avoid accessing registers after the clocks have been turned
off respectively before the clocks have been turned on. Finally, we can
refactor brcm_ahci_get_portmask() in order to fetch the number of ports
from hpriv->mmio which is now accessible without jumping through hoops
like we used to do.

The commit pointed in the Fixes tag is both old and new enough not to
require major headaches for backporting of this patch.

Fixes: eba68f8297 ("ata: ahci_brcmstb: rename to support across Broadcom SoC's")
Cc: stable@vger.kernel.org
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-25 20:47:21 -07:00
Florian Fainelli
84b032dbfd ata: libahci_platform: Export again ahci_platform_<en/dis>able_phys()
This reverts commit 6bb86fefa0
("libahci_platform: Staticize ahci_platform_<en/dis>able_phys()") we are
going to need ahci_platform_{enable,disable}_phys() in a subsequent
commit for ahci_brcm.c in order to properly control the PHY
initialization order.

Also make sure the function prototypes are declared in
include/linux/ahci_platform.h as a result.

Cc: stable@vger.kernel.org
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-12-25 20:47:19 -07:00
David S. Miller
095e90e080 Merge branch 'hsr-fix-several-bugs-in-hsr-module'
Taehee Yoo says:

====================
hsr: fix several bugs in hsr module

1. The first patch fixes debugfs warning when it's opened when hsr module
is being removed. debugfs file is opened, it tries to hold .owner module,
but it would print warning messages if it couldn't hold .owner module.
In order to avoid the warning message, this patch makes hsr module does
not set .owner. Unsetting .owner is safe because these are protected by
inode_lock().

2. The second patch fixes wrong error handling of hsr_dev_finalize()
a) hsr_dev_finalize() calls debugfs_create_{dir/file} to create debugfs.
it checks NULL pointer but debugfs don't return NULL so it's wrong code.
b) hsr_dev_finalize() calls register_netdevice(). so if it fails after
register_netdevice(), it should call unregister_netdevice().
But it doesn't.
c) debugfs doesn't affect any actual logic of hsr module.
So, the failure of creating of debugfs could be ignored.

3. The third patch adds hsr root debugfs directory.
When hsr interface is created, it creates debugfs directory in
/sys/kernel/debug/<interface name>.
It's a little bit faulty path because if an interface is the same with
another directory name in the same path, it will fail. If hsr root
directory is existing, the possibility of failure of creating debugfs
file will be reduced.

4. The fourth patch adds debugfs rename routine.
debugfs directory name is the same with hsr interface name.
So hsr interface name is changed, debugfs directory name should be
changed too.

5. The fifth patch fixes a race condition in node list add and del.
hsr nodes are protected by RCU and there is no write side lock.
But node insertions and deletions could be being operated concurrently.
So write side locking is needed.

6. The Sixth patch resets network header
Tap routine is enabled, below message will be printed.

[  175.852292][    C3] protocol 88fb is buggy, dev veth0

hsr module doesn't set network header for supervision frame.
But tap routine validates network header.
If network header wasn't set, it resets and warns about it.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25 16:35:35 -08:00
Taehee Yoo
3ed0a1d563 hsr: reset network header when supervision frame is created
The supervision frame is L2 frame.
When supervision frame is created, hsr module doesn't set network header.
If tap routine is enabled, dev_queue_xmit_nit() is called and it checks
network_header. If network_header pointer wasn't set(or invalid),
it resets network_header and warns.
In order to avoid unnecessary warning message, resetting network_header
is needed.

Test commands:
    ip netns add nst
    ip link add veth0 type veth peer name veth1
    ip link add veth2 type veth peer name veth3
    ip link set veth1 netns nst
    ip link set veth3 netns nst
    ip link set veth0 up
    ip link set veth2 up
    ip link add hsr0 type hsr slave1 veth0 slave2 veth2
    ip a a 192.168.100.1/24 dev hsr0
    ip link set hsr0 up
    ip netns exec nst ip link set veth1 up
    ip netns exec nst ip link set veth3 up
    ip netns exec nst ip link add hsr1 type hsr slave1 veth1 slave2 veth3
    ip netns exec nst ip a a 192.168.100.2/24 dev hsr1
    ip netns exec nst ip link set hsr1 up
    tcpdump -nei veth0

Splat looks like:
[  175.852292][    C3] protocol 88fb is buggy, dev veth0

Fixes: f421436a59 ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25 16:35:35 -08:00
Taehee Yoo
92a35678ec hsr: fix a race condition in node list insertion and deletion
hsr nodes are protected by RCU and there is no write side lock.
But node insertions and deletions could be being operated concurrently.
So write side locking is needed.

Test commands:
    ip netns add nst
    ip link add veth0 type veth peer name veth1
    ip link add veth2 type veth peer name veth3
    ip link set veth1 netns nst
    ip link set veth3 netns nst
    ip link set veth0 up
    ip link set veth2 up
    ip link add hsr0 type hsr slave1 veth0 slave2 veth2
    ip a a 192.168.100.1/24 dev hsr0
    ip link set hsr0 up
    ip netns exec nst ip link set veth1 up
    ip netns exec nst ip link set veth3 up
    ip netns exec nst ip link add hsr1 type hsr slave1 veth1 slave2 veth3
    ip netns exec nst ip a a 192.168.100.2/24 dev hsr1
    ip netns exec nst ip link set hsr1 up

    for i in {0..9}
    do
        for j in {0..9}
	do
	    for k in {0..9}
	    do
	        for l in {0..9}
		do
	        arping 192.168.100.2 -I hsr0 -s 00:01:3$i:4$j:5$k:6$l -c1 &
		done
	    done
	done
    done

Splat looks like:
[  236.066091][ T3286] list_add corruption. next->prev should be prev (ffff8880a5940300), but was ffff8880a5940d0.
[  236.069617][ T3286] ------------[ cut here ]------------
[  236.070545][ T3286] kernel BUG at lib/list_debug.c:25!
[  236.071391][ T3286] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[  236.072343][ T3286] CPU: 0 PID: 3286 Comm: arping Tainted: G        W         5.5.0-rc1+ #209
[  236.073463][ T3286] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  236.074695][ T3286] RIP: 0010:__list_add_valid+0x74/0xd0
[  236.075499][ T3286] Code: 48 39 da 75 27 48 39 f5 74 36 48 39 dd 74 31 48 83 c4 08 b8 01 00 00 00 5b 5d c3 48 b
[  236.078277][ T3286] RSP: 0018:ffff8880aaa97648 EFLAGS: 00010286
[  236.086991][ T3286] RAX: 0000000000000075 RBX: ffff8880d4624c20 RCX: 0000000000000000
[  236.088000][ T3286] RDX: 0000000000000075 RSI: 0000000000000008 RDI: ffffed1015552ebf
[  236.098897][ T3286] RBP: ffff88809b53d200 R08: ffffed101b3c04f9 R09: ffffed101b3c04f9
[  236.099960][ T3286] R10: 00000000308769a1 R11: ffffed101b3c04f8 R12: ffff8880d4624c28
[  236.100974][ T3286] R13: ffff8880d4624c20 R14: 0000000040310100 R15: ffff8880ce17ee02
[  236.138967][ T3286] FS:  00007f23479fa680(0000) GS:ffff8880d9c00000(0000) knlGS:0000000000000000
[  236.144852][ T3286] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  236.145720][ T3286] CR2: 00007f4a14bab210 CR3: 00000000a61c6001 CR4: 00000000000606f0
[  236.146776][ T3286] Call Trace:
[  236.147222][ T3286]  hsr_add_node+0x314/0x490 [hsr]
[  236.153633][ T3286]  hsr_forward_skb+0x2b6/0x1bc0 [hsr]
[  236.154362][ T3286]  ? rcu_read_lock_sched_held+0x90/0xc0
[  236.155091][ T3286]  ? rcu_read_lock_bh_held+0xa0/0xa0
[  236.156607][ T3286]  hsr_dev_xmit+0x70/0xd0 [hsr]
[  236.157254][ T3286]  dev_hard_start_xmit+0x160/0x740
[  236.157941][ T3286]  __dev_queue_xmit+0x1961/0x2e10
[  236.158565][ T3286]  ? netdev_core_pick_tx+0x2e0/0x2e0
[ ... ]

Reported-by: syzbot+3924327f9ad5f4d2b343@syzkaller.appspotmail.com
Fixes: f421436a59 ("net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0)")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25 16:35:35 -08:00
Taehee Yoo
4c2d5e33dc hsr: rename debugfs file when interface name is changed
hsr interface has own debugfs file, which name is same with interface name.
So, interface name is changed, debugfs file name should be changed too.

Fixes: fc4ecaeebd ("net: hsr: add debugfs support for display node list")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25 16:35:35 -08:00
Taehee Yoo
c6c4ccd7f9 hsr: add hsr root debugfs directory
In current hsr code, when hsr interface is created, it creates debugfs
directory /sys/kernel/debug/<interface name>.
If there is same directory or file name in there, it fails.
In order to reduce possibility of failure of creation of debugfs,
this patch adds root directory.

Test commands:
    ip link add dummy0 type dummy
    ip link add dummy1 type dummy
    ip link add hsr0 type hsr slave1 dummy0 slave2 dummy1

Before this patch:
    /sys/kernel/debug/hsr0/node_table

After this patch:
    /sys/kernel/debug/hsr/hsr0/node_table

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25 16:35:35 -08:00
Taehee Yoo
1d19e2d53e hsr: fix error handling routine in hsr_dev_finalize()
hsr_dev_finalize() is called to create new hsr interface.
There are some wrong error handling codes.

1. wrong checking return value of debugfs_create_{dir/file}.
These function doesn't return NULL. If error occurs in there,
it returns error pointer.
So, it should check error pointer instead of NULL.

2. It doesn't unregister interface if it fails to setup hsr interface.
If it fails to initialize hsr interface after register_netdevice(),
it should call unregister_netdevice().

3. Ignore failure of creation of debugfs
If creating of debugfs dir and file is failed, creating hsr interface
will be failed. But debugfs doesn't affect actual logic of hsr module.
So, ignoring this is more correct and this behavior is more general.

Fixes: c5a7591172 ("net/hsr: Use list_head (and rcu) instead of array for slave devices.")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-25 16:35:35 -08:00