Commit Graph

1045293 Commits

Author SHA1 Message Date
Alvin Šipraga
1521d5adfc net: dsa: tag_rtl8_4: add realtek 8 byte protocol 4 tag
This commit implements a basic version of the 8 byte tag protocol used
in the Realtek RTL8365MB-VC unmanaged switch, which carries with it a
protocol version of 0x04.

The implementation itself only handles the parsing of the EtherType
value and Realtek protocol version, together with the source or
destination port fields. The rest is left unimplemented for now.

The tag format is described in a confidential document provided to my
company by Realtek Semiconductor Corp. Permission has been granted by
the vendor to publish this driver based on that material, together with
an extract from the document describing the tag format and its fields.
It is hoped that this will help future implementors who do not have
access to the material but who wish to extend the functionality of
drivers for chips which use this protocol.

In addition, two possible values of the REASON field are specified,
based on experiments on my end. Realtek does not specify what value this
field can take.

Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 14:02:56 +01:00
Alvin Šipraga
2e405875f3 dt-bindings: net: dsa: realtek-smi: document new compatible rtl8365mb
rtl8365mb is a new realtek-smi subdriver for the RTL8365MB-VC 4+1 port
10/100/1000M Ethernet switch controller. Its compatible string is
"realtek,rtl8365mb".

Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Rob Herring <robh@kernel.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 14:02:56 +01:00
Alvin Šipraga
9cb8edda21 net: dsa: move NET_DSA_TAG_RTL4_A to right place in Kconfig/Makefile
Move things around a little so that this tag driver is alphabetically
ordered. The Kconfig file is sorted based on the tristate text.

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 14:02:55 +01:00
Alvin Šipraga
487d3855b6 net: dsa: allow reporting of standard ethtool stats for slave devices
Jakub pointed out that we have a new ethtool API for reporting device
statistics in a standardized way, via .get_eth_{phy,mac,ctrl}_stats.
Add a small amount of plumbing to allow DSA drivers to take advantage of
this when exposing statistics.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 14:02:55 +01:00
Alvin Šipraga
7bbbbfaa7a ether: add EtherType for proprietary Realtek protocols
Add a new EtherType ETH_P_REALTEK to the if_ether.h uapi header. The
EtherType 0x8899 is used in a number of different protocols from Realtek
Semiconductor Corp [1], so no general assumptions should be made when
trying to decode such packets. Observed protocols include:

  0x1 - Realtek Remote Control protocol [2]
  0x2 - Echo protocol [2]
  0x3 - Loop detection protocol [2]
  0x4 - RTL8365MB 4- and 8-byte switch CPU tag protocols [3]
  0x9 - RTL8306 switch CPU tag protocol [4]
  0xA - RTL8366RB switch CPU tag protocol [4]

[1] https://lore.kernel.org/netdev/CACRpkdYQthFgjwVzHyK3DeYUOdcYyWmdjDPG=Rf9B3VrJ12Rzg@mail.gmail.com/
[2] https://www.wireshark.org/lists/ethereal-dev/200409/msg00090.html
[3] https://lore.kernel.org/netdev/20210822193145.1312668-4-alvin@pqrs.dk/
[4] https://lore.kernel.org/netdev/20200708122537.1341307-2-linus.walleij@linaro.org/

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 14:02:55 +01:00
Jakub Kicinski
4abd7cffc0 ethernet: use eth_hw_addr_set() in unmaintained drivers
Commit 406f42fa0d ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 13:20:38 +01:00
Arnd Bergmann
0e9e7598c6 octeontx2-nic: fix mixed module build
Building the VF and PF side of this driver differently, with one being
a loadable module and the other one built-in results in a link failure
for the common PTP driver:

ld.lld: error: undefined symbol: __this_module
>>> referenced by otx2_ptp.c
>>>               net/ethernet/marvell/octeontx2/nic/otx2_ptp.o:(otx2_ptp_init) in archive drivers/built-in.a
>>> referenced by otx2_ptp.c
>>>               net/ethernet/marvell/octeontx2/nic/otx2_ptp.o:(otx2_ptp_init) in archive drivers/built-in.a

Move the otx2_ptp.c code into a separate module that gets built for
both configurations, making it built-in if at least one of the other
two is built-in.

Fixes: 43510ef4dd ("octeontx2-nicvf: Add PTP hardware clock support to NIX VF")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 13:14:00 +01:00
David S. Miller
91e1aef746 Merge branch 'uniphier-nx1'
Kunihiko Hayashi says:

====================
net: ethernet: ave: Introduce UniPhier NX1 SoC support

This series includes the patches to add basic support for new UniPhier NX1
SoC. NX1 SoC also has the same kinds of controls as the other UniPhier
SoCs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 13:10:08 +01:00
Kunihiko Hayashi
9fd3d5dced net: ethernet: ave: Add compatible string and SoC-dependent data for NX1 SoC
Add basic support for UniPhier NX1 SoC. This includes a compatible string
and SoC-dependent data.

Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 13:10:08 +01:00
Kunihiko Hayashi
8e60189d93 dt-bindings: net: ave: Add bindings for NX1 SoC
Update AVE binding document for UniPhier NX1 SoC.

Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 13:10:08 +01:00
Uwe Kleine-König
d40dfa0ceb net: w5100: Make w5100_remove() return void
Up to now w5100_remove() returns zero unconditionally. Make it return
void instead which makes it easier to see in the callers that there is
no error to handle.

Also the return value of platform and spi remove callbacks is ignored
anyway.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 12:59:12 +01:00
Uwe Kleine-König
2841bfd10a net: ks8851: Make ks8851_remove_common() return void
Up to now ks8851_remove_common() returns zero unconditionally. Make it
return void instead which makes it easier to see in the callers that
there is no error to handle.

Also the return value of platform and spi remove callbacks is ignored
anyway.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 12:59:12 +01:00
David S. Miller
f8ba22a142 Merge branch 'remove-qdisc-running-counter'
Sebastian Andrzej Siewior says:

====================
Try to simplify the gnet_stats and remove qdisc->running sequence counter.

The first few patches is a follow up to
    https://lore.kernel.org/all/20211007175000.2334713-1-bigeasy@linutronix.de/

The remaining patches (#5+) remove the seqcount_t (Qdisc::running) from
the Qdisc. The statistics (Qdisc::bstats and Qdisc::cpu_bstats) use
u64_stats_t and the "running state" is now represented by a bit in
Qdisc::state.

By removing the seqcount_t from Qdisc and decoupling the bstats
statistics from the seqcount_t it is possible to query the statistics
even if the Qdisc is running instead of waiting until it is idle again.

The try-lock like usage of the seqcount_t in qdisc_run_begin() is
problematic on PREEMPT_RT. Inside the qdisc_run_begin/end() qdisc->running
sequence counter write sections, at sch_direct_xmit(), the seqcount write
serialization lock is released then re-acquired. This is fine for !RT, because
the writer is in a BH disabled region and there is a no in-IRQ reader. For RT
though, BH sections are preemptible. The earlier introduced seqcount_LOCKNAME_t
mechanism, which for RT the reader acquires then relesaes the write
serailization lock to avoid infinite spinning if it preempts a seqcount write
section, cannot work: the qdisc->running write serialization lock is already
intermittingly released inside the seqcount write section.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 12:54:41 +01:00
Ahmed S. Darwish
29cbcd8582 net: sched: Remove Qdisc::running sequence counter
The Qdisc::running sequence counter has two uses:

  1. Reliably reading qdisc's tc statistics while the qdisc is running
     (a seqcount read/retry loop at gnet_stats_add_basic()).

  2. As a flag, indicating whether the qdisc in question is running
     (without any retry loops).

For the first usage, the Qdisc::running sequence counter write section,
qdisc_run_begin() => qdisc_run_end(), covers a much wider area than what
is actually needed: the raw qdisc's bstats update. A u64_stats sync
point was thus introduced (in previous commits) inside the bstats
structure itself. A local u64_stats write section is then started and
stopped for the bstats updates.

Use that u64_stats sync point mechanism for the bstats read/retry loop
at gnet_stats_add_basic().

For the second qdisc->running usage, a __QDISC_STATE_RUNNING bit flag,
accessed with atomic bitops, is sufficient. Using a bit flag instead of
a sequence counter at qdisc_run_begin/end() and qdisc_is_running() leads
to the SMP barriers implicitly added through raw_read_seqcount() and
write_seqcount_begin/end() getting removed. All call sites have been
surveyed though, and no required ordering was identified.

Now that the qdisc->running sequence counter is no longer used, remove
it.

Note, using u64_stats implies no sequence counter protection for 64-bit
architectures. This can lead to the qdisc tc statistics "packets" vs.
"bytes" values getting out of sync on rare occasions. The individual
values will still be valid.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 12:54:41 +01:00
Ahmed S. Darwish
50dc9a8572 net: sched: Merge Qdisc::bstats and Qdisc::cpu_bstats data types
The only factor differentiating per-CPU bstats data type (struct
gnet_stats_basic_cpu) from the packed non-per-CPU one (struct
gnet_stats_basic_packed) was a u64_stats sync point inside the former.
The two data types are now equivalent: earlier commits added a u64_stats
sync point to the latter.

Combine both data types into "struct gnet_stats_basic_sync". This
eliminates redundancy and simplifies the bstats read/write APIs.

Use u64_stats_t for bstats "packets" and "bytes" data types. On 64-bit
architectures, u64_stats sync points do not use sequence counter
protection.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 12:54:41 +01:00
Ahmed S. Darwish
f56940daa5 net: sched: Use _bstats_update/set() instead of raw writes
The Qdisc::running sequence counter, used to protect Qdisc::bstats reads
from parallel writes, is in the process of being removed. Qdisc::bstats
read/writes will synchronize using an internal u64_stats sync point
instead.

Modify all bstats writes to use _bstats_update(). This ensures that
the internal u64_stats sync point is always acquired and released as
appropriate.

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 12:54:41 +01:00
Ahmed S. Darwish
67c9e6270f net: sched: Protect Qdisc::bstats with u64_stats
The not-per-CPU variant of qdisc tc (traffic control) statistics,
Qdisc::gnet_stats_basic_packed bstats, is protected with Qdisc::running
sequence counter.

This sequence counter is used for reliably protecting bstats reads from
parallel writes. Meanwhile, the seqcount's write section covers a much
wider area than bstats update: qdisc_run_begin() => qdisc_run_end().

That read/write section asymmetry can lead to needless retries of the
read section. To prepare for removing the Qdisc::running sequence
counter altogether, introduce a u64_stats sync point inside bstats
instead.

Modify _bstats_update() to start/end the bstats u64_stats write
section.

For bisectability, and finer commits granularity, the bstats read
section is still protected with a Qdisc::running read/retry loop and
qdisc_run_begin/end() still starts/ends that seqcount write section.
Once all call sites are modified to use _bstats_update(), the
Qdisc::running seqcount will be removed and bstats read/retry loop will
be modified to utilize the internal u64_stats sync point.

Note, using u64_stats implies no sequence counter protection for 64-bit
architectures. This can lead to the statistics "packets" vs. "bytes"
values getting out of sync on rare occasions. The individual values will
still be valid.

[bigeasy: Minor commit message edits, init all gnet_stats_basic_packed.]

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 12:54:41 +01:00
Ahmed S. Darwish
f2efdb1792 u64_stats: Introduce u64_stats_set()
Allow to directly set a u64_stats_t value which is used to provide an init
function which sets it directly to zero intead of memset() the value.

Add u64_stats_set() to the u64_stats API.

[bigeasy: commit message. ]

Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 12:54:41 +01:00
Sebastian Andrzej Siewior
10940eb746 gen_stats: Move remaining users to gnet_stats_add_queue().
The gnet_stats_queue::qlen member is only used in the SMP-case.

qdisc_qstats_qlen_backlog() needs to add qdisc_qlen() to qstats.qlen to
have the same value as that provided by qdisc_qlen_sum().

gnet_stats_copy_queue() needs to overwritte the resulting qstats.qlen
field whith the caller submitted qlen value. It might be differ from the
submitted value.

Let both functions use gnet_stats_add_queue() and remove unused
__gnet_stats_copy_queue().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 12:54:41 +01:00
Sebastian Andrzej Siewior
7361df4606 mq, mqprio: Use gnet_stats_add_queue().
gnet_stats_add_basic() and gnet_stats_add_queue() add up the statistics
so they can be used directly for both the per-CPU and global case.

gnet_stats_add_queue() copies either Qdisc's per-CPU
gnet_stats_queue::qlen or the global member. The global
gnet_stats_queue::qlen isn't touched in the per-CPU case so there is no
need to consider it in the global-case.

In the per-CPU case, the sum of global gnet_stats_queue::qlen and
the per-CPU gnet_stats_queue::qlen was assigned to sch->q.qlen and
sch->qstats.qlen. Now both fields are copied individually.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 12:54:41 +01:00
Sebastian Andrzej Siewior
448e163f8b gen_stats: Add gnet_stats_add_queue().
This function will replace __gnet_stats_copy_queue(). It reads all
arguments and adds them into the passed gnet_stats_queue argument.
In contrast to __gnet_stats_copy_queue() it also copies the qlen member.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 12:54:41 +01:00
Sebastian Andrzej Siewior
fbf307c89e gen_stats: Add instead Set the value in __gnet_stats_copy_basic().
__gnet_stats_copy_basic() always assigns the value to the bstats
argument overwriting the previous value. The later added per-CPU version
always accumulated the values in the returning gnet_stats_basic_packed
argument.

Based on review there are five users of that function as of today:
- est_fetch_counters(), ___gnet_stats_copy_basic()
  memsets() bstats to zero, single invocation.

- mq_dump(), mqprio_dump(), mqprio_dump_class_stats()
  memsets() bstats to zero, multiple invocation but does not use the
  function due to !qdisc_is_percpu_stats().

Add the values in __gnet_stats_copy_basic() instead overwriting. Rename
the function to gnet_stats_add_basic() to make it more obvious.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18 12:54:40 +01:00
Kyungrok Chung
254ec036db net: make use of helper netif_is_bridge_master()
Make use of netdev helper functions to improve code readability.
Replace 'dev->priv_flags & IFF_EBRIDGE' with netif_is_bridge_master(dev).

Signed-off-by: Kyungrok Chung <acadx0@gmail.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 15:02:56 +01:00
David S. Miller
4e6e167bc0 Merge branch 'smc-rv23'
Karsten Graul says:

====================
net/smc: introduce SMC-Rv2 support

Please apply the following patch series for smc to netdev's net-next tree.

SMC-Rv2 support (see https://www.ibm.com/support/pages/node/6326337)
provides routable RoCE support for SMC-R, eliminating the current
same-subnet restriction, by exploiting the UDP encapsulation feature
of the RoCE adapter hardware.

v2: resend of the v1 patch series, and CC linux-rdma this time
v3: rebase after net tree was merged into net-next
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:13 +01:00
Karsten Graul
29397e34c7 net/smc: stop links when their GID is removed
With SMC-Rv2 the GID is an IP address that can be deleted from the
device. When an IB_EVENT_GID_CHANGE event is provided then iterate over
all active links and check if their GID is still defined. Otherwise
stop the affected link.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:13 +01:00
Karsten Graul
b0539f5edd net/smc: add netlink support for SMC-Rv2
Implement the netlink support for SMC-Rv2 related attributes that are
provided to user space.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:13 +01:00
Karsten Graul
b4ba4652b3 net/smc: extend LLC layer for SMC-Rv2
Add support for large v2 LLC control messages in smc_llc.c.
The new large work request buffer allows to combine control
messages into one packet that had to be spread over several
packets before.
Add handling of the new v2 LLC messages.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:13 +01:00
Karsten Graul
8799e310fb net/smc: add v2 support to the work request layer
In the work request layer define one large v2 buffer for each link group
that is used to transmit and receive large LLC control messages.
Add the completion queue handling for this buffer.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:13 +01:00
Karsten Graul
24fb68111d net/smc: retrieve v2 gid from IB device
In smc_ib.c, scan for RoCE devices that support UDP encapsulation.
Find an eligible device and check that there is a route to the
remote peer.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:13 +01:00
Karsten Graul
8ade200c26 net/smc: add v2 format of CLC decline message
The CLC decline message changed with SMC-Rv2 and supports up to
4 additional diagnosis codes.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:13 +01:00
Karsten Graul
e49300a6bf net/smc: add listen processing for SMC-Rv2
Implement the server side of the SMC-Rv2 processing. Process incoming
CLC messages, find eligible devices and check for a valid route to the
remote peer.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:13 +01:00
Karsten Graul
e5c4744cfb net/smc: add SMC-Rv2 connection establishment
Send a CLC proposal message, and the remote side process this type of
message and determine the target GID. Check for a valid route to this
GID, and complete the connection establishment.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:12 +01:00
Karsten Graul
42042dbbc2 net/smc: prepare for SMC-Rv2 connection
Prepare the connection establishment with SMC-Rv2. Detect eligible
RoCE cards and indicate all supported SMC modes for the connection.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:12 +01:00
Karsten Graul
ed990df29f net/smc: save stack space and allocate smc_init_info
The struct smc_init_info grew over time, its time to save space on stack
and allocate this struct dynamically.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:12 +01:00
Jakub Kicinski
24bcbe1cc6 net: stream: don't purge sk_error_queue in sk_stream_kill_queues()
sk_stream_kill_queues() can be called on close when there are
still outstanding skbs to transmit. Those skbs may try to queue
notifications to the error queue (e.g. timestamps).
If sk_stream_kill_queues() purges the queue without taking
its lock the queue may get corrupted, and skbs leaked.

This shows up as a warning about an rmem leak:

WARNING: CPU: 24 PID: 0 at net/ipv4/af_inet.c:154 inet_sock_destruct+0x...

The leak is always a multiple of 0x300 bytes (the value is in
%rax on my builds, so RAX: 0000000000000300). 0x300 is truesize of
an empty sk_buff. Indeed if we dump the socket state at the time
of the warning the sk_error_queue is often (but not always)
corrupted. The ->next pointer points back at the list head,
but not the ->prev pointer. Indeed we can find the leaked skb
by scanning the kernel memory for something that looks like
an skb with ->sk = socket in question, and ->truesize = 0x300.
The contents of ->cb[] of the skb confirms the suspicion that
it is indeed a timestamp notification (as generated in
__skb_complete_tx_timestamp()).

Removing purging of sk_error_queue should be okay, since
inet_sock_destruct() does it again once all socket refs
are gone. Eric suggests this may cause sockets that go
thru disconnect() to maintain notifications from the
previous incarnations of the socket, but that should be
okay since the race was there anyway, and disconnect()
is not exactly dependable.

Thanks to Jonathan Lemon and Omar Sandoval for help at various
stages of tracing the issue.

Fixes: cb9eff0978 ("net: new user space API for time stamping of incoming and outgoing packets")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 09:06:09 +01:00
David S. Miller
4b0dd004e3 Merge branch 'dev_addr-conversions-part-1'
Jakub Kicinski says:

====================
ethernet: manual netdev->dev_addr conversions (part 1)

Manual conversions of drivers writing directly
to netdev->dev_addr (part 1 out of 3).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 08:53:46 +01:00
Jakub Kicinski
ec356edef7 ethernet: ixgb: use eth_hw_addr_set()
Commit 406f42fa0d ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.

Read the address into an array on the stack, then call
eth_hw_addr_set(). ixgb_get_ee_mac_addr() is used with
a non-nevdev->dev_addr pointer so we can't deal with the problem
inside it.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 08:53:46 +01:00
Jakub Kicinski
5c8b348534 ethernet: ibmveth: use ether_addr_to_u64()
We'll want to make netdev->dev_addr const, remove the local
helper which is missing a const qualifier on the argument
and use ether_addr_to_u64().

Similar story to mlx4.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 08:53:46 +01:00
Jakub Kicinski
d9ca87233b ethernet: enetc: use eth_hw_addr_set()
Commit 406f42fa0d ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.

Pass a netdev into the helper instead of just the address,
read the address into an array on the stack, then call
eth_hw_addr_set().

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 08:53:46 +01:00
Jakub Kicinski
10e6ded812 ethernet: ec_bhf: use eth_hw_addr_set()
Commit 406f42fa0d ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.

Copy the address into an array on the stack, then call
eth_hw_addr_set().

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 08:53:46 +01:00
Jakub Kicinski
41edfff572 ethernet: enic: use eth_hw_addr_set()
Commit 406f42fa0d ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.

Use a zero'ed array on the stack, then call eth_hw_addr_set().

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 08:53:46 +01:00
Jakub Kicinski
0c9e0c7931 ethernet: bcmgenet: use eth_hw_addr_set()
Commit 406f42fa0d ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.

Read the address into an array on the stack, then call
eth_hw_addr_set().

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 08:53:46 +01:00
Jakub Kicinski
a85c8f9ad2 ethernet: bnx2x: use eth_hw_addr_set()
Commit 406f42fa0d ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.

Read the address into an array on the stack, then call
eth_hw_addr_set().

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 08:53:46 +01:00
Jakub Kicinski
698c33d8b4 ethernet: aquantia: use eth_hw_addr_set()
Commit 406f42fa0d ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.

Use an array on the stack, then call eth_hw_addr_set().
eth_hw_addr_set() is after error checking, this should
be fine, error propagates all the way to failing probe.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 08:53:46 +01:00
Jakub Kicinski
f98c50509a ethernet: amd: use eth_hw_addr_set()
Commit 406f42fa0d ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.

Read the address into an array on the stack, then call
eth_hw_addr_set().

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 08:53:46 +01:00
Jakub Kicinski
ffaeca68fb ethernet: alteon: use eth_hw_addr_set()
Commit 406f42fa0d ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.

Break the address apart into an array on the stack, then call
eth_hw_addr_set().

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 08:53:46 +01:00
Jakub Kicinski
0d4c751715 ethernet: aeroflex: use eth_hw_addr_set()
Commit 406f42fa0d ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.

macaddr[] is a module param, and int, so copy the address into
an array of u8 on the stack, then call eth_hw_addr_set().

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 08:53:45 +01:00
Jakub Kicinski
8ec53ed9af ethernet: adaptec: use eth_hw_addr_set()
Commit 406f42fa0d ("net-next: When a bond have a massive amount
of VLANs...") introduced a rbtree for faster Ethernet address look
up. To maintain netdev->dev_addr in this tree we need to make all
the writes to it got through appropriate helpers.

Read the address into an array on the stack, then call
eth_hw_addr_set().

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 08:53:45 +01:00
Jean Sacren
a07a296bba net: ipvtap: fix template string argument of device_create() call
The last argument of device_create() call should be a template string.
The tap_name variable should be the argument to the string, but not the
argument of the call itself.  We should add the template string and turn
tap_name into its argument.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 08:51:22 +01:00
Jean Sacren
1c5b5b3f0e net: macvtap: fix template string argument of device_create() call
The last argument of device_create() call should be a template string.
The tap_name variable should be the argument to the string, but not the
argument of the call itself.  We should add the template string and turn
tap_name into its argument.

Signed-off-by: Jean Sacren <sakiwit@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 08:51:21 +01:00