Commit Graph

1186860 Commits

Author SHA1 Message Date
6ad49839ba tools: ynl-gen: fix unused / pad attribute handling
Unused and Pad attributes don't carry information.
Unused should never exist, and be rejected.
Pad should be silently skipped.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-02 22:10:46 -07:00
91dfaef243 tools: ynl-gen: add extra headers for user space
Make sure all relevant headers are included, we allocate memory,
use memcpy() and Linux types without including the headers.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-02 22:10:46 -07:00
3f06760c00 ipv4: Drop tos parameter from flowi4_update_output()
Callers of flowi4_update_output() never try to update ->flowi4_tos:

  * ip_route_connect() updates ->flowi4_tos with its own current
    value.

  * ip_route_newports() has two users: tcp_v4_connect() and
    dccp_v4_connect. Both initialise fl4 with ip_route_connect(), which
    in turn sets ->flowi4_tos with RT_TOS(inet_sk(sk)->tos) and
    ->flowi4_scope based on SOCK_LOCALROUTE.

    Then ip_route_newports() updates ->flowi4_tos with
    RT_CONN_FLAGS(sk), which is the same as RT_TOS(inet_sk(sk)->tos),
    unless SOCK_LOCALROUTE is set on the socket. In that case, the
    lowest order bit is set to 1, to eventually inform
    ip_route_output_key_hash() to restrict the scope to RT_SCOPE_LINK.
    This is equivalent to properly setting ->flowi4_scope as
    ip_route_connect() did.

  * ip_vs_xmit.c initialises ->flowi4_tos with memset(0), then calls
    flowi4_update_output() with tos=0.

  * sctp_v4_get_dst() uses the same RT_CONN_FLAGS_TOS() when
    initialising ->flowi4_tos and when calling flowi4_update_output().

In the end, ->flowi4_tos never changes. So let's just drop the tos
parameter. This will simplify the conversion of ->flowi4_tos from __u8
to dscp_t.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-02 10:52:38 +01:00
0f0f586868 net: lan743x: Remove extranous gotos
The gotos for cleanup aren't required, the function
might as well just return the actual error code.

Signed-off-by: Moritz Fischer <moritzf@google.com>
Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-02 10:48:48 +01:00
a92fb5c034 ip_gre: clean up some inconsistent indenting
No functional modification involved.

net/ipv4/ip_gre.c:192 ipgre_err() warn: inconsistent indenting.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5375
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-02 10:44:46 +01:00
23fcb62bc1 selftests: tls: add tests for poll behavior
Make sure we don't generate premature POLLIN events.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-02 09:51:12 +01:00
121dca784f tls: suppress wakeups unless we have a full record
TLS does not override .poll() so TLS-enabled socket will generate
an event whenever data arrives at the TCP socket. This leads to
unnecessary wakeups on slow connections.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-02 09:51:12 +01:00
a395b8d1c7 selftests/tc-testing: replace mq with invalid parent ID
The test case shown in [1] triggers the kernel to access the null pointer.
Therefore, add related test cases to mq.
The test results are as follows:

./tdc.py -e 0531
1..1
ok 1 0531 - Replace mq with invalid parent ID

./tdc.py -c mq
1..8
ok 1 ce7d - Add mq Qdisc to multi-queue device (4 queues)
ok 2 2f82 - Add mq Qdisc to multi-queue device (256 queues)
ok 3 c525 - Add duplicate mq Qdisc
ok 4 128a - Delete nonexistent mq Qdisc
ok 5 03a9 - Delete mq Qdisc twice
ok 6 be0f - Add mq Qdisc to single-queue device
ok 7 1023 - Show mq class
ok 8 0531 - Replace mq with invalid parent ID

[1] https://lore.kernel.org/all/20230527093747.3583502-1-shaozhengchao@huawei.com/
Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com>
Reviewed-by: Pedro Tammela <pctammela@mojatatu.com>
Link: https://lore.kernel.org/r/20230601012250.52738-1-shaozhengchao@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 21:40:28 -07:00
e8b6f79b41 net: phy: broadcom: Add LPI counter
Add the ability to read the PHY maintained LPI counter which is in the
Clause 45 vendor space, device address 7, offset 0x803F. The counter is
cleared on read.

Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230531231729.1873932-1-florian.fainelli@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 21:40:10 -07:00
733b3e2765 r8169: use dev_err_probe in all appropriate places in rtl_init_one()
In addition to properly handling probe deferrals dev_err_probe()
conveniently combines printing an error message with returning
the errno. So let's use it for every error path in rtl_init_one()
to simplify the code.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/f0596a19-d517-e301-b649-304f9247b75a@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 21:39:39 -07:00
b345b2a6cb Merge branch 'extend-dt-bindings-for-pse-pd-controllers-and-update-prtt1c-dts'
Oleksij Rempel says:

====================
Extend dt-bindings for PSE-PD controllers and update prtt1c dts

This patch set comes in response to issues identified while adding PoDL
PSE support to the stm32 prtt1c device tree. The existing pse-pd device
tree bindings did not allow node name patterns like "ethernet-pse-0" and
"ethernet-pse-1", leading to validation failures.

To address these false positives in validation, the device tree bindings
are extended to support these node name patterns. Alongside this, an
example node is added to aid in the improved validation process.
====================

Link: https://lore.kernel.org/r/20230531102113.3353065-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 21:38:32 -07:00
bd415f6c74 dt-bindings: net: pse-pd: Allow -N suffix for ethernet-pse node names
Extend the pattern matching for PSE-PD controller nodes to allow -N
suffixes. This enables the use of multiple "ethernet-pse" nodes without the
need for a "reg" property.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 21:38:28 -07:00
5ff9424ea0 devlink: bring port new reply back
In the offending fixes commit I mistakenly removed the reply message of
the port new command. I was under impression it is a new port
notification, partly due to the "notify" in the name of the helper
function. Bring the code sending reply with new port message back, this
time putting it directly to devlink_nl_cmd_port_new_doit()

Fixes: c496daeb86 ("devlink: remove duplicate port notification")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230531142025.2605001-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 21:37:32 -07:00
a03a91bd68 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:

drivers/net/ethernet/sfc/tc.c
  622ab65634 ("sfc: fix error unwinds in TC offload")
  b6583d5e9e ("sfc: support TC decap rules matching on enc_src_port")

net/mptcp/protocol.c
  5b825727d0 ("mptcp: add annotations around msk->subflow accesses")
  e76c8ef5cc ("mptcp: refactor mptcp_stream_accept()")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 15:38:26 -07:00
714069daa5 Merge tag 'net-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Happy Wear a Dress Day.

  Fairly standard-sized batch of fixes, accounting for the lack of
  sub-tree submissions this week. The mlx5 IRQ fixes are notable, people
  were complaining about that. No fires burning.

  Current release - regressions:

   - eth: mlx5e:
      - multiple fixes for dynamic IRQ allocation
      - prevent encap offload when neigh update is running

   - eth: mana: fix perf regression: remove rx_cqes, tx_cqes counters

  Current release - new code bugs:

   - eth: mlx5e: DR, add missing mutex init/destroy in pattern manager

  Previous releases - always broken:

   - tcp: deny tcp_disconnect() when threads are waiting

   - sched: prevent ingress Qdiscs from getting installed in random
     locations in the hierarchy and moving around

   - sched: flower: fix possible OOB write in fl_set_geneve_opt()

   - netlink: fix NETLINK_LIST_MEMBERSHIPS length report

   - udp6: fix race condition in udp6_sendmsg & connect

   - tcp: fix mishandling when the sack compression is deferred

   - rtnetlink: validate link attributes set at creation time

   - mptcp: fix connect timeout handling

   - eth: stmmac: fix call trace when stmmac_xdp_xmit() is invoked

   - eth: amd-xgbe: fix the false linkup in xgbe_phy_status

   - eth: mlx5e:
      - fix corner cases in internal buffer configuration
      - drain health before unregistering devlink

   - usb: qmi_wwan: set DTR quirk for BroadMobi BM818

  Misc:

   - tcp: return user_mss for TCP_MAXSEG in CLOSE/LISTEN state if
     user_mss set"

* tag 'net-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (71 commits)
  mptcp: fix active subflow finalization
  mptcp: add annotations around sk->sk_shutdown accesses
  mptcp: fix data race around msk->first access
  mptcp: consolidate passive msk socket initialization
  mptcp: add annotations around msk->subflow accesses
  mptcp: fix connect timeout handling
  rtnetlink: add the missing IFLA_GRO_ tb check in validate_linkmsg
  rtnetlink: move IFLA_GSO_ tb check to validate_linkmsg
  rtnetlink: call validate_linkmsg in rtnl_create_link
  ice: recycle/free all of the fragments from multi-buffer frame
  net: phy: mxl-gpy: extend interrupt fix to all impacted variants
  net: renesas: rswitch: Fix return value in error path of xmit
  net: dsa: mv88e6xxx: Increase wait after reset deactivation
  net: ipa: Use correct value for IPA_STATUS_SIZE
  tcp: fix mishandling when the sack compression is deferred.
  net/sched: flower: fix possible OOB write in fl_set_geneve_opt()
  sfc: fix error unwinds in TC offload
  net/mlx5: Read embedded cpu after init bit cleared
  net/mlx5e: Fix error handling in mlx5e_refresh_tirs
  net/mlx5: Ensure af_desc.mask is properly initialized
  ...
2023-06-01 17:29:18 -04:00
f9010dbdce fork, vhost: Use CLONE_THREAD to fix freezer/ps regression
When switching from kthreads to vhost_tasks two bugs were added:
1. The vhost worker tasks's now show up as processes so scripts doing
ps or ps a would not incorrectly detect the vhost task as another
process.  2. kthreads disabled freeze by setting PF_NOFREEZE, but
vhost tasks's didn't disable or add support for them.

To fix both bugs, this switches the vhost task to be thread in the
process that does the VHOST_SET_OWNER ioctl, and has vhost_worker call
get_signal to support SIGKILL/SIGSTOP and freeze signals. Note that
SIGKILL/STOP support is required because CLONE_THREAD requires
CLONE_SIGHAND which requires those 2 signals to be supported.

This is a modified version of the patch written by Mike Christie
<michael.christie@oracle.com> which was a modified version of patch
originally written by Linus.

Much of what depended upon PF_IO_WORKER now depends on PF_USER_WORKER.
Including ignoring signals, setting up the register state, and having
get_signal return instead of calling do_group_exit.

Tidied up the vhost_task abstraction so that the definition of
vhost_task only needs to be visible inside of vhost_task.c.  Making
it easier to review the code and tell what needs to be done where.
As part of this the main loop has been moved from vhost_worker into
vhost_task_fn.  vhost_worker now returns true if work was done.

The main loop has been updated to call get_signal which handles
SIGSTOP, freezing, and collects the message that tells the thread to
exit as part of process exit.  This collection clears
__fatal_signal_pending.  This collection is not guaranteed to
clear signal_pending() so clear that explicitly so the schedule()
sleeps.

For now the vhost thread continues to exist and run work until the
last file descriptor is closed and the release function is called as
part of freeing struct file.  To avoid hangs in the coredump
rendezvous and when killing threads in a multi-threaded exec.  The
coredump code and de_thread have been modified to ignore vhost threads.

Remvoing the special case for exec appears to require teaching
vhost_dev_flush how to directly complete transactions in case
the vhost thread is no longer running.

Removing the special case for coredump rendezvous requires either the
above fix needed for exec or moving the coredump rendezvous into
get_signal.

Fixes: 6e890c5d50 ("vhost: use vhost_tasks for worker threads")
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Co-developed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-06-01 17:15:33 -04:00
a451b8eb96 Merge tag 'mlx5-fixes-2023-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
mlx5 fixes 2023-05-31

This series provides bug fixes to mlx5 driver.

* tag 'mlx5-fixes-2023-05-31' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: Read embedded cpu after init bit cleared
  net/mlx5e: Fix error handling in mlx5e_refresh_tirs
  net/mlx5: Ensure af_desc.mask is properly initialized
  net/mlx5: Fix setting of irq->map.index for static IRQ case
  net/mlx5: Remove rmap also in case dynamic MSIX not supported
====================

Link: https://lore.kernel.org/r/20230601031051.131529-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 10:15:43 -07:00
66dd101487 Merge branch 'mptcp-fixes-for-connect-timeout-access-annotations-and-subflow-init'
Mat Martineau says:

====================
mptcp: Fixes for connect timeout, access annotations, and subflow init

Patch 1 allows the SO_SNDTIMEO sockopt to correctly change the connect
timeout on MPTCP sockets.

Patches 2-5 add READ_ONCE()/WRITE_ONCE() annotations to fix KCSAN issues.

Patch 6 correctly initializes some subflow fields on outgoing connections.
====================

Link: https://lore.kernel.org/r/20230531-send-net-20230531-v1-0-47750c420571@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 10:04:06 -07:00
55b47ca7d8 mptcp: fix active subflow finalization
Active subflow are inserted into the connection list at creation time.
When the MPJ handshake completes successfully, a new subflow creation
netlink event is generated correctly, but the current code wrongly
avoid initializing a couple of subflow data.

The above will cause misbehavior on a few exceptional events: unneeded
mptcp-level retransmission on msk-level sequence wrap-around and infinite
mapping fallback even when a MPJ socket is present.

Address the issue factoring out the needed initialization in a new helper
and invoking the latter from __mptcp_finish_join() time for passive
subflow and from mptcp_finish_join() for active ones.

Fixes: 0530020a7c ("mptcp: track and update contiguous data status")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 10:04:04 -07:00
6b9831bfd9 mptcp: add annotations around sk->sk_shutdown accesses
Christoph reported the mptcp variant of a recently addressed plain
TCP issue. Similar to commit e14cadfd80 ("tcp: add annotations around
sk->sk_shutdown accesses") add READ/WRITE ONCE annotations to silence
KCSAN reports around lockless sk_shutdown access.

Fixes: 71ba088ce0 ("mptcp: cleanup accept and poll")
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/401
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 10:04:04 -07:00
1b1b43ee7a mptcp: fix data race around msk->first access
The first subflow socket is accessed outside the msk socket lock
by mptcp_subflow_fail(), we need to annotate each write access
with WRITE_ONCE, but a few spots still lacks it.

Fixes: 76a13b3157 ("mptcp: invoke MP_FAIL response when needed")
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 10:04:04 -07:00
7e8b88ec35 mptcp: consolidate passive msk socket initialization
When the msk socket is cloned at MPC handshake time, a few
fields are initialized in a racy way outside mptcp_sk_clone()
and the msk socket lock.

The above is due historical reasons: before commit a88d0092b2
("mptcp: simplify subflow_syn_recv_sock()") as the first subflow socket
carrying all the needed date was not available yet at msk creation
time

We can now refactor the code moving the missing initialization bit
under the socket lock, removing the init race and avoiding some
code duplication.

This will also simplify the next patch, as all msk->first write
access are now under the msk socket lock.

Fixes: 0397c6d85f ("mptcp: keep unaccepted MPC subflow into join list")
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 10:04:04 -07:00
5b825727d0 mptcp: add annotations around msk->subflow accesses
The MPTCP can access the first subflow socket in a few spots
outside the socket lock scope. That is actually safe, as MPTCP
will delete the socket itself only after the msk sock close().

Still the such accesses causes a few KCSAN splats, as reported
by Christoph. Silence the harmless warning adding a few annotation
around the relevant accesses.

Fixes: 71ba088ce0 ("mptcp: cleanup accept and poll")
Reported-by: Christoph Paasch <cpaasch@apple.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/402
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 10:04:04 -07:00
786fc12457 mptcp: fix connect timeout handling
Ondrej reported a functional issue WRT timeout handling on connect
with a nice reproducer.

The problem is that the current mptcp connect waits for both the
MPTCP socket level timeout, and the first subflow socket timeout.
The latter is not influenced/touched by the exposed setsockopt().

Overall the above makes the SO_SNDTIMEO a no-op on connect.

Since mptcp_connect is invoked via inet_stream_connect and the
latter properly handle the MPTCP level timeout, we can address the
issue making the nested subflow level connect always unblocking.

This also allow simplifying a bit the code, dropping an ugly hack
to handle the fastopen and custom proto_ops connect.

The issues predates the blamed commit below, but the current resolution
requires the infrastructure introduced there.

Fixes: 54f1944ed6 ("mptcp: factor out mptcp_connect()")
Reported-by: Ondrej Mosnacek <omosnace@redhat.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/399
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 10:04:04 -07:00
3021dbfe3e Merge branch 'rtnetlink-a-couple-of-fixes-in-linkmsg-validation'
Xin Long says:

====================
rtnetlink: a couple of fixes in linkmsg validation

validate_linkmsg() was introduced to do linkmsg validation for existing
links. However, the new created links also need this linkmsg validation.

Add validate_linkmsg() check for link creating in Patch 1, and add more
tb checks into validate_linkmsg() in Patch 2 and 3.
====================

Link: https://lore.kernel.org/r/cover.1685548598.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 09:59:45 -07:00
65d6914e25 rtnetlink: add the missing IFLA_GRO_ tb check in validate_linkmsg
This fixes the issue that dev gro_max_size and gso_ipv4_max_size
can be set to a huge value:

  # ip link add dummy1 type dummy
  # ip link set dummy1 gro_max_size 4294967295
  # ip -d link show dummy1
    dummy addrgenmode eui64 ... gro_max_size 4294967295

Fixes: 0fe79f28bf ("net: allow gro_max_size to exceed 65536")
Fixes: 9eefedd58a ("net: add gso_ipv4_max_size and gro_ipv4_max_size per device")
Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 09:59:44 -07:00
fef5b228dd rtnetlink: move IFLA_GSO_ tb check to validate_linkmsg
These IFLA_GSO_* tb check should also be done for the new created link,
otherwise, they can be set to a huge value when creating links:

  # ip link add dummy1 gso_max_size 4294967295 type dummy
  # ip -d link show dummy1
    dummy addrgenmode eui64 ... gso_max_size 4294967295

Fixes: 46e6b992c2 ("rtnetlink: allow GSO maximums to be set on device creation")
Fixes: 9eefedd58a ("net: add gso_ipv4_max_size and gro_ipv4_max_size per device")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 09:59:44 -07:00
b0ad3c1790 rtnetlink: call validate_linkmsg in rtnl_create_link
validate_linkmsg() was introduced by commit 1840bb13c2 ("[RTNL]:
Validate hardware and broadcast address attribute for RTM_NEWLINK")
to validate tb[IFLA_ADDRESS/BROADCAST] for existing links. The same
check should also be done for newly created links.

This patch adds validate_linkmsg() call in rtnl_create_link(), to
avoid the invalid address set when creating some devices like:

  # ip link add dummy0 type dummy
  # ip link add link dummy0 name mac0 address 01:02 type macsec

Fixes: 0e06877c6f ("[RTNETLINK]: rtnl_link: allow specifying initial device address")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 09:59:43 -07:00
abaf8d51b0 ice: recycle/free all of the fragments from multi-buffer frame
The ice driver caches next_to_clean value at the beginning of
ice_clean_rx_irq() in order to remember the first buffer that has to be
freed/recycled after main Rx processing loop. The end boundary is
indicated by first descriptor of frame that Rx processing loop has ended
its duties. Note that if mentioned loop ended in the middle of gathering
multi-buffer frame, next_to_clean would be pointing to the descriptor in
the middle of the frame BUT freeing/recycling stage will stop at the
first descriptor. This means that next iteration of ice_clean_rx_irq()
will miss the (first_desc, next_to_clean - 1) entries.

 When running various 9K MTU workloads, such splats were observed:

[  540.780716] BUG: kernel NULL pointer dereference, address: 0000000000000000
[  540.787787] #PF: supervisor read access in kernel mode
[  540.793002] #PF: error_code(0x0000) - not-present page
[  540.798218] PGD 0 P4D 0
[  540.800801] Oops: 0000 [#1] PREEMPT SMP NOPTI
[  540.805231] CPU: 18 PID: 3984 Comm: xskxceiver Tainted: G        W          6.3.0-rc7+ #96
[  540.813619] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
[  540.824209] RIP: 0010:ice_clean_rx_irq+0x2b6/0xf00 [ice]
[  540.829678] Code: 74 24 10 e9 aa 00 00 00 8b 55 78 41 31 57 10 41 09 c4 4d 85 ff 0f 84 83 00 00 00 49 8b 57 08 41 8b 4f 1c 65 8b 35 1a fa 4b 3f <48> 8b 02 48 c1 e8 3a 39 c6 0f 85 a2 00 00 00 f6 42 08 02 0f 85 98
[  540.848717] RSP: 0018:ffffc9000f42fc50 EFLAGS: 00010282
[  540.854029] RAX: 0000000000000004 RBX: 0000000000000002 RCX: 000000000000fffe
[  540.861272] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00000000ffffffff
[  540.868519] RBP: ffff88984a05ac00 R08: 0000000000000000 R09: dead000000000100
[  540.875760] R10: ffff88983fffcd00 R11: 000000000010f2b8 R12: 0000000000000004
[  540.883008] R13: 0000000000000003 R14: 0000000000000800 R15: ffff889847a10040
[  540.890253] FS:  00007f6ddf7fe640(0000) GS:ffff88afdf800000(0000) knlGS:0000000000000000
[  540.898465] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  540.904299] CR2: 0000000000000000 CR3: 000000010d3da001 CR4: 00000000007706e0
[  540.911542] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  540.918789] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  540.926032] PKRU: 55555554
[  540.928790] Call Trace:
[  540.931276]  <TASK>
[  540.933418]  ice_napi_poll+0x4ca/0x6d0 [ice]
[  540.937804]  ? __pfx_ice_napi_poll+0x10/0x10 [ice]
[  540.942716]  napi_busy_loop+0xd7/0x320
[  540.946537]  xsk_recvmsg+0x143/0x170
[  540.950178]  sock_recvmsg+0x99/0xa0
[  540.953729]  __sys_recvfrom+0xa8/0x120
[  540.957543]  ? do_futex+0xbd/0x1d0
[  540.961008]  ? __x64_sys_futex+0x73/0x1d0
[  540.965083]  __x64_sys_recvfrom+0x20/0x30
[  540.969155]  do_syscall_64+0x38/0x90
[  540.972796]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
[  540.977934] RIP: 0033:0x7f6de5f27934

To fix this, set cached_ntc to first_desc so that at the end, when
freeing/recycling buffers, descriptors from first to ntc are not missed.

Fixes: 2fba7dc515 ("ice: Add support for XDP multi-buffer on Rx side")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20230531154457.3216621-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 09:59:39 -07:00
519d648764 net: phy: mxl-gpy: extend interrupt fix to all impacted variants
The interrupt fix in commit 97a89ed101 should be applied on all variants
of GPY2xx PHY and GPY115C.

Fixes: 97a89ed101 ("net: phy: mxl-gpy: disable interrupts on GPY215 by default")
Signed-off-by: Xu Liang <lxu@maxlinear.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230531074822.39136-1-lxu@maxlinear.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 09:59:36 -07:00
a60caf039e net: renesas: rswitch: Fix return value in error path of xmit
Fix return value in the error path of rswitch_start_xmit(). If TX
queues are full, this function should return NETDEV_TX_BUSY.

Fixes: 3590918b5d ("net: ethernet: renesas: Add support for "Ethernet Switch"")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/20230529073817.1145208-1-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-01 09:57:08 -07:00
1874a42a7d Merge tag 'firewire-fixes-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394
Pull firewire fix from Takashi Sakamoto:
 "A single patch to use a flexible array rather than a zero-length one"

* tag 'firewire-fixes-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394:
  firewire: Replace zero-length array with flexible-array member
2023-06-01 11:18:20 -04:00
cd69bf361e Merge tag 'mailbox-fixes-6.4-rc5' of git://git.linaro.org/landing-teams/working/fujitsu/integration
Pull mailbox fix from Jassi Brar:
 "Fix missing mutex unlock in mailbox-test"

* tag 'mailbox-fixes-6.4-rc5' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
  mailbox: mailbox-test: fix a locking issue in mbox_test_message_write()
2023-06-01 11:13:10 -04:00
3c27f3d53d net: dsa: mv88e6xxx: Increase wait after reset deactivation
A switch held in reset by default needs to wait longer until we can
reliably detect it.

An issue was observed when testing on the Marvell 88E6393X (Link Street).
The driver failed to detect the switch on some upstarts. Increasing the
wait time after reset deactivation solves this issue.

The updated wait time is now also the same as the wait time in the
mv88e6xxx_hardware_reset function.

Fixes: 7b75e49de4 ("net: dsa: mv88e6xxx: wait after reset deactivation")
Signed-off-by: Andreas Svensson <andreas.svensson@axis.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230530145223.1223993-1-andreas.svensson@axis.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-01 15:46:46 +02:00
4420528254 firewire: Replace zero-length array with flexible-array member
Zero-length and one-element arrays are deprecated, and we are moving
towards adopting C99 flexible-array members, instead.

Address the following warnings found with GCC-13 and
-fstrict-flex-arrays=3 enabled:
sound/firewire/amdtp-stream.c: In function ‘build_it_pkt_header’:
sound/firewire/amdtp-stream.c:694:17: warning: ‘generate_cip_header’ accessing 8 bytes in a region of size 0 [-Wstringop-overflow=]
  694 |                 generate_cip_header(s, cip_header, data_block_counter, syt);
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
sound/firewire/amdtp-stream.c:694:17: note: referencing argument 2 of type ‘__be32[2]’ {aka ‘unsigned int[2]’}
sound/firewire/amdtp-stream.c:667:13: note: in a call to function ‘generate_cip_header’
  667 | static void generate_cip_header(struct amdtp_stream *s, __be32 cip_header[2],
      |             ^~~~~~~~~~~~~~~~~~~

This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [1].

Link: https://github.com/KSPP/linux/issues/21
Link: https://github.com/KSPP/linux/issues/303
Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [1]
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/ZHT0V3SpvHyxCv5W@work
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
2023-06-01 22:41:14 +09:00
f39dda98bc Merge tag 'for-linus-2023060101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:

 - Regression fix for overlong long timeouts during initialization on
   some Logitech Unifying devices (Bastien Nocera)

 - error handling and overflow fixes for Wacom driver (Denis Arefev,
   Jason Gerecke, Nikita Zhandarovich)

* tag 'for-linus-2023060101' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: logitech-hidpp: Handle timeout differently from busy
  HID: wacom: Add error check to wacom_parse_and_register()
  HID: google: add jewel USB id
  HID: wacom: avoid integer overflow in wacom_intuos_inout()
  HID: wacom: Check for string overflow from strscpy calls
2023-06-01 09:02:04 -04:00
ba059590ec Merge tag 'ata-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata
Pull ata fix from Damien Le Moal:

 - Fix ata_find_dev() use of the device number to find a struct
   ata_device for a port. This addresses issues with some passthrough
   commands with libsas managed devices.

* tag 'ata-6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/libata:
  ata: libata-scsi: Use correct device no in ata_find_dev()
2023-06-01 08:41:33 -04:00
8828003759 Merge tag '6.4-rc4-smb3-server-fixes' of git://git.samba.org/ksmbd
Pull smb server fixes from Steve French:
 "Eight server fixes (most also for stable):

   - Two fixes for uninitialized pointer reads (rename and link)

   - Fix potential UAF in oplock break

   - Two fixes for potential out of bound reads in negotiate

   - Fix crediting bug

   - Two fixes for xfstests (allocation size fix for test 694 and lookup
     issue shown by test 464)"

* tag '6.4-rc4-smb3-server-fixes' of git://git.samba.org/ksmbd:
  ksmbd: call putname after using the last component
  ksmbd: fix incorrect AllocationSize set in smb2_get_info
  ksmbd: fix UAF issue from opinfo->conn
  ksmbd: fix multiple out-of-bounds read during context decoding
  ksmbd: fix slab-out-of-bounds read in smb2_handle_negotiate
  ksmbd: fix credit count leakage
  ksmbd: fix uninitialized pointer read in smb2_create_link()
  ksmbd: fix uninitialized pointer read in ksmbd_vfs_rename()
2023-06-01 08:27:34 -04:00
4ff3dfc91c Merge branch 'splice-net-handle-msg_splice_pages-in-chelsio-tls'
David Howells says:

====================
splice, net: Handle MSG_SPLICE_PAGES in Chelsio-TLS

Here are patches to make Chelsio-TLS handle the MSG_SPLICE_PAGES internal
sendmsg flag.  MSG_SPLICE_PAGES is an internal hint that tells the protocol
that it should splice the pages supplied if it can.  Its sendpage
implementation is then turned into a wrapper around that.
====================

Link: https://lore.kernel.org/r/20230531110008.642903-1-dhowells@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-01 13:41:40 +02:00
26acc982c1 chelsio: Convert chtls_sendpage() to use MSG_SPLICE_PAGES
Convert chtls_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather than
directly splicing in the pages itself.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Ayush Sawal <ayush.sawal@chelsio.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-01 13:41:36 +02:00
116f7b361e chelsio: Support MSG_SPLICE_PAGES
Make Chelsio's TLS offload sendmsg() support MSG_SPLICE_PAGES, splicing in
pages from the source iterator if possible and copying the data in
otherwise.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Ayush Sawal <ayush.sawal@chelsio.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-01 13:41:36 +02:00
be7f8012a5 net: ipa: Use correct value for IPA_STATUS_SIZE
IPA_STATUS_SIZE was introduced in commit b8dc7d0eea as a replacement
for the size of the removed struct ipa_status which had size
sizeof(__le32[8]). Use this value as IPA_STATUS_SIZE.

Fixes: b8dc7d0eea ("net: ipa: stop using sizeof(status)")
Signed-off-by: Bert Karwatzki <spasswolf@web.de>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230531103618.102608-1-spasswolf@web.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-01 13:29:18 +02:00
30c6f0bf95 tcp: fix mishandling when the sack compression is deferred.
In this patch, we mainly try to handle sending a compressed ack
correctly if it's deferred.

Here are more details in the old logic:
When sack compression is triggered in the tcp_compressed_ack_kick(),
if the sock is owned by user, it will set TCP_DELACK_TIMER_DEFERRED
and then defer to the release cb phrase. Later once user releases
the sock, tcp_delack_timer_handler() should send a ack as expected,
which, however, cannot happen due to lack of ICSK_ACK_TIMER flag.
Therefore, the receiver would not sent an ack until the sender's
retransmission timeout. It definitely increases unnecessary latency.

Fixes: 5d9f4262b7 ("tcp: add SACK compression")
Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: fuyuanli <fuyuanli@didiglobal.com>
Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://lore.kernel.org/netdev/20230529113804.GA20300@didi-ThinkCentre-M920t-N000/
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230531080150.GA20424@didi-ThinkCentre-M920t-N000
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-01 13:15:12 +02:00
4d56304e58 net/sched: flower: fix possible OOB write in fl_set_geneve_opt()
If we send two TCA_FLOWER_KEY_ENC_OPTS_GENEVE packets and their total
size is 252 bytes(key->enc_opts.len = 252) then
key->enc_opts.len = opt->length = data_len / 4 = 0 when the third
TCA_FLOWER_KEY_ENC_OPTS_GENEVE packet enters fl_set_geneve_opt. This
bypasses the next bounds check and results in an out-of-bounds.

Fixes: 0a6e77784f ("net/sched: allow flower to match tunnel options")
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Link: https://lore.kernel.org/r/20230531102805.27090-1-hbh25y@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-06-01 12:59:04 +02:00
735c9ee9a3 Merge branch 'wangxun-netdev-features-support'
Mengyuan Lou says:

====================
Wangxun netdev features support

Implement tx_csum and rx_csum to support hardware checksum offload.
Implement ndo_vlan_rx_add_vid and ndo_vlan_rx_kill_vid.
Implement ndo_set_features.
Enable macros in netdev features which wangxun can support.
====================

Link: https://lore.kernel.org/r/20230530022632.17938-1-mengyuanlou@net-swift.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-31 23:02:28 -07:00
7df4af51de net: txgbe: Implement vlan add and remove ops
txgbe add ndo_vlan_rx_add_vid and ndo_vlan_rx_kill_vid.

Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-31 23:02:26 -07:00
6670f1ece2 net: txgbe: Add netdev features support
Add features and hw_features that ngbe can support.

Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-31 23:02:26 -07:00
361bf4f47c net: ngbe: Implement vlan add and remove ops
ngbe add ndo_vlan_rx_add_vid and ndo_vlan_rx_kill_vid.

Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-31 23:02:26 -07:00
50a908a0bd net: ngbe: Add netdev features support
Add features and hw_features that ngbe can support.

Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-31 23:02:26 -07:00
6dbedcffcf net: libwx: Implement xx_set_features ops
Implement wx_set_features function which to support
ndo_set_features.

Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-31 23:02:26 -07:00