1155372 Commits

Author SHA1 Message Date
Paolo Abeni
d4e85922e3 mptcp: do not wait for bare sockets' timeout
If the peer closes all the existing subflows for a given
mptcp socket and later the application closes it, the current
implementation let it survive until the timewait timeout expires.

While the above is allowed by the protocol specification it
consumes resources for almost no reason and additionally
causes sporadic self-tests failures.

Let's move the mptcp socket to the TCP_CLOSE state when there are
no alive subflows at close time, so that the allocated resources
will be freed immediately.

Fixes: e16163b6e2b7 ("mptcp: refactor shutdown and close")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08 09:39:34 +00:00
Vladimir Oltean
1a3245fe0c net: ethernet: mtk_eth_soc: fix DSA TX tag hwaccel for switch port 0
Arınç reports that on his MT7621AT Unielec U7621-06 board and MT7623NI
Bananapi BPI-R2, packets received by the CPU over mt7530 switch port 0
(of which this driver acts as the DSA master) are not processed
correctly by software. More precisely, they arrive without a DSA tag
(in packet or in the hwaccel area - skb_metadata_dst()), so DSA cannot
demux them towards the switch's interface for port 0. Traffic from other
ports receives a skb_metadata_dst() with the correct port and is demuxed
properly.

Looking at mtk_poll_rx(), it becomes apparent that this driver uses the
skb vlan hwaccel area:

	union {
		u32		vlan_all;
		struct {
			__be16	vlan_proto;
			__u16	vlan_tci;
		};
	};

as a temporary storage for the VLAN hwaccel tag, or the DSA hwaccel tag.
If this is a DSA master it's a DSA hwaccel tag, and finally clears up
the skb VLAN hwaccel header.

I'm guessing that the problem is the (mis)use of API.
skb_vlan_tag_present() looks like this:

 #define skb_vlan_tag_present(__skb)	(!!(__skb)->vlan_all)

So if both vlan_proto and vlan_tci are zeroes, skb_vlan_tag_present()
returns precisely false. I don't know for sure what is the format of the
DSA hwaccel tag, but I surely know that lowermost 3 bits of vlan_proto
are 0 when receiving from port 0:

	unsigned int port = vlan_proto & GENMASK(2, 0);

If the RX descriptor has no other bits set to non-zero values in
RX_DMA_VTAG, then the call to __vlan_hwaccel_put_tag() will not, in
fact, make the subsequent skb_vlan_tag_present() return true, because
it's implemented like this:

static inline void __vlan_hwaccel_put_tag(struct sk_buff *skb,
					  __be16 vlan_proto, u16 vlan_tci)
{
	skb->vlan_proto = vlan_proto;
	skb->vlan_tci = vlan_tci;
}

What we need to do to fix this problem (assuming this is the problem) is
to stop using skb->vlan_all as temporary storage for driver affairs, and
just create some local variables that serve the same purpose, but
hopefully better. Instead of calling skb_vlan_tag_present(), let's look
at a boolean has_hwaccel_tag which we set to true when the RX DMA
descriptors have something. Disambiguate based on netdev_uses_dsa()
whether this is a VLAN or DSA hwaccel tag, and only call
__vlan_hwaccel_put_tag() if we're certain it's a VLAN tag.

Arınç confirms that the treatment works, so this validates the
assumption.

Link: https://lore.kernel.org/netdev/704f3a72-fc9e-714a-db54-272e17612637@arinc9.com/
Fixes: 2d7605a72906 ("net: ethernet: mtk_eth_soc: enable hardware DSA untagging")
Reported-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08 09:11:31 +00:00
Yu Xiao
821de68c1f nfp: ethtool: fix the bug of setting unsupported port speed
Unsupported port speed can be set and cause error. Now fixing it
and return an error if setting unsupported speed.

This fix depends on the following, which was included in v6.2-rc1:
commit a61474c41e8c ("nfp: ethtool: support reporting link modes").

Fixes: 7c698737270f ("nfp: add support for .set_link_ksettings()")
Signed-off-by: Yu Xiao <yu.xiao@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08 09:08:12 +00:00
Kevin Yang
c11204c78d txhash: fix sk->sk_txrehash default
This code fix a bug that sk->sk_txrehash gets its default enable
value from sysctl_txrehash only when the socket is a TCP listener.

We should have sysctl_txrehash to set the default sk->sk_txrehash,
no matter TCP, nor listerner/connector.

Tested by following packetdrill:
  0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
  +0 socket(..., SOCK_DGRAM, IPPROTO_UDP) = 4
  // SO_TXREHASH == 74, default to sysctl_txrehash == 1
  +0 getsockopt(3, SOL_SOCKET, 74, [1], [4]) = 0
  +0 getsockopt(4, SOL_SOCKET, 74, [1], [4]) = 0

Fixes: 26859240e4ee ("txhash: Add socket option to control TX hash rethink behavior")
Signed-off-by: Kevin Yang <yyd@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08 09:07:11 +00:00
Tariq Toukan
c966153d12 net: ethernet: mtk_eth_soc: fix wrong parameters order in __xdp_rxq_info_reg()
Parameters 'queue_index' and 'napi_id' are passed in a swapped order.
Fix it here.

Fixes: 23233e577ef9 ("net: ethernet: mtk_eth_soc: rely on page_pool for single page buffers")
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08 09:04:33 +00:00
Arınç ÜNAL
21386e6926 net: ethernet: mtk_eth_soc: enable special tag when any MAC uses DSA
The special tag is only enabled when the first MAC uses DSA. However, it
must be enabled when any MAC uses DSA. Change the check accordingly.

This fixes hardware DSA untagging not working on the second MAC of the
MT7621 and MT7623 SoCs, and likely other SoCs too. Therefore, remove the
check that disables hardware DSA untagging for the second MAC of the MT7621
and MT7623 SoCs.

Fixes: a1f47752fd62 ("net: ethernet: mtk_eth_soc: disable hardware DSA untagging for second MAC")
Co-developed-by: Richard van Schagen <richard@routerhints.com>
Signed-off-by: Richard van Schagen <richard@routerhints.com>
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-08 09:00:29 +00:00
Dan Carpenter
9cec2aaffe net: sched: sch: Fix off by one in htb_activate_prios()
The > needs be >= to prevent an out of bounds access.

Fixes: de5ca4c3852f ("net: sched: sch: Bounds check priority")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/Y+D+KN18FQI2DKLq@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-07 23:38:53 -08:00
Jakub Kicinski
91701f63d8 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2023-02-06 (ice)

This series contains updates to ice driver only.

Ani removes WQ_MEM_RECLAIM flag from workqueue to resolve
check_flush_dependency warning.

Michal fixes KASAN out-of-bounds warning.

Brett corrects behaviour for port VLAN Rx filters to prevent receiving
of unintended traffic.

Dan Carpenter fixes possible off by one issue.

Zhang Changzhong adjusts error path for switch recipe to prevent memory
leak.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
  ice: switch: fix potential memleak in ice_add_adv_recipe()
  ice: Fix off by one in ice_tc_forward_to_queue()
  ice: Fix disabling Rx VLAN filtering with port VLAN enabled
  ice: fix out-of-bounds KASAN warning in virtchnl
  ice: Do not use WQ_MEM_RECLAIM flag for workqueue
====================

Link: https://lore.kernel.org/r/20230206232934.634298-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-07 22:04:44 -08:00
Sasha Neftin
9b27517627 igc: Add ndo_tx_timeout support
On some platforms, 100/1000/2500 speeds seem to have sometimes problems
reporting false positive tx unit hang during stressful UDP traffic. Likely
other Intel drivers introduce responses to a tx hang. Update the 'tx hang'
comparator with the comparison of the head and tail of ring pointers and
restore the tx_timeout_factor to the previous value (one).

This can be test by using netperf or iperf3 applications.
Example:
iperf3 -s -p 5001
iperf3 -c 192.168.0.2 --udp -p 5001 --time 600 -b 0

netserver -p 16604
netperf -H 192.168.0.2 -l 600 -p 16604 -t UDP_STREAM -- -m 64000

Fixes: b27b8dc77b5e ("igc: Increase timeout value for Speed 100/1000/2500")
Signed-off-by: Sasha Neftin <sasha.neftin@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://lore.kernel.org/r/20230206235818.662384-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-07 21:57:26 -08:00
Haiyang Zhang
18a048370b net: mana: Fix accessing freed irq affinity_hint
After calling irq_set_affinity_and_hint(), the cpumask pointer is
saved in desc->affinity_hint, and will be used later when reading
/proc/irq/<num>/affinity_hint. So the cpumask variable needs to be
persistent. Otherwise, we are accessing freed memory when reading
the affinity_hint file.

Also, need to clear affinity_hint before free_irq(), otherwise there
is a one-time warning and stack trace during module unloading:

 [  243.948687] WARNING: CPU: 10 PID: 1589 at kernel/irq/manage.c:1913 free_irq+0x318/0x360
 ...
 [  243.948753] Call Trace:
 [  243.948754]  <TASK>
 [  243.948760]  mana_gd_remove_irqs+0x78/0xc0 [mana]
 [  243.948767]  mana_gd_remove+0x3e/0x80 [mana]
 [  243.948773]  pci_device_remove+0x3d/0xb0
 [  243.948778]  device_remove+0x46/0x70
 [  243.948782]  device_release_driver_internal+0x1fe/0x280
 [  243.948785]  driver_detach+0x4e/0xa0
 [  243.948787]  bus_remove_driver+0x70/0xf0
 [  243.948789]  driver_unregister+0x35/0x60
 [  243.948792]  pci_unregister_driver+0x44/0x90
 [  243.948794]  mana_driver_exit+0x14/0x3fe [mana]
 [  243.948800]  __do_sys_delete_module.constprop.0+0x185/0x2f0

To fix the bug, use the persistent mask, cpumask_of(cpu#), and set
affinity_hint to NULL before freeing the IRQ, as required by free_irq().

Cc: stable@vger.kernel.org
Fixes: 71fa6887eeca ("net: mana: Assign interrupts to CPUs based on NUMA nodes")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/1675718929-19565-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-07 21:30:22 -08:00
Jakub Kicinski
b1f4fbabbb linux-can-fixes-for-6.2-20230207
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEDs2BvajyNKlf9TJQvlAcSiqKBOgFAmPiWbYTHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRC+UBxKKooE6HT3CACcKiUxmkFfS3I2MUCLt+spz2/4q3cG
 +MX8m03fzSU8efdKN1ckUOmQfsPp2RNP7NSeI/Anu83oka40zFuejC/Dfg9V1SBF
 EKD39VIWQTuPcYJP4GlxqKTkDHSVsA5iN/4vxzCFpp7C6LR3h2f0bTwj6SA44IvD
 eatH6Vzr/Y1rG6Yj27hZzFpC8HXqHzaIcC1UJpODqNTmhrux/hskem0XqH68VQXK
 FRi8K7xHgIs4MonamWGJj/oz1NkbEje3xvT4nZRkaMLgZY0prTQfJetSDaGwkBci
 WJmqJ15PpBRedDpdTIeXkZK8U95BWt/gZEp28XwgkohP6A8DsilJFS7F
 =Tcyi
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-6.2-20230207' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
can 2023-02-07

The patch is from Devid Antonio Filoni and fixes an address claiming
problem in the J1939 CAN protocol.

* tag 'linux-can-fixes-for-6.2-20230207' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
  can: j1939: do not wait 250 ms if the same addr was already claimed
====================

Link: https://lore.kernel.org/r/20230207140514.2885065-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-07 20:50:30 -08:00
Michael Kelley
c6aa9d3b43 hv_netvsc: Allocate memory in netvsc_dma_map() with GFP_ATOMIC
Memory allocations in the network transmit path must use GFP_ATOMIC
so they won't sleep.

Reported-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/lkml/8a4d08f94d3e6fe8b6da68440eaa89a088ad84f9.camel@redhat.com/
Fixes: 846da38de0e8 ("net: netvsc: Add Isolation VM support for netvsc driver")
Cc: stable@vger.kernel.org
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/1675714317-48577-1-git-send-email-mikelley@microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-07 20:39:26 -08:00
Shay Drory
8f0d1451ec net/mlx5: Serialize module cleanup with reload and remove
Currently, remove and reload flows can run in parallel to module cleanup.
This design is error prone. For example: aux_drivers callbacks are called
from both cleanup and remove flows with different lockings, which can
cause a deadlock[1].
Hence, serialize module cleanup with reload and remove.

[1]
       cleanup                        remove
       -------                        ------
   auxiliary_driver_unregister();
                                     devl_lock()
                                      auxiliary_device_delete(mlx5e_aux)
    device_lock(mlx5e_aux)
     devl_lock()
                                       device_lock(mlx5e_aux)

Fixes: 912cebf420c2 ("net/mlx5e: Connect ethernet part to auxiliary bus")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-02-07 19:01:07 -08:00
Shay Drory
184e1e4474 net/mlx5: fw_tracer, Zero consumer index when reloading the tracer
When tracer is reloaded, the device will log the traces at the
beginning of the log buffer. Also, driver is reading the log buffer in
chunks in accordance to the consumer index.
Hence, zero consumer index when reloading the tracer.

Fixes: 4383cfcc65e7 ("net/mlx5: Add devlink reload")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-02-07 19:01:06 -08:00
Shay Drory
db561fed6b net/mlx5: fw_tracer, Clear load bit when freeing string DBs buffers
Whenever the driver is reading the string DBs into buffers, the driver
is setting the load bit, but the driver never clears this bit.
As a result, in case load bit is on and the driver query the device for
new string DBs, the driver won't read again the string DBs.
Fix it by clearing the load bit when query the device for new string
DBs.

Fixes: 2d69356752ff ("net/mlx5: Add support for fw live patch event")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-02-07 19:01:06 -08:00
Maher Sanalla
9965bbebae net/mlx5: Expose SF firmware pages counter
Currently, each core device has VF pages counter which stores number of
fw pages used by its VFs and SFs.

The current design led to a hang when performing firmware reset on DPU,
where the DPU PFs stalled in sriov unload flow due to waiting on release
of SFs pages instead of waiting on only VFs pages.

Thus, Add a separate counter for SF firmware pages, which will prevent
the stall scenario described above.

Fixes: 1958fc2f0712 ("net/mlx5: SF, Add auxiliary device driver")
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-02-07 19:01:06 -08:00
Maher Sanalla
c3bdbaea65 net/mlx5: Store page counters in a single array
Currently, an independent page counter is used for tracking memory usage
for each function type such as VF, PF and host PF (DPU).

For better code-readibilty, use a single array that stores
the number of allocated memory pages for each function type.

Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-02-07 19:01:05 -08:00
Dragos Tatulea
8aa5f171d5 net/mlx5e: IPoIB, Show unknown speed instead of error
ethtool is returning an error for unknown speeds for the IPoIB interface:

$ ethtool ib0
netlink error: failed to retrieve link settings
netlink error: Invalid argument
netlink error: failed to retrieve link settings
netlink error: Invalid argument
Settings for ib0:
Link detected: no

After this change, ethtool will return success and show "unknown speed":

$ ethtool ib0
Settings for ib0:
Supported ports: [  ]
Supported link modes:   Not reported
Supported pause frame use: No
Supports auto-negotiation: No
Supported FEC modes: Not reported
Advertised link modes:  Not reported
Advertised pause frame use: No
Advertised auto-negotiation: No
Advertised FEC modes: Not reported
Speed: Unknown!
Duplex: Full
Auto-negotiation: off
Port: Other
PHYAD: 0
Transceiver: internal
Link detected: no

Fixes: eb234ee9d541 ("net/mlx5e: IPoIB, Add support for get_link_ksettings in ethtool")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-02-07 19:01:05 -08:00
Amir Tzin
8974aa9638 net/mlx5e: Fix crash unsetting rx-vlan-filter in switchdev mode
Moving to switchdev mode with rx-vlan-filter on and then setting it off
causes the kernel to crash since fs->vlan is freed during nic profile
cleanup flow.

RX VLAN filtering is not supported in switchdev mode so unset it when
changing to switchdev and restore its value when switching back to
legacy.

trace:
[] RIP: 0010:mlx5e_disable_cvlan_filter+0x43/0x70
[] set_feature_cvlan_filter+0x37/0x40 [mlx5_core]
[] mlx5e_handle_feature+0x3a/0x60 [mlx5_core]
[] mlx5e_set_features+0x6d/0x160 [mlx5_core]
[] __netdev_update_features+0x288/0xa70
[] ethnl_set_features+0x309/0x380
[] ? __nla_parse+0x21/0x30
[] genl_family_rcv_msg_doit.isra.17+0x110/0x150
[] genl_rcv_msg+0x112/0x260
[] ? features_reply_size+0xe0/0xe0
[] ? genl_family_rcv_msg_doit.isra.17+0x150/0x150
[] netlink_rcv_skb+0x4e/0x100
[] genl_rcv+0x24/0x40
[] netlink_unicast+0x1ab/0x290
[] netlink_sendmsg+0x257/0x4f0
[] sock_sendmsg+0x5c/0x70

Fixes: cb67b832921c ("net/mlx5e: Introduce SRIOV VF representors")
Signed-off-by: Amir Tzin <amirtz@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-02-07 19:01:05 -08:00
Vlad Buslov
da0c52426c net/mlx5: Bridge, fix ageing of peer FDB entries
SWITCHDEV_FDB_ADD_TO_BRIDGE event handler that updates FDB entry 'lastuse'
field is only executed for eswitch that owns the entry. However, if peer
entry processed packets at least once it will have hardware counter 'used'
value greater than entry 'lastuse' from that point on, which will cause FDB
entry not being aged out.

Process the event on all eswitch instances.

Fixes: ff9b7521468b ("net/mlx5: Bridge, support LAG")
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-02-07 19:01:05 -08:00
Yevgeny Kliteynik
288d85e07f net/mlx5: DR, Fix potential race in dr_rule_create_rule_nic
Selecting builder should be protected by the lock to prevent the case
where a new rule sets a builder in the nic_matcher while the previous
rule is still using the nic_matcher.

Fixing this issue and cleaning the error flow.

Fixes: b9b81e1e9382 ("net/mlx5: DR, For short chains of STEs, avoid allocating ste_arr dynamically")
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-02-07 19:01:04 -08:00
Adham Faris
1e66220948 net/mlx5e: Update rx ring hw mtu upon each rx-fcs flag change
rq->hw_mtu is used in function en_rx.c/mlx5e_skb_from_cqe_mpwrq_linear()
to catch oversized packets. If FCS is concatenated to the end of the
packet then the check should be updated accordingly.

Rx rings initialization (mlx5e_init_rxq_rq()) invoked for every new set
of channels, as part of mlx5e_safe_switch_params(), unknowingly if it
runs with default configuration or not. Current rq->hw_mtu
initialization assumes default configuration and ignores
params->scatter_fcs_en flag state.
Fix this, by accounting for params->scatter_fcs_en flag state during
rq->hw_mtu initialization.

In addition, updating rq->hw_mtu value during ingress traffic might
lead to packets drop and oversize_pkts_sw_drop counter increase with no
good reason. Hence we remove this optimization and switch the set of
channels with a new one, to make sure we don't get false positives on
the oversize_pkts_sw_drop counter.

Fixes: 102722fc6832 ("net/mlx5e: Add support for RXFCS feature flag")
Signed-off-by: Adham Faris <afaris@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2023-02-07 19:01:04 -08:00
Linus Torvalds
0983f6bf2b Devicetree fixes for v6.2, part 2:
- Fix handling of multiple OF framebuffer devices
 
 - Fix booting on Socionext Synquacer with bad 'dma-ranges' entries
 
 - Add DT binding .yamllint to .gitignore
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAmPiyF0ACgkQ+vtdtY28
 YcNZpA//adsF6J/7vPo9SiRj/Y+EFZ2m2J1mUubMP4UEAGtfG+JnkOmvjTgO3Rar
 aYivXNh2pqKqvOEw35XR5donzeygTxr953W4xzGRrPW282tZeWBeeKfz0bikVguv
 v7WzTlmYVL1I6P5KCNB2C5f5zZEXdDc7dA3pJmt+IM5sjEuduhqwlrw57Rd0uRdM
 OZOjlrLiwA1I9AIo081EmoPHUAcsckgxJzFqdLvh8Vz1JT3n8Rtj1GXuWQvSAueX
 p8vu771QeVJSgzEuZfA4g5Nia9T30g2XhxeD5bBtgndGoKdE8j2AzMoghBODJABa
 hvKd1DV9jP14c9HV7M/kgFb8JBEmXnKpARiSIyoP/KXLVraJS3HpbOSgbuTgdrp5
 lGxLbb44NYGGbPt1Sl25XPInr2b3BMfto/pvve9G9863fcY6SUk5aDuGIx6pNBsl
 lufniFQbaFTLdidskQ7bm187L0QEE4wGMY1ZDs77elMaMdtP6THXfim2Scq1UAKt
 tKMgYUciL/nps7rHCovmS8iXvysU7qztw6+F1nlOJMrrDld7LgrOZffWAjBnbNDa
 z1RZY1O1uyIIMam8VgeBbdMNoiqNv+2MNWH9STpY74DqS3+o2+pXyJitMS6+1U7p
 Rhlzbiwei4F1E0rFZqSGcWsgx+yy7HlgsfnEOKGkWjoFmBik26k=
 =T25l
 -----END PGP SIGNATURE-----

Merge tag 'devicetree-fixes-for-6.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull devicetree fixes from Rob Herring:

 - Fix handling of multiple OF framebuffer devices

 - Fix booting on Socionext Synquacer with bad 'dma-ranges' entries

 - Add DT binding .yamllint to .gitignore

* tag 'devicetree-fixes-for-6.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dt-bindings: interrupt-controller: arm,gic-v3: Fix typo in description of msi-controller property
  dt-bindings: Fix .gitignore
  of/address: Return an error when no valid dma-ranges are found
  of: Make OF framebuffer device names unique
2023-02-07 14:17:12 -08:00
Linus Torvalds
513c1a3d3f Fix regression in poll() and select()
With the fix that made poll() and select() block if read would block
 caused a slight regression in rasdaemon, as it needed that kind
 of behavior. Add a way to make that behavior come back by writing
 zero into the "buffer_percentage", which means to never block on read.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCY+Jn3xQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qgQ6AQC30hHcPMPm8+drlH/P6wEYstRP6xbp
 nHYHcvT6qXNPtAD+OhUQR2Zav66m6cE0qvkdnZb72E0YHRTfBhN5OpshTgQ=
 =dJEF
 -----END PGP SIGNATURE-----

Merge tag 'trace-v6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull tracing fix from Steven Rostedt:
 "Fix regression in poll() and select()

  With the fix that made poll() and select() block if read would block
  caused a slight regression in rasdaemon, as it needed that kind of
  behavior. Add a way to make that behavior come back by writing zero
  into the 'buffer_percentage', which means to never block on read"

* tag 'trace-v6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  tracing: Fix poll() and select() do not work on per_cpu trace_pipe and trace_pipe_raw
2023-02-07 07:54:40 -08:00
Devid Antonio Filoni
4ae5e1e97c can: j1939: do not wait 250 ms if the same addr was already claimed
The ISO 11783-5 standard, in "4.5.2 - Address claim requirements", states:
  d) No CF shall begin, or resume, transmission on the network until 250
     ms after it has successfully claimed an address except when
     responding to a request for address-claimed.

But "Figure 6" and "Figure 7" in "4.5.4.2 - Address-claim
prioritization" show that the CF begins the transmission after 250 ms
from the first AC (address-claimed) message even if it sends another AC
message during that time window to resolve the address contention with
another CF.

As stated in "4.4.2.3 - Address-claimed message":
  In order to successfully claim an address, the CF sending an address
  claimed message shall not receive a contending claim from another CF
  for at least 250 ms.

As stated in "4.4.3.2 - NAME management (NM) message":
  1) A commanding CF can
     d) request that a CF with a specified NAME transmit the address-
        claimed message with its current NAME.
  2) A target CF shall
     d) send an address-claimed message in response to a request for a
        matching NAME

Taking the above arguments into account, the 250 ms wait is requested
only during network initialization.

Do not restart the timer on AC message if both the NAME and the address
match and so if the address has already been claimed (timer has expired)
or the AC message has been sent to resolve the contention with another
CF (timer is still running).

Signed-off-by: Devid Antonio Filoni <devid.filoni@egluetechnologies.com>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/all/20221125170418.34575-1-devid.filoni@egluetechnologies.com
Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol")
Cc: stable@vger.kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2023-02-07 15:00:22 +01:00
Jiri Pirko
565b4824c3 devlink: change port event netdev notifier from per-net to global
Currently only the network namespace of devlink instance is monitored
for port events. If netdev is moved to a different namespace and then
unregistered, NETDEV_PRE_UNINIT is missed which leads to trigger
following WARN_ON in devl_port_unregister().
WARN_ON(devlink_port->type != DEVLINK_PORT_TYPE_NOTSET);

Fix this by changing the netdev notifier from per-net to global so no
event is missed.

Fixes: 02a68a47eade ("net: devlink: track netdev with devlink_port assigned")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20230206094151.2557264-1-jiri@resnulli.us
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-07 14:13:55 +01:00
Vladimir Oltean
bbb253b206 selftests: ocelot: tc_flower_chains: make test_vlan_ingress_modify() more comprehensive
We have two IS1 filters of the OCELOT_VCAP_KEY_ANY key type (the one with
"action vlan pop" and the one with "action vlan modify") and one of the
OCELOT_VCAP_KEY_IPV4 key type (the one with "action skbedit priority").
But we have no IS1 filter with the OCELOT_VCAP_KEY_ETYPE key type, and
there was an uncaught breakage there.

To increase test coverage, convert one of the OCELOT_VCAP_KEY_ANY
filters to OCELOT_VCAP_KEY_ETYPE, by making the filter also match on the
MAC SA of the traffic sent by mausezahn, $h1_mac.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20230205192409.1796428-2-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-07 12:20:21 +01:00
Vladimir Oltean
f964f8399d net: mscc: ocelot: fix VCAP filters not matching on MAC with "protocol 802.1Q"
Alternative short title: don't instruct the hardware to match on
EtherType with "protocol 802.1Q" flower filters. It doesn't work for the
reasons detailed below.

With a command such as the following:

tc filter add dev $swp1 ingress chain $(IS1 2) pref 3 \
	protocol 802.1Q flower skip_sw vlan_id 200 src_mac $h1_mac \
	action vlan modify id 300 \
	action goto chain $(IS2 0 0)

the created filter is set by ocelot_flower_parse_key() to be of type
OCELOT_VCAP_KEY_ETYPE, and etype is set to {value=0x8100, mask=0xffff}.
This gets propagated all the way to is1_entry_set() which commits it to
hardware (the VCAP_IS1_HK_ETYPE field of the key). Compare this to the
case where src_mac isn't specified - the key type is OCELOT_VCAP_KEY_ANY,
and is1_entry_set() doesn't populate VCAP_IS1_HK_ETYPE.

The problem is that for VLAN-tagged frames, the hardware interprets the
ETYPE field as holding the encapsulated VLAN protocol. So the above
filter will only match those packets which have an encapsulated protocol
of 0x8100, rather than all packets with VLAN ID 200 and the given src_mac.

The reason why this is allowed to occur is because, although we have a
block of code in ocelot_flower_parse_key() which sets "match_protocol"
to false when VLAN keys are present, that code executes too late.
There is another block of code, which executes for Ethernet addresses,
and has a "goto finished_key_parsing" and skips the VLAN header parsing.
By skipping it, "match_protocol" remains with the value it was
initialized with, i.e. "true", and "proto" is set to f->common.protocol,
or 0x8100.

The concept of ignoring some keys rather than erroring out when they are
present but can't be offloaded is dubious in itself, but is present
since the initial commit fe3490e6107e ("net: mscc: ocelot: Hardware
ofload for tc flower filter"), and it's outside of the scope of this
patch to change that.

The problem was introduced when the driver started to interpret the
flower filter's protocol, and populate the VCAP filter's ETYPE field
based on it.

To fix this, it is sufficient to move the code that parses the VLAN keys
earlier than the "goto finished_key_parsing" instruction. This will
ensure that if we have a flower filter with both VLAN and Ethernet
address keys, it won't match on ETYPE 0x8100, because the VLAN key
parsing sets "match_protocol = false".

Fixes: 86b956de119c ("net: mscc: ocelot: support matching on EtherType")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230205192409.1796428-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-07 12:20:21 +01:00
Vladimir Oltean
0b6d642510 net: dsa: mt7530: don't change PVC_EG_TAG when CPU port becomes VLAN-aware
Frank reports that in a mt7530 setup where some ports are standalone and
some are in a VLAN-aware bridge, 8021q uppers of the standalone ports
lose their VLAN tag on xmit, as seen by the link partner.

This seems to occur because once the other ports join the VLAN-aware
bridge, mt7530_port_vlan_filtering() also calls
mt7530_port_set_vlan_aware(ds, cpu_dp->index), and this affects the way
that the switch processes the traffic of the standalone port.

Relevant is the PVC_EG_TAG bit. The MT7530 documentation says about it:

EG_TAG: Incoming Port Egress Tag VLAN Attribution
0: disabled (system default)
1: consistent (keep the original ingress tag attribute)

My interpretation is that this setting applies on the ingress port, and
"disabled" is basically the normal behavior, where the egress tag format
of the packet (tagged or untagged) is decided by the VLAN table
(MT7530_VLAN_EGRESS_UNTAG or MT7530_VLAN_EGRESS_TAG).

But there is also an option of overriding the system default behavior,
and for the egress tagging format of packets to be decided not by the
VLAN table, but simply by copying the ingress tag format (if ingress was
tagged, egress is tagged; if ingress was untagged, egress is untagged;
aka "consistent). This is useful in 2 scenarios:

- VLAN-unaware bridge ports will always encounter a miss in the VLAN
  table. They should forward a packet as-is, though. So we use
  "consistent" there. See commit e045124e9399 ("net: dsa: mt7530: fix
  tagged frames pass-through in VLAN-unaware mode").

- Traffic injected from the CPU port. The operating system is in god
  mode; if it wants a packet to exit as VLAN-tagged, it sends it as
  VLAN-tagged. Otherwise it sends it as VLAN-untagged*.

*This is true only if we don't consider the bridge TX forwarding offload
feature, which mt7530 doesn't support.

So for now, make the CPU port always stay in "consistent" mode to allow
software VLANs to be forwarded to their egress ports with the VLAN tag
intact, and not stripped.

Link: https://lore.kernel.org/netdev/trinity-e6294d28-636c-4c40-bb8b-b523521b00be-1674233135062@3c-app-gmx-bs36/
Fixes: e045124e9399 ("net: dsa: mt7530: fix tagged frames pass-through in VLAN-unaware mode")
Reported-by: Frank Wunderlich <frank-w@public-files.de>
Tested-by: Frank Wunderlich <frank-w@public-files.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20230205140713.1609281-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-07 11:42:04 +01:00
ZhaoLong Wang
aa5465aeca cifs: Fix use-after-free in rdata->read_into_pages()
When the network status is unstable, use-after-free may occur when
read data from the server.

  BUG: KASAN: use-after-free in readpages_fill_pages+0x14c/0x7e0

  Call Trace:
   <TASK>
   dump_stack_lvl+0x38/0x4c
   print_report+0x16f/0x4a6
   kasan_report+0xb7/0x130
   readpages_fill_pages+0x14c/0x7e0
   cifs_readv_receive+0x46d/0xa40
   cifs_demultiplex_thread+0x121c/0x1490
   kthread+0x16b/0x1a0
   ret_from_fork+0x2c/0x50
   </TASK>

  Allocated by task 2535:
   kasan_save_stack+0x22/0x50
   kasan_set_track+0x25/0x30
   __kasan_kmalloc+0x82/0x90
   cifs_readdata_direct_alloc+0x2c/0x110
   cifs_readdata_alloc+0x2d/0x60
   cifs_readahead+0x393/0xfe0
   read_pages+0x12f/0x470
   page_cache_ra_unbounded+0x1b1/0x240
   filemap_get_pages+0x1c8/0x9a0
   filemap_read+0x1c0/0x540
   cifs_strict_readv+0x21b/0x240
   vfs_read+0x395/0x4b0
   ksys_read+0xb8/0x150
   do_syscall_64+0x3f/0x90
   entry_SYSCALL_64_after_hwframe+0x72/0xdc

  Freed by task 79:
   kasan_save_stack+0x22/0x50
   kasan_set_track+0x25/0x30
   kasan_save_free_info+0x2e/0x50
   __kasan_slab_free+0x10e/0x1a0
   __kmem_cache_free+0x7a/0x1a0
   cifs_readdata_release+0x49/0x60
   process_one_work+0x46c/0x760
   worker_thread+0x2a4/0x6f0
   kthread+0x16b/0x1a0
   ret_from_fork+0x2c/0x50

  Last potentially related work creation:
   kasan_save_stack+0x22/0x50
   __kasan_record_aux_stack+0x95/0xb0
   insert_work+0x2b/0x130
   __queue_work+0x1fe/0x660
   queue_work_on+0x4b/0x60
   smb2_readv_callback+0x396/0x800
   cifs_abort_connection+0x474/0x6a0
   cifs_reconnect+0x5cb/0xa50
   cifs_readv_from_socket.cold+0x22/0x6c
   cifs_read_page_from_socket+0xc1/0x100
   readpages_fill_pages.cold+0x2f/0x46
   cifs_readv_receive+0x46d/0xa40
   cifs_demultiplex_thread+0x121c/0x1490
   kthread+0x16b/0x1a0
   ret_from_fork+0x2c/0x50

The following function calls will cause UAF of the rdata pointer.

readpages_fill_pages
 cifs_read_page_from_socket
  cifs_readv_from_socket
   cifs_reconnect
    __cifs_reconnect
     cifs_abort_connection
      mid->callback() --> smb2_readv_callback
       queue_work(&rdata->work)  # if the worker completes first,
                                 # the rdata is freed
          cifs_readv_complete
            kref_put
              cifs_readdata_release
                kfree(rdata)
 return rdata->...               # UAF in readpages_fill_pages()

Similarly, this problem also occurs in the uncache_fill_pages().

Fix this by adjusts the order of condition judgment in the return
statement.

Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com>
Cc: stable@vger.kernel.org
Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-02-06 22:50:25 -06:00
Zhang Changzhong
4a606ce684 ice: switch: fix potential memleak in ice_add_adv_recipe()
When ice_add_special_words() fails, the 'rm' is not released, which will
lead to a memory leak. Fix this up by going to 'err_unroll' label.

Compile tested only.

Fixes: 8b032a55c1bd ("ice: low level support for tunnels")
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
2023-02-06 15:13:02 -08:00
Dan Carpenter
3f4870df1b ice: Fix off by one in ice_tc_forward_to_queue()
The > comparison should be >= to prevent reading one element beyond
the end of the array.

The "vsi->num_rxq" is not strictly speaking the number of elements in
the vsi->rxq_map[] array.  The array has "vsi->alloc_rxq" elements and
"vsi->num_rxq" is less than or equal to the number of elements in the
array.  The array is allocated in ice_vsi_alloc_arrays().  It's still
an off by one but it might not access outside the end of the array.

Fixes: 143b86f346c7 ("ice: Enable RX queue selection using skbedit action")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Amritha Nambiar <amritha.nambiar@intel.com>
Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
2023-02-06 15:13:02 -08:00
Brett Creeley
c793f8ea15 ice: Fix disabling Rx VLAN filtering with port VLAN enabled
If the user turns on the vf-true-promiscuous-support flag, then Rx VLAN
filtering will be disabled if the VF requests to enable promiscuous
mode. When the VF is in a port VLAN, this is the incorrect behavior
because it will allow the VF to receive traffic outside of its port VLAN
domain. Fortunately this only resulted in the VF(s) receiving broadcast
traffic outside of the VLAN domain because all of the VLAN promiscuous
rules are based on the port VLAN ID. Fix this by setting the
.disable_rx_filtering VLAN op to a no-op when a port VLAN is enabled on
the VF.

Also, make sure to make this fix for both Single VLAN Mode and Double
VLAN Mode enabled devices.

Fixes: c31af68a1b94 ("ice: Add outer_vlan_ops and VSI specific VLAN ops implementations")
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Karen Ostrowska <karen.ostrowska@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-02-06 15:13:02 -08:00
Michal Swiatkowski
b2dbde3ad4 ice: fix out-of-bounds KASAN warning in virtchnl
KASAN reported:
[ 9793.708867] BUG: KASAN: global-out-of-bounds in ice_get_link_speed+0x16/0x30 [ice]
[ 9793.709205] Read of size 4 at addr ffffffffc1271b1c by task kworker/6:1/402

[ 9793.709222] CPU: 6 PID: 402 Comm: kworker/6:1 Kdump: loaded Tainted: G    B      OE      6.1.0+ #3
[ 9793.709235] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.00.01.0014.070920180847 07/09/2018
[ 9793.709245] Workqueue: ice ice_service_task [ice]
[ 9793.709575] Call Trace:
[ 9793.709582]  <TASK>
[ 9793.709588]  dump_stack_lvl+0x44/0x5c
[ 9793.709613]  print_report+0x17f/0x47b
[ 9793.709632]  ? __cpuidle_text_end+0x5/0x5
[ 9793.709653]  ? ice_get_link_speed+0x16/0x30 [ice]
[ 9793.709986]  ? ice_get_link_speed+0x16/0x30 [ice]
[ 9793.710317]  kasan_report+0xb7/0x140
[ 9793.710335]  ? ice_get_link_speed+0x16/0x30 [ice]
[ 9793.710673]  ice_get_link_speed+0x16/0x30 [ice]
[ 9793.711006]  ice_vc_notify_vf_link_state+0x14c/0x160 [ice]
[ 9793.711351]  ? ice_vc_repr_cfg_promiscuous_mode+0x120/0x120 [ice]
[ 9793.711698]  ice_vc_process_vf_msg+0x7a7/0xc00 [ice]
[ 9793.712074]  __ice_clean_ctrlq+0x98f/0xd20 [ice]
[ 9793.712534]  ? ice_bridge_setlink+0x410/0x410 [ice]
[ 9793.712979]  ? __request_module+0x320/0x520
[ 9793.713014]  ? ice_process_vflr_event+0x27/0x130 [ice]
[ 9793.713489]  ice_service_task+0x11cf/0x1950 [ice]
[ 9793.713948]  ? io_schedule_timeout+0xb0/0xb0
[ 9793.713972]  process_one_work+0x3d0/0x6a0
[ 9793.714003]  worker_thread+0x8a/0x610
[ 9793.714031]  ? process_one_work+0x6a0/0x6a0
[ 9793.714049]  kthread+0x164/0x1a0
[ 9793.714071]  ? kthread_complete_and_exit+0x20/0x20
[ 9793.714100]  ret_from_fork+0x1f/0x30
[ 9793.714137]  </TASK>

[ 9793.714151] The buggy address belongs to the variable:
[ 9793.714158]  ice_aq_to_link_speed+0x3c/0xffffffffffff3520 [ice]

[ 9793.714632] Memory state around the buggy address:
[ 9793.714642]  ffffffffc1271a00: f9 f9 f9 f9 00 00 05 f9 f9 f9 f9 f9 00 00 02 f9
[ 9793.714656]  ffffffffc1271a80: f9 f9 f9 f9 00 00 04 f9 f9 f9 f9 f9 00 00 00 00
[ 9793.714670] >ffffffffc1271b00: 00 00 00 04 f9 f9 f9 f9 04 f9 f9 f9 f9 f9 f9 f9
[ 9793.714680]                             ^
[ 9793.714690]  ffffffffc1271b80: 00 00 00 00 00 04 f9 f9 f9 f9 f9 f9 00 00 00 00
[ 9793.714704]  ffffffffc1271c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

The ICE_AQ_LINK_SPEED_UNKNOWN define is BIT(15). The value is bigger
than both legacy and normal link speed tables. Add one element (0 -
unknown) to both tables. There is no need to explicitly set table size,
leave it empty.

Fixes: 1d0e28a9be1f ("ice: Remove and replace ice speed defines with ethtool.h versions")
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
2023-02-06 15:13:02 -08:00
Anirudh Venkataramanan
4d159f7884 ice: Do not use WQ_MEM_RECLAIM flag for workqueue
When both ice and the irdma driver are loaded, a warning in
check_flush_dependency is being triggered. This is due to ice driver
workqueue being allocated with the WQ_MEM_RECLAIM flag and the irdma one
is not.

According to kernel documentation, this flag should be set if the
workqueue will be involved in the kernel's memory reclamation flow.
Since it is not, there is no need for the ice driver's WQ to have this
flag set so remove it.

Example trace:

[  +0.000004] workqueue: WQ_MEM_RECLAIM ice:ice_service_task [ice] is flushing !WQ_MEM_RECLAIM infiniband:0x0
[  +0.000139] WARNING: CPU: 0 PID: 728 at kernel/workqueue.c:2632 check_flush_dependency+0x178/0x1a0
[  +0.000011] Modules linked in: bonding tls xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_cha
in_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink bridge stp llc rfkill vfat fat intel_rapl_msr intel
_rapl_common isst_if_common skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct1
0dif_pclmul crc32_pclmul ghash_clmulni_intel rapl intel_cstate rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod target_
core_mod ib_iser libiscsi scsi_transport_iscsi rdma_cm ib_cm iw_cm iTCO_wdt iTCO_vendor_support ipmi_ssif irdma mei_me ib_uverbs
ib_core intel_uncore joydev pcspkr i2c_i801 acpi_ipmi mei lpc_ich i2c_smbus intel_pch_thermal ioatdma ipmi_si acpi_power_meter
acpi_pad xfs libcrc32c sd_mod t10_pi crc64_rocksoft crc64 sg ahci ixgbe libahci ice i40e igb crc32c_intel mdio i2c_algo_bit liba
ta dca wmi dm_mirror dm_region_hash dm_log dm_mod ipmi_devintf ipmi_msghandler fuse
[  +0.000161]  [last unloaded: bonding]
[  +0.000006] CPU: 0 PID: 728 Comm: kworker/0:2 Tainted: G S                 6.2.0-rc2_next-queue-13jan-00458-gc20aabd57164 #1
[  +0.000006] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0010.010620200716 01/06/2020
[  +0.000003] Workqueue: ice ice_service_task [ice]
[  +0.000127] RIP: 0010:check_flush_dependency+0x178/0x1a0
[  +0.000005] Code: 89 8e 02 01 e8 49 3d 40 00 49 8b 55 18 48 8d 8d d0 00 00 00 48 8d b3 d0 00 00 00 4d 89 e0 48 c7 c7 e0 3b 08
9f e8 bb d3 07 01 <0f> 0b e9 be fe ff ff 80 3d 24 89 8e 02 00 0f 85 6b ff ff ff e9 06
[  +0.000004] RSP: 0018:ffff88810a39f990 EFLAGS: 00010282
[  +0.000005] RAX: 0000000000000000 RBX: ffff888141bc2400 RCX: 0000000000000000
[  +0.000004] RDX: 0000000000000001 RSI: dffffc0000000000 RDI: ffffffffa1213a80
[  +0.000003] RBP: ffff888194bf3400 R08: ffffed117b306112 R09: ffffed117b306112
[  +0.000003] R10: ffff888bd983088b R11: ffffed117b306111 R12: 0000000000000000
[  +0.000003] R13: ffff888111f84d00 R14: ffff88810a3943ac R15: ffff888194bf3400
[  +0.000004] FS:  0000000000000000(0000) GS:ffff888bd9800000(0000) knlGS:0000000000000000
[  +0.000003] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0.000003] CR2: 000056035b208b60 CR3: 000000017795e005 CR4: 00000000007706f0
[  +0.000003] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  +0.000003] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  +0.000002] PKRU: 55555554
[  +0.000003] Call Trace:
[  +0.000002]  <TASK>
[  +0.000003]  __flush_workqueue+0x203/0x840
[  +0.000006]  ? mutex_unlock+0x84/0xd0
[  +0.000008]  ? __pfx_mutex_unlock+0x10/0x10
[  +0.000004]  ? __pfx___flush_workqueue+0x10/0x10
[  +0.000006]  ? mutex_lock+0xa3/0xf0
[  +0.000005]  ib_cache_cleanup_one+0x39/0x190 [ib_core]
[  +0.000174]  __ib_unregister_device+0x84/0xf0 [ib_core]
[  +0.000094]  ib_unregister_device+0x25/0x30 [ib_core]
[  +0.000093]  irdma_ib_unregister_device+0x97/0xc0 [irdma]
[  +0.000064]  ? __pfx_irdma_ib_unregister_device+0x10/0x10 [irdma]
[  +0.000059]  ? up_write+0x5c/0x90
[  +0.000005]  irdma_remove+0x36/0x90 [irdma]
[  +0.000062]  auxiliary_bus_remove+0x32/0x50
[  +0.000007]  device_release_driver_internal+0xfa/0x1c0
[  +0.000005]  bus_remove_device+0x18a/0x260
[  +0.000007]  device_del+0x2e5/0x650
[  +0.000005]  ? __pfx_device_del+0x10/0x10
[  +0.000003]  ? mutex_unlock+0x84/0xd0
[  +0.000004]  ? __pfx_mutex_unlock+0x10/0x10
[  +0.000004]  ? _raw_spin_unlock+0x18/0x40
[  +0.000005]  ice_unplug_aux_dev+0x52/0x70 [ice]
[  +0.000160]  ice_service_task+0x1309/0x14f0 [ice]
[  +0.000134]  ? __pfx___schedule+0x10/0x10
[  +0.000006]  process_one_work+0x3b1/0x6c0
[  +0.000008]  worker_thread+0x69/0x670
[  +0.000005]  ? __kthread_parkme+0xec/0x110
[  +0.000007]  ? __pfx_worker_thread+0x10/0x10
[  +0.000005]  kthread+0x17f/0x1b0
[  +0.000005]  ? __pfx_kthread+0x10/0x10
[  +0.000004]  ret_from_fork+0x29/0x50
[  +0.000009]  </TASK>

Fixes: 940b61af02f4 ("ice: Initialize PF and setup miscellaneous interrupt")
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Tested-by: Jakub Andrysiak <jakub.andrysiak@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
2023-02-06 15:13:02 -08:00
Linus Torvalds
05ecb68070 cgroup fixes for v6.2-rc7
During v6.2 cycle, there were a series of changes to task cpu affinity
 handling which fixed cpuset inadvertently clobbering user-configured
 affinity masks. Unfortunately, they broke the affinity handling on hybrid
 heterogeneous CPUs which have cores that can execute both 64 and 32bit along
 with cores that can only execute 32bit code.
 
 This late pull request contains two fix patches for the above issue. While
 reverting the changes that caused the regression is definitely an option,
 the origial patches do improve how cpuset behave signficantly in some cases
 and the fixes seem fairly safe, so I think it'd be better to try to fix them
 first.
 -----BEGIN PGP SIGNATURE-----
 
 iIQEABYIACwWIQTfIjM1kS57o3GsC/uxYfJx3gVYGQUCY+F1Fg4cdGpAa2VybmVs
 Lm9yZwAKCRCxYfJx3gVYGXK8AP0SuFGMgUQL1gYVGfFG3gMCcxDKHVhdU+UP/N1l
 oJXEpgEA8DW1otuQZz0+MHUyYHkEIUQ5eVj1f2BJfHRkc5r5Cgo=
 =MXxv
 -----END PGP SIGNATURE-----

Merge tag 'cgroup-for-6.2-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup fixes from Tejun Heo:
 "During the v6.2 cycle, there were a series of changes to task cpu
  affinity handling which fixed cpuset inadvertently clobbering
  user-configured affinity masks. Unfortunately, they broke the affinity
  handling on hybrid heterogeneous CPUs which have cores that can
  execute both 64 and 32bit along with cores that can only execute 32bit
  code.

  This contains two fix patches for the above issue. While reverting the
  changes that caused the regression is definitely an option, the
  origial patches do improve how cpuset behave signficantly in some
  cases and the fixes seem fairly safe, so I think it'd be better to try
  to fix them first"

* tag 'cgroup-for-6.2-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cpuset: Call set_cpus_allowed_ptr() with appropriate mask for task
  cgroup/cpuset: Don't filter offline CPUs in cpuset_cpus_allowed() for top cpuset tasks
2023-02-06 14:07:38 -08:00
Linus Torvalds
66fcf74e5c for-6.2-rc7-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmPhSm8ACgkQxWXV+ddt
 WDtucA/+MYsOjRZtG76NFUzDVaWpgPJ0/M7lJlzQkhMpRZwjVheDBDCGDSlu/Xzq
 wLdvc4VR/o0xZD90KtnQNDPwq1jknBHynVUiWAUzt0FKWu81Jd5TvfRMmGKGQ5B2
 CxSdfB2iatL/1L+DZ3q4uUXg8L+MDKTtjk2xOb648pXrT2MIy3u3j9ZhlDiYhvWx
 6YlPyUehq7a9gLXq6TGmZjC4FUboqlI6hdf3iu3rHlCeFFXTPT4QKR9G8FpVRikc
 C7lH8X3qV2Sg6rGaFT3BIsamS/rQZHh3zOuj4EbI/n6ZXiSsr0Bo/2JAxgyGYoH0
 u5LkIRIpry7E4Pn2vc9mj9T7C+tpN7BP+rQ9wL6r9KIbDB/c1hOsfOp+uZikukpY
 Lg9EvHksHyp0Fcrro3FxswRlK1Q5Q7Vx/+VUoYB93WCl8iQtEiVOH2LSoR+ZtSiD
 /Iitx8i1qcNO5DiFPcZgVC0WbrEfDoVqnwPrvY77BsBMA7i4l6Pe/n5Kw/vzRGmY
 ywo08fri7Daqv3HulBk3QrVGw4lHFPOuUpN9DkI3WfUoXTNeclzTPFS+27XnaXZn
 bP3OLf7hU7zTRC8FukWk9X4nPSTLT0xJ8LllGdMp1Wi9ntavqIDiJAviGsyqvneC
 FTgTKHFuvXvzgnji66Lo61wMEPRbac49diAKcmSiQwua/I7aPRY=
 =5fdr
 -----END PGP SIGNATURE-----

Merge tag 'for-6.2-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - explicitly initialize zlib work memory to fix a KCSAN warning

 - limit number of send clones by maximum memory allocated

 - limit device size extent in case it device shrink races with chunk
   allocation

 - raid56 fixes:
     - fix copy&paste error in RAID6 stripe recovery
     - make error bitmap update atomic

* tag 'for-6.2-rc7-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: raid56: make error_bitmap update atomic
  btrfs: send: limit number of clones and allocated memory size
  btrfs: zlib: zero-initialize zlib workspace
  btrfs: limit device extents to the device size
  btrfs: raid56: fix stripes if vertical errors are found
2023-02-06 14:05:16 -08:00
Will Deacon
7a2127e66a cpuset: Call set_cpus_allowed_ptr() with appropriate mask for task
set_cpus_allowed_ptr() will fail with -EINVAL if the requested
affinity mask is not a subset of the task_cpu_possible_mask() for the
task being updated. Consequently, on a heterogeneous system with cpusets
spanning the different CPU types, updates to the cgroup hierarchy can
silently fail to update task affinities when the effective affinity
mask for the cpuset is expanded.

For example, consider an arm64 system with 4 CPUs, where CPUs 2-3 are
the only cores capable of executing 32-bit tasks. Attaching a 32-bit
task to a cpuset containing CPUs 0-2 will correctly affine the task to
CPU 2. Extending the cpuset to CPUs 0-3, however, will fail to extend
the affinity mask of the 32-bit task because update_tasks_cpumask() will
pass the full 0-3 mask to set_cpus_allowed_ptr().

Extend update_tasks_cpumask() to take a temporary 'cpumask' paramater
and use it to mask the 'effective_cpus' mask with the possible mask for
each task being updated.

Fixes: 431c69fac05b ("cpuset: Honour task_cpu_possible_mask() in guarantee_online_cpus()")
Signed-off-by: Will Deacon <will@kernel.org>
Acked-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2023-02-06 10:18:36 -10:00
Waiman Long
3fb906e7fa cgroup/cpuset: Don't filter offline CPUs in cpuset_cpus_allowed() for top cpuset tasks
Since commit 8f9ea86fdf99 ("sched: Always preserve the user
requested cpumask"), relax_compatible_cpus_allowed_ptr() is calling
__sched_setaffinity() unconditionally. This helps to expose a bug in
the current cpuset hotplug code where the cpumasks of the tasks in
the top cpuset are not updated at all when some CPUs become online or
offline. It is likely caused by the fact that some of the tasks in the
top cpuset, like percpu kthreads, cannot have their cpu affinity changed.

One way to reproduce this as suggested by Peter is:
 - boot machine
 - offline all CPUs except one
 - taskset -p ffffffff $$
 - online all CPUs

Fix this by allowing cpuset_cpus_allowed() to return a wider mask that
includes offline CPUs for those tasks that are in the top cpuset. For
tasks not in the top cpuset, the old rule applies and only online CPUs
will be returned in the mask since hotplug events will update their
cpumasks accordingly.

Fixes: 8f9ea86fdf99 ("sched: Always preserve the user requested cpumask")
Reported-by: Will Deacon <will@kernel.org>
Originally-from: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Will Deacon <will@kernel.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2023-02-06 10:15:08 -10:00
Luka Guzenko
ebebf05a4b HID: Ignore battery for ELAN touchscreen 29DF on HP
The touchscreen reports a battery status of 0% and jumps to 1% when a
stylus is used. The device ID was added and the battery ignore quirk was
enabled for it.

Signed-off-by: Luka Guzenko <l.guzenko@web.de>
Link: https://lore.kernel.org/r/20230120223741.3007-1-l.guzenko@web.de
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
2023-02-06 16:38:55 +01:00
Mario Limonciello
7bcfdab3f0 HID: amd_sfh: if no sensors are enabled, clean up
It was reported that commit b300667b33b2 ("HID: amd_sfh: Disable the
interrupt for all command") had caused increased resume time on HP Envy
x360.

Before this commit 3 sensors were reported, but they were not actually
functional.  After this commit the sensors are no longer reported, but
also the resume time increased.

To avoid this problem explicitly look for the number of disabled sensors.
If all the sensors are disabled, clean everything up.

Fixes: b300667b33b2 ("HID: amd_sfh: Disable the interrupt for all command")
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2115
Reported-by: Xaver Hugl <xaver.hugl@gmail.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
Link: https://lore.kernel.org/r/20230203220850.13924-1-mario.limonciello@amd.com
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
2023-02-06 11:00:36 +01:00
Alan Stern
811d581194 net: USB: Fix wrong-direction WARNING in plusb.c
The syzbot fuzzer detected a bug in the plusb network driver: A
zero-length control-OUT transfer was treated as a read instead of a
write.  In modern kernels this error provokes a WARNING:

usb 1-1: BOGUS control dir, pipe 80000280 doesn't match bRequestType c0
WARNING: CPU: 0 PID: 4645 at drivers/usb/core/urb.c:411
usb_submit_urb+0x14a7/0x1880 drivers/usb/core/urb.c:411
Modules linked in:
CPU: 1 PID: 4645 Comm: dhcpcd Not tainted
6.2.0-rc6-syzkaller-00050-g9f266ccaa2f5 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google
01/12/2023
RIP: 0010:usb_submit_urb+0x14a7/0x1880 drivers/usb/core/urb.c:411
...
Call Trace:
 <TASK>
 usb_start_wait_urb+0x101/0x4b0 drivers/usb/core/message.c:58
 usb_internal_control_msg drivers/usb/core/message.c:102 [inline]
 usb_control_msg+0x320/0x4a0 drivers/usb/core/message.c:153
 __usbnet_read_cmd+0xb9/0x390 drivers/net/usb/usbnet.c:2010
 usbnet_read_cmd+0x96/0xf0 drivers/net/usb/usbnet.c:2068
 pl_vendor_req drivers/net/usb/plusb.c:60 [inline]
 pl_set_QuickLink_features drivers/net/usb/plusb.c:75 [inline]
 pl_reset+0x2f/0xf0 drivers/net/usb/plusb.c:85
 usbnet_open+0xcc/0x5d0 drivers/net/usb/usbnet.c:889
 __dev_open+0x297/0x4d0 net/core/dev.c:1417
 __dev_change_flags+0x587/0x750 net/core/dev.c:8530
 dev_change_flags+0x97/0x170 net/core/dev.c:8602
 devinet_ioctl+0x15a2/0x1d70 net/ipv4/devinet.c:1147
 inet_ioctl+0x33f/0x380 net/ipv4/af_inet.c:979
 sock_do_ioctl+0xcc/0x230 net/socket.c:1169
 sock_ioctl+0x1f8/0x680 net/socket.c:1286
 vfs_ioctl fs/ioctl.c:51 [inline]
 __do_sys_ioctl fs/ioctl.c:870 [inline]
 __se_sys_ioctl fs/ioctl.c:856 [inline]
 __x64_sys_ioctl+0x197/0x210 fs/ioctl.c:856
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

The fix is to call usbnet_write_cmd() instead of usbnet_read_cmd() and
remove the USB_DIR_IN flag.

Reported-and-tested-by: syzbot+2a0e7abd24f1eb90ce25@syzkaller.appspotmail.com
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Fixes: 090ffa9d0e90 ("[PATCH] USB: usbnet (9/9) module for pl2301/2302 cables")
CC: stable@vger.kernel.org
Link: https://lore.kernel.org/r/00000000000052099f05f3b3e298@google.com/
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-06 09:59:35 +00:00
Bastien Nocera
690eb7dec7 HID: logitech: Disable hi-res scrolling on USB
On some Logitech mice, such as the G903, and possibly the G403, the HID
events are generated on a different interface to the HID++ one.

If we enable hi-res through the HID++ interface, the HID interface
wouldn't know anything about it, and handle the events as if they were
regular scroll events, making the mouse unusable.

Disable hi-res scrolling on those devices until we implement scroll
events through HID++.

Signed-off-by: Bastien Nocera <hadess@hadess.net>
Tested-by: Tobias Klausmann <klausman@schwarzvogel.de>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216885
Fixes: 908d325e1665 ("HID: logitech-hidpp: Detect hi-res scrolling support")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230203101800.139380-1-hadess@hadess.net
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
2023-02-06 10:58:15 +01:00
Xin Zhao
ea427a222d HID: core: Fix deadloop in hid_apply_multiplier.
The initial value of hid->collection[].parent_idx if 0. When
Report descriptor doesn't contain "HID Collection", the value
remains as 0.

In the meanwhile, when the Report descriptor fullfill
all following conditions, it will trigger hid_apply_multiplier
function call.
1. Usage page is Generic Desktop Ctrls (0x01)
2. Usage is RESOLUTION_MULTIPLIER (0x48)
3. Contain any FEATURE items

The while loop in hid_apply_multiplier will search the top-most
collection by searching parent_idx == -1. Because all parent_idx
is 0. The loop will run forever.

There is a Report Descriptor triggerring the deadloop
0x05, 0x01,        // Usage Page (Generic Desktop Ctrls)
0x09, 0x48,        // Usage (0x48)
0x95, 0x01,        // Report Count (1)
0x75, 0x08,        // Report Size (8)
0xB1, 0x01,        // Feature

Signed-off-by: Xin Zhao <xnzhao@google.com>
Link: https://lore.kernel.org/r/20230130212947.1315941-1-xnzhao@google.com
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
2023-02-06 10:57:27 +01:00
Casper Andersson
d7d94b2612 net: microchip: sparx5: fix PTP init/deinit not checking all ports
Check all ports instead of just port_count ports. PTP init was only
checking ports 0 to port_count. If the hardware ports are not mapped
starting from 0 then they would be missed, e.g. if only ports 20-30 were
mapped it would attempt to init ports 0-10, resulting in NULL pointers
when attempting to timestamp. Now it will init all mapped ports.

Fixes: 70dfe25cd866 ("net: sparx5: Update extraction/injection for timestamping")
Signed-off-by: Casper Andersson <casper.casan@gmail.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-06 09:26:04 +00:00
Herton R. Krzesinski
03702d4d29 uapi: add missing ip/ipv6 header dependencies for linux/stddef.h
Since commit 58e0be1ef6118 ("net: use struct_group to copy ip/ipv6
header addresses"), ip and ipv6 headers started to use the __struct_group
definition, which is defined at include/uapi/linux/stddef.h. However,
linux/stddef.h isn't explicitly included in include/uapi/linux/{ip,ipv6}.h,
which breaks build of xskxceiver bpf selftest if you install the uapi
headers in the system:

$ make V=1 xskxceiver -C tools/testing/selftests/bpf
...
make: Entering directory '(...)/tools/testing/selftests/bpf'
gcc -g -O0 -rdynamic -Wall -Werror (...)
In file included from xskxceiver.c:79:
/usr/include/linux/ip.h:103:9: error: expected specifier-qualifier-list before ‘__struct_group’
  103 |         __struct_group(/* no tag */, addrs, /* no attrs */,
      |         ^~~~~~~~~~~~~~
...

Include the missing <linux/stddef.h> dependency in ip.h and do the
same for the ipv6.h header.

Fixes: 58e0be1ef611 ("net: use struct_group to copy ip/ipv6 header addresses")
Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>
Tested-by: Carlos O'Donell <carlos@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-06 09:01:00 +00:00
Julian Anastasov
c1d2ecdf5e neigh: make sure used and confirmed times are valid
Entries can linger in cache without timer for days, thanks to
the gc_thresh1 limit. As result, without traffic, the confirmed
time can be outdated and to appear to be in the future. Later,
on traffic, NUD_STALE entries can switch to NUD_DELAY and start
the timer which can see the invalid confirmed time and wrongly
switch to NUD_REACHABLE state instead of NUD_PROBE. As result,
timer is set many days in the future. This is more visible on
32-bit platforms, with higher HZ value.

Why this is a problem? While we expect unused entries to expire,
such entries stay in REACHABLE state for too long, locked in
cache. They are not expired normally, only when cache is full.

Problem and the wrong state change reported by Zhang Changzhong:

172.16.1.18 dev bond0 lladdr 0a:0e:0f:01:12:01 ref 1 used 350521/15994171/350520 probes 4 REACHABLE

350520 seconds have elapsed since this entry was last updated, but it is
still in the REACHABLE state (base_reachable_time_ms is 30000),
preventing lladdr from being updated through probe.

Fix it by ensuring timer is started with valid used/confirmed
times. Considering the valid time range is LONG_MAX jiffies,
we try not to go too much in the past while we are in
DELAY/PROBE state. There are also places that need
used/updated times to be validated while timer is not running.

Reported-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Tested-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-06 08:44:31 +00:00
Linus Torvalds
d2d11f342b Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull ELF fix from Al Viro:
 "One of the many equivalent build warning fixes for !CONFIG_ELF_CORE
  configs. Geert's is the earliest one I've been able to find"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  coredump: Move dump_emit_page() to kill unused warning
2023-02-05 17:17:10 -08:00
Linus Torvalds
4ec5183ec4 Linux 6.2-rc7 v6.2-rc7 2023-02-05 13:13:28 -08:00
Linus Torvalds
c608f6b58f USB fixes for 6.2-rc7
Here are some small USB fixes for 6.2-rc7 that resolve some reported
 problems.  These include:
   - gadget driver fixes
   - dwc3 driver fix
   - typec driver fix
   - MAINTAINERS file update.
 
 All of these have been in linux-next with no reported problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY9+WCw8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yn/VwCfR7EXWarbaSbNM12kGQd5aU/r2QgAoLZy81Yg
 lkxWtksTfGGrHI6g9JAs
 =1VJG
 -----END PGP SIGNATURE-----

Merge tag 'usb-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some small USB fixes that resolve some reported problems.
  These include:

   - gadget driver fixes

   - dwc3 driver fix

   - typec driver fix

   - MAINTAINERS file update.

  All of these have been in linux-next with no reported problems"

* tag 'usb-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: typec: ucsi: Don't attempt to resume the ports before they exist
  usb: gadget: udc: do not clear gadget driver.bus
  usb: gadget: f_uac2: Fix incorrect increment of bNumEndpoints
  usb: gadget: f_fs: Fix unbalanced spinlock in __ffs_ep0_queue_wait
  usb: dwc3: qcom: enable vbus override when in OTG dr-mode
  MAINTAINERS: Add myself as UVC Gadget Maintainer
2023-02-05 12:19:55 -08:00