linux

iv/linux

Author	SHA1	Message	Date
Claudiu Beznea	21feaf558f	net: ravb: Keep reverse order of operations in ravb_remove() [ Upstream commit edf9bc396e05081ca281ffb0cd41e44db478ff26 ] On RZ/G3S SMARC Carrier II board having RGMII connections b/w Ethernet MACs and PHYs it has been discovered that doing unbind/bind for ravb driver in a loop leads to wrong speed and duplex for Ethernet links and broken connectivity (the connectivity cannot be restored even with bringing interface down/up). Before doing unbind/bind the Ethernet interfaces were configured though systemd. The sh instructions used to do unbind/bind were: $ cd /sys/bus/platform/drivers/ravb/ $ while :; do echo 11c30000.ethernet > unbind ; \ echo 11c30000.ethernet > bind; done It has been discovered that there is a race b/w IOCTLs initialized by systemd at the response of success binding and the "ravb_write(ndev, CCC_OPC_RESET, CCC)" call in ravb_remove() as follows: 1/ as a result of bind success the user space open/configures the interfaces tough an IOCTL; the following stack trace has been identified on RZ/G3S: Call trace: dump_backtrace+0x9c/0x100 show_stack+0x20/0x38 dump_stack_lvl+0x48/0x60 dump_stack+0x18/0x28 ravb_open+0x70/0xa58 __dev_open+0xf4/0x1e8 __dev_change_flags+0x198/0x218 dev_change_flags+0x2c/0x80 devinet_ioctl+0x640/0x708 inet_ioctl+0x1e4/0x200 sock_do_ioctl+0x50/0x108 sock_ioctl+0x240/0x358 __arm64_sys_ioctl+0xb0/0x100 invoke_syscall+0x50/0x128 el0_svc_common.constprop.0+0xc8/0xf0 do_el0_svc+0x24/0x38 el0_svc+0x34/0xb8 el0t_64_sync_handler+0xc0/0xc8 el0t_64_sync+0x190/0x198 2/ this call may execute concurrently with ravb_remove() as the unbind/bind operation was executed in a loop 3/ if the operation mode is changed to RESET (through ravb_write(ndev, CCC_OPC_RESET, CCC) call in ravb_remove()) while the above ravb_open() is in progress it may lead to MAC (or PHY, or MAC-PHY connection, the right point hasn't been identified at the moment) to be broken, thus the Ethernet connectivity fails to restore. The simple fix for this is to move ravb_write(ndev, CCC_OPC_RESET, CCC)) after unregister_netdev() to avoid resetting the controller while the netdev interface is still registered. To avoid future issues in ravb_remove(), the patch follows the proper order of operations in ravb_remove(): reverse order compared with ravb_probe(). This avoids described races as the IOCTLs as well as unregister_netdev() (called now at the beginning of ravb_remove()) calls rtnl_lock() before continuing and IOCTLs check (though devinet_ioctl()) if device is still registered just after taking the lock: int devinet_ioctl(struct net net, unsigned int cmd, struct ifreq ifr) { // ... rtnl_lock(); ret = -ENODEV; dev = __dev_get_by_name(net, ifr->ifr_name); if (!dev) goto done; // ... done: rtnl_unlock(); out: return ret; } Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:18 +01:00
Claudiu Beznea	8d04278ff4	net: ravb: Stop DMA in case of failures on ravb_open() [ Upstream commit eac16a733427ba0de2449ffc7bd3da32ddb65cb7 ] In case ravb_phy_start() returns with error the settings applied in ravb_dmac_init() are not reverted (e.g. config mode). For this call ravb_stop_dma() on failure path of ravb_open(). Fixes: a0d2f20650e8 ("Renesas Ethernet AVB PTP clock driver") Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:18 +01:00
Claudiu Beznea	52b751686c	net: ravb: Start TX queues after HW initialization succeeded [ Upstream commit 6f32c086602050fc11157adeafaa1c1eb393f0af ] ravb_phy_start() may fail. If that happens, the TX queues will remain started. Thus, move the netif_tx_start_all_queues() after PHY is successfully initialized. Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:18 +01:00
Claudiu Beznea	e2db25d16c	net: ravb: Make write access to CXR35 first before accessing other EMAC registers [ Upstream commit d78c0ced60d5e2f8b5a4a0468a5c400b24aeadf2 ] Hardware manual of RZ/G3S (and RZ/G2L) specifies the following on the description of CXR35 register (chapter "PHY interface select register (CXR35)"): "After release reset, make write-access to this register before making write-access to other registers (except MDIOMOD). Even if not need to change the value of this register, make write-access to this register at least one time. Because RGMII/MII MODE is recognized by accessing this register". The setup procedure for EMAC module (chapter "Setup procedure" of RZ/G3S, RZ/G2L manuals) specifies the E-MAC.CXR35 register is the first EMAC register that is to be configured. Note [A] from chapter "PHY interface select register (CXR35)" specifies the following: [A] The case which CXR35 SEL_XMII is used for the selection of RGMII/MII in APB Clock 100 MHz. (1) To use RGMII interface, Set ‘H’03E8_0000’ to this register. (2) To use MII interface, Set ‘H’03E8_0002’ to this register. Take into account these indication. Fixes: 1089877ada8d ("ravb: Add RZ/G2L MII interface support") Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:18 +01:00
Claudiu Beznea	f5c649ce79	net: ravb: Use pm_runtime_resume_and_get() [ Upstream commit 88b74831faaee455c2af380382d979fc38e79270 ] pm_runtime_get_sync() may return an error. In case it returns with an error dev->power.usage_count needs to be decremented. pm_runtime_resume_and_get() takes care of this. Thus use it. Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:18 +01:00
Claudiu Beznea	149b2fe12a	net: ravb: Check return value of reset_control_deassert() [ Upstream commit d8eb6ea4b302e7ff78535c205510e359ac10a0bd ] reset_control_deassert() could return an error. Some devices cannot work if reset signal de-assert operation fails. To avoid this check the return code of reset_control_deassert() in ravb_probe() and take proper action. Along with it, the free_netdev() call from the error path was moved after reset_control_assert() on its own label (out_free_netdev) to free netdev in case reset_control_deassert() fails. Fixes: 0d13a1a464a0 ("ravb: Add reset support") Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:18 +01:00
Yoshihiro Shimoda	7ed2e4c2d0	ravb: Fix races between ravb_tx_timeout_work() and net related ops [ Upstream commit 9870257a0a338cd8d6c1cddab74e703f490f6779 ] Fix races between ravb_tx_timeout_work() and functions of net_device_ops and ethtool_ops by using rtnl_trylock() and rtnl_unlock(). Note that since ravb_close() is under the rtnl lock and calls cancel_work_sync(), ravb_tx_timeout_work() should calls rtnl_trylock(). Otherwise, a deadlock may happen in ravb_tx_timeout_work() like below: CPU0 CPU1 ravb_tx_timeout() schedule_work() ... __dev_close_many() // Under rtnl lock ravb_close() cancel_work_sync() // Waiting ravb_tx_timeout_work() rtnl_lock() // This is possible to cause a deadlock If rtnl_trylock() fails, rescheduling the work with sleep for 1 msec. Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/20231127122420.3706751-1-yoshihiro.shimoda.uh@renesas.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:18 +01:00
Heiner Kallweit	8b1d088be5	r8169: prevent potential deadlock in rtl8169_close [ Upstream commit 91d3d149978ba7b238198dd80e4b823756aa7cfa ] ndo_stop() is RTNL-protected by net core, and the worker function takes RTNL as well. Therefore we will deadlock when trying to execute a pending work synchronously. To fix this execute any pending work asynchronously. This will do no harm because netif_running() is false in ndo_stop(), and therefore the work function is effectively a no-op. However we have to ensure that no task is running or pending after rtl_remove_one(), therefore add a call to cancel_work_sync(). Fixes: abe5fc42f9ce ("r8169: use RTNL to protect critical sections") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/12395867-1d17-4cac-aa7d-c691938fcddf@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:18 +01:00
Subbaraya Sundeep	9c4ac2d98a	octeontx2-pf: Restore TC ingress police rules when interface is up [ Upstream commit fd7f98b2e12a3d96a92bde6640657ec7116f4372 ] TC ingress policer rules depends on interface receive queue contexts since the bandwidth profiles are attached to RQ contexts. When an interface is brought down all the queue contexts are freed. This in turn frees bandwidth profiles in hardware causing ingress police rules non-functional after the interface is brought up. Fix this by applying all the ingress police rules config to hardware in otx2_open. Also allow adding ingress rules only when interface is running since no contexts exist for the interface when it is down. Fixes: 68fbff68dbea ("octeontx2-pf: Add police action for TC flower") Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://lore.kernel.org/r/1700930217-5707-1-git-send-email-sbhatta@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:18 +01:00
Suman Ghosh	aef2d5b3e5	octeontx2-af: Install TC filter rules in hardware based on priority [ Upstream commit ec87f05402f592d27507e1aa6b2fd21c486f2cc0 ] As of today, hardware does not support installing tc filter rules based on priority. This patch adds support to install the hardware rules based on priority. The final hardware rules will not be dependent on rule installation order, it will be strictly priority based, same as software. Signed-off-by: Suman Ghosh <sumang@marvell.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230721043925.2627806-1-sumang@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Stable-dep-of: fd7f98b2e12a ("octeontx2-pf: Restore TC ingress police rules when interface is up") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:17 +01:00
Geetha sowjanya	662b887084	octeontx2-pf: Fix adding mbox work queue entry when num_vfs > 64 [ Upstream commit 51597219e0cd5157401d4d0ccb5daa4d9961676f ] When more than 64 VFs are enabled for a PF then mbox communication between VF and PF is not working as mbox work queueing for few VFs are skipped due to wrong calculation of VF numbers. Fixes: d424b6c02415 ("octeontx2-pf: Enable SRIOV and added VF mbox handling") Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://lore.kernel.org/r/1700930042-5400-1-git-send-email-sbhatta@marvell.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:17 +01:00
Furong Xu	5d5bcfb1ca	net: stmmac: xgmac: Disable FPE MMC interrupts [ Upstream commit e54d628a2721bfbb002c19f6e8ca6746cec7640f ] Commit aeb18dd07692 ("net: stmmac: xgmac: Disable MMC interrupts by default") tries to disable MMC interrupts to avoid a storm of unhandled interrupts, but leaves the FPE(Frame Preemption) MMC interrupts enabled, FPE MMC interrupts can cause the same problem. Now we mask FPE TX and RX interrupts to disable all MMC interrupts. Fixes: aeb18dd07692 ("net: stmmac: xgmac: Disable MMC interrupts by default") Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Furong Xu <0x1207@gmail.com> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Link: https://lore.kernel.org/r/20231125060126.2328690-1-0x1207@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:17 +01:00
Elena Salomatkina	334e6378c2	octeontx2-af: Fix possible buffer overflow [ Upstream commit ad31c629ca3c87f6d557488c1f9faaebfbcd203c ] A loop in rvu_mbox_handler_nix_bandprof_free() contains a break if (idx == MAX_BANDPROF_PER_PFFUNC), but if idx may reach MAX_BANDPROF_PER_PFFUNC buffer '(*req->prof_idx)[layer]' overflow happens before that check. The patch moves the break to the beginning of the loop. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: e8e095b3b370 ("octeontx2-af: cn10k: Bandwidth profiles config support"). Signed-off-by: Elena Salomatkina <elena.salomatkina.cmc@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://lore.kernel.org/r/20231124210802.109763-1-elena.salomatkina.cmc@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:17 +01:00
Willem de Bruijn	c3e974e9c4	selftests/net: mptcp: fix uninitialized variable warnings [ Upstream commit 00a4f8fd9c750f20d8fd4535c71c9caa7ef5ff2f ] Same init_rng() in both tests. The function reads /dev/urandom to initialize srand(). In case of failure, it falls back onto the entropy in the uninitialized variable. Not sure if this is on purpose. But failure reading urandom should be rare, so just fail hard. While at it, convert to getrandom(). Which man 4 random suggests is simpler and more robust. mptcp_inq.c:525:6: mptcp_connect.c:1131:6: error: variable 'foo' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized] Fixes: 048d19d444be ("mptcp: add basic kselftest for mptcp") Fixes: b51880568f20 ("selftests: mptcp: add inq test case") Cc: Florian Westphal <fw@strlen.de> Signed-off-by: Willem de Bruijn <willemb@google.com> ---- When input is randomized because this is expected to meaningfully explore edge cases, should we also add 1. logging the random seed to stdout and 2. adding a command line argument to replay from a specific seed I can do this in net-next, if authors find it useful in this case. Reviewed-by: Matthieu Baerts <matttbe@kernel.org> Link: https://lore.kernel.org/r/20231124171645.1011043-5-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:17 +01:00
Willem de Bruijn	12dd4c1bf3	selftests/net: unix: fix unused variable compiler warning [ Upstream commit 59fef379d453781f0dabfa1f1a1e86e78aee919a ] Remove an unused variable. diag_uid.c:151:24: error: unused variable 'udr' [-Werror,-Wunused-variable] Fixes: ac011361bd4f ("af_unix: Add test for sock_diag and UDIAG_SHOW_UID.") Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231124171645.1011043-4-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:17 +01:00
Willem de Bruijn	4e999af7cf	selftests/net: fix a char signedness issue [ Upstream commit 7b29828c5af6841bdeb9fafa32fdfeff7ab9c407 ] Signedness of char is signed on x86_64, but unsigned on arm64. Fix the warning building cmsg_sender.c on signed platforms or forced with -fsigned-char: msg_sender.c:455:12: error: implicit conversion from 'int' to 'char' changes value from 128 to -128 [-Werror,-Wconstant-conversion] buf[0] = ICMPV6_ECHO_REQUEST; constant ICMPV6_ECHO_REQUEST is 128. Link: https://lwn.net/Articles/911914 Fixes: de17e305a810 ("selftests: net: cmsg_sender: support icmp and raw sockets") Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20231124171645.1011043-3-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:17 +01:00
Willem de Bruijn	249ceee95c	selftests/net: ipsec: fix constant out of range [ Upstream commit 088559815477c6f623a5db5993491ddd7facbec7 ] Fix a small compiler warning. nr_process must be a signed long: it is assigned a signed long by strtol() and is compared against LONG_MIN and LONG_MAX. ipsec.c:2280:65: error: result of comparison of constant -9223372036854775808 with expression of type 'unsigned int' is always false [-Werror,-Wtautological-constant-out-of-range-compare] if ((errno == ERANGE && (nr_process == LONG_MAX \|\| nr_process == LONG_MIN)) Fixes: bc2652b7ae1e ("selftest/net/xfrm: Add test for ipsec tunnel") Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Dmitry Safonov <0x7f454c46@gmail.com> Link: https://lore.kernel.org/r/20231124171645.1011043-2-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:17 +01:00
Dmitry Antipov	e01249a839	uapi: propagate __struct_group() attributes to the container union [ Upstream commit 4e86f32a13af1970d21be94f659cae56bbe487ee ] Recently the kernel test robot has reported an ARM-specific BUILD_BUG_ON() in an old and unmaintained wil6210 wireless driver. The problem comes from the structure packing rules of old ARM ABI ('-mabi=apcs-gnu'). For example, the following structure is packed to 18 bytes instead of 16: struct poorly_packed { unsigned int a; unsigned int b; unsigned short c; union { struct { unsigned short d; unsigned int e; } __attribute__((packed)); struct { unsigned short d; unsigned int e; } __attribute__((packed)) inner; }; } __attribute__((packed)); To fit it into 16 bytes, it's required to add packed attribute to the container union as well: struct poorly_packed { unsigned int a; unsigned int b; unsigned short c; union { struct { unsigned short d; unsigned int e; } __attribute__((packed)); struct { unsigned short d; unsigned int e; } __attribute__((packed)) inner; } __attribute__((packed)); } __attribute__((packed)); Thanks to Andrew Pinski of GCC team for sorting the things out at https://gcc.gnu.org/pipermail/gcc/2023-November/242888.html. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202311150821.cI4yciFE-lkp@intel.com Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Link: https://lore.kernel.org/r/20231120110607.98956-1-dmantipov@yandex.ru Fixes: 50d7bd38c3aa ("stddef: Introduce struct_group() helper macro") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:17 +01:00
Ioana Ciornei	fd91b48f10	dpaa2-eth: increase the needed headroom to account for alignment [ Upstream commit f422abe3f23d483cf01f386819f26fb3fe0dbb2b ] Increase the needed headroom to account for a 64 byte alignment restriction which, with this patch, we make mandatory on the Tx path. The case in which the amount of headroom needed is not available is already handled by the driver which instead sends a S/G frame with the first buffer only holding the SW and HW annotation areas. Without this patch, we can empirically see data corruption happening between Tx and Tx confirmation which sometimes leads to the SW annotation area being overwritten. Since this is an old IP where the hardware team cannot help to understand the underlying behavior, we make the Tx alignment mandatory for all frames to avoid the crash on Tx conf. Also, remove the comment that suggested that this is just an optimization. This patch also sets the needed_headroom net device field to the usual value that the driver would need on the Tx path: - 64 bytes for the software annotation area - 64 bytes to account for a 64 byte aligned buffer address Fixes: 6e2387e8f19e ("staging: fsl-dpaa2/eth: Add Freescale DPAA2 Ethernet driver") Closes: https://lore.kernel.org/netdev/aa784d0c-85eb-4e5d-968b-c8f74fa86be6@gin.de/ Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:17 +01:00
Zhengchao Shao	94445d9583	ipv4: igmp: fix refcnt uaf issue when receiving igmp query packet [ Upstream commit e2b706c691905fe78468c361aaabc719d0a496f1 ] When I perform the following test operations: 1.ip link add br0 type bridge 2.brctl addif br0 eth0 3.ip addr add 239.0.0.1/32 dev eth0 4.ip addr add 239.0.0.1/32 dev br0 5.ip addr add 224.0.0.1/32 dev br0 6.while ((1)) do ifconfig br0 up ifconfig br0 down done 7.send IGMPv2 query packets to port eth0 continuously. For example, ./mausezahn ethX -c 0 "01 00 5e 00 00 01 00 72 19 88 aa 02 08 00 45 00 00 1c 00 01 00 00 01 02 0e 7f c0 a8 0a b7 e0 00 00 01 11 64 ee 9b 00 00 00 00" The preceding tests may trigger the refcnt uaf issue of the mc list. The stack is as follows: refcount_t: addition on 0; use-after-free. WARNING: CPU: 21 PID: 144 at lib/refcount.c:25 refcount_warn_saturate (lib/refcount.c:25) CPU: 21 PID: 144 Comm: ksoftirqd/21 Kdump: loaded Not tainted 6.7.0-rc1-next-20231117-dirty #80 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:refcount_warn_saturate (lib/refcount.c:25) RSP: 0018:ffffb68f00657910 EFLAGS: 00010286 RAX: 0000000000000000 RBX: ffff8a00c3bf96c0 RCX: ffff8a07b6160908 RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff8a07b6160900 RBP: ffff8a00cba36862 R08: 0000000000000000 R09: 00000000ffff7fff R10: ffffb68f006577c0 R11: ffffffffb0fdcdc8 R12: ffff8a00c3bf9680 R13: ffff8a00c3bf96f0 R14: 0000000000000000 R15: ffff8a00d8766e00 FS: 0000000000000000(0000) GS:ffff8a07b6140000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055f10b520b28 CR3: 000000039741a000 CR4: 00000000000006f0 Call Trace: <TASK> igmp_heard_query (net/ipv4/igmp.c:1068) igmp_rcv (net/ipv4/igmp.c:1132) ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205) ip_local_deliver_finish (net/ipv4/ip_input.c:234) __netif_receive_skb_one_core (net/core/dev.c:5529) netif_receive_skb_internal (net/core/dev.c:5729) netif_receive_skb (net/core/dev.c:5788) br_handle_frame_finish (net/bridge/br_input.c:216) nf_hook_bridge_pre (net/bridge/br_input.c:294) __netif_receive_skb_core (net/core/dev.c:5423) __netif_receive_skb_list_core (net/core/dev.c:5606) __netif_receive_skb_list (net/core/dev.c:5674) netif_receive_skb_list_internal (net/core/dev.c:5764) napi_gro_receive (net/core/gro.c:609) e1000_clean_rx_irq (drivers/net/ethernet/intel/e1000/e1000_main.c:4467) e1000_clean (drivers/net/ethernet/intel/e1000/e1000_main.c:3805) __napi_poll (net/core/dev.c:6533) net_rx_action (net/core/dev.c:6735) __do_softirq (kernel/softirq.c:554) run_ksoftirqd (kernel/softirq.c:913) smpboot_thread_fn (kernel/smpboot.c:164) kthread (kernel/kthread.c:388) ret_from_fork (arch/x86/kernel/process.c:153) ret_from_fork_asm (arch/x86/entry/entry_64.S:250) </TASK> The root causes are as follows: Thread A Thread B ... netif_receive_skb br_dev_stop ... br_multicast_leave_snoopers ... __ip_mc_dec_group ... __igmp_group_dropped igmp_rcv igmp_stop_timer igmp_heard_query //ref = 1 ip_ma_put igmp_mod_timer refcount_dec_and_test igmp_start_timer //ref = 0 ... refcount_inc //ref increases from 0 When the device receives an IGMPv2 Query message, it starts the timer immediately, regardless of whether the device is running. If the device is down and has left the multicast group, it will cause the mc list refcount uaf issue. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:17 +01:00
Niklas Neronin	f89fef7710	usb: config: fix iteration issue in 'usb_get_bos_descriptor()' [ Upstream commit 974bba5c118f4c2baf00de0356e3e4f7928b4cbc ] The BOS descriptor defines a root descriptor and is the base descriptor for accessing a family of related descriptors. Function 'usb_get_bos_descriptor()' encounters an iteration issue when skipping the 'USB_DT_DEVICE_CAPABILITY' descriptor type. This results in the same descriptor being read repeatedly. To address this issue, a 'goto' statement is introduced to ensure that the pointer and the amount read is updated correctly. This ensures that the function iterates to the next descriptor instead of reading the same descriptor repeatedly. Cc: stable@vger.kernel.org Fixes: 3dd550a2d365 ("USB: usbcore: Fix slab-out-of-bounds bug during device reset") Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Acked-by: Mathias Nyman <mathias.nyman@linux.intel.com> Reviewed-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/20231115121325.471454-1-niklas.neronin@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:17 +01:00
Alan Stern	9aff7c51b4	USB: core: Change configuration warnings to notices [ Upstream commit 7a09c1269702db8eccb6f718da2b00173e1e0034 ] It has been pointed out that the kernel log messages warning about problems in USB configuration and related descriptors are vexing for users. The warning log level has a fairly high priority, but the user can do nothing to fix the underlying errors in the device's firmware. To reduce the amount of useless information produced by tools that filter high-priority log messages, we can change these warnings to notices, i.e., change dev_warn() to dev_notice(). The same holds for a few messages that currently use dev_err(): Unless they indicate a failure that might make a device unusable (such as inability to transfer a config descriptor), change them to dev_notice() also. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216630 Suggested-by: Artem S. Tashkinov <aros@gmx.com> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Link: https://lore.kernel.org/r/Y2KzPx0h6z1jXCuN@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Stable-dep-of: 974bba5c118f ("usb: config: fix iteration issue in 'usb_get_bos_descriptor()'") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:17 +01:00
Johan Hovold	c89b34eef3	USB: xhci-plat: fix legacy PHY double init [ Upstream commit 16b7e0cccb243033de4406ffb4d892365041a1e7 ] Commits 7b8ef22ea547 ("usb: xhci: plat: Add USB phy support") and 9134c1fd0503 ("usb: xhci: plat: Add USB 3.0 phy support") added support for looking up legacy PHYs from the sysdev devicetree node and initialising them. This broke drivers such as dwc3 which manages PHYs themself as the PHYs would now be initialised twice, something which specifically can lead to resources being left enabled during suspend (e.g. with the usb_phy_generic PHY driver). As the dwc3 driver uses driver-name matching for the xhci platform device, fix this by only looking up and initialising PHYs for devices that have been matched using OF. Note that checking that the platform device has a devicetree node would currently be sufficient, but that could lead to subtle breakages in case anyone ever tries to reuse an ancestor's node. Fixes: 7b8ef22ea547 ("usb: xhci: plat: Add USB phy support") Fixes: 9134c1fd0503 ("usb: xhci: plat: Add USB 3.0 phy support") Cc: stable@vger.kernel.org # 4.1 Cc: Maxime Ripard <mripard@kernel.org> Cc: Stanley Chang <stanley_chang@realtek.com> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Tested-by: Stefan Eichenberger <stefan.eichenberger@toradex.com> Tested-by: Stanley Chang <stanley_chang@realtek.com> Link: https://lore.kernel.org/r/20231103164323.14294-1-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:17 +01:00
Johannes Berg	307a6525c8	wifi: cfg80211: fix CQM for non-range use commit 7e7efdda6adb385fbdfd6f819d76bc68c923c394 upstream. My prior race fix here broke CQM when ranges aren't used, as the reporting worker now requires the cqm_config to be set in the wdev, but isn't set when there's no range configured. Rather than continuing to special-case the range version, set the cqm_config always and configure accordingly, also tracking if range was used or not to be able to clear the configuration appropriately with the same API, which was actually not right if both were implemented by a driver for some reason, as is the case with mac80211 (though there the implementations are equivalent so it doesn't matter.) Also, the original multiple-RSSI commit lost checking for the callback, so might have potentially crashed if a driver had neither implementation, and userspace tried to use it despite not being advertised as supported. Cc: stable@vger.kernel.org Fixes: 4a4b8169501b ("cfg80211: Accept multiple RSSI thresholds for CQM") Fixes: 37c20b2effe9 ("wifi: cfg80211: fix cqm_config access race") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:16 +01:00
Hugo Villeneuve	e8c1105c0c	serial: sc16is7xx: add missing support for rs485 devicetree properties commit b4a778303ea0fcabcaff974721477a5743e1f8ec upstream. Retrieve rs485 devicetree properties on registration of sc16is7xx ports in case they are attached to an rs485 transceiver. Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Lech Perczak <lech.perczak@camlingroup.com> Tested-by: Lech Perczak <lech.perczak@camlingroup.com> Link: https://lore.kernel.org/r/20230807214556.540627-7-hugo@hugovil.com Cc: Hugo Villeneuve <hugo@hugovil.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:16 +01:00
Hui Wang	55061c3230	serial: sc16is7xx: Put IOControl register into regmap_volatile commit 77a82cebf0eb023203b4cb2235cab75afc77cccf upstream. According to the IOControl register bits description in the page 31 of the product datasheet, we know the bit 3 of IOControl register is softreset, this bit will self-clearing once the reset finish. In the probe, the softreset bit is set, and when we read this register from debugfs/regmap interface, we found the softreset bit is still setting, this confused us for a while. Finally we found this register is cached, to read the real value from register, we could put it into the regmap_volatile(). Signed-off-by: Hui Wang <hui.wang@canonical.com> Link: https://lore.kernel.org/r/20230724034727.17335-1-hui.wang@canonical.com Cc: Hugo Villeneuve <hugo@hugovil.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:16 +01:00
Hugo Villeneuve	a491c7be35	auxdisplay: hd44780: move cursor home after clear display command commit 35b464e32c8bccef435e415db955787ead4ab44c upstream. The DISPLAY_CLEAR command on the NewHaven NHD-0220DZW-AG5 display does NOT change the DDRAM address to 00h (home position) like the standard Hitachi HD44780 controller. As a consequence, the starting position of the initial string LCD_INIT_TEXT is not guaranteed to be at 0,0 depending on where the cursor was before the DISPLAY_CLEAR command. Extract of DISPLAY_CLEAR command from datasheets of: Hitachi HD44780: ... It then sets DDRAM address 0 into the address counter... NewHaven NHD-0220DZW-AG5 datasheet: ... This instruction does not change the DDRAM Address Move the cursor home after sending DISPLAY_CLEAR command to support non-standard LCDs. Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com> Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org> Tested-by: David Reaver <me@davidreaver.com> Link: https://lore.kernel.org/r/20230722180925.1408885-1-hugo@hugovil.com Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:16 +01:00
Max Nguyen	7f21167775	Input: xpad - add HyperX Clutch Gladiate Support commit e28a0974d749e5105d77233c0a84d35c37da047e upstream. Add HyperX controller support to xpad_device and xpad_table. Suggested-by: Chris Toledanes <chris.toledanes@hp.com> Reviewed-by: Carl Ng <carl.ng@hp.com> Signed-off-by: Max Nguyen <maxwell.nguyen@hp.com> Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com> Link: https://lore.kernel.org/r/20230906231514.4291-1-hphyperxdev@gmail.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:16 +01:00
David Sterba	7a105de275	btrfs: fix 64bit compat send ioctl arguments not initializing version member commit 5de0434bc064606d6b7467ec3e5ad22963a18c04 upstream. When the send protocol versioning was added in 5.16 e77fbf990316 ("btrfs: send: prepare for v2 protocol"), the 32/64bit compat code was not updated (added by 2351f431f727 ("btrfs: fix send ioctl on 32bit with 64bit kernel")), missing the version struct member. The compat code is probably rarely used, nobody reported any bugs. Found by tool https://github.com/jirislaby/clang-struct . Fixes: e77fbf990316 ("btrfs: send: prepare for v2 protocol") CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:16 +01:00
Filipe Manana	32912ee869	btrfs: make error messages more clear when getting a chunk map commit 7d410d5efe04e42a6cd959bfe6d59d559fdf8b25 upstream. When getting a chunk map, at btrfs_get_chunk_map(), we do some sanity checks to verify we found a chunk map and that map found covers the logical address the caller passed in. However the messages aren't very clear in the sense that don't mention the issue is with a chunk map and one of them prints the 'length' argument as if it were the end offset of the requested range (while the in the string format we use %llu-%llu which suggests a range, and the second %llu-%llu is actually a range for the chunk map). So improve these two details in the error messages. CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:16 +01:00
Jann Horn	4fc9c61c02	btrfs: send: ensure send_fd is writable commit 0ac1d13a55eb37d398b63e6ff6db4a09a2c9128c upstream. kernel_write() requires the caller to ensure that the file is writable. Let's do that directly after looking up the ->send_fd. We don't need a separate bailout path because the "out" path already does fput() if ->send_filp is non-NULL. This has no security impact for two reasons: - the ioctl requires CAP_SYS_ADMIN - __kernel_write() bails out on read-only files - but only since 5.8, see commit a01ac27be472 ("fs: check FMODE_WRITE in __kernel_write") Reported-and-tested-by: syzbot+12e098239d20385264d3@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=12e098239d20385264d3 Fixes: 31db9f7c23fb ("Btrfs: introduce BTRFS_IOC_SEND for btrfs send/receive") CC: stable@vger.kernel.org # 4.14+ Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:16 +01:00
Filipe Manana	86742a963f	btrfs: fix off-by-one when checking chunk map includes logical address commit 5fba5a571858ce2d787fdaf55814e42725bfa895 upstream. At btrfs_get_chunk_map() we get the extent map for the chunk that contains the given logical address stored in the 'logical' argument. Then we do sanity checks to verify the extent map contains the logical address. One of these checks verifies if the extent map covers a range with an end offset behind the target logical address - however this check has an off-by-one error since it will consider an extent map whose start offset plus its length matches the target logical address as inclusive, while the fact is that the last byte it covers is behind the target logical address (by 1). So fix this condition by using '<=' rather than '<' when comparing the extent map's "start + length" against the target logical address. CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:16 +01:00
Bragatheswaran Manickavel	9fe447c485	btrfs: ref-verify: fix memory leaks in btrfs_ref_tree_mod() commit f91192cd68591c6b037da345bc9fcd5e50540358 upstream. In btrfs_ref_tree_mod(), when !parent 're' was allocated through kmalloc(). In the following code, if an error occurs, the execution will be redirected to 'out' or 'out_unlock' and the function will be exited. However, on some of the paths, 're' are not deallocated and may lead to memory leaks. For example: lookup_block_entry() for 'be' returns NULL, the out label will be invoked. During that flow ref and 'ra' are freed but not 're', which can potentially lead to a memory leak. CC: stable@vger.kernel.org # 5.10+ Reported-and-tested-by: syzbot+d66de4cbf532749df35f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=d66de4cbf532749df35f Signed-off-by: Bragatheswaran Manickavel <bragathemanick0908@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:16 +01:00
Qu Wenruo	3f26d9b257	btrfs: add dmesg output for first mount and last unmount of a filesystem commit 2db313205f8b96eea467691917138d646bb50aef upstream. There is a feature request to add dmesg output when unmounting a btrfs. There are several alternative methods to do the same thing, but with their own problems: - Use eBPF to watch btrfs_put_super()/open_ctree() Not end user friendly, they have to dip their head into the source code. - Watch for directory /sys/fs/<uuid>/ This is way more simple, but still requires some simple device -> uuid lookups. And a script needs to use inotify to watch /sys/fs/. Compared to all these, directly outputting the information into dmesg would be the most simple one, with both device and UUID included. And since we're here, also add the output when mounting a filesystem for the first time for parity. A more fine grained monitoring of subvolume mounts should be done by another layer, like audit. Now mounting a btrfs with all default mkfs options would look like this: [81.906566] BTRFS info (device dm-8): first mount of filesystem 633b5c16-afe3-4b79-b195-138fe145e4f2 [81.907494] BTRFS info (device dm-8): using crc32c (crc32c-intel) checksum algorithm [81.908258] BTRFS info (device dm-8): using free space tree [81.912644] BTRFS info (device dm-8): auto enabling async discard [81.913277] BTRFS info (device dm-8): checking UUID tree [91.668256] BTRFS info (device dm-8): last unmount of filesystem 633b5c16-afe3-4b79-b195-138fe145e4f2 CC: stable@vger.kernel.org # 5.4+ Link: https://github.com/kdave/btrfs-progs/issues/689 Reviewed-by: Anand Jain <anand.jain@oracle.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> [ update changelog ] Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:16 +01:00
Helge Deller	0ad7d59e79	parisc: Mark altinstructions read-only and 32-bit aligned commit 33f806da2df68606f77d7b892cd1298ba3d463e8 upstream. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:16 +01:00
Helge Deller	cf2ae6494d	parisc: Ensure 32-bit alignment on parisc unwind section commit c9fcb2b65c2849e8ff3be23fd8828312fb68dc19 upstream. Make sure the .PARISC.unwind section will be 32-bit aligned. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:16 +01:00
Helge Deller	2acfff5730	parisc: Mark jump_table naturally aligned commit 07eecff8ae78df7f28800484d31337e1f9bfca3a upstream. The jump_table stores two 32-bit words and one 32- (on 32-bit kernel) or one 64-bit word (on 64-bit kernel). Ensure that the last word is always 64-bit aligned on a 64-bit kernel by aligning the whole structure on sizeof(long). Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:15 +01:00
Helge Deller	3793cd2ded	parisc: Drop the HP-UX ENOSYM and EREMOTERELEASE error codes commit e5f3e299a2b1e9c3ece24a38adfc089aef307e8a upstream. Those return codes are only defined for the parisc architecture and are leftovers from when we wanted to be HP-UX compatible. They are not returned by any Linux kernel syscall but do trigger problems with the glibc strerrorname_np() and strerror() functions as reported in glibc issue #31080. There is no need to keep them, so simply remove them. Signed-off-by: Helge Deller <deller@gmx.de> Reported-by: Bruno Haible <bruno@clisp.org> Closes: https://sourceware.org/bugzilla/show_bug.cgi?id=31080 Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:15 +01:00
Helge Deller	79a1fdf4c2	parisc: Mark lock_aligned variables 16-byte aligned on SMP commit b28fc0d8739c03e7b6c44914a9d00d4c6dddc0ea upstream. On parisc we need 16-byte alignment for variables which are used for locking. Mark the __lock_aligned attribute acordingly so that the .data..lock_aligned section will get that alignment in the generated object files. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:15 +01:00
Helge Deller	41d7852a0a	parisc: Use natural CPU alignment for bug_table commit fe76a1349f235969381832c83d703bc911021eb6 upstream. Make sure that the __bug_table section gets 32- or 64-bit aligned, depending if a 32- or 64-bit kernel is being built. Mark it non-writeable and use .blockz instead of the .org assembler directive to pad the struct. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:15 +01:00
Helge Deller	c7c78a4aa6	parisc: Mark ex_table entries 32-bit aligned in uaccess.h commit a80aeb86542a50aa8521729ea4cc731ee7174f03 upstream. Add an align statement to tell the linker that all ex_table entries and as such the whole ex_table section should be 32-bit aligned in vmlinux and modules. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:15 +01:00
Helge Deller	645e4b693b	parisc: Mark ex_table entries 32-bit aligned in assembly.h commit e11d4cccd094a7cd4696c8c42e672c76c092dad5 upstream. Add an align statement to tell the linker that all ex_table entries and as such the whole ex_table section should be 32-bit aligned in vmlinux and modules. Signed-off-by: Helge Deller <deller@gmx.de> Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:15 +01:00
Timothy Pearson	c23b9eaca8	powerpc: Don't clobber f0/vs0 during fp\|altivec register save commit 5e1d824f9a283cbf90f25241b66d1f69adb3835b upstream. During floating point and vector save to thread data f0/vs0 are clobbered by the FPSCR/VSCR store routine. This has been obvserved to lead to userspace register corruption and application data corruption with io-uring. Fix it by restoring f0/vs0 after FPSCR/VSCR store has completed for all the FP, altivec, VMX register save paths. Tested under QEMU in kvm mode, running on a Talos II workstation with dual POWER9 DD2.2 CPUs. Additional detail (mpe): Typically save_fpu() is called from __giveup_fpu() which saves the FP regs and also turns off FP in the tasks MSR, meaning the kernel will reload the FP regs from the thread struct before letting the task use FP again. So in that case save_fpu() is free to clobber f0 because the FP regs no longer hold live values for the task. There is another case though, which is the path via: sys_clone() ... copy_process() dup_task_struct() arch_dup_task_struct() flush_all_to_thread() save_all() That path saves the FP regs but leaves them live. That's meant as an optimisation for a process that's using FP/VSX and then calls fork(), leaving the regs live means the parent process doesn't have to take a fault after the fork to get its FP regs back. The optimisation was added in commit 8792468da5e1 ("powerpc: Add the ability to save FPU without giving it up"). That path does clobber f0, but f0 is volatile across function calls, and typically programs reach copy_process() from userspace via a syscall wrapper function. So in normal usage f0 being clobbered across a syscall doesn't cause visible data corruption. But there is now a new path, because io-uring can call copy_process() via create_io_thread() from the signal handling path. That's OK if the signal is handled as part of syscall return, but it's not OK if the signal is handled due to some other interrupt. That path is: interrupt_return_srr_user() interrupt_exit_user_prepare() interrupt_exit_user_prepare_main() do_notify_resume() get_signal() task_work_run() create_worker_cb() create_io_worker() copy_process() dup_task_struct() arch_dup_task_struct() flush_all_to_thread() save_all() if (tsk->thread.regs->msr & MSR_FP) save_fpu() # f0 is clobbered and potentially live in userspace Note the above discussion applies equally to save_altivec(). Fixes: 8792468da5e1 ("powerpc: Add the ability to save FPU without giving it up") Cc: stable@vger.kernel.org # v4.6+ Closes: https://lore.kernel.org/all/480932026.45576726.1699374859845.JavaMail.zimbra@raptorengineeringinc.com/ Closes: https://lore.kernel.org/linuxppc-dev/480221078.47953493.1700206777956.JavaMail.zimbra@raptorengineeringinc.com/ Tested-by: Timothy Pearson <tpearson@raptorengineering.com> Tested-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Timothy Pearson <tpearson@raptorengineering.com> [mpe: Reword change log to describe exact path of corruption & other minor tweaks] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/1921539696.48534988.1700407082933.JavaMail.zimbra@raptorengineeringinc.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:15 +01:00
Nicholas Piggin	e6bc42fae6	KVM: PPC: Book3S HV: Fix KVM_RUN clobbering FP/VEC user registers commit dc158d23b33df9033bcc8e7117e8591dd2f9d125 upstream. Before running a guest, the host process (e.g., QEMU) FP/VEC registers are saved if they were being used, similarly to when the kernel uses FP registers. The guest values are then loaded into regs, and the host process registers will be restored lazily when it uses FP/VEC. KVM HV has a bug here: the host process registers do get saved, but the user MSR bits remain enabled, which indicates the registers are valid for the process. After they are clobbered by running the guest, this valid indication causes the host process to take on the FP/VEC register values of the guest. Fixes: 34e119c96b2b ("KVM: PPC: Book3S HV P9: Reduce mtmsrd instructions required to save host SPRs") Cc: stable@vger.kernel.org # v5.17+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231122025811.2973-1-npiggin@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:15 +01:00
Abdul Halim, Mohd Syazwan	59419ebcc0	iommu/vt-d: Add MTL to quirk list to skip TE disabling commit 85b80fdffa867d75dfb9084a839e7949e29064e8 upstream. The VT-d spec requires (10.4.4 Global Command Register, TE field) that: Hardware implementations supporting DMA draining must drain any in-flight DMA read/write requests queued within the Root-Complex before switching address translation on or off and reflecting the status of the command through the TES field in the Global Status register. Unfortunately, some integrated graphic devices fail to do so after some kind of power state transition. As the result, the system might stuck in iommu_disable_translation(), waiting for the completion of TE transition. Add MTL to the quirk list for those devices and skips TE disabling if the qurik hits. Fixes: b1012ca8dc4f ("iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu") Cc: stable@vger.kernel.org Signed-off-by: Abdul Halim, Mohd Syazwan <mohd.syazwan.abdul.halim@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/20231116022324.30120-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:15 +01:00
Markus Weippert	0b48970ce1	bcache: revert replacing IS_ERR_OR_NULL with IS_ERR commit bb6cc253861bd5a7cf8439e2118659696df9619f upstream. Commit 028ddcac477b ("bcache: Remove unnecessary NULL point check in node allocations") replaced IS_ERR_OR_NULL by IS_ERR. This leads to a NULL pointer dereference. BUG: kernel NULL pointer dereference, address: 0000000000000080 Call Trace: ? __die_body.cold+0x1a/0x1f ? page_fault_oops+0xd2/0x2b0 ? exc_page_fault+0x70/0x170 ? asm_exc_page_fault+0x22/0x30 ? btree_node_free+0xf/0x160 [bcache] ? up_write+0x32/0x60 btree_gc_coalesce+0x2aa/0x890 [bcache] ? bch_extent_bad+0x70/0x170 [bcache] btree_gc_recurse+0x130/0x390 [bcache] ? btree_gc_mark_node+0x72/0x230 [bcache] bch_btree_gc+0x5da/0x600 [bcache] ? cpuusage_read+0x10/0x10 ? bch_btree_gc+0x600/0x600 [bcache] bch_gc_thread+0x135/0x180 [bcache] The relevant code starts with: new_nodes[0] = NULL; for (i = 0; i < nodes; i++) { if (__bch_keylist_realloc(&keylist, bkey_u64s(&r[i].b->key))) goto out_nocoalesce; // ... out_nocoalesce: // ... for (i = 0; i < nodes; i++) if (!IS_ERR(new_nodes[i])) { // IS_ERR_OR_NULL before 028ddcac477b btree_node_free(new_nodes[i]); // new_nodes[0] is NULL rw_unlock(true, new_nodes[i]); } This patch replaces IS_ERR() by IS_ERR_OR_NULL() to fix this. Fixes: 028ddcac477b ("bcache: Remove unnecessary NULL point check in node allocations") Link: https://lore.kernel.org/all/3DF4A87A-2AC1-4893-AE5F-E921478419A9@suse.de/ Cc: stable@vger.kernel.org Cc: Zheng Wang <zyytlz.wz@163.com> Cc: Coly Li <colyli@suse.de> Signed-off-by: Markus Weippert <markus@gekmihesg.de> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:15 +01:00
Christian König	fc98ea2699	dma-buf: fix check in dma_resv_add_fence commit 95ba893c9f4feb836ddce627efd0bb6af6667031 upstream. It's valid to add the same fence multiple times to a dma-resv object and we shouldn't need one extra slot for each. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: a3f7c10a269d5 ("dma-buf/dma-resv: check if the new fence is really later") Cc: stable@vger.kernel.org # v5.19+ Link: https://patchwork.freedesktop.org/patch/msgid/20231115093035.1889-1-christian.koenig@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:15 +01:00
Gautham R. Shenoy	4da1556996	cpufreq/amd-pstate: Fix the return value of amd_pstate_fast_switch() commit bb87be267b8ee9b40917fb5bf51be5ddb33c37c2 upstream. cpufreq_driver->fast_switch() callback expects a frequency as a return value. amd_pstate_fast_switch() was returning the return value of amd_pstate_update_freq(), which only indicates a success or failure. Fix this by making amd_pstate_fast_switch() return the target_freq when the call to amd_pstate_update_freq() is successful, and return the current frequency from policy->cur when the call to amd_pstate_update_freq() is unsuccessful. Fixes: 4badf2eb1e98 ("cpufreq: amd-pstate: Add ->fast_switch() callback") Acked-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Wyes Karny <wyes.karny@amd.com> Reviewed-by: Perry Yuan <perry.yuan@amd.com> Cc: 6.4+ <stable@vger.kernel.org> # v6.4+ Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:15 +01:00
Lukasz Luba	9d4c721c18	powercap: DTPM: Fix unneeded conversions to micro-Watts commit b817f1488fca548fe50e2654d84a1956a16a1a8a upstream. The power values coming from the Energy Model are already in uW. The PowerCap and DTPM frameworks operate on uW, so all places should just use the values from the EM. Fix the code by removing all of the conversion to uW still present in it. Fixes: ae6ccaa65038 (PM: EM: convert power field to micro-Watts precision and align drivers) Cc: 5.19+ <stable@vger.kernel.org> # v5.19+ Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:15 +01:00
Ewan D. Milne	a62ca58bb3	nvme: check for valid nvme_identify_ns() before using it commit d8b90d600aff181936457f032d116dbd8534db06 upstream. When scanning namespaces, it is possible to get valid data from the first call to nvme_identify_ns() in nvme_alloc_ns(), but not from the second call in nvme_update_ns_info_block(). In particular, if the NSID becomes inactive between the two commands, a storage device may return a buffer filled with zero as per 4.1.5.1. In this case, we can get a kernel crash due to a divide-by-zero in blk_stack_limits() because ns->lba_shift will be set to zero. PID: 326 TASK: ffff95fec3cd8000 CPU: 29 COMMAND: "kworker/u98:10" #0 [ffffad8f8702f9e0] machine_kexec at ffffffff91c76ec7 #1 [ffffad8f8702fa38] __crash_kexec at ffffffff91dea4fa #2 [ffffad8f8702faf8] crash_kexec at ffffffff91deb788 #3 [ffffad8f8702fb00] oops_end at ffffffff91c2e4bb #4 [ffffad8f8702fb20] do_trap at ffffffff91c2a4ce #5 [ffffad8f8702fb70] do_error_trap at ffffffff91c2a595 #6 [ffffad8f8702fbb0] exc_divide_error at ffffffff928506e6 #7 [ffffad8f8702fbd0] asm_exc_divide_error at ffffffff92a00926 [exception RIP: blk_stack_limits+434] RIP: ffffffff92191872 RSP: ffffad8f8702fc80 RFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff95efa0c91800 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000001 RBP: 00000000ffffffff R8: ffff95fec7df35a8 R9: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ffff95fed33c09a8 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ffffad8f8702fce0] nvme_update_ns_info_block at ffffffffc06d3533 [nvme_core] #9 [ffffad8f8702fd18] nvme_scan_ns at ffffffffc06d6fa7 [nvme_core] This happened when the check for valid data was moved out of nvme_identify_ns() into one of the callers. Fix this by checking in both callers. Link: https://bugzilla.kernel.org/show_bug.cgi?id=218186 Fixes: 0dd6fff2aad4 ("nvme: bring back auto-removal of deleted namespaces during sequential scan") Cc: stable@vger.kernel.org Signed-off-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-08 08:51:15 +01:00

... 10 11 12 13 14 ...

1151177 Commits