linux

iv/linux

Author	SHA1	Message	Date
Dmytro Linkin	e251225864	net/mlx5e: Fix nest_level for vlan pop action [ Upstream commit `70f478ca08` ] Current value of nest_level, assigned from net_device lower_level value, does not reflect the actual number of vlan headers, needed to pop. For ex., if we have untagged ingress traffic sended over vlan devices, instead of one pop action, driver will perform two pop actions. To fix that, calculate nest_level as difference between vlan device and parent device lower_levels. Fixes: `f3b0a18bb6` ("net: remove unnecessary variables and callback") Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com> Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-21 09:04:49 +02:00
Eran Ben Elisha	cb8892f52e	net/mlx5e: Add missing release firmware call [ Upstream commit `d19987ccf5` ] Once driver finishes flashing the firmware image, it should release it. Fixes: `9c8bca2637` ("mlx5: Move firmware flash implementation to devlink") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Aya Levin <ayal@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-21 09:04:49 +02:00
Moshe Shemesh	34310505d4	net/mlx5: Fix frequent ioread PCI access during recovery [ Upstream commit `8c702a53bb` ] High frequency of PCI ioread calls during recovery flow may cause the following trace on powerpc: [ 248.670288] EEH: 2100000 reads ignored for recovering device at location=Slot1 driver=mlx5_core pci addr=0000:01:00.1 [ 248.670331] EEH: Might be infinite loop in mlx5_core driver [ 248.670361] CPU: 2 PID: 35247 Comm: kworker/u192:11 Kdump: loaded Tainted: G OE ------------ 4.14.0-115.14.1.el7a.ppc64le #1 [ 248.670425] Workqueue: mlx5_health0000:01:00.1 health_recover_work [mlx5_core] [ 248.670471] Call Trace: [ 248.670492] [c00020391c11b960] [c000000000c217ac] dump_stack+0xb0/0xf4 (unreliable) [ 248.670548] [c00020391c11b9a0] [c000000000045818] eeh_check_failure+0x5c8/0x630 [ 248.670631] [c00020391c11ba50] [c00000000068fce4] ioread32be+0x114/0x1c0 [ 248.670692] [c00020391c11bac0] [c00800000dd8b400] mlx5_error_sw_reset+0x160/0x510 [mlx5_core] [ 248.670752] [c00020391c11bb60] [c00800000dd75824] mlx5_disable_device+0x34/0x1d0 [mlx5_core] [ 248.670822] [c00020391c11bbe0] [c00800000dd8affc] health_recover_work+0x11c/0x3c0 [mlx5_core] [ 248.670891] [c00020391c11bc80] [c000000000164fcc] process_one_work+0x1bc/0x5f0 [ 248.670955] [c00020391c11bd20] [c000000000167f8c] worker_thread+0xac/0x6b0 [ 248.671015] [c00020391c11bdc0] [c000000000171618] kthread+0x168/0x1b0 [ 248.671067] [c00020391c11be30] [c00000000000b65c] ret_from_kernel_thread+0x5c/0x80 Reduce the PCI ioread frequency during recovery by using msleep() instead of cond_resched() Fixes: `3e5b72ac2f` ("net/mlx5: Issue SW reset on FW assert") Signed-off-by: Moshe Shemesh <moshe@mellanox.com> Reviewed-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-21 09:04:48 +02:00
René van Dorst	1ff0732cf8	net: ethernet: mediatek: move mt7623 settings out off the mt7530 [ Upstream commit `a5d7553829` ] Moving mt7623 logic out off mt7530, is required to make hardware setting consistent after we introduce phylink to mtk driver. Fixes: `b8fc9f3082` ("net: ethernet: mediatek: Add basic PHYLINK support") Reviewed-by: Sean Wang <sean.wang@mediatek.com> Tested-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: René van Dorst <opensource@vdorst.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-21 09:04:48 +02:00
René van Dorst	f749a8bfdd	net: dsa: mt7530: move mt7623 settings out off the mt7530 [ Upstream commit `84d2f7b708` ] Moving mt7623 logic out off mt7530, is required to make hardware setting consistent after we introduce phylink to mtk driver. Fixes: `ca366d6c88` ("net: dsa: mt7530: Convert to PHYLINK API") Reviewed-by: Sean Wang <sean.wang@mediatek.com> Tested-by: Sean Wang <sean.wang@mediatek.com> Signed-off-by: René van Dorst <opensource@vdorst.com> Tested-by: Frank Wunderlich <frank-w@public-files.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-21 09:04:48 +02:00
Gilberto Bertin	bb54dcca3f	net: tun: record RX queue in skb before do_xdp_generic() [ Upstream commit `3fe260e00c` ] This allows netif_receive_generic_xdp() to correctly determine the RX queue from which the skb is coming, so that the context passed to the XDP program will contain the correct RX queue index. Signed-off-by: Gilberto Bertin <me@jibi.io> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-21 09:04:48 +02:00
Konstantin Khlebnikov	f6b264f2a0	net: revert default NAPI poll timeout to 2 jiffies [ Upstream commit `a4837980fd` ] For HZ < 1000 timeout 2000us rounds up to 1 jiffy but expires randomly because next timer interrupt could come shortly after starting softirq. For commonly used CONFIG_HZ=1000 nothing changes. Fixes: `7acf8a1e8a` ("Replace 2 jiffies with sysctl netdev_budget_usecs to enable softirq tuning") Reported-by: Dmitry Yakunin <zeil@yandex-team.ru> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-21 09:04:47 +02:00
Wang Wenhu	6126048679	net: qrtr: send msgs from local of same id as broadcast [ Upstream commit `6dbf02acef` ] If the local node id(qrtr_local_nid) is not modified after its initialization, it equals to the broadcast node id(QRTR_NODE_BCAST). So the messages from local node should not be taken as broadcast and keep the process going to send them out anyway. The definitions are as follow: static unsigned int qrtr_local_nid = NUMA_NO_NODE; Fixes: `fdf5fd3975` ("net: qrtr: Broadcast messages only from control port") Signed-off-by: Wang Wenhu <wenhu.wang@vivo.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-21 09:04:47 +02:00
Atsushi Nemoto	81dc4e9bff	net: phy: micrel: use genphy_read_status for KSZ9131 [ Upstream commit `68dac3eb50` ] KSZ9131 will not work with some switches due to workaround for KSZ9031 introduced in commit `d2fd719bcb` ("net/phy: micrel: Add workaround for bad autoneg"). Use genphy_read_status instead of dedicated ksz9031_read_status. Fixes: `bff5b4b373` ("net: phy: micrel: add Microchip KSZ9131 initial driver") Signed-off-by: Atsushi Nemoto <atsushi.nemoto@sord.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-21 09:04:46 +02:00
Tim Stallard	a9a851f0ec	net: ipv6: do not consider routes via gateways for anycast address check [ Upstream commit `03e2a984b6` ] The behaviour for what is considered an anycast address changed in commit `45e4fd2668` ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception"). This now considers the first address in a subnet where there is a route via a gateway to be an anycast address. This breaks path MTU discovery and traceroutes when a host in a remote network uses the address at the start of a prefix (eg 2600:: advertised as 2600::/48 in the DFZ) as ICMP errors will not be sent to anycast addresses. This patch excludes any routes with a gateway, or via point to point links, like the behaviour previously from rt6_is_gw_or_nonexthop in net/ipv6/route.c. This can be tested with: ip link add v1 type veth peer name v2 ip netns add test ip netns exec test ip link set lo up ip link set v2 netns test ip link set v1 up ip netns exec test ip link set v2 up ip addr add 2001:db8::1/64 dev v1 nodad ip addr add 2001:db8:100:: dev lo nodad ip netns exec test ip addr add 2001:db8::2/64 dev v2 nodad ip netns exec test ip route add unreachable 2001:db8:1::1 ip netns exec test ip route add 2001:db8:100::/64 via 2001:db8::1 ip netns exec test sysctl net.ipv6.conf.all.forwarding=1 ip route add 2001:db8:1::1 via 2001:db8::2 ping -I 2001:db8::1 2001:db8:1::1 -c1 ping -I 2001:db8:100:: 2001:db8:1::1 -c1 ip addr delete 2001:db8:100:: dev lo ip netns delete test Currently the first ping will get back a destination unreachable ICMP error, but the second will never get a response, with "icmp6_send: acast source" logged. After this patch, both get destination unreachable ICMP replies. Fixes: `45e4fd2668` ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception") Signed-off-by: Tim Stallard <code@timstallard.me.uk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-21 09:04:45 +02:00
Taras Chornyi	22e56cb2f9	net: ipv4: devinet: Fix crash when add/del multicast IP with autojoin [ Upstream commit `690cc86321` ] When CONFIG_IP_MULTICAST is not set and multicast ip is added to the device with autojoin flag or when multicast ip is deleted kernel will crash. steps to reproduce: ip addr add 224.0.0.0/32 dev eth0 ip addr del 224.0.0.0/32 dev eth0 or ip addr add 224.0.0.0/32 dev eth0 autojoin Unable to handle kernel NULL pointer dereference at virtual address 0000000000000088 pc : _raw_write_lock_irqsave+0x1e0/0x2ac lr : lock_sock_nested+0x1c/0x60 Call trace: _raw_write_lock_irqsave+0x1e0/0x2ac lock_sock_nested+0x1c/0x60 ip_mc_config.isra.28+0x50/0xe0 inet_rtm_deladdr+0x1a8/0x1f0 rtnetlink_rcv_msg+0x120/0x350 netlink_rcv_skb+0x58/0x120 rtnetlink_rcv+0x14/0x20 netlink_unicast+0x1b8/0x270 netlink_sendmsg+0x1a0/0x3b0 ____sys_sendmsg+0x248/0x290 ___sys_sendmsg+0x80/0xc0 __sys_sendmsg+0x68/0xc0 __arm64_sys_sendmsg+0x20/0x30 el0_svc_common.constprop.2+0x88/0x150 do_el0_svc+0x20/0x80 el0_sync_handler+0x118/0x190 el0_sync+0x140/0x180 Fixes: `93a714d6b5` ("multicast: Extend ip address command to enable multicast group join/leave on") Signed-off-by: Taras Chornyi <taras.chornyi@plvision.eu> Signed-off-by: Vadym Kochan <vadym.kochan@plvision.eu> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-21 09:04:45 +02:00
DENG Qingfang	3ca8547431	net: dsa: mt7530: fix tagged frames pass-through in VLAN-unaware mode [ Upstream commit `e045124e93` ] In VLAN-unaware mode, the Egress Tag (EG_TAG) field in Port VLAN Control register must be set to Consistent to let tagged frames pass through as is, otherwise their tags will be stripped. Fixes: `83163f7dca` ("net: dsa: mediatek: add VLAN support for MT7530") Signed-off-by: DENG Qingfang <dqfext@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: René van Dorst <opensource@vdorst.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-21 09:04:44 +02:00
Michael Weiß	016e3531d5	l2tp: Allow management of tunnels and session in user namespace [ Upstream commit `2abe05234f` ] Creation and management of L2TPv3 tunnels and session through netlink requires CAP_NET_ADMIN. However, a process with CAP_NET_ADMIN in a non-initial user namespace gets an EPERM due to the use of the genetlink GENL_ADMIN_PERM flag. Thus, management of L2TP VPNs inside an unprivileged container won't work. We replaced the GENL_ADMIN_PERM by the GENL_UNS_ADMIN_PERM flag similar to other network modules which also had this problem, e.g., openvswitch (commit `4a92602aa1` "openvswitch: allow management from inside user namespaces") and nl80211 (commit `5617c6cd6f` "nl80211: Allow privileged operations from user namespaces"). I tested this in the container runtime trustm3 (trustm3.github.io) and was able to create l2tp tunnels and sessions in unpriviliged (user namespaced) containers using a private network namespace. For other runtimes such as docker or lxc this should work, too. Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-21 09:04:44 +02:00
Taehee Yoo	22ea267a9c	hsr: check protocol version in hsr_newlink() [ Upstream commit `4faab8c446` ] In the current hsr code, only 0 and 1 protocol versions are valid. But current hsr code doesn't check the version, which is received by userspace. Test commands: ip link add dummy0 type dummy ip link add dummy1 type dummy ip link add hsr0 type hsr slave1 dummy0 slave2 dummy1 version 4 In the test commands, version 4 is invalid. So, the command should be failed. After this patch, following error will occur. "Error: hsr: Only versions 0..1 are supported." Fixes: `ee1c279772` ("net/hsr: Added support for HSR v1") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-21 09:04:44 +02:00
Sebastian Andrzej Siewior	ced57064a0	amd-xgbe: Use __napi_schedule() in BH context [ Upstream commit `d518691cbd` ] The driver uses __napi_schedule_irqoff() which is fine as long as it is invoked with disabled interrupts by everybody. Since the commit mentioned below the driver may invoke xgbe_isr_task() in tasklet/softirq context. This may lead to list corruption if another driver uses __napi_schedule_irqoff() in IRQ context. Use __napi_schedule() which safe to use from IRQ and softirq context. Fixes: `85b85c8534` ("amd-xgbe: Re-issue interrupt if interrupt status not cleared") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-21 09:04:43 +02:00
Greg Kroah-Hartman	dc4059d21d	Linux 5.4.33	2020-04-17 10:50:26 +02:00
James Smart	484cc15ad0	scsi: lpfc: fix inlining of lpfc_sli4_cleanup_poll_list() [ Upstream commit `d480e57809` ] Compilation can fail due to having an inline function reference where the function body is not present. Fix by removing the inline tag. Fixes: `93a4d6f401` ("scsi: lpfc: Add registration for CPU Offline/Online events") Link: https://lore.kernel.org/r/20191111230401.12958-4-jsmart2021@gmail.com Reviewed-by: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:26 +02:00
Julia Lawall	8dead2c275	ASoC: stm32: sai: Add missing cleanup [ Upstream commit `7506baeed8` ] The commit `0d6defc7e0` ("ASoC: stm32: sai: manage rebind issue") converts some function calls to their non-devm equivalents. The appropriate cleanup code was added to the remove function, but not to the probe function. Add a call to snd_dmaengine_pcm_unregister to compensate for the call to snd_dmaengine_pcm_register in case of subsequent failure. Fixes: commit `0d6defc7e0` ("ASoC: stm32: sai: manage rebind issue") Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Acked-by: Olivier Moysan <olivier.moysan@st.com> Link: https://lore.kernel.org/r/1586099028-5104-1-git-send-email-Julia.Lawall@inria.fr Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:25 +02:00
Gary Lin	aed5ee6bef	efi/x86: Fix the deletion of variables in mixed mode [ Upstream commit `a4b81ccfd4` ] efi_thunk_set_variable() treated the NULL "data" pointer as an invalid parameter, and this broke the deletion of variables in mixed mode. This commit fixes the check of data so that the userspace program can delete a variable in mixed mode. Fixes: `8319e9d5ad` ("efi/x86: Handle by-ref arguments covering multiple pages in mixed mode") Signed-off-by: Gary Lin <glin@suse.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20200408081606.1504-1-glin@suse.com Link: https://lore.kernel.org/r/20200409130434.6736-9-ardb@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:25 +02:00
Andy Shevchenko	0c839eee16	mfd: dln2: Fix sanity checking for endpoints [ Upstream commit `fb945c95a4` ] While the commit `2b8bd606b1` ("mfd: dln2: More sanity checking for endpoints") tries to harden the sanity checks it made at the same time a regression, i.e. mixed in and out endpoints. Obviously it should have been not tested on real hardware at that time, but unluckily it didn't happen. So, fix above mentioned typo and make device being enumerated again. While here, introduce an enumerator for magic values to prevent similar issue to happen in the future. Fixes: `2b8bd606b1` ("mfd: dln2: More sanity checking for endpoints") Cc: Oliver Neukum <oneukum@suse.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:25 +02:00
Jann Horn	b70eb420e9	bpf: Fix tnum constraints for 32-bit comparisons [ Upstream commit `604dca5e3a` ] The BPF verifier tried to track values based on 32-bit comparisons by (ab)using the tnum state via `581738a681` ("bpf: Provide better register bounds after jmp32 instructions"). The idea is that after a check like this: if ((u32)r0 > 3) exit We can't meaningfully constrain the arithmetic-range-based tracking, but we can update the tnum state to (value=0,mask=0xffff'ffff'0000'0003). However, the implementation from `581738a681` didn't compute the tnum constraint based on the fixed operand, but instead derives it from the arithmetic-range-based tracking. This means that after the following sequence of operations: if (r0 >= 0x1'0000'0001) exit if ((u32)r0 > 7) exit The verifier assumed that the lower half of r0 is in the range (0, 0) and apply the tnum constraint (value=0,mask=0xffff'ffff'0000'0000) thus causing the overall tnum to be (value=0,mask=0x1'0000'0000), which was incorrect. Provide a fixed implementation. Fixes: `581738a681` ("bpf: Provide better register bounds after jmp32 instructions") Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200330160324.15259-3-daniel@iogearbox.net Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:25 +02:00
Faiz Abbas	26711cc7e0	mmc: sdhci: Refactor sdhci_set_timeout() [ Upstream commit `7d76ed77cf` ] Refactor sdhci_set_timeout() such that platform drivers can do some functionality in a set_timeout() callback and then call __sdhci_set_timeout() to complete the operation. Signed-off-by: Faiz Abbas <faiz_abbas@ti.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20200116105154.7685-7-faiz_abbas@ti.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:25 +02:00
Faiz Abbas	56a296657e	mmc: sdhci: Convert sdhci_set_timeout_irq() to non-static [ Upstream commit `7907ebe741` ] Export sdhci_set_timeout_irq() so that it is accessible from platform drivers. Signed-off-by: Faiz Abbas <faiz_abbas@ti.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20200116105154.7685-6-faiz_abbas@ti.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:25 +02:00
Christophe Leroy	c1f3e1d8d7	powerpc/kasan: Fix kasan_remap_early_shadow_ro() [ Upstream commit `af92bad615` ] At the moment kasan_remap_early_shadow_ro() does nothing, because k_end is 0 and k_cur < 0 is always true. Change the test to k_cur != k_end, as done in kasan_init_shadow_page_tables() Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Fixes: `cbd18991e2` ("powerpc/mm: Fix an Oops in kasan_mmu_init()") Cc: stable@vger.kernel.org Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/4e7b56865e01569058914c991143f5961b5d4719.1583507333.git.christophe.leroy@c-s.fr Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:25 +02:00
Imre Deak	36b0b1f639	drm/i915/icl+: Don't enable DDI IO power on a TypeC port in TBT mode The DDI IO power well must not be enabled for a TypeC port in TBT mode, ensure this during driver loading/system resume. This gets rid of error messages like [drm] ERROR power well DDI E TC2 IO state mismatch (refcount 1/enabled 0) and avoids leaking the power ref when disabling the output. Cc: <stable@vger.kernel.org> # v5.4+ Signed-off-by: Imre Deak <imre.deak@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200330152244.11316-1-imre.deak@intel.com (cherry picked from commit `f77a2db27f`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:25 +02:00
Prike Liang	bdac1d76a3	drm/amdgpu: fix gfx hang during suspend with video playback (v2) [ Upstream commit `487eca11a3` ] The system will be hang up during S3 suspend because of SMU is pending for GC not respose the register CP_HQD_ACTIVE access request.This issue root cause of accessing the GC register under enter GFX CGGPG and can be fixed by disable GFX CGPG before perform suspend. v2: Use disable the GFX CGPG instead of RLC safe mode guard. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Tested-by: Mengbing Wang <Mengbing.Wang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:25 +02:00
Lyude Paul	d1bbdf003c	drm/dp_mst: Fix clearing payload state on topology disable [ Upstream commit `8732fe46b2` ] The issues caused by: commit `64e62bdf04` ("drm/dp_mst: Remove VCPI while disabling topology mgr") Prompted me to take a closer look at how we clear the payload state in general when disabling the topology, and it turns out there's actually two subtle issues here. The first is that we're not grabbing &mgr.payload_lock when clearing the payloads in drm_dp_mst_topology_mgr_set_mst(). Seeing as the canonical lock order is &mgr.payload_lock -> &mgr.lock (because we always want &mgr.lock to be the inner-most lock so topology validation always works), this makes perfect sense. It also means that -technically- there could be racing between someone calling drm_dp_mst_topology_mgr_set_mst() to disable the topology, along with a modeset occurring that's modifying the payload state at the same time. The second is the more obvious issue that Wayne Lin discovered, that we're not clearing proposed_payloads when disabling the topology. I actually can't see any obvious places where the racing caused by the first issue would break something, and it could be that some of our higher-level locks already prevent this by happenstance, but better safe then sorry. So, let's make it so that drm_dp_mst_topology_mgr_set_mst() first grabs &mgr.payload_lock followed by &mgr.lock so that we never race when modifying the payload state. Then, we also clear proposed_payloads to fix the original issue of enabling a new topology with a dirty payload state. This doesn't clear any of the drm_dp_vcpi structures, but those are getting destroyed along with the ports anyway. Changes since v1: * Use sizeof(mgr->payloads[0])/sizeof(mgr->proposed_vcpis[0]) instead - vsyrjala Cc: Sean Paul <sean@poorly.run> Cc: Wayne Lin <Wayne.Lin@amd.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200122194321.14953-1-lyude@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:25 +02:00
Sasha Levin	7676e69c67	Revert "drm/dp_mst: Remove VCPI while disabling topology mgr" [ Upstream commit `a86675968e` ] This reverts commit `64e62bdf04`. This commit ends up causing some lockdep splats due to trying to grab the payload lock while holding the mgr's lock: [ 54.010099] [ 54.011765] ====================================================== [ 54.018670] WARNING: possible circular locking dependency detected [ 54.025577] 5.5.0-rc6-02274-g77381c23ee63 #47 Not tainted [ 54.031610] ------------------------------------------------------ [ 54.038516] kworker/1:6/1040 is trying to acquire lock: [ 54.044354] ffff888272af3228 (&mgr->payload_lock){+.+.}, at: drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4 [ 54.054957] [ 54.054957] but task is already holding lock: [ 54.061473] ffff888272af3060 (&mgr->lock){+.+.}, at: drm_dp_mst_topology_mgr_set_mst+0x3c/0x2e4 [ 54.071193] [ 54.071193] which lock already depends on the new lock. [ 54.071193] [ 54.080334] [ 54.080334] the existing dependency chain (in reverse order) is: [ 54.088697] [ 54.088697] -> #1 (&mgr->lock){+.+.}: [ 54.094440] __mutex_lock+0xc3/0x498 [ 54.099015] drm_dp_mst_topology_get_port_validated+0x25/0x80 [ 54.106018] drm_dp_update_payload_part1+0xa2/0x2e2 [ 54.112051] intel_mst_pre_enable_dp+0x144/0x18f [ 54.117791] intel_encoders_pre_enable+0x63/0x70 [ 54.123532] hsw_crtc_enable+0xa1/0x722 [ 54.128396] intel_update_crtc+0x50/0x194 [ 54.133455] skl_commit_modeset_enables+0x40c/0x540 [ 54.139485] intel_atomic_commit_tail+0x5f7/0x130d [ 54.145418] intel_atomic_commit+0x2c8/0x2d8 [ 54.150770] drm_atomic_helper_set_config+0x5a/0x70 [ 54.156801] drm_mode_setcrtc+0x2ab/0x833 [ 54.161862] drm_ioctl+0x2e5/0x424 [ 54.166242] vfs_ioctl+0x21/0x2f [ 54.170426] do_vfs_ioctl+0x5fb/0x61e [ 54.175096] ksys_ioctl+0x55/0x75 [ 54.179377] __x64_sys_ioctl+0x1a/0x1e [ 54.184146] do_syscall_64+0x5c/0x6d [ 54.188721] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 54.194946] [ 54.194946] -> #0 (&mgr->payload_lock){+.+.}: [ 54.201463] [ 54.201463] other info that might help us debug this: [ 54.201463] [ 54.210410] Possible unsafe locking scenario: [ 54.210410] [ 54.217025] CPU0 CPU1 [ 54.222082] ---- ---- [ 54.227138] lock(&mgr->lock); [ 54.230643] lock(&mgr->payload_lock); [ 54.237742] lock(&mgr->lock); [ 54.244062] lock(&mgr->payload_lock); [ 54.248346] [ 54.248346] * DEADLOCK * [ 54.248346] [ 54.254959] 7 locks held by kworker/1:6/1040: [ 54.259822] #0: ffff888275c4f528 ((wq_completion)events){+.+.}, at: worker_thread+0x455/0x6e2 [ 54.269451] #1: ffffc9000119beb0 ((work_completion)(&(&dev_priv->hotplug.hotplug_work)->work)){+.+.}, at: worker_thread+0x455/0x6e2 [ 54.282768] #2: ffff888272a403f0 (&dev->mode_config.mutex){+.+.}, at: i915_hotplug_work_func+0x4b/0x2be [ 54.293368] #3: ffffffff824fc6c0 (drm_connector_list_iter){.+.+}, at: i915_hotplug_work_func+0x17e/0x2be [ 54.304061] #4: ffffc9000119bc58 (crtc_ww_class_acquire){+.+.}, at: drm_helper_probe_detect_ctx+0x40/0xfd [ 54.314855] #5: ffff888272a40470 (crtc_ww_class_mutex){+.+.}, at: drm_modeset_lock+0x74/0xe2 [ 54.324385] #6: ffff888272af3060 (&mgr->lock){+.+.}, at: drm_dp_mst_topology_mgr_set_mst+0x3c/0x2e4 [ 54.334597] [ 54.334597] stack backtrace: [ 54.339464] CPU: 1 PID: 1040 Comm: kworker/1:6 Not tainted 5.5.0-rc6-02274-g77381c23ee63 #47 [ 54.348893] Hardware name: Google Fizz/Fizz, BIOS Google_Fizz.10139.39.0 01/04/2018 [ 54.357451] Workqueue: events i915_hotplug_work_func [ 54.362995] Call Trace: [ 54.365724] dump_stack+0x71/0x9c [ 54.369427] check_noncircular+0x91/0xbc [ 54.373809] ? __lock_acquire+0xc9e/0xf66 [ 54.378286] ? __lock_acquire+0xc9e/0xf66 [ 54.382763] ? lock_acquire+0x175/0x1ac [ 54.387048] ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4 [ 54.393177] ? __mutex_lock+0xc3/0x498 [ 54.397362] ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4 [ 54.403492] ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4 [ 54.409620] ? drm_dp_dpcd_access+0xd9/0x101 [ 54.414390] ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4 [ 54.420517] ? drm_dp_mst_topology_mgr_set_mst+0x218/0x2e4 [ 54.426645] ? intel_digital_port_connected+0x34d/0x35c [ 54.432482] ? intel_dp_detect+0x227/0x44e [ 54.437056] ? ww_mutex_lock+0x49/0x9a [ 54.441242] ? drm_helper_probe_detect_ctx+0x75/0xfd [ 54.446789] ? intel_encoder_hotplug+0x4b/0x97 [ 54.451752] ? intel_ddi_hotplug+0x61/0x2e0 [ 54.456423] ? mark_held_locks+0x53/0x68 [ 54.460803] ? _raw_spin_unlock_irqrestore+0x3a/0x51 [ 54.466347] ? lockdep_hardirqs_on+0x187/0x1a4 [ 54.471310] ? drm_connector_list_iter_next+0x89/0x9a [ 54.476953] ? i915_hotplug_work_func+0x206/0x2be [ 54.482208] ? worker_thread+0x4d5/0x6e2 [ 54.486587] ? worker_thread+0x455/0x6e2 [ 54.490966] ? queue_work_on+0x64/0x64 [ 54.495151] ? kthread+0x1e9/0x1f1 [ 54.498946] ? queue_work_on+0x64/0x64 [ 54.503130] ? kthread_unpark+0x5e/0x5e [ 54.507413] ? ret_from_fork+0x3a/0x50 The proper fix for this is probably cleanup the VCPI allocations when we're enabling the topology, or on the first payload allocation. For now though, let's just revert. Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: `64e62bdf04` ("drm/dp_mst: Remove VCPI while disabling topology mgr") Cc: Sean Paul <sean@poorly.run> Cc: Wayne Lin <Wayne.Lin@amd.com> Reviewed-by: Sean Paul <sean@poorly.run> Link: https://patchwork.freedesktop.org/patch/msgid/20200117205149.97262-1-lyude@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:24 +02:00
James Smart	ba74ab0c29	scsi: lpfc: Fix broken Credit Recovery after driver load [ Upstream commit `835214f5d5` ] When driver is set to enable bb credit recovery, the switch displayed the setting as inactive. If the link bounces, it switches to Active. During link up processing, the driver currently does a MBX_READ_SPARAM followed by a MBX_CONFIG_LINK. These mbox commands are queued to be executed, one at a time and the completion is processed by the worker thread. Since the MBX_READ_SPARAM is done BEFORE the MBX_CONFIG_LINK, the BB_SC_N bit is never set the the returned values. BB Credit recovery status only gets set after the driver requests the feature in CONFIG_LINK, which is done after the link up. Thus the ordering of READ_SPARAM needs to follow the CONFIG_LINK. Fix by reordering so that READ_SPARAM is done after CONFIG_LINK. Added a HBA_DEFER_FLOGI flag so that any FLOGI handling waits until after the READ_SPARAM is done so that the proper BB credit value is set in the FLOGI payload. Fixes: `6bfb162082` ("scsi: lpfc: Fix configuration of BB credit recovery in service parameters") Cc: <stable@vger.kernel.org> # v5.4+ Link: https://lore.kernel.org/r/20200128002312.16346-4-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:24 +02:00
James Smart	33ebae4f3b	scsi: lpfc: Fix configuration of BB credit recovery in service parameters [ Upstream commit `6bfb162082` ] The driver today is reading service parameters from the firmware and then overwriting the firmware-provided values with values of its own. There are some switch features that require preliminary FLOGI's that are switch-specific and done prior to the actual fabric FLOGI for traffic. The fw will perform those FLOGIs and will revise the service parameters for the features configured. As the driver later overwrites those values with its own values, it misconfigures things like BBSCN use by doing so. Correct by eliminating the driver-overwrite of firmware values. The driver correctly re-reads the service parameters after each link up to obtain the latest values from firmware. Link: https://lore.kernel.org/r/20191105005708.7399-3-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:24 +02:00
James Smart	037b0b5521	scsi: lpfc: Fix Fabric hostname registration if system hostname changes [ Upstream commit `e3ba04c9ba` ] There are reports of multiple ports on the same system displaying different hostnames in fabric FDMI displays. Currently, the driver registers the hostname at initialization and obtains the hostname via init_utsname()->nodename queried at the time the FC link comes up. Unfortunately, if the machine hostname is updated after initialization, such as via DHCP or admin command, the value registered initially will be incorrect. Fix by having the driver save the hostname that was registered with FDMI. The driver then runs a heartbeat action that will check the hostname. If the name changes, reregister the FMDI data. The hostname is used in RSNN_NN, FDMI RPA and FDMI RHBA. Link: https://lore.kernel.org/r/20191218235808.31922-5-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:24 +02:00
James Smart	f48e759352	scsi: lpfc: Add registration for CPU Offline/Online events [ Upstream commit `93a4d6f401` ] The recent affinitization didn't address cpu offlining/onlining. If an interrupt vector is shared and the low order cpu owning the vector is offlined, as interrupts are managed, the vector is taken offline. This causes the other CPUs sharing the vector will hang as they can't get io completions. Correct by registering callbacks with the system for Offline/Online events. When a cpu is taken offline, its eq, which is tied to an interrupt vector is found. If the cpu is the "owner" of the vector and if the eq/vector is shared by other CPUs, the eq is placed into a polled mode. Additionally, code paths that perform io submission on the "sharing CPUs" will check the eq state and poll for completion after submission of new io to a wq that uses the eq. Similarly, when a cpu comes back online and owns an offlined vector, the eq is taken out of polled mode and rearmed to start driving interrupts for eq. Link: https://lore.kernel.org/r/20191105005708.7399-9-jsmart2021@gmail.com Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <jsmart2021@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:24 +02:00
Nikos Tsironis	33344a7661	dm clone: Add missing casts to prevent overflows and data corruption [ Upstream commit `9fc06ff568` ] Add missing casts when converting from regions to sectors. In case BITS_PER_LONG == 32, the lack of the appropriate casts can lead to overflows and miscalculation of the device sector. As a result, we could end up discarding and/or copying the wrong parts of the device, thus corrupting the device's data. Fixes: `7431b7835f` ("dm: add clone target") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:24 +02:00
Nikos Tsironis	2d7eb7ee36	dm clone: Fix handling of partial region discards [ Upstream commit `4b5142905d` ] There is a bug in the way dm-clone handles discards, which can lead to discarding the wrong blocks or trying to discard blocks beyond the end of the device. This could lead to data corruption, if the destination device indeed discards the underlying blocks, i.e., if the discard operation results in the original contents of a block to be lost. The root of the problem is the code that calculates the range of regions covered by a discard request and decides which regions to discard. Since dm-clone handles the device in units of regions, we don't discard parts of a region, only whole regions. The range is calculated as: rs = dm_sector_div_up(bio->bi_iter.bi_sector, clone->region_size); re = bio_end_sector(bio) >> clone->region_shift; , where 'rs' is the first region to discard and (re - rs) is the number of regions to discard. The bug manifests when we try to discard part of a single region, i.e., when we try to discard a block with size < region_size, and the discard request both starts at an offset with respect to the beginning of that region and ends before the end of the region. The root cause is the following comparison: if (rs == re) // skip discard and complete original bio immediately , which doesn't take into account that 'rs' might be greater than 're'. Thus, we then issue a discard request for the wrong blocks, instead of skipping the discard all together. Fix the check to also take into account the above case, so we don't end up discarding the wrong blocks. Also, add some range checks to dm_clone_set_region_hydrated() and dm_clone_cond_set_range(), which update dm-clone's region bitmap. Note that the aforementioned bug doesn't cause invalid memory accesses, because dm_clone_is_range_hydrated() returns True for this case, so the checks are just precautionary. Fixes: `7431b7835f` ("dm: add clone target") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:24 +02:00
Mikulas Patocka	dcf2f00b08	dm clone: replace spin_lock_irqsave with spin_lock_irq [ Upstream commit `6ca43ed837` ] If we are in a place where it is known that interrupts are enabled, functions spin_lock_irq/spin_unlock_irq should be used instead of spin_lock_irqsave/spin_unlock_irqrestore. spin_lock_irq and spin_unlock_irq are faster because they don't need to push and pop the flags register. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:23 +02:00
Bob Liu	fddfa591da	dm zoned: remove duplicate nr_rnd_zones increase in dmz_init_zone() [ Upstream commit `b8fdd09037` ] zmd->nr_rnd_zones was increased twice by mistake. The other place it is increased in dmz_init_zone() is the only one needed: 1131 zmd->nr_useable_zones++; 1132 if (dmz_is_rnd(zone)) { 1133 zmd->nr_rnd_zones++; ^^^ Fixes: `3b1a94c88b` ("dm zoned: drive-managed zoned block device target") Cc: stable@vger.kernel.org Signed-off-by: Bob Liu <bob.liu@oracle.com> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-04-17 10:50:23 +02:00
Mark Brown	1ba26c2aed	arm64: Always force a branch protection mode when the compiler has one commit `b8fdef311a` upstream. Compilers with branch protection support can be configured to enable it by default, it is likely that distributions will do this as part of deploying branch protection system wide. As well as the slight overhead from having some extra NOPs for unused branch protection features this can cause more serious problems when the kernel is providing pointer authentication to userspace but not built for pointer authentication itself. In that case our switching of keys for userspace can affect the kernel unexpectedly, causing pointer authentication instructions in the kernel to corrupt addresses. To ensure that we get consistent and reliable behaviour always explicitly initialise the branch protection mode, ensuring that the kernel is built the same way regardless of the compiler defaults. [This is a reworked version of `b8fdef311a` ("arm64: Always force a branch protection mode when the compiler has one") for backport. Kernels prior to `74afda4016` ("arm64: compile the kernel with ptrauth return address signing") don't have any Makefile machinery for forcing on pointer auth but still have issues if the compiler defaults it on so need this reworked version. -- broonie] Fixes: `7503197562` (arm64: add basic pointer authentication support) Reported-by: Szabolcs Nagy <szabolcs.nagy@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org [catalin.marinas@arm.com: remove Kconfig option in favour of Makefile check] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:23 +02:00
Clement Courbet	ba7581be85	powerpc: Make setjmp/longjmp signature standard commit `c17eb4dca5` upstream. Declaring setjmp()/longjmp() as taking longs makes the signature non-standard, and makes clang complain. In the past, this has been worked around by adding -ffreestanding to the compile flags. The implementation looks like it only ever propagates the value (in longjmp) or sets it to 1 (in setjmp), and we only call longjmp with integer parameters. This allows removing -ffreestanding from the compilation flags. Fixes: `c9029ef9c9` ("powerpc: Avoid clang warnings around setjmp and longjmp") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Clement Courbet <courbet@google.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200330080400.124803-1-courbet@google.com Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:23 +02:00
Sreekanth Reddy	3457b2232e	scsi: mpt3sas: Fix kernel panic observed on soft HBA unplug commit `cc41f11a21` upstream. Generic protection fault type kernel panic is observed when user performs soft (ordered) HBA unplug operation while IOs are running on drives connected to HBA. When user performs ordered HBA removal operation, the kernel calls PCI device's .remove() call back function where driver is flushing out all the outstanding SCSI IO commands with DID_NO_CONNECT host byte and also unmaps sg buffers allocated for these IO commands. However, in the ordered HBA removal case (unlike of real HBA hot removal), HBA device is still alive and hence HBA hardware is performing the DMA operations to those buffers on the system memory which are already unmapped while flushing out the outstanding SCSI IO commands and this leads to kernel panic. Don't flush out the outstanding IOs from .remove() path in case of ordered removal since HBA will be still alive in this case and it can complete the outstanding IOs. Flush out the outstanding IOs only in case of 'physical HBA hot unplug' where there won't be any communication with the HBA. During shutdown also it is possible that HBA hardware can perform DMA operations on those outstanding IO buffers which are completed with DID_NO_CONNECT by the driver from .shutdown(). So same above fix is applied in shutdown path as well. It is safe to drop the outstanding commands when HBA is inaccessible such as when permanent PCI failure happens, when HBA is in non-operational state, or when someone does a real HBA hot unplug operation. Since driver knows that HBA is inaccessible during these cases, it is safe to drop the outstanding commands instead of waiting for SCSI error recovery to kick in and clear these outstanding commands. Link: https://lore.kernel.org/r/1585302763-23007-1-git-send-email-sreekanth.reddy@broadcom.com Fixes: `c666d3be99` ("scsi: mpt3sas: wait for and flush running commands on shutdown/unload") Cc: stable@vger.kernel.org #v4.14.174+ Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:23 +02:00
Michael Ellerman	e294f8a5ad	powerpc/64: Prevent stack protection in early boot commit `7053f80d96` upstream. The previous commit reduced the amount of code that is run before we setup a paca. However there are still a few remaining functions that run with no paca, or worse, with an arbitrary value in r13 that will be used as a paca pointer. In particular the stack protector canary is stored in the paca, so if stack protector is activated for any of these functions we will read the stack canary from wherever r13 points. If r13 happens to point outside of memory we will get a machine check / checkstop. For example if we modify initialise_paca() to trigger stack protection, and then boot in the mambo simulator with r13 poisoned in skiboot before calling the kernel: DEBUG: 19952232: (19952232): INSTRUCTION: PC=0xC0000000191FC1E8: [0x3C4C006D]: addis r2,r12,0x6D [fetch] DEBUG: 19952236: (19952236): INSTRUCTION: PC=0xC00000001807EAD8: [0x7D8802A6]: mflr r12 [fetch] FATAL ERROR: 19952276: (19952276): Check Stop for 0:0: Machine Check with ME bit of MSR off DEBUG: 19952276: (19952276): INSTRUCTION: PC=0xC0000000191FCA7C: [0xE90D0CF8]: ld r8,0xCF8(r13) [Instruction Failed] INFO: 19952276: (19952277): Execution stopped: Mambo Error, Machine Check Stop, systemsim % bt pc: 0xC0000000191FCA7C initialise_paca+0x54 lr: 0xC0000000191FC22C early_setup+0x44 stack:0x00000000198CBED0 0x0 +0x0 stack:0x00000000198CBF00 0xC0000000191FC22C early_setup+0x44 stack:0x00000000198CBF90 0x1801C968 +0x1801C968 So annotate the relevant functions to ensure stack protection is never enabled for them. Fixes: `06ec27aea9` ("powerpc/64: add stack protector support") Cc: stable@vger.kernel.org # v4.20+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200320032116.1024773-2-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:23 +02:00
Christophe Leroy	fc8755dc01	powerpc/kprobes: Ignore traps that happened in real mode commit `21f8b2fa3c` upstream. When a program check exception happens while MMU translation is disabled, following Oops happens in kprobe_handler() in the following code: } else if (*addr != BREAKPOINT_INSTRUCTION) { BUG: Unable to handle kernel data access on read at 0x0000e268 Faulting instruction address: 0xc000ec34 Oops: Kernel access of bad area, sig: 11 [#1] BE PAGE_SIZE=16K PREEMPT CMPC885 Modules linked in: CPU: 0 PID: 429 Comm: cat Not tainted 5.6.0-rc1-s3k-dev-00824-g84195dc6c58a #3267 NIP: c000ec34 LR: c000ecd8 CTR: c019cab8 REGS: ca4d3b58 TRAP: 0300 Not tainted (5.6.0-rc1-s3k-dev-00824-g84195dc6c58a) MSR: 00001032 <ME,IR,DR,RI> CR: 2a4d3c52 XER: 00000000 DAR: 0000e268 DSISR: c0000000 GPR00: c000b09c ca4d3c10 c66d0620 00000000 ca4d3c60 00000000 00009032 00000000 GPR08: 00020000 00000000 c087de44 c000afe0 c66d0ad0 100d3dd6 fffffff3 00000000 GPR16: 00000000 00000041 00000000 ca4d3d70 00000000 00000000 0000416d 00000000 GPR24: 00000004 c53b6128 00000000 0000e268 00000000 c07c0000 c07bb6fc ca4d3c60 NIP [c000ec34] kprobe_handler+0x128/0x290 LR [c000ecd8] kprobe_handler+0x1cc/0x290 Call Trace: [ca4d3c30] [c000b09c] program_check_exception+0xbc/0x6fc [ca4d3c50] [c000e43c] ret_from_except_full+0x0/0x4 --- interrupt: 700 at 0xe268 Instruction dump: 913e0008 81220000 38600001 3929ffff 91220000 80010024 bb410008 7c0803a6 38210020 4e800020 38600000 4e800020 <813b0000> 6d2a7fe0 2f8a0008 419e0154 ---[ end trace 5b9152d4cdadd06d ]--- kprobe is not prepared to handle events in real mode and functions running in real mode should have been blacklisted, so kprobe_handler() can safely bail out telling 'this trap is not mine' for any trap that happened while in real-mode. If the trap happened with MSR_IR or MSR_DR cleared, return 0 immediately. Reported-by: Larry Finger <Larry.Finger@lwfinger.net> Fixes: `6cc89bad60` ("powerpc/kprobes: Invoke handlers directly") Cc: stable@vger.kernel.org # v4.10+ Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/424331e2006e7291a1bfe40e7f3fa58825f565e1.1582054578.git.christophe.leroy@c-s.fr Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:23 +02:00
Cédric Le Goater	ed6f6b2b39	powerpc/xive: Fix xmon support on the PowerNV platform commit `97ef275077` upstream. The PowerNV platform has multiple IRQ chips and the xmon command dumping the state of the XIVE interrupt should only operate on the XIVE IRQ chip. Fixes: `5896163f7f` ("powerpc/xmon: Improve output of XIVE interrupts") Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306150143.5551-3-clg@kaod.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:23 +02:00
Daniel Axtens	1ab730b659	powerpc/64: Setup a paca before parsing device tree etc. commit `d4a8e98621` upstream. Currently we set up the paca after parsing the device tree for CPU features. Prior to that, r13 contains random data, which means there is random data in r13 while we're running the generic dt parsing code. This random data varies depending on whether we boot through a vmlinux or a zImage: for the vmlinux case it's usually around zero, but for zImages we see random values like 912a72603d420015. This is poor practice, and can also lead to difficult-to-debug crashes. For example, when kcov is enabled, the kcov instrumentation attempts to read preempt_count out of the current task, which goes via the paca. This then crashes in the zImage case. Similarly stack protector can cause crashes if r13 is bogus, by reading from the stack canary in the paca. To resolve this: - move the paca setup to before the CPU feature parsing. - because we no longer have access to CPU feature flags in paca setup, change the HV feature test in the paca setup path to consider the actual value of the MSR rather than the CPU feature. Translations get switched on once we leave early_setup, so I think we'd already catch any other cases where the paca or task aren't set up. Boot tested on a P9 guest and host. Fixes: `fb0b0a73b2` ("powerpc: Enable kcov") Fixes: `06ec27aea9` ("powerpc/64: add stack protector support") Cc: stable@vger.kernel.org # v4.20+ Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Daniel Axtens <dja@axtens.net> [mpe: Reword comments & change log a bit to mention stack protector] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200320032116.1024773-1-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:23 +02:00
Cédric Le Goater	9240f83aa9	powerpc/xive: Use XIVE_BAD_IRQ instead of zero to catch non configured IPIs commit `b1a504a650` upstream. When a CPU is brought up, an IPI number is allocated and recorded under the XIVE CPU structure. Invalid IPI numbers are tracked with interrupt number 0x0. On the PowerNV platform, the interrupt number space starts at 0x10 and this works fine. However, on the sPAPR platform, it is possible to allocate the interrupt number 0x0 and this raises an issue when CPU 0 is unplugged. The XIVE spapr driver tracks allocated interrupt numbers in a bitmask and it is not correctly updated when interrupt number 0x0 is freed. It stays allocated and it is then impossible to reallocate. Fix by using the XIVE_BAD_IRQ value instead of zero on both platforms. Reported-by: David Gibson <david@gibson.dropbear.id.au> Fixes: `eac1e731b5` ("powerpc/xive: guest exploitation of the XIVE interrupt controller") Cc: stable@vger.kernel.org # v4.14+ Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Tested-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200306150143.5551-2-clg@kaod.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:22 +02:00
Aneesh Kumar K.V	bd0fa14473	powerpc/hash64/devmap: Use H_PAGE_THP_HUGE when setting up huge devmap PTE entries commit `36b78402d9` upstream. H_PAGE_THP_HUGE is used to differentiate between a THP hugepage and hugetlb hugepage entries. The difference is WRT how we handle hash fault on these address. THP address enables MPSS in segments. We want to manage devmap hugepage entries similar to THP pt entries. Hence use H_PAGE_THP_HUGE for devmap huge PTE entries. With current code while handling hash PTE fault, we do set is_thp = true when finding devmap PTE huge PTE entries. Current code also does the below sequence we setting up huge devmap entries. entry = pmd_mkhuge(pfn_t_pmd(pfn, prot)); if (pfn_t_devmap(pfn)) entry = pmd_mkdevmap(entry); In that case we would find both H_PAGE_THP_HUGE and PAGE_DEVMAP set for huge devmap PTE entries. This results in false positive error like below. kernel BUG at /home/kvaneesh/src/linux/mm/memory.c:4321! Oops: Exception in kernel mode, sig: 5 [#1] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 56 PID: 67996 Comm: t_mmap_dio Not tainted 5.6.0-rc4-59640-g371c804dedbc #128 .... NIP [c00000000044c9e4] __follow_pte_pmd+0x264/0x900 LR [c0000000005d45f8] dax_writeback_one+0x1a8/0x740 Call Trace: str_spec.74809+0x22ffb4/0x2d116c (unreliable) dax_writeback_one+0x1a8/0x740 dax_writeback_mapping_range+0x26c/0x700 ext4_dax_writepages+0x150/0x5a0 do_writepages+0x68/0x180 __filemap_fdatawrite_range+0x138/0x180 file_write_and_wait_range+0xa4/0x110 ext4_sync_file+0x370/0x6e0 vfs_fsync_range+0x70/0xf0 sys_msync+0x220/0x2e0 system_call+0x5c/0x68 This is because our pmd_trans_huge check doesn't exclude _PAGE_DEVMAP. To make this all consistent, update pmd_mkdevmap to set H_PAGE_THP_HUGE and pmd_trans_huge check now excludes _PAGE_DEVMAP correctly. Fixes: `ebd3119793` ("powerpc/mm: Add devmap support for ppc64") Cc: stable@vger.kernel.org # v4.13+ Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200313094842.351830-1-aneesh.kumar@linux.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:22 +02:00
Laurentiu Tudor	81b9336ab2	powerpc/fsl_booke: Avoid creating duplicate tlb1 entry commit `aa4113340a` upstream. In the current implementation, the call to loadcam_multi() is wrapped between switch_to_as1() and restore_to_as0() calls so, when it tries to create its own temporary AS=1 TLB1 entry, it ends up duplicating the existing one created by switch_to_as1(). Add a check to skip creating the temporary entry if already running in AS=1. Fixes: `d9e1831a42` ("powerpc/85xx: Load all early TLB entries at once") Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com> Acked-by: Scott Wood <oss@buserror.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200123111914.2565-1-laurentiu.tudor@nxp.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:22 +02:00
Michael Ellerman	38aa7f32df	powerpc/64/tm: Don't let userspace set regs->trap via sigreturn commit `c7def7fbde` upstream. In restore_tm_sigcontexts() we take the trap value directly from the user sigcontext with no checking: err \|= __get_user(regs->trap, &sc->gp_regs[PT_TRAP]); This means we can be in the kernel with an arbitrary regs->trap value. Although that's not immediately problematic, there is a risk we could trigger one of the uses of CHECK_FULL_REGS(): #define CHECK_FULL_REGS(regs) BUG_ON(regs->trap & 1) It can also cause us to unnecessarily save non-volatile GPRs again in save_nvgprs(), which shouldn't be problematic but is still wrong. It's also possible it could trick the syscall restart machinery, which relies on regs->trap not being == 0xc00 (see `9a81c16b52` ("powerpc: fix double syscall restarts")), though I haven't been able to make that happen. Finally it doesn't match the behaviour of the non-TM case, in restore_sigcontext() which zeroes regs->trap. So change restore_tm_sigcontexts() to zero regs->trap. This was discovered while testing Nick's upcoming rewrite of the syscall entry path. In that series the call to save_nvgprs() prior to signal handling (do_notify_resume()) is removed, which leaves the low-bit of regs->trap uncleared which can then trigger the FULL_REGS() WARNs in setup_tm_sigcontexts(). Fixes: `2b0a576d15` ("powerpc: Add new transactional memory state to the signal context") Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20200401023836.3286664-1-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:22 +02:00
Juergen Gross	0abc07d23c	xen/blkfront: fix memory allocation flags in blkfront_setup_indirect() commit `3a169c0be7` upstream. Commit `1d5c76e664` ("xen-blkfront: switch kcalloc to kvcalloc for large array allocation") didn't fix the issue it was meant to, as the flags for allocating the memory are GFP_NOIO, which will lead the memory allocation falling back to kmalloc(). So instead of GFP_NOIO use GFP_KERNEL and do all the memory allocation in blkfront_setup_indirect() in a memalloc_noio_{save,restore} section. Fixes: `1d5c76e664` ("xen-blkfront: switch kcalloc to kvcalloc for large array allocation") Cc: stable@vger.kernel.org Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Link: https://lore.kernel.org/r/20200403090034.8753-1-jgross@suse.com Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:22 +02:00
Wen Yang	5fdf01181c	ipmi: fix hung processes in __get_guid() commit `32830a0534` upstream. The wait_event() function is used to detect command completion. When send_guid_cmd() returns an error, smi_send() has not been called to send data. Therefore, wait_event() should not be used on the error path, otherwise it will cause the following warning: [ 1361.588808] systemd-udevd D 0 1501 1436 0x00000004 [ 1361.588813] ffff883f4b1298c0 0000000000000000 ffff883f4b188000 ffff887f7e3d9f40 [ 1361.677952] ffff887f64bd4280 ffffc90037297a68 ffffffff8173ca3b ffffc90000000010 [ 1361.767077] 00ffc90037297ad0 ffff887f7e3d9f40 0000000000000286 ffff883f4b188000 [ 1361.856199] Call Trace: [ 1361.885578] [<ffffffff8173ca3b>] ? __schedule+0x23b/0x780 [ 1361.951406] [<ffffffff8173cfb6>] schedule+0x36/0x80 [ 1362.010979] [<ffffffffa071f178>] get_guid+0x118/0x150 [ipmi_msghandler] [ 1362.091281] [<ffffffff810d5350>] ? prepare_to_wait_event+0x100/0x100 [ 1362.168533] [<ffffffffa071f755>] ipmi_register_smi+0x405/0x940 [ipmi_msghandler] [ 1362.258337] [<ffffffffa0230ae9>] try_smi_init+0x529/0x950 [ipmi_si] [ 1362.334521] [<ffffffffa022f350>] ? std_irq_setup+0xd0/0xd0 [ipmi_si] [ 1362.411701] [<ffffffffa0232bd2>] init_ipmi_si+0x492/0x9e0 [ipmi_si] [ 1362.487917] [<ffffffffa0232740>] ? ipmi_pci_probe+0x280/0x280 [ipmi_si] [ 1362.568219] [<ffffffff810021a0>] do_one_initcall+0x50/0x180 [ 1362.636109] [<ffffffff812231b2>] ? kmem_cache_alloc_trace+0x142/0x190 [ 1362.714330] [<ffffffff811b2ae1>] do_init_module+0x5f/0x200 [ 1362.781208] [<ffffffff81123ca8>] load_module+0x1898/0x1de0 [ 1362.848069] [<ffffffff811202e0>] ? __symbol_put+0x60/0x60 [ 1362.913886] [<ffffffff8130696b>] ? security_kernel_post_read_file+0x6b/0x80 [ 1362.998514] [<ffffffff81124465>] SYSC_finit_module+0xe5/0x120 [ 1363.068463] [<ffffffff81124465>] ? SYSC_finit_module+0xe5/0x120 [ 1363.140513] [<ffffffff811244be>] SyS_finit_module+0xe/0x10 [ 1363.207364] [<ffffffff81003c04>] do_syscall_64+0x74/0x180 Fixes: `50c812b2b9` ("[PATCH] ipmi: add full sysfs support") Signed-off-by: Wen Yang <wenyang@linux.alibaba.com> Cc: Corey Minyard <minyard@acm.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: openipmi-developer@lists.sourceforge.net Cc: linux-kernel@vger.kernel.org Cc: stable@vger.kernel.org # 2.6.17- Message-Id: <20200403090408.58745-1-wenyang@linux.alibaba.com> Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:22 +02:00
Kai-Heng Feng	d0b9bd4804	libata: Return correct status in sata_pmp_eh_recover_pm() when ATA_DFLAG_DETACH is set commit `8305f72f95` upstream. During system resume from suspend, this can be observed on ASM1062 PMP controller: ata10.01: SATA link down (SStatus 0 SControl 330) ata10.02: hard resetting link ata10.02: SATA link down (SStatus 0 SControl 330) ata10.00: configured for UDMA/133 Kernel panic - not syncing: stack-protector: Kernel in: sata_pmp_eh_recover+0xa2b/0xa40 CPU: 2 PID: 230 Comm: scsi_eh_9 Tainted: P OE #49-Ubuntu Hardware name: System manufacturer System Product 1001 12/10/2017 Call Trace: dump_stack+0x63/0x8b panic+0xe4/0x244 ? sata_pmp_eh_recover+0xa2b/0xa40 __stack_chk_fail+0x19/0x20 sata_pmp_eh_recover+0xa2b/0xa40 ? ahci_do_softreset+0x260/0x260 [libahci] ? ahci_do_hardreset+0x140/0x140 [libahci] ? ata_phys_link_offline+0x60/0x60 ? ahci_stop_engine+0xc0/0xc0 [libahci] sata_pmp_error_handler+0x22/0x30 ahci_error_handler+0x45/0x80 [libahci] ata_scsi_port_error_handler+0x29b/0x770 ? ata_scsi_cmd_error_handler+0x101/0x140 ata_scsi_error+0x95/0xd0 ? scsi_try_target_reset+0x90/0x90 scsi_error_handler+0xd0/0x5b0 kthread+0x121/0x140 ? scsi_eh_get_sense+0x200/0x200 ? kthread_create_worker_on_cpu+0x70/0x70 ret_from_fork+0x22/0x40 Kernel Offset: 0xcc00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) Since sata_pmp_eh_recover_pmp() doens't set rc when ATA_DFLAG_DETACH is set, sata_pmp_eh_recover() continues to run. During retry it triggers the stack protector. Set correct rc in sata_pmp_eh_recover_pmp() to let sata_pmp_eh_recover() jump to pmp_fail directly. BugLink: https://bugs.launchpad.net/bugs/1821434 Cc: stable@vger.kernel.org Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-04-17 10:50:22 +02:00

1 2 3 4 5 ...

877892 Commits