linux

iv/linux

Author	SHA1	Message	Date
Jan Höppner	76b40319a1	s390/dasd: Use correct lock while counting channel queue length commit ccc45cb4e7271c74dbb27776ae8f73d84557f5c6 upstream. The lock around counting the channel queue length in the BIODASDINFO ioctl was incorrectly changed to the dasd_block->queue_lock with commit 583d6535cb9d ("dasd: remove dead code"). This can lead to endless list iterations and a subsequent crash. The queue_lock is supposed to be used only for queue lists belonging to dasd_block. For dasd_device related queue lists the ccwdev lock must be used. Fix the mentioned issues by correctly using the ccwdev lock instead of the queue lock. Fixes: 583d6535cb9d ("dasd: remove dead code") Cc: stable@vger.kernel.org # v5.0+ Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com> Reviewed-by: Stefan Haberland <sth@linux.ibm.com> Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Link: https://lore.kernel.org/r/20230609153750.1258763-2-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-06-14 11:09:51 +02:00
Holger Dengler	0467212806	s390/pkey: zeroize key blobs [ Upstream commit 844cf829e5f33e00b279230470c8c93b58b8c16f ] Key blobs for the IOCTLs PKEY_KBLOB2PROTK[23] may contain clear key material. Zeroize the copies of these keys in kernel memory after creating the protected key. Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Holger Dengler <dengler@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-06-09 10:30:11 +02:00
Heiko Carstens	75258f0838	s390/qdio: fix do_sqbs() inline assembly constraint [ Upstream commit 2862a2fdfae875888e3c1c3634e3422e01d98147 ] Use "a" constraint instead of "d" constraint to pass the state parameter to the do_sqbs() inline assembly. This prevents that general purpose register zero is used for the state parameter. If the compiler would select general purpose register zero this would be problematic for the used instruction in rsy format: the register used for the state parameter is a base register. If the base register is general purpose register zero the contents of the register are unexpectedly ignored when the instruction is executed. This only applies to z/VM guests using QIOASSIST with dedicated (pass through) QDIO-based devices such as FCP [zfcp driver] as well as real OSA or HiperSockets [qeth driver]. A possible symptom for this case using zfcp is the following repeating kernel message pattern: zfcp <devbusid>: A QDIO problem occurred zfcp <devbusid>: A QDIO problem occurred zfcp <devbusid>: qdio: ZFCP on SC <sc> using AI:1 QEBSM:1 PRI:1 TDD:1 SIGA: W zfcp <devbusid>: A QDIO problem occurred zfcp <devbusid>: A QDIO problem occurred Each of the qdio problem message can be accompanied by the following entries for the affected subchannel <sc> in /sys/kernel/debug/s390dbf/qdio_error/hex_ascii for zfcp or qeth: <sc> ccq: 69.... <sc> SQBS ERROR. Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Cc: Steffen Maier <maier@linux.ibm.com> Fixes: 8129ee164267 ("[PATCH] s390: qdio V=V pass-through") Cc: <stable@vger.kernel.org> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-30 12:57:55 +01:00
Heiko Carstens	3681a0287a	s390/qdio: get rid of register asm [ Upstream commit d3e2ff5436d6ee38b572ba5c01dc7994769bec54 ] Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Stable-dep-of: 2862a2fdfae8 ("s390/qdio: fix do_sqbs() inline assembly constraint") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-05-30 12:57:55 +01:00
Stefan Haberland	abb427cb77	s390/dasd: fix hanging blockdevice after request requeue commit d8898ee50edecacdf0141f26fd90acf43d7e9cd7 upstream. The DASD driver does not kick the requeue list when requeuing IO requests to the blocklayer. This might lead to hanging blockdevice when there is no other trigger for this. Fix by automatically kick the requeue list when requeuing DASD requests to the blocklayer. Fixes: e443343e509a ("s390/dasd: blk-mq conversion") CC: stable@vger.kernel.org # 4.14+ Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Link: https://lore.kernel.org/r/20230405142017.2446986-8-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-05-17 11:48:06 +02:00
Tony Krowiak	ee17dea307	s390/vfio-ap: fix memory leak in vfio_ap device driver [ Upstream commit 8f8cf767589f2131ae5d40f3758429095c701c84 ] The device release callback function invoked to release the matrix device uses the dev_get_drvdata(device *dev) function to retrieve the pointer to the vfio_matrix_dev object in order to free its storage. The problem is, this object is not stored as drvdata with the device; since the kfree function will accept a NULL pointer, the memory for the vfio_matrix_dev object is never freed. Since the device being released is contained within the vfio_matrix_dev object, the container_of macro will be used to retrieve its pointer. Fixes: 1fde573413b5 ("s390: vfio-ap: base implementation of VFIO AP device driver") Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com> Reviewed-by: Harald Freudenberger <freude@linux.ibm.com> Link: https://lore.kernel.org/r/20230320150447.34557-1-akrowiak@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-04-05 11:23:48 +02:00
Stefan Haberland	aa8579bc08	s390/dasd: add missing discipline function commit c0c8a8397fa8a74d04915f4d3d28cb4a5d401427 upstream. Fix crash with illegal operation exception in dasd_device_tasklet. Commit b72949328869 ("s390/dasd: Prepare for additional path event handling") renamed the verify_path function for ECKD but not for FBA and DIAG. This leads to a panic when the path verification function is called for a FBA or DIAG device. Fix by defining a wrapper function for dasd_generic_verify_path(). Fixes: b72949328869 ("s390/dasd: Prepare for additional path event handling") Cc: <stable@vger.kernel.org> #5.11 Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com> Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20210525125006.157531-2-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-03-17 08:45:17 +01:00
Qiheng Lin	a50e28d433	s390/dasd: Fix potential memleak in dasd_eckd_init() [ Upstream commit 460e9bed82e49db1b823dcb4e421783854d86c40 ] `dasd_reserve_req` is allocated before `dasd_vol_info_req`, and it also needs to be freed before the error returns, just like the other cases in this function. Fixes: 9e12e54c7a8f ("s390/dasd: Handle out-of-space constraint") Signed-off-by: Qiheng Lin <linqiheng@huawei.com> Link: https://lore.kernel.org/r/20221208133809.16796-1-linqiheng@huawei.com Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Link: https://lore.kernel.org/r/20230210000253.1644903-3-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-03-11 16:39:15 +01:00
Jan Höppner	72aebdac39	s390/dasd: Prepare for additional path event handling [ Upstream commit b72949328869dfd45f6452c2410647afd7db5f1a ] As more path events need to be handled for ECKD the current path verification infrastructure can be reused. Rename all path verifcation code to fit the more broadly based task of path event handling and put the path verification in a new separate function. Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com> Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Reviewed-by: Stefan Haberland <sth@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Stable-dep-of: 460e9bed82e4 ("s390/dasd: Fix potential memleak in dasd_eckd_init()") Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-03-11 16:39:15 +01:00
Nathan Chancellor	ebc3c77785	s390/lcs: Fix return type of lcs_start_xmit() [ Upstream commit bb16db8393658e0978c3f0d30ae069e878264fa3 ] With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG), indirect call targets are validated against the expected function pointer prototype to make sure the call target is valid to help mitigate ROP attacks. If they are not identical, there is a failure at run time, which manifests as either a kernel panic or thread getting killed. A proposed warning in clang aims to catch these at compile time, which reveals: drivers/s390/net/lcs.c:2090:21: error: incompatible function pointer types initializing 'netdev_tx_t ()(struct sk_buff , struct net_device )' (aka 'enum netdev_tx ()(struct sk_buff , struct net_device )') with an expression of type 'int (struct sk_buff , struct net_device )' [-Werror,-Wincompatible-function-pointer-types-strict] .ndo_start_xmit = lcs_start_xmit, ^~~~~~~~~~~~~~ drivers/s390/net/lcs.c:2097:21: error: incompatible function pointer types initializing 'netdev_tx_t ()(struct sk_buff , struct net_device )' (aka 'enum netdev_tx ()(struct sk_buff , struct net_device )') with an expression of type 'int (struct sk_buff , struct net_device )' [-Werror,-Wincompatible-function-pointer-types-strict] .ndo_start_xmit = lcs_start_xmit, ^~~~~~~~~~~~~~ ->ndo_start_xmit() in 'struct net_device_ops' expects a return type of 'netdev_tx_t', not 'int'. Adjust the return type of lcs_start_xmit() to match the prototype's to resolve the warning and potential CFI failure, should s390 select ARCH_SUPPORTS_CFI_CLANG in the future. Link: https://github.com/ClangBuiltLinux/linux/issues/1750 Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-01-14 10:16:17 +01:00
Nathan Chancellor	3ac0217ca9	s390/netiucv: Fix return type of netiucv_tx() [ Upstream commit 88d86d18d7cf7e9137c95f9d212bb9fff8a1b4be ] With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG), indirect call targets are validated against the expected function pointer prototype to make sure the call target is valid to help mitigate ROP attacks. If they are not identical, there is a failure at run time, which manifests as either a kernel panic or thread getting killed. A proposed warning in clang aims to catch these at compile time, which reveals: drivers/s390/net/netiucv.c:1854:21: error: incompatible function pointer types initializing 'netdev_tx_t ()(struct sk_buff , struct net_device )' (aka 'enum netdev_tx ()(struct sk_buff , struct net_device )') with an expression of type 'int (struct sk_buff , struct net_device )' [-Werror,-Wincompatible-function-pointer-types-strict] .ndo_start_xmit = netiucv_tx, ^~~~~~~~~~ ->ndo_start_xmit() in 'struct net_device_ops' expects a return type of 'netdev_tx_t', not 'int'. Adjust the return type of netiucv_tx() to match the prototype's to resolve the warning and potential CFI failure, should s390 select ARCH_SUPPORTS_CFI_CLANG in the future. Additionally, while in the area, remove a comment block that is no longer relevant. Link: https://github.com/ClangBuiltLinux/linux/issues/1750 Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-01-14 10:16:17 +01:00
Nathan Chancellor	eeb75f80bc	s390/ctcm: Fix return type of ctc{mp,}m_tx() [ Upstream commit aa5bf80c3c067b82b4362cd6e8e2194623bcaca6 ] With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG), indirect call targets are validated against the expected function pointer prototype to make sure the call target is valid to help mitigate ROP attacks. If they are not identical, there is a failure at run time, which manifests as either a kernel panic or thread getting killed. A proposed warning in clang aims to catch these at compile time, which reveals: drivers/s390/net/ctcm_main.c:1064:21: error: incompatible function pointer types initializing 'netdev_tx_t ()(struct sk_buff , struct net_device )' (aka 'enum netdev_tx ()(struct sk_buff , struct net_device )') with an expression of type 'int (struct sk_buff , struct net_device )' [-Werror,-Wincompatible-function-pointer-types-strict] .ndo_start_xmit = ctcm_tx, ^~~~~~~ drivers/s390/net/ctcm_main.c:1072:21: error: incompatible function pointer types initializing 'netdev_tx_t ()(struct sk_buff , struct net_device )' (aka 'enum netdev_tx ()(struct sk_buff , struct net_device )') with an expression of type 'int (struct sk_buff , struct net_device )' [-Werror,-Wincompatible-function-pointer-types-strict] .ndo_start_xmit = ctcmpc_tx, ^~~~~~~~~ ->ndo_start_xmit() in 'struct net_device_ops' expects a return type of 'netdev_tx_t', not 'int'. Adjust the return type of ctc{mp,}m_tx() to match the prototype's to resolve the warning and potential CFI failure, should s390 select ARCH_SUPPORTS_CFI_CLANG in the future. Additionally, while in the area, remove a comment block that is no longer relevant. Link: https://github.com/ClangBuiltLinux/linux/issues/1750 Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-01-14 10:16:16 +01:00
Stefan Haberland	e61b00374a	s390/dasd: fix no record found for raw_track_access [ Upstream commit 590ce6d96d6a224b470a3862c33a483d5022bfdb ] For DASD devices in raw_track_access mode only full track images are read and written. For this purpose it is not necessary to do search operation in the locate record extended function. The documentation even states that this might fail if the searched record is not found on a track. Currently the driver sets a value of 1 in the search field for the first record after record zero. This is the default for disks not in raw_track_access mode but record 1 might be missing on a completely empty track. There has not been any problem with this on IBM storage servers but it might lead to errors with DASD devices on other vendors storage servers. Fix this by setting the search field to 0. Record zero is always available even on a completely empty track. Fixes: e4dbb0f2b5dd ("[S390] dasd: Add support for raw ECKD access.") Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com> Link: https://lore.kernel.org/r/20221123160719.3002694-4-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-12-02 17:40:02 +01:00
Benjamin Block	d2c7d8f58e	scsi: zfcp: Fix double free of FSF request when qdio send fails commit 0954256e970ecf371b03a6c9af2cf91b9c4085ff upstream. We used to use the wrong type of integer in 'zfcp_fsf_req_send()' to cache the FSF request ID when sending a new FSF request. This is used in case the sending fails and we need to remove the request from our internal hash table again (so we don't keep an invalid reference and use it when we free the request again). In 'zfcp_fsf_req_send()' we used to cache the ID as 'int' (signed and 32 bit wide), but the rest of the zfcp code (and the firmware specification) handles the ID as 'unsigned long'/'u64' (unsigned and 64 bit wide [s390x ELF ABI]). For one this has the obvious problem that when the ID grows past 32 bit (this can happen reasonably fast) it is truncated to 32 bit when storing it in the cache variable and so doesn't match the original ID anymore. The second less obvious problem is that even when the original ID has not yet grown past 32 bit, as soon as the 32nd bit is set in the original ID (0x80000000 = 2'147'483'648) we will have a mismatch when we cast it back to 'unsigned long'. As the cached variable is of a signed type, the compiler will choose a sign-extending instruction to load the 32 bit variable into a 64 bit register (e.g.: 'lgf %r11,188(%r15)'). So once we pass the cached variable into 'zfcp_reqlist_find_rm()' to remove the request again all the leading zeros will be flipped to ones to extend the sign and won't match the original ID anymore (this has been observed in practice). If we can't successfully remove the request from the hash table again after 'zfcp_qdio_send()' fails (this happens regularly when zfcp cannot notify the adapter about new work because the adapter is already gone during e.g. a ChpID toggle) we will end up with a double free. We unconditionally free the request in the calling function when 'zfcp_fsf_req_send()' fails, but because the request is still in the hash table we end up with a stale memory reference, and once the zfcp adapter is either reset during recovery or shutdown we end up freeing the same memory twice. The resulting stack traces vary depending on the kernel and have no direct correlation to the place where the bug occurs. Here are three examples that have been seen in practice: list_del corruption. next->prev should be 00000001b9d13800, but was 00000000dead4ead. (next=00000001bd131a00) ------------[ cut here ]------------ kernel BUG at lib/list_debug.c:62! monitor event: 0040 ilc:2 [#1] PREEMPT SMP Modules linked in: ... CPU: 9 PID: 1617 Comm: zfcperp0.0.1740 Kdump: loaded Hardware name: ... Krnl PSW : 0704d00180000000 00000003cbeea1f8 (__list_del_entry_valid+0x98/0x140) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3 Krnl GPRS: 00000000916d12f1 0000000080000000 000000000000006d 00000003cb665cd6 0000000000000001 0000000000000000 0000000000000000 00000000d28d21e8 00000000d3844000 00000380099efd28 00000001bd131a00 00000001b9d13800 00000000d3290100 0000000000000000 00000003cbeea1f4 00000380099efc70 Krnl Code: 00000003cbeea1e8: c020004f68a7 larl %r2,00000003cc8d7336 00000003cbeea1ee: c0e50027fd65 brasl %r14,00000003cc3e9cb8 #00000003cbeea1f4: af000000 mc 0,0 >00000003cbeea1f8: c02000920440 larl %r2,00000003cd12aa78 00000003cbeea1fe: c0e500289c25 brasl %r14,00000003cc3fda48 00000003cbeea204: b9040043 lgr %r4,%r3 00000003cbeea208: b9040051 lgr %r5,%r1 00000003cbeea20c: b9040032 lgr %r3,%r2 Call Trace: [<00000003cbeea1f8>] __list_del_entry_valid+0x98/0x140 ([<00000003cbeea1f4>] __list_del_entry_valid+0x94/0x140) [<000003ff7ff502fe>] zfcp_fsf_req_dismiss_all+0xde/0x150 [zfcp] [<000003ff7ff49cd0>] zfcp_erp_strategy_do_action+0x160/0x280 [zfcp] [<000003ff7ff4a22e>] zfcp_erp_strategy+0x21e/0xca0 [zfcp] [<000003ff7ff4ad34>] zfcp_erp_thread+0x84/0x1a0 [zfcp] [<00000003cb5eece8>] kthread+0x138/0x150 [<00000003cb557f3c>] __ret_from_fork+0x3c/0x60 [<00000003cc4172ea>] ret_from_fork+0xa/0x40 INFO: lockdep is turned off. Last Breaking-Event-Address: [<00000003cc3e9d04>] _printk+0x4c/0x58 Kernel panic - not syncing: Fatal exception: panic_on_oops or: Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 6b6b6b6b6b6b6000 TEID: 6b6b6b6b6b6b6803 Fault in home space mode while using kernel ASCE. AS:0000000063b10007 R3:0000000000000024 Oops: 0038 ilc:3 [#1] SMP Modules linked in: ... CPU: 10 PID: 0 Comm: swapper/10 Kdump: loaded Hardware name: ... Krnl PSW : 0404d00180000000 000003ff7febaf8e (zfcp_fsf_reqid_check+0x86/0x158 [zfcp]) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3 Krnl GPRS: 5a6f1cfa89c49ac3 00000000aff2c4c8 6b6b6b6b6b6b6b6b 00000000000002a8 0000000000000000 0000000000000055 0000000000000000 00000000a8515800 0700000000000000 00000000a6e14500 00000000aff2c000 000000008003c44c 000000008093c700 0000000000000010 00000380009ebba8 00000380009ebb48 Krnl Code: 000003ff7febaf7e: a7f4003d brc 15,000003ff7febaff8 000003ff7febaf82: e32020000004 lg %r2,0(%r2) #000003ff7febaf88: ec2100388064 cgrj %r2,%r1,8,000003ff7febaff8 >000003ff7febaf8e: e3b020100020 cg %r11,16(%r2) 000003ff7febaf94: a774fff7 brc 7,000003ff7febaf82 000003ff7febaf98: ec280030007c cgij %r2,0,8,000003ff7febaff8 000003ff7febaf9e: e31020080004 lg %r1,8(%r2) 000003ff7febafa4: e33020000004 lg %r3,0(%r2) Call Trace: [<000003ff7febaf8e>] zfcp_fsf_reqid_check+0x86/0x158 [zfcp] [<000003ff7febbdbc>] zfcp_qdio_int_resp+0x6c/0x170 [zfcp] [<000003ff7febbf90>] zfcp_qdio_irq_tasklet+0xd0/0x108 [zfcp] [<0000000061d90a04>] tasklet_action_common.constprop.0+0xdc/0x128 [<000000006292f300>] __do_softirq+0x130/0x3c0 [<0000000061d906c6>] irq_exit_rcu+0xfe/0x118 [<000000006291e818>] do_io_irq+0xc8/0x168 [<000000006292d516>] io_int_handler+0xd6/0x110 [<000000006292d596>] psw_idle_exit+0x0/0xa ([<0000000061d3be50>] arch_cpu_idle+0x40/0xd0) [<000000006292ceea>] default_idle_call+0x52/0xf8 [<0000000061de4fa4>] do_idle+0xd4/0x168 [<0000000061de51fe>] cpu_startup_entry+0x36/0x40 [<0000000061d4faac>] smp_start_secondary+0x12c/0x138 [<000000006292d88e>] restart_int_handler+0x6e/0x90 Last Breaking-Event-Address: [<000003ff7febaf94>] zfcp_fsf_reqid_check+0x8c/0x158 [zfcp] Kernel panic - not syncing: Fatal exception in interrupt or: Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 523b05d3ae76a000 TEID: 523b05d3ae76a803 Fault in home space mode while using kernel ASCE. AS:0000000077c40007 R3:0000000000000024 Oops: 0038 ilc:3 [#1] SMP Modules linked in: ... CPU: 3 PID: 453 Comm: kworker/3:1H Kdump: loaded Hardware name: ... Workqueue: kblockd blk_mq_run_work_fn Krnl PSW : 0404d00180000000 0000000076fc0312 (__kmalloc+0xd2/0x398) R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3 Krnl GPRS: ffffffffffffffff 523b05d3ae76abf6 0000000000000000 0000000000092a20 0000000000000002 00000007e49b5cc0 00000007eda8f000 0000000000092a20 00000007eda8f000 00000003b02856b9 00000000000000a8 523b05d3ae76abf6 00000007dd662000 00000007eda8f000 0000000076fc02b2 000003e0037637a0 Krnl Code: 0000000076fc0302: c004000000d4 brcl 0,76fc04aa 0000000076fc0308: b904001b lgr %r1,%r11 #0000000076fc030c: e3106020001a algf %r1,32(%r6) >0000000076fc0312: e31010000082 xg %r1,0(%r1) 0000000076fc0318: b9040001 lgr %r0,%r1 0000000076fc031c: e30061700082 xg %r0,368(%r6) 0000000076fc0322: ec59000100d9 aghik %r5,%r9,1 0000000076fc0328: e34003b80004 lg %r4,952 Call Trace: [<0000000076fc0312>] __kmalloc+0xd2/0x398 [<0000000076f318f2>] mempool_alloc+0x72/0x1f8 [<000003ff8027c5f8>] zfcp_fsf_req_create.isra.7+0x40/0x268 [zfcp] [<000003ff8027f1bc>] zfcp_fsf_fcp_cmnd+0xac/0x3f0 [zfcp] [<000003ff80280f1a>] zfcp_scsi_queuecommand+0x122/0x1d0 [zfcp] [<000003ff800b4218>] scsi_queue_rq+0x778/0xa10 [scsi_mod] [<00000000771782a0>] __blk_mq_try_issue_directly+0x130/0x208 [<000000007717a124>] blk_mq_request_issue_directly+0x4c/0xa8 [<000003ff801302e2>] dm_mq_queue_rq+0x2ea/0x468 [dm_mod] [<0000000077178c12>] blk_mq_dispatch_rq_list+0x33a/0x818 [<000000007717f064>] __blk_mq_do_dispatch_sched+0x284/0x2f0 [<000000007717f44c>] __blk_mq_sched_dispatch_requests+0x1c4/0x218 [<000000007717fa7a>] blk_mq_sched_dispatch_requests+0x52/0x90 [<0000000077176d74>] __blk_mq_run_hw_queue+0x9c/0xc0 [<0000000076da6d74>] process_one_work+0x274/0x4d0 [<0000000076da7018>] worker_thread+0x48/0x560 [<0000000076daef18>] kthread+0x140/0x160 [<000000007751d144>] ret_from_fork+0x28/0x30 Last Breaking-Event-Address: [<0000000076fc0474>] __kmalloc+0x234/0x398 Kernel panic - not syncing: Fatal exception: panic_on_oops To fix this, simply change the type of the cache variable to 'unsigned long', like the rest of zfcp and also the argument for 'zfcp_reqlist_find_rm()'. This prevents truncation and wrong sign extension and so can successfully remove the request from the hash table. Fixes: e60a6d69f1f8 ("[SCSI] zfcp: Remove function zfcp_reqlist_find_safe") Cc: <stable@vger.kernel.org> #v2.6.34+ Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Link: https://lore.kernel.org/r/979f6e6019d15f91ba56182f1aaf68d61bf37fc6.1668595505.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-25 17:45:53 +01:00
Stefan Haberland	f5fcc9d6d7	s390/dasd: fix Oops in dasd_alias_get_start_dev due to missing pavgroup commit db7ba07108a48c0f95b74fabbfd5d63e924f992d upstream. Fix Oops in dasd_alias_get_start_dev() function caused by the pavgroup pointer being NULL. The pavgroup pointer is checked on the entrance of the function but without the lcu->lock being held. Therefore there is a race window between dasd_alias_get_start_dev() and _lcu_update() which sets pavgroup to NULL with the lcu->lock held. Fix by checking the pavgroup pointer with lcu->lock held. Cc: <stable@vger.kernel.org> # 2.6.25+ Fixes: 8e09f21574ea ("[S390] dasd: add hyper PAV support to DASD device driver, part 1") Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com> Link: https://lore.kernel.org/r/20220919154931.4123002-2-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-09-28 11:10:38 +02:00
Steffen Maier	b8aad5eba7	scsi: zfcp: Fix missing auto port scan and thus missing target ports commit 4da8c5f76825269f28d6a89fa752934a4bcb6dfa upstream. Case (1): The only waiter on wka_port->completion_wq is zfcp_fc_wka_port_get() trying to open a WKA port. As such it should only be woken up by WKA port open responses, not by WKA port close responses. Case (2): A close WKA port response coming in just after having sent a new open WKA port request and before blocking for the open response with wait_event() in zfcp_fc_wka_port_get() erroneously renders the wait_event a NOP because the close handler overwrites wka_port->status. Hence the wait_event condition is erroneously true and it does not enter blocking state. With non-negligible probability, the following time space sequence happens depending on timing without this fix: user process ERP thread zfcp work queue tasklet system work queue ============ ========== =============== ======= ================= $ echo 1 > online zfcp_ccw_set_online zfcp_ccw_activate zfcp_erp_adapter_reopen msleep scan backoff zfcp_erp_strategy \| ... \| zfcp_erp_action_cleanup \| ... \| queue delayed scan_work \| queue ns_up_work \| ns_up_work: \| zfcp_fc_wka_port_get \| open wka request \| open response \| GSPN FC-GS \| RSPN FC-GS [NPIV-only] \| zfcp_fc_wka_port_put \| (--wka->refcount==0) \| sched delayed wka->work \| ~~~Case (1)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ zfcp_erp_wait flush scan_work \| wka->work: \| wka->status=CLOSING \| close wka request \| scan_work: \| zfcp_fc_wka_port_get \| (wka->status==CLOSING) \| wka->status=OPENING \| open wka request \| wait_event \| \| close response \| \| wka->status=OFFLINE \| \| wake_up /WRONG/ ~~~Case (2)~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ \| wka->work: \| wka->status=CLOSING \| close wka request zfcp_erp_wait flush scan_work \| scan_work: \| zfcp_fc_wka_port_get \| (wka->status==CLOSING) \| wka->status=OPENING \| open wka request \| close response \| wka->status=OFFLINE \| wake_up /WRONG&NOP/ \| wait_event /NOP/ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ \| (wka->status!=ONLINE) \| return -EIO \| return early open response wka->status=ONLINE wake_up /NOP/ So we erroneously end up with no automatic port scan. This is a big problem when it happens during boot. The timing is influenced by v3.19 commit 18f87a67e6d6 ("zfcp: auto port scan resiliency"). Fix it by fully mutually excluding zfcp_fc_wka_port_get() and zfcp_fc_wka_port_offline(). For that to work, we make the latter block until we got the response for a close WKA port. In order not to penalize the system workqueue, we move wka_port->work to our own adapter workqueue. Note that before v2.6.30 commit 828bc1212a68 ("[SCSI] zfcp: Set WKA-port to offline on adapter deactivation"), zfcp did block in zfcp_fc_wka_port_offline() as well, but with a different condition. While at it, make non-functional cleanups to improve code reading in zfcp_fc_wka_port_get(). If we cannot send the WKA port open request, don't rely on the subsequent wait_event condition to immediately let this case pass without blocking. Also don't want to rely on the additional condition handling the refcount to be skipped just to finally return with -EIO. Link: https://lore.kernel.org/r/20220729162529.1620730-1-maier@linux.ibm.com Fixes: 5ab944f97e09 ("[SCSI] zfcp: attach and release SAN nameserver port on demand") Cc: <stable@vger.kernel.org> #v2.6.28+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-08-21 15:16:13 +02:00
Alexander Gordeev	9c2ad32ed9	s390/zcore: fix race when reading from hardware system area [ Upstream commit 9ffed254d938c9e99eb7761c7f739294c84e0367 ] Memory buffer used for reading out data from hardware system area is not protected against concurrent access. Reported-by: Matthew Wilcox <willy@infradead.org> Fixes: 411ed3225733 ("[S390] zfcpdump support.") Acked-by: Heiko Carstens <hca@linux.ibm.com> Tested-by: Alexander Egorenkov <egorenar@linux.ibm.com> Link: https://lore.kernel.org/r/e68137f0f9a0d2558f37becc20af18e2939934f6.1658206891.git.agordeev@linux.ibm.com Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-08-21 15:16:09 +02:00
Eric Farman	b16d653bc7	vfio/ccw: Do not change FSM state in subchannel event [ Upstream commit cffcc109fd682075dee79bade3d60a07152a8fd1 ] The routine vfio_ccw_sch_event() is tasked with handling subchannel events, specifically machine checks, on behalf of vfio-ccw. It correctly calls cio_update_schib(), and if that fails (meaning the subchannel is gone) it makes an FSM event call to mark the subchannel Not Operational. If that worked, however, then it decides that if the FSM state was already Not Operational (implying the subchannel just came back), then it should simply change the FSM to partially- or fully-open. Remove this trickery, since a subchannel returning will require more probing than simply "oh all is well again" to ensure it works correctly. Fixes: bbe37e4cb8970 ("vfio: ccw: introduce a finite state machine") Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Link: https://lore.kernel.org/r/20220707135737.720765-4-farman@linux.ibm.com Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-08-21 15:16:07 +02:00
Christoph Hellwig	1cb3032406	block: remove the request_queue to argument request based tracepoints [ Upstream commit a54895fa057c67700270777f7661d8d3c7fda88a ] The request_queue can trivially be derived from the request. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-08-21 15:15:36 +02:00
Jiri Slaby	4d374625cc	tty: the rest, stop using tty_schedule_flip() commit b68b914494df4f79b4e9b58953110574af1cb7a2 upstream. Since commit a9c3f68f3cd8d (tty: Fix low_latency BUG) in 2014, tty_flip_buffer_push() is only a wrapper to tty_schedule_flip(). We are going to remove the latter (as it is used less), so call the former in the rest of the users. Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: William Hubbs <w.d.hubbs@gmail.com> Cc: Chris Brannon <chris@the-brannons.com> Cc: Kirk Reiser <kirk@reisers.ca> Cc: Samuel Thibault <samuel.thibault@ens-lyon.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Reviewed-by: Johan Hovold <johan@kernel.org> Signed-off-by: Jiri Slaby <jslaby@suse.cz> Link: https://lore.kernel.org/r/20211122111648.30379-3-jslaby@suse.cz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-07-29 17:19:28 +02:00
Alexandra Winter	b816ed53f3	s390/lcs: fix variable dereferenced before check [ Upstream commit 671bb35c8e746439f0ed70815968f9a4f20a8deb ] smatch complains about drivers/s390/net/lcs.c:1741 lcs_get_control() warn: variable dereferenced before check 'card->dev' (see line 1739) Fixes: 27eb5ac8f015 ("[PATCH] s390: lcs driver bug fixes and improvements [1/2]") Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-05-18 10:23:44 +02:00
Alexandra Winter	4d3c6d7418	s390/ctcm: fix potential memory leak [ Upstream commit 0c0b20587b9f25a2ad14db7f80ebe49bdf29920a ] smatch complains about drivers/s390/net/ctcm_mpc.c:1210 ctcmpc_unpack_skb() warn: possible memory leak of 'mpcginfo' mpc_action_discontact() did not free mpcginfo. Consolidate the freeing in ctcmpc_unpack_skb(). Fixes: 293d984f0e36 ("ctcm: infrastructure for replaced ctc driver") Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-05-18 10:23:44 +02:00
Alexandra Winter	5497f87edc	s390/ctcm: fix variable dereferenced before check [ Upstream commit 2c50c6867c85afee6f2b3bcbc50fc9d0083d1343 ] Found by cppcheck and smatch. smatch complains about drivers/s390/net/ctcm_sysfs.c:43 ctcm_buffer_write() warn: variable dereferenced before check 'priv' (see line 42) Fixes: 3c09e2647b5e ("ctcm: rename READ/WRITE defines to avoid redefinitions") Reported-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-05-18 10:23:44 +02:00
Jan Höppner	6c7c0e131e	s390/dasd: Fix read inconsistency for ESE DASD devices commit b9c10f68e23c13f56685559a0d6fdaca9f838324 upstream. Read requests that return with NRF error are partially completed in dasd_eckd_ese_read(). The function keeps track of the amount of processed bytes and the driver will eventually return this information back to the block layer for further processing via __dasd_cleanup_cqr() when the request is in the final stage of processing (from the driver's perspective). For this, blk_update_request() is used which requires the number of bytes to complete the request. As per documentation the nr_bytes parameter is described as follows: "number of bytes to complete for @req". This was mistakenly interpreted as "number of bytes _left_ for @req" leading to new requests with incorrect data length. The consequence are inconsistent and completely wrong read requests as data from random memory areas are read back. Fix this by correctly specifying the amount of bytes that should be used to complete the request. Fixes: 5e6bdd37c552 ("s390/dasd: fix data corruption for thin provisioned devices") Cc: stable@vger.kernel.org # 5.3+ Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com> Reviewed-by: Stefan Haberland <sth@linux.ibm.com> Link: https://lore.kernel.org/r/20220505141733.1989450-5-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-05-12 12:25:34 +02:00
Jan Höppner	6e02c0413a	s390/dasd: Fix read for ESE with blksize < 4k commit cd68c48ea15c85f1577a442dc4c285e112ff1b37 upstream. When reading unformatted tracks on ESE devices, the corresponding memory areas are simply set to zero for each segment. This is done incorrectly for blocksizes < 4096. There are two problems. First, the increment of dst is done using the counter of the loop (off), which is increased by blksize every iteration. This leads to a much bigger increment for dst as actually intended. Second, the increment of dst is done before the memory area is set to 0, skipping a significant amount of bytes of memory. This leads to illegal overwriting of memory and ultimately to a kernel panic. This is not a problem with 4k blocksize because blk_queue_max_segment_size is set to PAGE_SIZE, always resulting in a single iteration for the inner segment loop (bv.bv_len == blksize). The incorrectly used 'off' value to increment dst is 0 and the correct memory area is used. In order to fix this for blksize < 4k, increment dst correctly using the blksize and only do it at the end of the loop. Fixes: 5e2b17e712cf ("s390/dasd: Add dynamic formatting support for ESE volumes") Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: Jan Höppner <hoeppner@linux.ibm.com> Reviewed-by: Stefan Haberland <sth@linux.ibm.com> Link: https://lore.kernel.org/r/20220505141733.1989450-4-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-05-12 12:25:34 +02:00
Stefan Haberland	ecc8396827	s390/dasd: prevent double format of tracks for ESE devices commit 71f3871657370dbbaf942a1c758f64e49a36c70f upstream. For ESE devices we get an error for write operations on an unformatted track. Afterwards the track will be formatted and the IO operation restarted. When using alias devices a track might be accessed by multiple requests simultaneously and there is a race window that a track gets formatted twice resulting in data loss. Prevent this by remembering the amount of formatted tracks when starting a request and comparing this number before actually formatting a track on the fly. If the number has changed there is a chance that the current track was finally formatted in between. As a result do not format the track and restart the current IO to check. The number of formatted tracks does not match the overall number of formatted tracks on the device and it might wrap around but this is no problem. It is only needed to recognize that a track has been formatted at all in between. Fixes: 5e2b17e712cf ("s390/dasd: Add dynamic formatting support for ESE volumes") Cc: stable@vger.kernel.org # 5.3+ Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com> Link: https://lore.kernel.org/r/20220505141733.1989450-3-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-05-12 12:25:34 +02:00
Stefan Haberland	30e008ab3f	s390/dasd: fix data corruption for ESE devices commit 5b53a405e4658580e1faf7c217db3f55a21ba849 upstream. For ESE devices we get an error when accessing an unformatted track. The handling of this error will return zero data for read requests and format the track on demand before writing to it. To do this the code needs to distinguish between read and write requests. This is done with data from the blocklayer request. A pointer to the blocklayer request is stored in the CQR. If there is an error on the device an ERP request is built to do error recovery. While the ERP request is mostly a copy of the original CQR the pointer to the blocklayer request is not copied to not accidentally pass it back to the blocklayer without cleanup. This leads to the error that during ESE handling after an ERP request was built it is not possible to determine the IO direction. This leads to the formatting of a track for read requests which might in turn lead to data corruption. Fixes: 5e2b17e712cf ("s390/dasd: Add dynamic formatting support for ESE volumes") Cc: stable@vger.kernel.org # 5.3+ Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com> Link: https://lore.kernel.org/r/20220505141733.1989450-2-sth@linux.ibm.com Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-05-12 12:25:34 +02:00
Steffen Maier	f08801252d	scsi: zfcp: Fix failed recovery on gone remote port with non-NPIV FCP devices commit 8c9db6679be4348b8aae108e11d4be2f83976e30 upstream. Suppose we have an environment with a number of non-NPIV FCP devices (virtual HBAs / FCP devices / zfcp "adapter"s) sharing the same physical FCP channel (HBA port) and its I_T nexus. Plus a number of storage target ports zoned to such shared channel. Now one target port logs out of the fabric causing an RSCN. Zfcp reacts with an ADISC ELS and subsequent port recovery depending on the ADISC result. This happens on all such FCP devices (in different Linux images) concurrently as they all receive a copy of this RSCN. In the following we look at one of those FCP devices. Requests other than FSF_QTCB_FCP_CMND can be slow until they get a response. Depending on which requests are affected by slow responses, there are different recovery outcomes. Here we want to fix failed recoveries on port or adapter level by avoiding recovery requests that can be slow. We need the cached N_Port_ID for the remote port "link" test with ADISC. Just before sending the ADISC, we now intentionally forget the old cached N_Port_ID. The idea is that on receiving an RSCN for a port, we have to assume that any cached information about this port is stale. This forces a fresh new GID_PN [FC-GS] nameserver lookup on any subsequent recovery for the same port. Since we typically can still communicate with the nameserver efficiently, we now reach steady state quicker: Either the nameserver still does not know about the port so we stop recovery, or the nameserver already knows the port potentially with a new N_Port_ID and we can successfully and quickly perform open port recovery. For the one case, where ADISC returns successfully, we re-initialize port->d_id because that case does not involve any port recovery. This also solves a problem if the storage WWPN quickly logs into the fabric again but with a different N_Port_ID. Such as on virtual WWPN takeover during target NPIV failover. [https://www.redbooks.ibm.com/abstracts/redp5477.html] In that case the RSCN from the storage FDISC was ignored by zfcp and we could not successfully recover the failover. On some later failback on the storage, we could have been lucky if the virtual WWPN got the same old N_Port_ID from the SAN switch as we still had cached. Then the related RSCN triggered a successful port reopen recovery. However, there is no guarantee to get the same N_Port_ID on NPIV FDISC. Even though NPIV-enabled FCP devices are not affected by this problem, this code change optimizes recovery time for gone remote ports as a side effect. The timely drop of cached N_Port_IDs prevents unnecessary slow open port attempts. While the problem might have been in code before v2.6.32 commit 799b76d09aee ("[SCSI] zfcp: Decouple gid_pn requests from erp") this fix depends on the gid_pn_work introduced with that commit, so we mark it as culprit to satisfy fix dependencies. Note: Point-to-point remote port is already handled separately and gets its N_Port_ID from the cached peer_d_id. So resetting port->d_id in general does not affect PtP. Link: https://lore.kernel.org/r/20220118165803.3667947-1-maier@linux.ibm.com Fixes: 799b76d09aee ("[SCSI] zfcp: Decouple gid_pn requests from erp") Cc: <stable@vger.kernel.org> #2.6.32+ Suggested-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-02-01 17:25:39 +01:00
Halil Pasic	1f420818df	s390/cio: make ccw_device_dma_* more robust commit ad9a14517263a16af040598c7920c09ca9670a31 upstream. Since commit 48720ba56891 ("virtio/s390: use DMA memory for ccw I/O and classic notifiers") we were supposed to make sure that virtio_ccw_release_dev() completes before the ccw device and the attached dma pool are torn down, but unfortunately we did not. Before that commit it used to be OK to delay cleaning up the memory allocated by virtio-ccw indefinitely (which isn't really intuitive for guys used to destruction happens in reverse construction order), but now we trigger a BUG_ON if the genpool is destroyed before all memory allocated from it is deallocated. Which brings down the guest. We can observe this problem, when unregister_virtio_device() does not give up the last reference to the virtio_device (e.g. because a virtio-scsi attached scsi disk got removed without previously unmounting its previously mounted partition). To make sure that the genpool is only destroyed after all the necessary freeing is done let us take a reference on the ccw device on each ccw_device_dma_zalloc() and give it up on each ccw_device_dma_free(). Actually there are multiple approaches to fixing the problem at hand that can work. The upside of this one is that it is the safest one while remaining simple. We don't crash the guest even if the driver does not pair allocations and frees. The downside is the reference counting overhead, that the reference counting for ccw devices becomes more complex, in a sense that we need to pair the calls to the aforementioned functions for it to be correct, and that if we happen to leak, we leak more than necessary (the whole ccw device instead of just the genpool). Some alternatives to this approach are taking a reference in virtio_ccw_online() and giving it up in virtio_ccw_release_dev() or making sure virtio_ccw_release_dev() completes its work before virtio_ccw_remove() returns. The downside of these approaches is that these are less safe against programming errors. Cc: <stable@vger.kernel.org> # v5.3 Signed-off-by: Halil Pasic <pasic@linux.ibm.com> Fixes: 48720ba56891 ("virtio/s390: use DMA memory for ccw I/O and classic notifiers") Reported-by: bfu@redhat.com Reviewed-by: Vineeth Vijayan <vneethv@linux.ibm.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-11-18 14:04:30 +01:00
Harald Freudenberger	c9ca9669de	s390/ap: Fix hanging ioctl caused by orphaned replies commit 3826350e6dd435e244eb6e47abad5a47c169ebc2 upstream. When a queue is switched to soft offline during heavy load and later switched to soft online again and now used, it may be that the caller is blocked forever in the ioctl call. The failure occurs because there is a pending reply after the queue(s) have been switched to offline. This orphaned reply is received when the queue is switched to online and is accidentally counted for the outstanding replies. So when there was a valid outstanding reply and this orphaned reply is received it counts as the outstanding one thus dropping the outstanding counter to 0. Voila, with this counter the receive function is not called any more and the real outstanding reply is never received (until another request comes in...) and the ioctl blocks. The fix is simple. However, instead of readjusting the counter when an orphaned reply is detected, I check the queue status for not empty and compare this to the outstanding counter. So if the queue is not empty then the counter must not drop to 0 but at least have a value of 1. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-11-18 14:04:30 +01:00
Sven Schnelle	57de1fbecf	s390/tape: fix timer initialization in tape_std_assign() commit 213fca9e23b59581c573d558aa477556f00b8198 upstream. commit 9c6c273aa424 ("timer: Remove init_timer_on_stack() in favor of timer_setup_on_stack()") changed the timer setup from init_timer_on_stack(() to timer_setup(), but missed to change the mod_timer() call. And while at it, use msecs_to_jiffies() instead of the open coded timeout calculation. Cc: stable@vger.kernel.org Fixes: 9c6c273aa424 ("timer: Remove init_timer_on_stack() in favor of timer_setup_on_stack()") Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-11-18 14:04:30 +01:00
Vineeth Vijayan	1174298a5b	s390/cio: check the subchannel validity for dev_busid commit a4751f157c194431fae9e9c493f456df8272b871 upstream. Check the validity of subchanel before reading other fields in the schib. Fixes: d3683c055212 ("s390/cio: add dev_busid sysfs entry for each subchannel") CC: <stable@vger.kernel.org> Reported-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Link: https://lore.kernel.org/r/20211105154451.847288-1-vneethv@linux.ibm.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-11-18 14:04:30 +01:00
Julian Wiedmann	db94f89e1d	s390/qeth: fix NULL deref in qeth_clear_working_pool_list() [ Upstream commit 248f064af222a1f97ee02c84a98013dfbccad386 ] When qeth_set_online() calls qeth_clear_working_pool_list() to roll back after an error exit from qeth_hardsetup_card(), we are at risk of accessing card->qdio.in_q before it was allocated by qeth_alloc_qdio_queues() via qeth_mpc_initialize(). qeth_clear_working_pool_list() then dereferences NULL, and by writing to queue->bufs[i].pool_entry scribbles all over the CPU's lowcore. Resulting in a crash when those lowcore areas are used next (eg. on the next machine-check interrupt). Such a scenario would typically happen when the device is first set online and its queues aren't allocated yet. An early IO error or certain misconfigs (eg. mismatched transport mode, bad portno) then cause us to error out from qeth_hardsetup_card() with card->qdio.in_q still being NULL. Fix it by checking the pointer for NULL before accessing it. Note that we also have (rare) paths inside qeth_mpc_initialize() where a configuration change can cause us to free the existing queues, expecting that subsequent code will allocate them again. If we then error out before that re-allocation happens, the same bug occurs. Fixes: eff73e16ee11 ("s390/qeth: tolerate pre-filled RX buffer") Reported-by: Stefan Raspl <raspl@linux.ibm.com> Root-caused-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-09-30 10:11:03 +02:00
Alexander Egorenkov	0346f8a2c5	s390/sclp: fix Secure-IPL facility detection commit d76b14f3971a0638b6cd0da289f8b48acee287d0 upstream. Prevent out-of-range access if the returned SCLP SCCB response is smaller in size than the address of the Secure-IPL flag. Fixes: c9896acc7851 ("s390/ipl: Provide has_secure sysfs attribute") Cc: stable@vger.kernel.org # 5.2+ Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-09-22 12:27:56 +02:00
Julian Wiedmann	76668bdee0	s390/qdio: cancel the ESTABLISH ccw after timeout commit 1c1dc8bda3a05c60877a6649775894db5343bdea upstream. When the ESTABLISH ccw does not complete within the specified timeout, try our best to cancel the ccw program that is still active on the device. Otherwise the IO subsystem might be accessing it even after the driver eg. called qdio_free(). Fixes: 779e6e1c724d ("[S390] qdio: new qdio driver.") Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Cc: <stable@vger.kernel.org> # 2.6.27 Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-09-18 13:40:09 +02:00
Julian Wiedmann	bcc0c767f9	s390/qdio: fix roll-back after timeout on ESTABLISH ccw commit 2c197870e4701610ec3b1143808d4e31152caf30 upstream. When qdio_establish() times out while waiting for the ESTABLISH ccw to complete, it calls qdio_shutdown() to roll back all of its previous actions. But at this point the qdio_irq's state is still QDIO_IRQ_STATE_INACTIVE, so qdio_shutdown() will exit immediately without doing any actual work. Which means that eg. the qdio_irq's thinint-indicator stays registered, and cdev->handler isn't restored to its old value. And since commit 954d6235be41 ("s390/qdio: make thinint registration symmetric") the qdio_irq also stays on the tiq_list, so on the next qdio_establish() we might get a helpful BUG from the list-debugging code: ... [ 4633.512591] list_add double add: new=00000000005a4110, prev=00000001b357db78, next=00000000005a4110. [ 4633.512621] ------------[ cut here ]------------ [ 4633.512623] kernel BUG at lib/list_debug.c:29! ... [ 4633.512796] [<00000001b2c6ee9a>] __list_add_valid+0x82/0xa0 [ 4633.512798] ([<00000001b2c6ee96>] __list_add_valid+0x7e/0xa0) [ 4633.512800] [<00000001b2fcecce>] qdio_establish_thinint+0x116/0x190 [ 4633.512805] [<00000001b2fcbe58>] qdio_establish+0x128/0x498 ... Fix this by extracting a goto-chain from the existing error exits in qdio_establish(), and check the return value of the wait_event_...() to detect the timeout condition. Fixes: 779e6e1c724d ("[S390] qdio: new qdio driver.") Root-caused-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Cc: <stable@vger.kernel.org> # 2.6.27 Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-09-18 13:40:09 +02:00
Harald Freudenberger	a758b1d4ca	s390/ap: fix state machine hang after failure to enable irq [ Upstream commit cabebb697c98fb1f05cc950a747a9b6ec61a5b01 ] If for any reason the interrupt enable for an ap queue fails the state machine run for the queue returned wrong return codes to the caller. So the caller assumed interrupt support for this queue in enabled and thus did not re-establish the high resolution timer used for polling. In the end this let to a hang for the user space process waiting "forever" for the reply. This patch reworks these return codes to return correct indications for the caller to re-establish the timer when a queue runs without interrupt support. Please note that this is fixing a wrong behavior after a first failure (enable interrupt support for the queue) failed. However, looks like this occasionally happens on KVM systems. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-09-15 09:50:27 +02:00
Harald Freudenberger	cf619a528e	s390/zcrypt: fix wrong offset index for APKA master key valid state [ Upstream commit 8617bb74006252cb2286008afe7d6575a6425857 ] Tests showed a mismatch between what the CCA tool reports about the APKA master key state and what's displayed by the zcrypt dd in sysfs. After some investigation, we found out that the documentation which was the source for the zcrypt dd implementation lacks the listing of 3 fields. So this patch now moves the evaluation of the APKA master key state to the correct offset. Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-09-15 09:50:26 +02:00
Vineeth Vijayan	b4aa00bf8a	s390/cio: add dev_busid sysfs entry for each subchannel [ Upstream commit d3683c055212bf910d4e318f7944910ce10dbee6 ] Introduce dev_busid, which exports the device-id associated with the io-subchannel (and message-subchannel). The dev_busid indicates that of the device which may be physically installed on the corrosponding subchannel. The dev_busid value "none" indicates that the subchannel is not valid, there is no I/O device currently associated with the subchannel. The dev_busid information would be helpful to write device-specific udev-rules associated with the subchannel. The dev_busid interface would be available even when the sch is not bound to any driver or if there is no operational device connected on it. Hence this attribute can be used to write udev-rules which are specific to the device associated with the subchannel. Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-09-15 09:50:26 +02:00
Valentin Vidic	78cddc9aa6	s390/sclp_vt220: fix console name to match device [ Upstream commit b7d91d230a119fdcc334d10c9889ce9c5e15118b ] Console name reported in /proc/consoles: ttyS1 -W- (EC p ) 4:65 does not match the char device name: crw--w---- 1 root root 4, 65 May 17 12:18 /dev/ttysclp0 so debian-installer inside a QEMU s390x instance gets confused and fails to start with the following error: steal-ctty: No such file or directory Signed-off-by: Valentin Vidic <vvidic@valentin-vidic.from.hr> Link: https://lore.kernel.org/r/20210427194010.9330-1-vvidic@valentin-vidic.from.hr Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-07-20 16:05:42 +02:00
Steffen Maier	e1261c7a84	scsi: zfcp: Report port fc_security as unknown early during remote cable pull commit 8b3bdd99c092bbaeaa7d9eecb1a3e5dc9112002b upstream. On remote cable pull, a zfcp_port keeps its status and only gets ZFCP_STATUS_PORT_LINK_TEST added. Only after an ADISC timeout, we would actually start port recovery and remove ZFCP_STATUS_COMMON_UNBLOCKED which zfcp_sysfs_port_fc_security_show() detected and reported as "unknown" instead of the old and possibly stale zfcp_port->connection_info. Add check for ZFCP_STATUS_PORT_LINK_TEST for timely "unknown" report. Link: https://lore.kernel.org/r/20210702160922.2667874-1-maier@linux.ibm.com Fixes: a17c78460093 ("scsi: zfcp: report FC Endpoint Security in sysfs") Cc: <stable@vger.kernel.org> #5.7+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-07-20 16:05:36 +02:00
Vineeth Vijayan	38a2ba82e2	s390/cio: dont call css_wait_for_slow_path() inside a lock commit c749d8c018daf5fba6dfac7b6c5c78b27efd7d65 upstream. Currently css_wait_for_slow_path() gets called inside the chp->lock. The path-verification-loop of slowpath inside this lock could lead to deadlock as reported by the lockdep validator. The ccw_device_get_chp_desc() during the instance of a device-set-online would try to acquire the same 'chp->lock' to read the chp->desc. The instance of this function can get called from multiple scenario, like probing or setting-device online manually. This could, in some corner-cases lead to the deadlock. lockdep validator reported this as, CPU0 CPU1 ---- ---- lock(&chp->lock); lock(kn->active#43); lock(&chp->lock); lock((wq_completion)cio); The chp->lock was introduced to serialize the access of struct channel_path. This lock is not needed for the css_wait_for_slow_path() function, so invoke the slow-path function outside this lock. Fixes: b730f3a93395 ("[S390] cio: add lock to struct channel_path") Cc: <stable@vger.kernel.org> Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com> Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-07-14 16:55:43 +02:00
Harald Freudenberger	b516daed99	s390/ap: Fix hanging ioctl caused by wrong msg counter commit e73a99f3287a740a07d6618e9470f4d6cb217da8 upstream. When a AP queue is switched to soft offline, all pending requests are purged out of the pending requests list and 'received' by the upper layer like zcrypt device drivers. This is also done for requests which are already enqueued into the firmware queue. A request in a firmware queue may eventually produce an response message, but there is no waiting process any more. However, the response was counted with the queue_counter and as this counter was reset to 0 with the offline switch, the pending response caused the queue_counter to get negative. The next request increased this counter to 0 (instead of 1) which caused the ap code to assume there is nothing to receive and so the response for this valid request was never tried to fetch from the firmware queue. This all caused a queue to not work properly after a switch offline/online and in the end processes to hang forever when trying to send a crypto request after an queue offline/online switch cicle. Fixed by a) making sure the counter does not drop below 0 and b) on a successful enqueue of a message has at least a value of 1. Additionally a warning is emitted, when a reply can't get assigned to a waiting process. This may be normal operation (process had timeout or has been killed) but may give a hint that something unexpected happened (like this odd behavior described above). Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Cc: stable@vger.kernel.org Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-23 14:42:51 +02:00
Eric Farman	01905f3232	vfio-ccw: Serialize FSM IDLE state with I/O completion [ Upstream commit 2af7a834a435460d546f0cf0a8b8e4d259f1d910 ] Today, the stacked call to vfio_ccw_sch_io_todo() does three things: 1) Update a solicited IRB with CP information, and release the CP if the interrupt was the end of a START operation. 2) Copy the IRB data into the io_region, under the protection of the io_mutex 3) Reset the vfio-ccw FSM state to IDLE to acknowledge that vfio-ccw can accept more work. The trouble is that step 3 is (A) invoked for both solicited and unsolicited interrupts, and (B) sitting after the mutex for step 2. This second piece becomes a problem if it processes an interrupt for a CLEAR SUBCHANNEL while another thread initiates a START, thus allowing the CP and FSM states to get out of sync. That is: CPU 1 CPU 2 fsm_do_clear() fsm_irq() fsm_io_request() vfio_ccw_sch_io_todo() fsm_io_helper() Since the FSM state and CP should be kept in sync, let's make a note when the CP is released, and rely on that as an indication that the FSM should also be reset at the end of this routine and open up the device for more work. Signed-off-by: Eric Farman <farman@linux.ibm.com> Acked-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Message-Id: <20210511195631.3995081-4-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-16 12:01:35 +02:00
Eric Farman	cad3dc73c0	vfio-ccw: Reset FSM state to IDLE inside FSM [ Upstream commit 6c02ac4c9211edabe17bda437ac97e578756f31b ] When an I/O request is made, the fsm_io_request() routine moves the FSM state from IDLE to CP_PROCESSING, and then fsm_io_helper() moves it to CP_PENDING if the START SUBCHANNEL received a cc0. Yet, the error case to go from CP_PROCESSING back to IDLE is done after the FSM call returns. Let's move this up into the FSM proper, to provide some better symmetry when unwinding in this case. Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Matthew Rosato <mjrosato@linux.ibm.com> Message-Id: <20210511195631.3995081-3-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-16 12:01:35 +02:00
Eric Farman	cd37040ba9	vfio-ccw: Check initialized flag in cp_init() [ Upstream commit c6c82e0cd8125d30f2f1b29205c7e1a2f1a6785b ] We have a really nice flag in the channel_program struct that indicates if it had been initialized by cp_init(), and use it as a guard in the other cp accessor routines, but not for a duplicate call into cp_init(). The possibility of this occurring is low, because that flow is protected by the private->io_mutex and FSM CP_PROCESSING state. But then why bother checking it in (for example) cp_prefetch() then? Let's just be consistent and check for that in cp_init() too. Fixes: 71189f263f8a3 ("vfio-ccw: make it safe to access channel programs") Signed-off-by: Eric Farman <farman@linux.ibm.com> Reviewed-by: Cornelia Huck <cohuck@redhat.com> Acked-by: Matthew Rosato <mjrosato@linux.ibm.com> Message-Id: <20210511195631.3995081-2-farman@linux.ibm.com> Signed-off-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-03 09:00:46 +02:00
Harald Freudenberger	026499a9c2	s390/zcrypt: fix zcard and zqueue hot-unplug memleak commit 70fac8088cfad9f3b379c9082832b4d7532c16c2 upstream. Tests with kvm and a kmemdebug kernel showed, that on hot unplug the zcard and zqueue structs for the unplugged card or queue are not properly freed because of a mismatch with get/put for the embedded kref counter. This fix now adjusts the handling of the kref counters. With init the kref counter starts with 1. This initial value needs to drop to zero with the unregister of the card or queue to trigger the release and free the object. Fixes: 29c2680fd2bf ("s390/ap: fix ap devices reference counting") Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com> Signed-off-by: Harald Freudenberger <freude@linux.ibm.com> Cc: stable@vger.kernel.org Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-05-11 14:47:11 +02:00
Julian Wiedmann	e8e99acd08	s390/qeth: schedule TX NAPI on QAOB completion [ Upstream commit 3e83d467a08e25b27c44c885f511624a71c84f7c ] When a QAOB notifies us that a pending TX buffer has been delivered, the actual TX completion processing by qeth_tx_complete_pending_bufs() is done within the context of a TX NAPI instance. We shouldn't rely on this instance being scheduled by some other TX event, but just do it ourselves. qeth_qdio_handle_aob() is called from qeth_poll(), ie. our main NAPI instance. To avoid touching the TX queue's NAPI instance before/after it is (un-)registered, reorder the code in qeth_open() and qeth_stop() accordingly. Fixes: 0da9581ddb0f ("qeth: exploit asynchronous delivery of storage blocks") Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-25 09:04:13 +01:00
Stefan Haberland	08bccd7212	s390/dasd: fix hanging IO request during DASD driver unbind commit 66f669a272898feb1c69b770e1504aa2ec7723d1 upstream. Prevent that an IO request is build during device shutdown initiated by a driver unbind. This request will never be able to be processed or canceled and will hang forever. This will lead also to a hanging unbind. Fix by checking not only if the device is in READY state but also check that there is no device offline initiated before building a new IO request. Fixes: e443343e509a ("s390/dasd: blk-mq conversion") Cc: <stable@vger.kernel.org> # v4.14+ Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Tested-by: Bjoern Walk <bwalk@linux.ibm.com> Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:06:28 +01:00
Stefan Haberland	5d76645551	s390/dasd: fix hanging DASD driver unbind commit 7d365bd0bff3c0310c39ebaffc9a8458e036d666 upstream. In case of an unbind of the DASD device driver the function dasd_generic_remove() is called which shuts down the device. Among others this functions removes the int_handler from the cdev. During shutdown the device cancels all outstanding IO requests and waits for completion of the clear request. Unfortunately the clear interrupt will never be received when there is no interrupt handler connected. Fix by moving the int_handler removal after the call to the state machine where no request or interrupt is outstanding. Cc: stable@vger.kernel.org Signed-off-by: Stefan Haberland <sth@linux.ibm.com> Tested-by: Bjoern Walk <bwalk@linux.ibm.com> Reviewed-by: Jan Hoeppner <hoeppner@linux.ibm.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-17 17:06:28 +01:00

1 2 3 4 5 ...

5158 Commits