scsi: target: core: Fix hard lockup when executing a compare-and-write command
While handling an I/O completion for the compare portion of a COMPARE_AND_WRITE command, it may happen that the compare_and_write_callback function submits new bio structs while still in softirq context. Low level drivers like md raid5 do not expect their make_request call to be used in softirq context, they call into schedule() and create a deadlocked system. __schedule at ffffffff873a0807 schedule at ffffffff873a0cc5 raid5_get_active_stripe at ffffffffc0875744 [raid456] raid5_make_request at ffffffffc0875a50 [raid456] md_handle_request at ffffffff8713b9f9 md_make_request at ffffffff8713bacb generic_make_request at ffffffff86e6f14b submit_bio at ffffffff86e6f27c iblock_submit_bios at ffffffffc0b4e4dc [target_core_iblock] iblock_execute_rw at ffffffffc0b4f3ce [target_core_iblock] __target_execute_cmd at ffffffffc1090079 [target_core_mod] compare_and_write_callback at ffffffffc1093602 [target_core_mod] target_cmd_interrupted at ffffffffc108d1ec [target_core_mod] target_complete_cmd_with_sense at ffffffffc108d27c [target_core_mod] iblock_complete_cmd at ffffffffc0b4e23a [target_core_iblock] dm_io_dec_pending at ffffffffc00db29e [dm_mod] clone_endio at ffffffffc00dbf07 [dm_mod] raid5_align_endio at ffffffffc086d6c2 [raid456] blk_update_request at ffffffff86e6d950 scsi_end_request at ffffffff87063d48 scsi_io_completion at ffffffff87063ee8 blk_complete_reqs at ffffffff86e77b05 __softirqentry_text_start at ffffffff876000d7 This problem appears to be an issue between target_cmd_interrupted() and compare_and_write_callback(). target_cmd_interrupted() calls the se_cmd's transport_complete_callback function pointer if the se_cmd is being stopped or aborted, and CMD_T_ABORTED was set on the se_cmd. When calling compare_and_write_callback(), the success parameter was set to false. target_cmd_interrupted() seems to expect this means the callback will do cleanup that does not require a process context. But compare_and_write_callback() ignores the parameter if there was I/O done for the compare part of COMPARE_AND_WRITE. Since there was data, the function continued on, passed the compare, and issued a write while ignoring the value of the success parameter. The submit of a bio for the write portion of the COMPARE_AND_WRITE then causes schedule to be unsafely called from the softirq context. Fix the bug in compare_and_write_callback by jumping to the out label if success == "false", after checking if we have been called by transport_generic_request_failure(); The command is being aborted or stopped so there is no need to submit the write bio for the write part of the COMPARE_AND_WRITE command. Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Link: https://lore.kernel.org/r/20221121092703.316489-1-mlombard@redhat.com Reviewed-by: Mike Christie <michael.christie@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
This commit is contained in:
parent
fec1b2fa62
commit
a72629b5cd
@ -446,12 +446,22 @@ static sense_reason_t compare_and_write_callback(struct se_cmd *cmd, bool succes
|
||||
sense_reason_t ret = TCM_NO_SENSE;
|
||||
int i;
|
||||
|
||||
/*
|
||||
* Handle early failure in transport_generic_request_failure(),
|
||||
* which will not have taken ->caw_sem yet..
|
||||
*/
|
||||
if (!success && (!cmd->t_data_sg || !cmd->t_bidi_data_sg))
|
||||
return TCM_NO_SENSE;
|
||||
if (!success) {
|
||||
/*
|
||||
* Handle early failure in transport_generic_request_failure(),
|
||||
* which will not have taken ->caw_sem yet..
|
||||
*/
|
||||
if (!cmd->t_data_sg || !cmd->t_bidi_data_sg)
|
||||
return TCM_NO_SENSE;
|
||||
|
||||
/*
|
||||
* The command has been stopped or aborted so
|
||||
* we don't have to perform the write operation.
|
||||
*/
|
||||
WARN_ON(!(cmd->transport_state &
|
||||
(CMD_T_ABORTED | CMD_T_STOP)));
|
||||
goto out;
|
||||
}
|
||||
/*
|
||||
* Handle special case for zero-length COMPARE_AND_WRITE
|
||||
*/
|
||||
|
Loading…
Reference in New Issue
Block a user