linux

iv/linux

Author	SHA1	Message	Date
Kees Cook	bbe21d7a97	scsi: ufs: ufshcd: Remove VLA usage On the quest to remove all VLAs from the kernel[1] this moves buffers off the stack. In the second instance, this collapses two separately allocated buffers into a single buffer, since they are used consecutively, which saves 256 bytes (QUERY_DESC_MAX_SIZE + 1) of stack space. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Subhash Jadavani <subhashj@codeaurora.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-18 10:36:09 -04:00
Alexander Potapenko	a45b599ad8	scsi: sg: allocate with __GFP_ZERO in sg_build_indirect() This shall help avoid copying uninitialized memory to the userspace when calling ioctl(fd, SG_IO) with an empty command. Reported-by: syzbot+7d26fc1eea198488deab@syzkaller.appspotmail.com Cc: stable@vger.kernel.org Signed-off-by: Alexander Potapenko <glider@google.com> Acked-by: Douglas Gilbert <dgilbert@interlog.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-18 10:26:01 -04:00
Christoph Hellwig	b8b1483d79	sg: simplify procfs code Use remove_proc_subtree to remove the whole subtree on cleanup, and unwind the registration loop into individual calls. Switch to use proc_create_seq where applicable. Also don't bother handling proc_create* failures - the driver works perfectly fine without the proc files, and the cleanup will handle missing files gracefully. Signed-off-by: Christoph Hellwig <hch@lst.de>	2018-05-16 07:24:30 +02:00
Christoph Hellwig	f7680bec04	megaraid: simplify procfs code Use remove_proc_subtree to remove the whole subtree on cleanup, and unwind the registration loop into individual calls. Switch to use proc_create_single. Also don't bother handling proc_create* failures - the driver works perfectly fine without the proc files, and the cleanup will handle missing files gracefully. Signed-off-by: Christoph Hellwig <hch@lst.de>	2018-05-16 07:24:30 +02:00
Randy Dunlap	a406b0a069	scsi: core: clean up generated file scsi_devinfo_tbl.c "make clean" should remove the generated file "scsi_devinfo_tbl.c", so list it in the clean-files variable so that the file gets cleaned up. Fixes: 345e29608b4b ("scsi: scsi: Export blacklist flags to sysfs") Cc: Hannes Reinecke <hare@suse.de> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-14 22:48:03 -04:00
Wen Xiong	81471b07b7	scsi: ipr: new IOASC update This patch adds new adapter error log for P9 system with the new AZ SAS cable. Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com> Acked-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-14 22:42:53 -04:00
Colin Ian King	eaa6d0df67	scsi: esas2r: fix spelling mistake: "requestss" -> "requests" Trivial fix to spelling mistake in esas2r_debug message. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-14 22:40:30 -04:00
Kees Cook	97e066184b	scsi: libosd: Remove VLA usage On the quest to remove all VLAs from the kernel[1] this rearranges the code to avoid a VLA warning under -Wvla (gcc doesn't recognize "const" variables as not triggering VLA creation). Additionally cleans up variable naming to avoid 80 character column limit. [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Boaz Harrosh <ooo@electrozaur.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-14 22:32:45 -04:00
Christoph Hellwig	0eb0b63c1d	block: consistently use GFP_NOIO instead of __GFP_NORECLAIM Same numerical value (for now at least), but a much better documentation of intent. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-05-14 08:55:18 -06:00
Christoph Hellwig	4accf5fc79	block: pass an explicit gfp_t to get_request blk_old_get_request already has it at hand, and in blk_queue_bio, which is the fast path, it is constant. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-05-14 08:55:14 -06:00
Christoph Hellwig	ff005a0662	block: sanitize blk_get_request calling conventions Switch everyone to blk_get_request_flags, and then rename blk_get_request_flags to blk_get_request. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-05-14 08:55:12 -06:00
Christoph Hellwig	ac613e4566	scsi/osd: remove the gfp argument to osd_start_request Always GFP_KERNEL, and keeping it would cause serious complications for the next change. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-05-14 08:55:09 -06:00
Wenwen Wang	9899e4d352	scsi: 3w-xxxx: fix a missing-check bug In tw_chrdev_ioctl(), the length of the data buffer is firstly copied from the userspace pointer 'argp' and saved to the kernel object 'data_buffer_length'. Then a security check is performed on it to make sure that the length is not more than 'TW_MAX_IOCTL_SECTORS * 512'. Otherwise, an error code -EINVAL is returned. If the security check is passed, the entire ioctl command is copied again from the 'argp' pointer and saved to the kernel object 'tw_ioctl'. Then, various operations are performed on 'tw_ioctl' according to the 'cmd'. Given that the 'argp' pointer resides in userspace, a malicious userspace process can race to change the buffer length between the two copies. This way, the user can bypass the security check and inject invalid data buffer length. This can cause potential security issues in the following execution. This patch checks for capable(CAP_SYS_ADMIN) in tw_chrdev_open() to avoid the above issues. Signed-off-by: Wenwen Wang <wang6495@umn.edu> Acked-by: Adam Radford <aradford@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:32:43 -04:00
Wenwen Wang	c9318a3e02	scsi: 3w-9xxx: fix a missing-check bug In twa_chrdev_ioctl(), the ioctl driver command is firstly copied from the userspace pointer 'argp' and saved to the kernel object 'driver_command'. Then a security check is performed on the data buffer size indicated by 'driver_command', which is 'driver_command.buffer_length'. If the security check is passed, the entire ioctl command is copied again from the 'argp' pointer and saved to the kernel object 'tw_ioctl'. Then, various operations are performed on 'tw_ioctl' according to the 'cmd'. Given that the 'argp' pointer resides in userspace, a malicious userspace process can race to change the buffer size between the two copies. This way, the user can bypass the security check and inject invalid data buffer size. This can cause potential security issues in the following execution. This patch checks for capable(CAP_SYS_ADMIN) in twa_chrdev_open()t o avoid the above issues. Signed-off-by: Wenwen Wang <wang6495@umn.edu> Acked-by: Adam Radford <aradford@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:32:18 -04:00
Dan Carpenter	27e833daba	scsi: megaraid: silence a static checker bug If we had more than 32 megaraid cards then it would cause memory corruption. That's not likely, of course, but it's handy to enforce it and make the static checker happy. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:29:00 -04:00
Colin Ian King	7afc0ce912	scsi: lpfc: fix spelling mistakes: "mabilbox" and "maibox" Trivial fix to spelling mistakes in lpfc_printf_log log message "mabilbox" -> "mailbox" "maibox" -> "mailbox" Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:26:44 -04:00
Andrei Vagin	4b83cb8b06	scsi: qla2xxx: remove the unused tcm_qla2xxx_cmd_wq Signed-off-by: Andrei Vagin <avagin@openvz.org> Reviewed-by: Laurence Oberman <loberman@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:24:25 -04:00
Xiaofei Tan	f70c1251de	scsi: hisi_sas: workaround a v3 hw hilink bug There is an SoC bug of v3 hw development version. When hot- unplugging a directly attached disk, the PHY down interrupt may not happen. It is very easy to appear on some boards. When this issue occurs, the controller will receive many invalid dword frames, and the "alos" fields of register HILINK_ERR_DFX can indicate that disk was unplugged. As an workaround solution, this patch detects this issue in the channel interrupt, and workaround it by following steps: - Disable the PHY - Clear error code and interrupt - Enable the PHY Then the HW will reissue PHY down interrupt. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:44 -04:00
John Garry	9b8addf302	scsi: hisi_sas: add readl poll timeout helper wrappers It is common to use readl poll timeout helpers in the driver, so create custom wrappers. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:44 -04:00
Xiaofei Tan	bf081d5da4	scsi: hisi_sas: remove redundant handling to event95 for v3 Event95 is used for DFX purpose. The relevant bit for this interrupt in the ENT_INT_SRC_MSK3 register has been disabled, so remove the processing. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:44 -04:00
Xiang Chen	9413532788	scsi: hisi_sas: config ATA de-reset as an constrained command for v3 hw As a unconstrained command, a command can be sent to SATA disk even if SATA disk status is BUSY, ERR or DRQ. If an ATA reset assert is successful but ATA reset de-assert fails, then it will retry the reset de-assert. If reset de- assert retry is successful, we think it is okay to probe the device but actually it still has Err status. Apparently we need to retry the ATA reset assertion and de- assertion instead for this mentioned scenario. As such, we config ATA reset assert as a constrained command, if ATA reset de-assert fails, then ATA reset de-assert retry will also fail. Then we will retry the proper process of ATA reset assert and de-assert again. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:44 -04:00
Xiang Chen	c2c1d9ded0	scsi: hisi_sas: update PHY linkrate after a controller reset After the controller is reset, we currently may not honour the PHY max linkrate set via sysfs, in that after a reset we always revert to max linkrate of 12Gbps, ignoring the value set via sysfs. This patch modifies to policy to set the programmed PHY linkrate, honouring the max linkrate programmed via sysfs. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:44 -04:00
John Garry	6f7c32d605	scsi: hisi_sas: stop controller timer for reset We should only have the timer enabled after PHY up after controller reset, so disable prior to reset. Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:44 -04:00
Xiang Chen	c6ef895472	scsi: hisi_sas: check sas_dev gone earlier in hisi_sas_abort_task() It is possible to dereference a NULL-pointer in hisi_sas_abort_task() in special scenario when the device has been removed. If an SMP task times-out, it will call hisi_sas_abort_task() to recover. And currently there is a check in hisi_sas_abort_task() to avoid the situation of processing the abort for the removed device. However we have an ordering problem, in that we may reference a task for the removed device before checking if the device has been removed. Fix this by only referencing the sas_dev after we know it is still present. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:44 -04:00
Xiang Chen	a14da7a20d	scsi: hisi_sas: fix PI memory size There are 28 bytes of protection information record of SSP for v3 hw, 16 bytes for v2 hw, and probably 24 for v1 hw (forgotten now). So use a value big enough in hisi_sas_command_table_ssp.prot to cover all cases. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:44 -04:00
Xiang Chen	cd938e535e	scsi: hisi_sas: check host frozen before calling "done" function When the host is frozen in SCSI EH state, at any point after the LLDD sets SAS_TASK_STATE_DONE for the sas_task task state, libsas may free the task; see sas_scsi_find_task(). This puts the LLDD in a difficult position, in that once it sets SAS_TASK_STATE_DONE for the task state it should not reference the sas_task again. But the LLDD needs will check the sas_task indirectly in calling task->task_done()->sas_scsi_task_done() or sas_ata_task_done() (to check if the host is frozen state actually). And the LLDD cannot set SAS_TASK_STATE_DONE for the task state after task->task_done() is called (as the sas_task is free'd at this point). This situation would seem to be a problem made by libsas. To work around, check in the LLDD whether the host is in frozen state to ensure it is ok to call task->task_done() function. If in the frozen state, we rely on SCSI EH and libsas to free the sas_task directly. We do not do this for the following IO types: - SMP - they are managed in libsas directly, outside SCSI EH - Any internally originated IO, for similar reason Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:44 -04:00
Xiang Chen	b81b6cce58	scsi: hisi_sas: Add some checks to avoid free'ing a sas_task twice If the SCSI host enters EH, any pending IO will be processed by SCSI EH. However it is possible that SCSI EH will try to abort the IO and also at the same time the IO completes in the driver. In this situation there is a small chance of freeing the sas_task twice. Then if another IO re-uses freed sas_task before the second time of free'ing sas_task, it is possible to free incorrect sas_task. To avoid this situation, add some checks to increase reliability. The sas_task task state flag SAS_TASK_STATE_ABORTED is used to mutually protect the LLDD and libsas freeing the task. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:43 -04:00
Xiang Chen	24cf43612d	scsi: hisi_sas: optimise the usage of DQ locking In the DQ tasklet processing it is not necessary to take the DQ lock, as there is no contention between adding slots to the CQ and removing slots from the matching DQ. In addition, since we run each DQ in a separate tasklet context, there would be no possible contention between DQ processing running for the same queue in parallel. It is still necessary to take hisi_hba lock when free'ing slots. Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:10:43 -04:00
James Smart	3e21d1cb0f	scsi: lpfc: Comment cleanup regarding Broadcom copyright header Fix small formatting and wording nits in Broadcom copyright header Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	e65d8c3411	scsi: lpfc: update driver version to 12.0.0.3 Update the driver version to 12.0.0.3 Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	11f0e34ff4	scsi: lpfc: Enhance log messages when reporting CQE errors Enhance log messages for CQEs as they were not reporting certain fields. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	44c2757b76	scsi: lpfc: Fix up log messages and stats counters in IO submit code path Fix up log messages and add an fcp error stat counter in the IO submit code path to make diagnosing problems easier Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	d38f33b304	scsi: lpfc: Driver NVME load fails when CPU cnt > WQ resource cnt If the cpu count is larger than the number of WQ resources available, adapter attachment eventually failes due to a WQ_CREATE failure. Calculate the number of WQs desired (which initializes to cpu count) after accounting for the number of queues the adapter supports and the number allocated to SCSI and the control/ELS path, and scale down if necessary. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	23288b78a1	scsi: lpfc: Handle new link fault code returned by adapter firmware. The driver encounters a link event ACQE with a fault code it doesn't recognize, it logs an "Invalid" fault type and futher treats the unknown value as a mailbox command failure. First off, there is no "invalid" value, only values that are unknown. Secondly, the fault code doesn't indicate status - the rest of the ACQE contains that status so there is no reason to "fail the commands". Change the "Invalid" to "Unknown". There is no "invalid" code value. Separate fault code parsing and message genaration from any mbx handling status. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	a72d56b2a6	scsi: lpfc: Correct fw download error message In situations when the firmware image in inappropriate for the chip type, initial validation checks were light, allowing the checks to pass, thus allowing the firmware to be downloaded. Eventually, after the download, the chip rejects the firmware but it is logged as a generic firmware download error. Revise the initial checks to validate the image vs asic type so that the correct message is displayed and the download process is avoided. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:16 -04:00
James Smart	48f8fdb4b4	scsi: lpfc: enhance LE data structure copies to hardware The driver builds the control structures in host memory using definitions that are based on 32-bit words. After building the structure it is then written to the adapter. This patch slightly optimizes LE hosts by copying the structures via 64-bit copies. This is doable as the adapter interface is LE thus there is no byteswapping as the copy is performed. The same optimization would be nice on BE systems, but when byteswapping occurs, it swaps 32-bit words as well, thus trashing the control structure. Given amount of code that is dependent upon the 32-bit word definition, it was decided to not change things for the minor optimization. Thus PPC 64-bit systems sticks with doing 32-bit copies. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:15 -04:00
James Smart	cd2400715c	scsi: lpfc: Change IO submit return to EBUSY if remote port is recovering I/O submission paths in the lpfc nvme path are rejecting the io with an error code that reflects back to the callee as a hard io failure. Many of these conditions are transient and would likely resolve if retried. Correct by returning -EBUSY, which the FC transport triggers off of to return busy status codes to the blk-mq layer. Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com> Signed-off-by: James Smart <james.smart@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 01:03:15 -04:00
Chad Dupuis	ab7ad49d01	scsi: qedf: Update version number to 8.33.16.20 Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:11 -04:00
Chad Dupuis	5d1c8b5ba0	scsi: qedf: Update copyright for 2018 Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:11 -04:00
Chad Dupuis	f3690a89f9	scsi: qedf: Add more defensive checks for concurrent error conditions During an uplink toggle test all error handling is done via timeout and firmware error conditions which can occur concurrently: - SCSI layer timeouts - Error detect CQEs - Firmware detected underruns - ABTS timeouts All these concurrent events require more defensive checks in the driver including: - Check both internally and externally generated aborts to make sure the xid is not already been aborted in another context or in cleanup. - Check back pointers in qedf_cmd_timeout to verify the context of the io_req, fcport and qedf_ctx - Check rport state in host reset handler to not reset the whole host if the rport is already uploaded or in the process of relogin - Check to state for an fcport before initiating a middle path ELS request Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:11 -04:00
Chad Dupuis	4f4616ceeb	scsi: qedf: Set the UNLOADING flag when removing a vport Similar to what we do when we remove a PCI function, set the QEDF_UNLOADING flag to prevent any requests from being queued while a vport is being deleted. This prevents any requests from getting stuck in limbo when the vport is unloaded or deleted. Fixes the crash: PID: 106676 TASK: ffff9a436aa90000 CPU: 12 COMMAND: "multipathd" #0 [ffff9a43567d3550] machine_kexec+522 at ffffffffaca60b2a #1 [ffff9a43567d35b0] __crash_kexec+114 at ffffffffacb13512 #2 [ffff9a43567d3680] crash_kexec+48 at ffffffffacb13600 #3 [ffff9a43567d3698] oops_end+168 at ffffffffad117768 #4 [ffff9a43567d36c0] no_context+645 at ffffffffad106f52 #5 [ffff9a43567d3710] __bad_area_nosemaphore+116 at ffffffffad106fe9 #6 [ffff9a43567d3760] bad_area+70 at ffffffffad107379 #7 [ffff9a43567d3788] __do_page_fault+1247 at ffffffffad11a8cf #8 [ffff9a43567d37f0] do_page_fault+53 at ffffffffad11a915 #9 [ffff9a43567d3820] page_fault+40 at ffffffffad116768 [exception RIP: qedf_init_task+61] RIP: ffffffffc0e13c2d RSP: ffff9a43567d38d0 RFLAGS: 00010046 RAX: 0000000000000000 RBX: ffffbe920472c738 RCX: ffff9a434fa0e3e8 RDX: ffff9a434f695280 RSI: ffffbe920472c738 RDI: ffff9a43aa359c80 RBP: ffff9a43567d3950 R8: 0000000000000c15 R9: ffff9a3fb09b9880 R10: ffff9a434fa0e3e8 R11: ffff9a43567d35ce R12: 0000000000000000 R13: ffff9a434f695280 R14: ffff9a43aa359c80 R15: ffff9a3fb9e005c0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:11 -04:00
Chad Dupuis	92bbccdf72	scsi: qedf: Add additional checks when restarting an rport due to ABTS timeout There are a couple of kernel cases when we restart a remote port due to ABTS timeout that we need to handle: 1. Flush any outstanding ABTS requests when flushing I/Os so that we do not hold up the eh_abort handler indefinitely causing process hangs. 2. Check if we are currently uploading a connection before issuing an ABTS. Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:11 -04:00
Chad Dupuis	96673e1e22	scsi: qedf: If qed fails to enable MSI-X fail PCI probe Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:11 -04:00
Chad Dupuis	65b7beca42	scsi: qedf: Honor default_prio module parameter even if DCBX does not converge Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:10 -04:00
Chad Dupuis	4b9b7fabb3	scsi: qedf: Improve firmware debug dump handling Get all firmware debug data instead of just a grc dump. Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:10 -04:00
Saurav Kashyap	f9a4a7f2c0	scsi: qedf: Remove setting DCBX pending during soft context reset PROBLEM DESCRIPTION: According to the logs, STAG was changing and it was triggering soft reset. In soft reset we used to virtual link down and up and also we were disabling DCBx flag. Since this was virtual link flap, DCBx never used to converge again. SOLUTION: Code change is to remove disabling DCBx flag from soft reset. Signed-off-by: Saurav Kashyap <saurav.kashyap@cavium.com> Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:10 -04:00
Chad Dupuis	8025c84208	scsi: qedf: Add task id to kref_get_unless_zero() debug messages when flushing requests Helps to corroborate which requests we can't get reference on and if it's real bug or not. Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:10 -04:00
Chad Dupuis	3f9de7f041	scsi: qedf: Check if link is already up when receiving a link up event from qed [mkp: typo] Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:10 -04:00
Chad Dupuis	a8f192bce1	scsi: qedf: Return request as DID_NO_CONNECT if MSI-X is not enabled Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:10 -04:00
Chad Dupuis	adf4884252	scsi: qedf: Release RRQ reference correctly when RRQ command times out When an RRQ request times out the reference is not getting decremented correctly as there are still ELS commands leftover when we flush any pending I/Os during offload: [ 281.788553] [0000:21:00.3]:[qedf_cmd_timeout:58]:4: ELS timeout, xid=0x96a. ... [ 281.788553] [0000:21:00.3]:[qedf_cmd_timeout:58]:4: ELS timeout, xid=0x96a. [ 281.788772] [0000:21:00.3]:[qedf_rrq_compl:182]:4: Entered. [ 281.788774] [0000:21:00.3]:[qedf_rrq_compl:200]:4: rrq_compl: orig io = ffffc90004c556f8, orig xid = 0x81b, rrq_xid = 0x96a, refcount=1 ... [ 331.448032] [0000:21:00.3]:[qedf_flush_els_req:1512]:4: Flushing ELS request xid=0x96a refcount=2. The fix is to call kref_put on the rrq_req in case of timeout as the timeout handler will call rrq_compl directly vs. a normal completion where it is call from els_compl. Signed-off-by: Chad Dupuis <chad.dupuis@cavium.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2018-05-08 00:57:10 -04:00

1 2 3 4 5 ...

16632 Commits