1043325 Commits

Author SHA1 Message Date
Kaige Fu
280bef5129 irqchip/gic-v3-its: Fix potential VPE leak on error
In its_vpe_irq_domain_alloc, when its_vpe_init() returns an error,
there is an off-by-one in the number of VPEs to be freed.

Fix it by simply passing the number of VPEs allocated, which is the
index of the loop iterating over the VPEs.

Fixes: 7d75bbb4bc1a ("irqchip/gic-v3-its: Add VPE irq domain allocation/teardown")
Signed-off-by: Kaige Fu <kaige.fu@linux.alibaba.com>
[maz: fixed commit message]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/d9e36dee512e63670287ed9eff884a5d8d6d27f2.1631672311.git.kaige.fu@linux.alibaba.com
2021-09-22 14:37:04 +01:00
Randy Dunlap
969ac78db7 irqchip/goldfish-pic: Select GENERIC_IRQ_CHIP to fix build
irq-goldfish-pic uses GENERIC_IRQ_CHIP interfaces so select that symbol
to fix build errors.

Fixes these build errors:

mips-linux-ld: drivers/irqchip/irq-goldfish-pic.o: in function `goldfish_pic_of_init':
irq-goldfish-pic.c:(.init.text+0xc0): undefined reference to `irq_alloc_generic_chip'
mips-linux-ld: irq-goldfish-pic.c:(.init.text+0xf4): undefined reference to `irq_gc_unmask_enable_reg'
mips-linux-ld: irq-goldfish-pic.c:(.init.text+0xf8): undefined reference to `irq_gc_unmask_enable_reg'
mips-linux-ld: irq-goldfish-pic.c:(.init.text+0x100): undefined reference to `irq_gc_mask_disable_reg'
mips-linux-ld: irq-goldfish-pic.c:(.init.text+0x104): undefined reference to `irq_gc_mask_disable_reg'
mips-linux-ld: irq-goldfish-pic.c:(.init.text+0x11c): undefined reference to `irq_setup_generic_chip'
mips-linux-ld: irq-goldfish-pic.c:(.init.text+0x168): undefined reference to `irq_remove_generic_chip'

Fixes: 4235ff50cf98 ("irqchip/irq-goldfish-pic: Add Goldfish PIC driver")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Miodrag Dinic <miodrag.dinic@mips.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Goran Ferenc <goran.ferenc@mips.com>
Cc: Aleksandar Markovic <aleksandar.markovic@mips.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210905162519.21507-1-rdunlap@infradead.org
2021-09-22 14:33:09 +01:00
Randy Dunlap
b999488361 irqchip/mbigen: Repair non-kernel-doc notation
Fix kernel-doc warnings in irq-mbigen.c:

irq-mbigen.c:29: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 * In mbigen vector register
irq-mbigen.c:43: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 * offset of clear register in mbigen node
irq-mbigen.c:50: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
 * offset of interrupt type register

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Jun Ma <majun258@huawei.com>
Cc: Yun Wu <wuyun.wu@huawei.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Aditya Srivastava <yashsri421@gmail.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210905033644.15988-1-rdunlap@infradead.org
2021-09-22 14:32:26 +01:00
Bixuan Cui
20c36ce216 irqdomain: Change the type of 'size' in __irq_domain_add() to be consistent
The 'size' is used in struct_size(domain, revmap, size) and its input
parameter type is 'size_t'(unsigned int).
Changing the size to 'unsigned int' to make the type consistent.

Signed-off-by: Bixuan Cui <cuibixuan@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210916025203.44841-1-cuibixuan@huawei.com
2021-09-22 14:29:32 +01:00
Marc Zyngier
2a7313dc81 irqchip/armada-370-xp: Fix ack/eoi breakage
When converting the driver to using handle_percpu_devid_irq,
we forgot to repaint the irq_eoi() callback into irq_ack(),
as handle_percpu_devid_fasteoi_ipi() was actually using EOI
really early in the handling. Yes this was a stupid idea.

Fix this by using the HW ack method as irq_ack().

Fixes: e52e73b7e9f7 ("irqchip/armada-370-xp: Make IPIs use handle_percpu_devid_irq()")
Reported-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Tested-by: Steffen Trumtrar <s.trumtrar@pengutronix.de>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: Valentin Schneider <valentin.schneider@arm.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87tuiexq5f.fsf@pengutronix.de
2021-09-22 14:24:49 +01:00
Dan Carpenter
372d1f3e1b ext2: fix sleeping in atomic bugs on error
The ext2_error() function syncs the filesystem so it sleeps.  The caller
is holding a spinlock so it's not allowed to sleep.

   ext2_statfs() <- disables preempt
   -> ext2_count_free_blocks()
      -> ext2_get_group_desc()

Fix this by using WARN() to print an error message and a stack trace
instead of using ext2_error().

Link: https://lore.kernel.org/r/20210921203233.GA16529@kili
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jan Kara <jack@suse.cz>
2021-09-22 13:05:23 +02:00
Heiko Stuebner
b22a4705e2 gpio/rockchip: fix get_direction value handling
The function uses the newly introduced rockchip_gpio_readl_bit()
which directly returns the actual value of the requeste bit.
So using the existing bit-wise check for the bit inside the value
will always return 0.

Fix this by dropping the bit manipulation on the result.

Fixes: 3bcbd1a85b68 ("gpio/rockchip: support next version gpio controller")
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2021-09-22 11:31:52 +02:00
Heiko Stuebner
0f562b7de9 gpio/rockchip: extended debounce support is only available on v2
The gpio driver runs into issues on v1 gpio blocks, as the db_clk
and the whole extended debounce support is only ever defined on v2.
So checking for the IS_ERR on the db_clk is not enough, as it will
be NULL on v1.

Fix this by adding the needed condition for v2 first before checking
the existence of the db_clk.

This caused my rk3288-veyron-pinky to enter a reboot loop when it
tried to enable the power-key as adc-key device.

Fixes: 3bcbd1a85b68 ("gpio/rockchip: support next version gpio controller")
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2021-09-22 11:31:43 +02:00
Steven Lee
f6c35df227 gpio: gpio-aspeed-sgpio: Fix wrong hwirq in irq handler.
The current hwirq is calculated based on the old GPIO pin order(input
GPIO range is from 0 to ngpios - 1).
It should be calculated based on the current GPIO input pin order(input
GPIOs are 0, 2, 4, ..., (ngpios - 1) * 2).

Signed-off-by: Steven Lee <steven_lee@aspeedtech.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2021-09-22 11:23:10 +02:00
Kunihiko Hayashi
2dd824cca3 gpio: uniphier: Fix void functions to remove return value
The return type of irq_chip.irq_mask() and irq_chip.irq_unmask() should
be void.

Fixes: dbe776c2ca54 ("gpio: uniphier: add UniPhier GPIO controller driver")
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2021-09-22 11:19:29 +02:00
Hans de Goede
cef0d022f5 gpiolib: acpi: Make set-debounce-timeout failures non fatal
Commit 8dcb7a15a585 ("gpiolib: acpi: Take into account debounce settings")
made the gpiolib-acpi code call gpio_set_debounce_timeout() when requesting
GPIOs.

This in itself is fine, but it also made gpio_set_debounce_timeout()
errors fatal, causing the requesting of the GPIO to fail. This is causing
regressions. E.g. on a HP ElitePad 1000 G2 various _AEI specified GPIO
ACPI event sources specify a debouncy timeout of 20 ms, but the
pinctrl-baytrail.c only supports certain fixed values, the closest
ones being 12 or 24 ms and pinctrl-baytrail.c responds with -EINVAL
when specified a value which is not one of the fixed values.

This is causing the acpi_request_own_gpiod() call to fail for 3
ACPI event sources on the HP ElitePad 1000 G2, which in turn is causing
e.g. the battery charging vs discharging status to never get updated,
even though a charger has been plugged-in or unplugged.

Make gpio_set_debounce_timeout() errors non fatal, warning about the
failure instead, to fix this regression.

Note we should probably also fix various pinctrl drivers to just
pick the first bigger discrete value rather then returning -EINVAL but
this will need to be done on a per driver basis, where as this fix
at least gets us back to where things were before and thus restores
functionality on devices where this was lost due to
gpio_set_debounce_timeout() errors.

Fixes: 8dcb7a15a585 ("gpiolib: acpi: Take into account debounce settings")
Depends-on: 2e2b496cebef ("gpiolib: acpi: Extract acpi_request_own_gpiod() helper")
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bartosz Golaszewski <brgl@bgdev.pl>
2021-09-22 11:16:42 +02:00
Jiri Slaby
f7d848e0fd MAINTAINERS: usb, update Peter Korsgaard's entries
Peter's e-mail in MAINTAINERS is defunct:
This is the qmail-send program at a.mx.sunsite.dk.
<jacmet@sunsite.dk>:
      Sorry, no mailbox here by that name. (#5.1.1)

Peter says:
** Ahh yes, it should be changed to peter@korsgaard.com.

However he also says:
** I haven't had access to c67x00 hw for quite some years though, so maybe
** it should be marked Orphan instead?

So change as he wishes.

Cc: Peter Korsgaard <peter@korsgaard.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-usb@vger.kernel.org
Acked-by: Peter Korsgaard <peter@korsgaard.com>
Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Link: https://lore.kernel.org/r/20210922063008.25758-1-jslaby@suse.cz
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-22 09:11:49 +02:00
Wen Xiong
fbdac19e64 scsi: ses: Retry failed Send/Receive Diagnostic commands
Setting SCSI logging level with error=3, we saw some errors from enclosues:

[108017.360833] ses 0:0:9:0: tag#641 Done: NEEDS_RETRY Result: hostbyte=DID_ERROR driverbyte=DRIVER_OK cmd_age=0s
[108017.360838] ses 0:0:9:0: tag#641 CDB: Receive Diagnostic 1c 01 01 00 20 00
[108017.427778] ses 0:0:9:0: Power-on or device reset occurred
[108017.427784] ses 0:0:9:0: tag#641 Done: SUCCESS Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=0s
[108017.427788] ses 0:0:9:0: tag#641 CDB: Receive Diagnostic 1c 01 01 00 20 00
[108017.427791] ses 0:0:9:0: tag#641 Sense Key : Unit Attention [current]
[108017.427793] ses 0:0:9:0: tag#641 Add. Sense: Bus device reset function occurred
[108017.427801] ses 0:0:9:0: Failed to get diagnostic page 0x1
[108017.427804] ses 0:0:9:0: Failed to bind enclosure -19
[108017.427895] ses 0:0:10:0: Attached Enclosure device
[108017.427942] ses 0:0:10:0: Attached scsi generic sg18 type 13

Retry if the Send/Receive Diagnostic commands complete with a transient
error status (NOT_READY or UNIT_ATTENTION with ASC 0x29).

Link: https://lore.kernel.org/r/1631849061-10210-2-git-send-email-wenxiong@linux.ibm.com
Reviewed-by: Brian King <brking@linux.ibm.com>
Reviewed-by: James Bottomley <jejb@linux.ibm.com>
Signed-off-by: Wen Xiong <wenxiong@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-22 00:36:40 -04:00
Colin Ian King
9a8ef2c73c scsi: target: Fix spelling mistake "CONFLIFT" -> "CONFLICT"
There is a spelling mistake in a dev_err message. Fix it.

Link: https://lore.kernel.org/r/20210920183206.17477-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-22 00:17:29 -04:00
Arnd Bergmann
a38923f2d0 scsi: lpfc: Fix gcc -Wstringop-overread warning, again
I fixed a stringop-overread warning earlier this year, now a second copy of
the original code was added and the warning came back:

drivers/scsi/lpfc/lpfc_attr.c: In function 'lpfc_cmf_info_show':
drivers/scsi/lpfc/lpfc_attr.c:289:25: error: 'strnlen' specified bound 4095 exceeds source size 24 [-Werror=stringop-overread]
  289 |                         strnlen(LPFC_INFO_MORE_STR, PAGE_SIZE - 1),
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fix it the same way as the other copy.

Link: https://lore.kernel.org/r/20210920095628.1191676-1-arnd@kernel.org
Fixes: ada48ba70f6b ("scsi: lpfc: Fix gcc -Wstringop-overread warning")
Fixes: 74a7baa2a3ee ("scsi: lpfc: Add cmf_info sysfs entry")
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-22 00:16:38 -04:00
Dan Carpenter
6dacc371b7 scsi: lpfc: Use correct scnprintf() limit
The limit should be "PAGE_SIZE - len" instead of "PAGE_SIZE".  We're not
going to hit the limit so this fix will not affect runtime.

Link: https://lore.kernel.org/r/20210916132331.GE25094@kili
Fixes: 5b9e70b22cc5 ("scsi: lpfc: raise sg count for nvme to use available sg resources")
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-22 00:15:05 -04:00
Dan Carpenter
cdbc16c552 scsi: lpfc: Fix sprintf() overflow in lpfc_display_fpin_wwpn()
This scnprintf() uses the wrong limit.  It should be
"LPFC_FPIN_WWPN_LINE_SZ - len" instead of LPFC_FPIN_WWPN_LINE_SZ.

Link: https://lore.kernel.org/r/20210916132251.GD25094@kili
Fixes: 428569e66fa7 ("scsi: lpfc: Expand FPIN and RDF receive logging")
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-22 00:14:34 -04:00
Hannes Reinecke
a4869faf96 scsi: core: Remove 'current_tag'
The 'current_tag' field in struct scsi_device is unused now; remove it.

Link: https://lore.kernel.org/r/1631696835-136198-4-git-send-email-john.garry@huawei.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-22 00:13:38 -04:00
Hannes Reinecke
756fb6a895 scsi: acornscsi: Remove tagged queuing vestiges
The acornscsi driver has a config option to enable tagged queuing, but this
option gets disabled in the driver itself with the comment 'needs to be
debugged'.  As this is a _really_ old driver I doubt anyone will be wanting
to invest time here, so remove the tagged queue vestiges and make our lives
easier.

[jpg: Use scsi_cmd_to_rq()]

Link: https://lore.kernel.org/r/1631696835-136198-3-git-send-email-john.garry@huawei.com
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-22 00:13:38 -04:00
Hannes Reinecke
bc41fcbffd scsi: fas216: Kill scmd->tag
The driver is attempting to allocate a tag internally which is a no-go with
blk-mq. Switch the driver to use the request tag and kill usage of
scmd->tag and scmd->device->current_tag.

[jpg: Change to use scsi_cmd_to_rq()]

Link: https://lore.kernel.org/r/1631696835-136198-2-git-send-email-john.garry@huawei.com
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-22 00:13:38 -04:00
Dmitry Bogdanov
5f85790388 scsi: qla2xxx: Restore initiator in dual mode
In dual mode in case of disabling the target, the whole port goes offline
and initiator is turned off too.

Fix restoring initiator mode after disabling target in dual mode.

Link: https://lore.kernel.org/r/20210915153239.8035-1-d.bogdanov@yadro.com
Fixes: 0645cb8350cd ("scsi: qla2xxx: Add mode control for each physical port")
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Dmitry Bogdanov <d.bogdanov@yadro.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-21 23:58:57 -04:00
Bart Van Assche
d04a968c33 scsi: ufs: core: Unbreak the reset handler
A command tag is passed as the second argument of the
__ufshcd_transfer_req_compl() call in ufshcd_eh_device_reset_handler()
instead of a bitmask. Fix this by passing a bitmask as argument instead of
a command tag.

Link: https://lore.kernel.org/r/20210916175408.2260084-1-bvanassche@acm.org
Fixes: a45f937110fa ("scsi: ufs: Optimize host lock on transfer requests send/compl paths")
Cc: Can Guo <cang@codeaurora.org>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-21 23:56:19 -04:00
Bart Van Assche
1d479e6c9c scsi: sd_zbc: Support disks with more than 2**32 logical blocks
This patch addresses the following Coverity report about the zno *
sdkp->zone_blocks expression:

CID 1475514 (#1 of 1): Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN)
overflow_before_widen: Potentially overflowing expression zno *
sdkp->zone_blocks with type unsigned int (32 bits, unsigned) is evaluated
using 32-bit arithmetic, and then used in a context that expects an
expression of type sector_t (64 bits, unsigned).

Link: https://lore.kernel.org/r/20210917212314.2362324-1-bvanassche@acm.org
Fixes: 5795eb443060 ("scsi: sd_zbc: emulate ZONE_APPEND commands")
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Cc: Damien Le Moal <Damien.LeMoal@wdc.com>
Cc: Hannes Reinecke <hare@suse.de>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-21 23:52:55 -04:00
Adrian Hunter
88b099006d scsi: ufs: core: Revert "scsi: ufs: Synchronize SCSI and UFS error handling"
This reverts commit a113eaaf86373362b053279049907ff82b5df6c8.

There are a couple of issues with the commit:

 1. It causes deadlocks.

 2. It causes the shost->eh_cmd_q list of failed requests not to be
    processed, ever.

So revert it.

1. Deadlocks

The SCSI error handler runs with requests blocked beginning when
scsi_schedule_eh() sets SHOST_RECOVERY state, continuing through
scsi_error_handler() callback ->eh_strategy_handler() until
scsi_restart_operations() is called.  By setting eh_strategy_handler to
ufshcd_err_handler, the patch changed the UFS error handler to run with
requests blocked, including PM requests, for the entire run of the error
handler.

That conflicts with UFS error handler existing synchronization with UFS
device PM operations.  The UFS error handler synchronizes with runtime PM
by doing pm_runtime_get_sync() prior to blocking requests itself.  It
synchronizes with system PM by use of hba->host_sem, again before blocking
requests itself.  However, if requests are already blocked, then PM
operations will block.  So:

   the UFS error handler blocks waiting on PM
 + PM blocks waiting on SCSI PM requests to process or fail
 + PM requests are blocked waiting on error handling to finish
 =  deadlock

This happens both for runtime PM and system PM.

Prior to the patch, these deadlocks could not happen even if SCSI error
handling was running, because the presence of requests in shost->eh_cmd_q
would mean the queues could not be suspended, which would mean that, should
the UFS error handler run at the same time, it would not need to wait for
PM or vice versa.

Please note these scenarios are not just theoretical, they were found
during testing on a Samsung Galaxy Book S.

2. ->eh_strategy_handler() must process shost->eh_cmd_q list of failed
requests, as all other eh_strategy_handler's do except UFS error handler.
Refer for example: scsi_unjam_host(), ata_scsi_error() and
sas_scsi_recover_host().

Link: https://lore.kernel.org/r/20210917144349.14058-1-adrian.hunter@intel.com
Fixes: a113eaaf8637 ("scsi: ufs: Synchronize SCSI and UFS error handling")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-09-21 23:48:14 -04:00
Jakub Kicinski
b52d3161c2 Merge branch 's390-qeth-fixes-2021-09-21'
Julian Wiedmann says:

====================
s390/qeth: fixes 2021-09-21

This brings two fixes for deadlocks when a device is removed while it
has certain types of async work pending. And one additional fix for a
missing NULL check in an error case.
====================

Link: https://lore.kernel.org/r/20210921145217.1584654-1-jwi@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-21 20:02:28 -07:00
Alexandra Winter
d2b59bd4b0 s390/qeth: fix deadlock during failing recovery
Commit 0b9902c1fcc5 ("s390/qeth: fix deadlock during recovery") removed
taking discipline_mutex inside qeth_do_reset(), fixing potential
deadlocks. An error path was missed though, that still takes
discipline_mutex and thus has the original deadlock potential.

Intermittent deadlocks were seen when a qeth channel path is configured
offline, causing a race between qeth_do_reset and ccwgroup_remove.
Call qeth_set_offline() directly in the qeth_do_reset() error case and
then a new variant of ccwgroup_set_offline(), without taking
discipline_mutex.

Fixes: b41b554c1ee7 ("s390/qeth: fix locking for discipline setup / removal")
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-21 20:02:24 -07:00
Alexandra Winter
ee909d0b1d s390/qeth: Fix deadlock in remove_discipline
Problem: qeth_close_dev_handler is a worker that tries to acquire
card->discipline_mutex via drv->set_offline() in ccwgroup_set_offline().
Since commit b41b554c1ee7
("s390/qeth: fix locking for discipline setup / removal")
qeth_remove_discipline() is called under card->discipline_mutex and
cancels the work and waits for it to finish.

STOPLAN reception with reason code IPA_RC_VEPA_TO_VEB_TRANSITION is the
only situation that schedules close_dev_work. In that situation scheduling
qeth recovery will also result in an offline interface, when resetting the
isolation mode fails, if the external switch is still set to VEB.
And since commit 0b9902c1fcc5 ("s390/qeth: fix deadlock during recovery")
qeth recovery does not aquire card->discipline_mutex anymore.

So we accept the longer pathlength of qeth_schedule_recovery in this
error situation and re-use the existing function.

As a side-benefit this changes the hwtrap to behave like during recovery
instead of like during a user-triggered set_offline.

Fixes: b41b554c1ee7 ("s390/qeth: fix locking for discipline setup / removal")
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Acked-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-21 20:02:24 -07:00
Julian Wiedmann
248f064af2 s390/qeth: fix NULL deref in qeth_clear_working_pool_list()
When qeth_set_online() calls qeth_clear_working_pool_list() to roll
back after an error exit from qeth_hardsetup_card(), we are at risk of
accessing card->qdio.in_q before it was allocated by
qeth_alloc_qdio_queues() via qeth_mpc_initialize().

qeth_clear_working_pool_list() then dereferences NULL, and by writing to
queue->bufs[i].pool_entry scribbles all over the CPU's lowcore.
Resulting in a crash when those lowcore areas are used next (eg. on
the next machine-check interrupt).

Such a scenario would typically happen when the device is first set
online and its queues aren't allocated yet. An early IO error or certain
misconfigs (eg. mismatched transport mode, bad portno) then cause us to
error out from qeth_hardsetup_card() with card->qdio.in_q still being
NULL.

Fix it by checking the pointer for NULL before accessing it.

Note that we also have (rare) paths inside qeth_mpc_initialize() where
a configuration change can cause us to free the existing queues,
expecting that subsequent code will allocate them again. If we then
error out before that re-allocation happens, the same bug occurs.

Fixes: eff73e16ee11 ("s390/qeth: tolerate pre-filled RX buffer")
Reported-by: Stefan Raspl <raspl@linux.ibm.com>
Root-caused-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-21 20:02:23 -07:00
Dan Carpenter
e946d3c887 cifs: fix a sign extension bug
The problem is the mismatched types between "ctx->total_len" which is
an unsigned int, "rc" which is an int, and "ctx->rc" which is a
ssize_t.  The code does:

	ctx->rc = (rc == 0) ? ctx->total_len : rc;

We want "ctx->rc" to store the negative "rc" error code.  But what
happens is that "rc" is type promoted to a high unsigned int and
'ctx->rc" will store the high positive value instead of a negative
value.

The fix is to change "rc" from an int to a ssize_t.

Fixes: c610c4b619e5 ("CIFS: Add asynchronous write support through kernel AIO")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-21 20:51:50 -05:00
Mark Brown
96c8395e21
spi: Revert modalias changes
During the v5.13 cycle we updated the SPI subsystem to generate OF style
modaliases for SPI devices, replacing the old Linux style modalises we
used to generate based on spi_device_id which are the DT style name with
the vendor removed.  Unfortunately this means that we start only
reporting OF style modalises and not the old ones and there is nothing
that ensures that drivers list every possible OF compatible string in
their OF ID table.  The result is that there are systems which have been
relying on loading modules based on the old style that are now broken,
as found by Russell King with spi-nor on Macchiatobin.

spi-nor is a particularly problematic case for this, it only lists a
single generic DT compatible jedec,spi-nor in the driver but supports a
huge raft of device specific compatibles, with a large set of part
numbers many of which are offered by multiple vendors.  Russell's
searches of upstream device trees has turned up examples with vendor
names written in non-standard ways too.  To make matters worse up until
8ff16cf77ce3 ("Documentation: devicetree: m25p80: add "nor-jedec"
binding") the generic compatible was not part of the binding so there
are device trees out there written to that binding version which don't
list it all.  The sheer number of parts supported together with our
previous approach of ignoring the vendor ID makes robustly fixing this
by adding compatibles to the spi-nor driver seem problematic, the
current DT binding document does not list all the parts supported by the
driver at the minute (further patches will fix this).

I've also investigated supporting both formats of modalias
simultaneously but that doesn't seem possible, especially without
breaking our userspace ABI which is obviously not viable.

Instead revert the relevant changes for now:

e09f2ab8eecc ("spi: update modalias_show after of_device_uevent_modalias support")
3ce6c9e2617e ("spi: add of_device_uevent_modalias support")

This will unfortunately mean that any system which had started having
modules autoload based on the OF compatibles for drivers that list
things there but not in the spi_device_ids will now not have those
modules load which is itself a regression.  Since it affects a narrower
time window and the particularly problematic spi-nor driver may be
critical to system boot on smaller systems this seems the best of a
series of bad options.  I will start an audit of SPI drivers to identify
and fix cases where things won't autoload using spi_device_id, this is
not great but seems to be the best way forward that anyone has been able
to identify.

Thanks to Russell for both his report and the additional diagnostic and
analysis work he has done here, the detailed research above was his
work.

Fixes: e09f2ab8eecc ("spi: update modalias_show after of_device_uevent_modalias support")
Fixes: 3ce6c9e2617e ("spi: add of_device_uevent_modalias support")
Reported-by: Russell King (Oracle) <linux@armlinux.org.uk>
Suggested-by: Russell King (Oracle) <linux@armlinux.org.uk>
Signed-off-by: Mark Brown <broonie@kernel.org>
Tested-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Cc: Andreas Schwab <schwab@suse.de>
Cc: Marco Felsch <m.felsch@pengutronix.de>
2021-09-21 20:22:48 +01:00
Namjae Jeon
9f6323311c ksmbd: add default data stream name in FILE_STREAM_INFORMATION
Windows client expect to get default stream name(::DATA) in
FILE_STREAM_INFORMATION response even if there is no stream data in file.
This patch fix update failure when writing ppt or doc files.

Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
Reviewed-By: Tom Talpey <tom@talpey.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-21 12:53:23 -05:00
Steve French
e44fd5081c ksmbd: log that server is experimental at module load
While we are working through detailed security reviews
of ksmbd server code we should remind users that it is an
experimental module by adding a warning when the module
loads.  Currently the module shows as experimental
in Kconfig and is disabled by default, but we don't want
to confuse users.

Although ksmbd passes a wide variety of the
important functional tests (since initial focus had
been largely on functional testing such as smbtorture,
xfstests etc.), and ksmbd has added key security
features (e.g. GCM256 encryption, Kerberos support),
there are ongoing detailed reviews of the code base
for path processing and network buffer decoding, and
this patch reminds users that the module should be
considered "experimental."

Reviewed-by: Namjae Jeon <linkinjeon@kernel.org>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-09-21 12:35:20 -05:00
Cristian Marussi
0e3dbf765f kselftest/arm64: signal: Skip tests if required features are missing
During initialization of a signal testcase, features declared as required
are properly checked against the running system but no action is then taken
to effectively skip such a testcase.

Fix core signals test logic to abort initialization and report such a
testcase as skipped to the KSelfTest framework.

Fixes: f96bf4340316 ("kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utils")
Signed-off-by: Cristian Marussi <cristian.marussi@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20210920121228.35368-1-cristian.marussi@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-09-21 18:12:03 +01:00
Linus Torvalds
92477dd1fa s390 eBPF JIT miscompilation issues fixes.
These issues can be used by an unprivileged local user to circumvent the verifier and gain root privileges.
 
 v4.1+
 s390/bpf: Fix optimizing out zero-extensions
 s390/bpf: Fix 64-bit subtraction of the -0x80000000 constant
 v5.5+
 s390/bpf: Fix branch shortening during codegen pass
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmFDDAoACgkQjYWKoQLX
 FBiK/ggAi9LxNSrxNI4u83PY/ubZ+13XIkB59BtvShHnzSSw/e0Sdbr7dnedljaF
 qWv9MUF9s1P88DRlXw/gHIckvIH77uvT28i5Omg6JfUnBLB0LaVC4QiwhmfbqBbV
 psew9zV1z0/2YFvagCsc/XdF4QdlyLm3GKO/bEAeFXG/jDigkzBVB3clffIv7zm7
 mD4uXgELil8PNv9tliRonUwoGG5P+0tF7KviTxi3aU7NLXtZxJ1Qt0N+HjRaDpwR
 kxvSaukn61yWafUWlgbmGL8HGqhFqCk4SQzHSAZ8m87dl6KjxDiQUjvccn5xQPF2
 M7JQU7INFApvK2QZFgW66jnTeqvUcQ==
 =Qmcn
 -----END PGP SIGNATURE-----

Merge tag 's390-5.15-ebpf-jit-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 eBPF fixes from Vasily Gorbik:
 "Johan Almbladh has implemented a number of new testcases for eBPF [1],
  which uncovered three miscompilation issues in the s390 eBPF JIT"

Link: https://lore.kernel.org/bpf/20210902185229.1840281-1-johan.almbladh@anyfinetworks.com/ [1]

* tag 's390-5.15-ebpf-jit-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/bpf: Fix optimizing out zero-extensions
  s390/bpf: Fix 64-bit subtraction of the -0x80000000 constant
  s390/bpf: Fix branch shortening during codegen pass
2021-09-21 09:36:11 -07:00
Ian Abbott
bb509a6ffe comedi: Fix memory leak in compat_insnlist()
`compat_insnlist()` handles the 32-bit version of the `COMEDI_INSNLIST`
ioctl (whenwhen `CONFIG_COMPAT` is enabled).  It allocates memory to
temporarily hold an array of `struct comedi_insn` converted from the
32-bit version in user space.  This memory is only being freed if there
is a fault while filling the array, otherwise it is leaked.

Add a call to `kfree()` to fix the leak.

Fixes: b8d47d881305 ("comedi: get rid of compat_alloc_user_space() mess in COMEDI_INSNLIST compat")
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-staging@lists.linux.dev
Cc: <stable@vger.kernel.org> # 5.13+
Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Link: https://lore.kernel.org/r/20210916145023.157479-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-21 17:53:54 +02:00
Dan Carpenter
708c87168b ceph: fix off by one bugs in unsafe_request_wait()
The "> max" tests should be ">= max" to prevent an out of bounds access
on the next lines.

Fixes: e1a4541ec0b9 ("ceph: flush the mdlog before waiting on unsafe reqs")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2021-09-21 17:39:20 +02:00
Geert Uytterhoeven
7af526c740 nvmem: NVMEM_NINTENDO_OTP should depend on WII
The Nintendo Wii and Wii U OTP is only present on Nintendo Wii and Wii U
consoles.  Hence add a dependency on WII, to prevent asking the user
about this driver when configuring a kernel without Nintendo Wii and Wii
U console support.

Fixes: 3683b761fe3a10ad ("nvmem: nintendo-otp: Add new driver for the Wii and Wii U OTP")
Reviewed-by: Emmanuel Gil Peyrot <linkmauve@linkmauve.fr>
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/01318920709dddc4d85fe895e2083ca0eee234d8.1631611652.git.geert+renesas@glider.be
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-21 17:38:37 +02:00
Linus Torvalds
d5f6545934 qnx4: work around gcc false positive warning bug
In commit b7213ffa0e58 ("qnx4: avoid stringop-overread errors") I tried
to teach gcc about how the directory entry structure can be two
different things depending on a status flag.  It made the code clearer,
and it seemed to make gcc happy.

However, Arnd points to a gcc bug, where despite using two different
members of a union, gcc then gets confused, and uses the size of one of
the members to decide if a string overrun happens.  And not necessarily
the rigth one.

End result: with some configurations, gcc-11 will still complain about
the source buffer size being overread:

  fs/qnx4/dir.c: In function 'qnx4_readdir':
  fs/qnx4/dir.c:76:32: error: 'strnlen' specified bound [16, 48] exceeds source size 1 [-Werror=stringop-overread]
     76 |                         size = strnlen(name, size);
        |                                ^~~~~~~~~~~~~~~~~~~
  fs/qnx4/dir.c:26:22: note: source object declared here
     26 |                 char de_name;
        |                      ^~~~~~~

because gcc will get confused about which union member entry is actually
getting accessed, even when the source code is very clear about it.  Gcc
internally will have combined two "redundant" pointers (pointing to
different union elements that are at the same offset), and takes the
size checking from one or the other - not necessarily the right one.

This is clearly a gcc bug, but we can work around it fairly easily.  The
biggest thing here is the big honking comment about why we do what we
do.

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578#c6
Reported-and-tested-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-21 08:36:48 -07:00
Dan Carpenter
517c7bf99b usb: musb: tusb6010: uninitialized data in tusb_fifo_write_unaligned()
This is writing to the first 1 - 3 bytes of "val" and then writing all
four bytes to musb_writel().  The last byte is always going to be
garbage.  Zero out the last bytes instead.

Fixes: 550a7375fe72 ("USB: Add MUSB and TUSB support")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210916135737.GI25094@kili
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-21 16:38:21 +02:00
Ondrej Zary
b55d37ef6b usb-storage: Add quirk for ScanLogic SL11R-IDE older than 2.6c
ScanLogic SL11R-IDE with firmware older than 2.6c (the latest one) has
broken tag handling, preventing the device from working at all:
usb 1-1: new full-speed USB device number 2 using uhci_hcd
usb 1-1: New USB device found, idVendor=04ce, idProduct=0002, bcdDevice= 2.60
usb 1-1: New USB device strings: Mfr=1, Product=1, SerialNumber=0
usb 1-1: Product: USB Device
usb 1-1: Manufacturer: USB Device
usb-storage 1-1:1.0: USB Mass Storage device detected
scsi host2: usb-storage 1-1:1.0
usbcore: registered new interface driver usb-storage
usb 1-1: reset full-speed USB device number 2 using uhci_hcd
usb 1-1: reset full-speed USB device number 2 using uhci_hcd
usb 1-1: reset full-speed USB device number 2 using uhci_hcd
usb 1-1: reset full-speed USB device number 2 using uhci_hcd

Add US_FL_BULK_IGNORE_TAG to fix it. Also update my e-mail address.

2.6c is the only firmware that claims Linux compatibility.
The firmware can be upgraded using ezotgdbg utility:
https://github.com/asciilifeform/ezotgdbg

Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Ondrej Zary <linux@zary.sk>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210913210106.12717-1-linux@zary.sk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-21 16:37:52 +02:00
Julian Sikorski
ce1c42b4da Re-enable UAS for LaCie Rugged USB3-FW with fk quirk
Further testing has revealed that LaCie Rugged USB3-FW does work with
uas as long as US_FL_NO_REPORT_OPCODES and US_FL_NO_SAME are enabled.

Link: https://lore.kernel.org/linux-usb/2167ea48-e273-a336-a4e0-10a4e883e75e@redhat.com/
Cc: stable <stable@vger.kernel.org>
Suggested-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Julian Sikorski <belegdol+github@gmail.com>
Link: https://lore.kernel.org/r/20210913181454.7365-1-belegdol+github@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-21 16:37:26 +02:00
Johan Hovold
d9d1232b48 misc: bcm-vk: fix tty registration race
Make sure to set the tty class-device driver data before registering the
tty to avoid having a racing open() dereference a NULL pointer.

Fixes: 91ca10d6fa07 ("misc: bcm-vk: add ttyVK support")
Cc: stable@vger.kernel.org      # 5.12
Signed-off-by: Johan Hovold <johan@kernel.org>
Link: https://lore.kernel.org/r/20210917115736.5816-1-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-09-21 16:17:15 +02:00
Robin Murphy
59a68d4138 arm64: Mitigate MTE issues with str{n}cmp()
As with strlen(), the patches importing the updated str{n}cmp()
implementations were originally developed and tested before the
advent of CONFIG_KASAN_HW_TAGS, and have subsequently revealed
not to be MTE-safe. Since in-kernel MTE is still a rather niche
case, let it temporarily fall back to the generic C versions for
correctness until we can figure out the best fix.

Fixes: 758602c04409 ("arm64: Import latest version of Cortex Strings' strcmp")
Fixes: 020b199bc70d ("arm64: Import latest version of Cortex Strings' strncmp")
Cc: <stable@vger.kernel.org> # 5.14.x
Reported-by: Branislav Rankov <branislav.rankov@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/34dc4d12eec0adae49b0ac927df642ed10089d40.1631890770.git.robin.murphy@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-09-21 14:50:19 +01:00
Tobias Jakobi
6f6aab1caf platform/x86: gigabyte-wmi: add support for B550I Aorus Pro AX
Tested with a AMD Ryzen 7 5800X.

Signed-off-by: Tobias Jakobi <tjakobi@math.uni-bielefeld.de>
Acked-by: Thomas Weißschuh <thomas@weissschuh.net>
Link: https://lore.kernel.org/r/20210921100702.3838-1-tjakobi@math.uni-bielefeld.de
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-09-21 15:49:23 +02:00
José Expósito
b201cb0ebe platform/x86/intel: hid: Add DMI switches allow list
Some devices, even non convertible ones, can send incorrect
SW_TABLET_MODE reports.

Add an allow list and accept such reports only from devices in it.

Bug reported for Dell XPS 17 9710 on:
https://gitlab.freedesktop.org/libinput/libinput/-/issues/662

Reported-by: Tobias Gurtzick <magic@wizardtales.com>
Suggested-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Tobias Gurtzick <magic@wizardtales.com>
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Link: https://lore.kernel.org/r/20210920160312.9787-1-jose.exposito89@gmail.com
[hdegoede@redhat.com: Check dmi_switches_auto_add_allow_list only once]
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-09-21 15:49:16 +02:00
Randy Dunlap
5b72dafaca platform/x86: dell: fix DELL_WMI_PRIVACY dependencies & build error
When DELL_WMI=y, DELL_WMI_PRIVACY=y, and LEDS_TRIGGER_AUDIO=m, there
is a linker error since the LEDS trigger code is built as a loadable
module. This happens because DELL_WMI_PRIVACY is a bool that depends
on a tristate (LEDS_TRIGGER_AUDIO=m), which can be dangerous.

ld: drivers/platform/x86/dell/dell-wmi-privacy.o: in function `dell_privacy_wmi_probe':
dell-wmi-privacy.c:(.text+0x3df): undefined reference to `ledtrig_audio_get'

Fixes: 8af9fa37b8a3 ("platform/x86: dell-privacy: Add support for Dell hardware privacy")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Perry Yuan <Perry.Yuan@dell.com>
Cc: Dell.Client.Kernel@dell.com
Cc: platform-driver-x86@vger.kernel.org
Cc: Hans de Goede <hdegoede@redhat.com>
Cc: Mark Gross <mgross@linux.intel.com>
Link: https://lore.kernel.org/r/20210918044829.19222-1-rdunlap@infradead.org
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2021-09-21 15:49:09 +02:00
Ansuel Smith
cf96921876 thermal/drivers/tsens: Fix wrong check for tzd in irq handlers
Some devices can have some thermal sensors disabled from the
factory. The current two irq handler functions check all the sensor by
default and the check if the sensor was actually registered is
wrong. The tzd is actually never set if the registration fails hence
the IS_ERR check is wrong.

Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com>
Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210907212543.20220-1-ansuelsmth@gmail.com
2021-09-21 15:17:11 +02:00
Dan Carpenter
1bb30b20b4 thermal/core: Potential buffer overflow in thermal_build_list_of_policies()
After printing the list of thermal governors, then this function prints
a newline character.  The problem is that "size" has not been updated
after printing the last governor.  This means that it can write one
character (the NUL terminator) beyond the end of the buffer.

Get rid of the "size" variable and just use "PAGE_SIZE - count" directly.

Fixes: 1b4f48494eb2 ("thermal: core: group functions related to governor handling")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20210916131342.GB25094@kili
2021-09-21 15:17:11 +02:00
David S. Miller
b3f98404bd Merge branch 'dsa-devres'
Vladimir Oltean says:

====================
Fix mdiobus users with devres

Commit ac3a68d56651 ("net: phy: don't abuse devres in
devm_mdiobus_register()") by Bartosz Golaszewski has introduced two
classes of potential bugs by making the devres callback of
devm_mdiobus_alloc stop calling mdiobus_unregister.

The exact buggy circumstances are presented in the individual commit
messages. I have searched the tree for other occurrences, but at the
moment:

- for issue (a) I have no concrete proof that other buses except SPI and
  I2C suffer from it, and the only SPI or I2C device drivers that call
  of_mdiobus_alloc are the DSA drivers that leave a NULL
  ds->slave_mii_bus and a non-NULL ds->ops->phy_read, aka ksz9477,
  ksz8795, lan9303_i2c, vsc73xx-spi.

- for issue (b), all drivers which call of_mdiobus_alloc either use
  of_mdiobus_register too, or call mdiobus_unregister sometime within
  the ->remove path.

Although at this point I've seen enough strangeness caused by this
"device_del during ->shutdown" that I'm just going to copy the SPI and
I2C subsystem maintainers to this patch series, to get their feedback
whether they've had reports about things like this before. I don't think
other buses behave in this way, it forces SPI and I2C devices to have to
protect themselves from a really strange set of issues.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-21 13:52:16 +01:00
Vladimir Oltean
74b6d7d133 net: dsa: realtek: register the MDIO bus under devres
The Linux device model permits both the ->shutdown and ->remove driver
methods to get called during a shutdown procedure. Example: a DSA switch
which sits on an SPI bus, and the SPI bus driver calls this on its
->shutdown method:

spi_unregister_controller
-> device_for_each_child(&ctlr->dev, NULL, __unregister);
   -> spi_unregister_device(to_spi_device(dev));
      -> device_del(&spi->dev);

So this is a simple pattern which can theoretically appear on any bus,
although the only other buses on which I've been able to find it are
I2C:

i2c_del_adapter
-> device_for_each_child(&adap->dev, NULL, __unregister_client);
   -> i2c_unregister_device(client);
      -> device_unregister(&client->dev);

The implication of this pattern is that devices on these buses can be
unregistered after having been shut down. The drivers for these devices
might choose to return early either from ->remove or ->shutdown if the
other callback has already run once, and they might choose that the
->shutdown method should only perform a subset of the teardown done by
->remove (to avoid unnecessary delays when rebooting).

So in other words, the device driver may choose on ->remove to not
do anything (therefore to not unregister an MDIO bus it has registered
on ->probe), because this ->remove is actually triggered by the
device_shutdown path, and its ->shutdown method has already run and done
the minimally required cleanup.

This used to be fine until the blamed commit, but now, the following
BUG_ON triggers:

void mdiobus_free(struct mii_bus *bus)
{
	/* For compatibility with error handling in drivers. */
	if (bus->state == MDIOBUS_ALLOCATED) {
		kfree(bus);
		return;
	}

	BUG_ON(bus->state != MDIOBUS_UNREGISTERED);
	bus->state = MDIOBUS_RELEASED;

	put_device(&bus->dev);
}

In other words, there is an attempt to free an MDIO bus which was not
unregistered. The attempt to free it comes from the devres release
callbacks of the SPI device, which are executed after the device is
unregistered.

I'm not saying that the fact that MDIO buses allocated using devres
would automatically get unregistered wasn't strange. I'm just saying
that the commit didn't care about auditing existing call paths in the
kernel, and now, the following code sequences are potentially buggy:

(a) devm_mdiobus_alloc followed by plain mdiobus_register, for a device
    located on a bus that unregisters its children on shutdown. After
    the blamed patch, either both the alloc and the register should use
    devres, or none should.

(b) devm_mdiobus_alloc followed by plain mdiobus_register, and then no
    mdiobus_unregister at all in the remove path. After the blamed
    patch, nobody unregisters the MDIO bus anymore, so this is even more
    buggy than the previous case which needs a specific bus
    configuration to be seen, this one is an unconditional bug.

In this case, the Realtek drivers fall under category (b). To solve it,
we can register the MDIO bus under devres too, which restores the
previous behavior.

Fixes: ac3a68d56651 ("net: phy: don't abuse devres in devm_mdiobus_register()")
Reported-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Reported-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-21 13:52:16 +01:00