IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
commit fb312ac5ccb007e843f982b38d4d6886ba4b32f2 upstream.
I got this crash more times during debugging of PCIe controller and crash
happens somehow at the time when PCIe kernel code started link retraining (as
part of ASPM code) when at the same time PCIe link went down and ath9k probably
executed hw reset procedure.
Currently I'm not able to reproduce this issue as it looks like to be
some race condition between link training, ASPM, link down and reset
path. And as always, race conditions which depends on more input
parameters are hard to reproduce as it depends on precise timings.
But it is clear that pointers are zero in this case and should be
properly filled as same code pattern is used in ath9k_stop() function.
Anyway I was able to reproduce this crash by manually triggering ath
reset worker prior putting card up. I created simple patch to export
reset functionality via debugfs and use it to "simulate" of triggering
reset. s proved that NULL-pointer dereference issue is there.
Function ath9k_hw_reset() is dereferencing chan structure pointer, so it
needs to be non-NULL pointer.
Function ath9k_stop() already contains code which sets ah->curchan to valid
non-NULL pointer prior calling ath9k_hw_reset() function.
Add same code pattern also into ath_reset_internal() function to prevent
kernel NULL pointer dereference in ath9k_hw_reset() function.
This change fixes kernel NULL pointer dereference in ath9k_hw_reset() which
is caused by calling ath9k_hw_reset() from ath_reset_internal() with NULL
chan structure.
[ 45.334305] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
[ 45.344417] Mem abort info:
[ 45.347301] ESR = 0x96000005
[ 45.350448] EC = 0x25: DABT (current EL), IL = 32 bits
[ 45.356166] SET = 0, FnV = 0
[ 45.359350] EA = 0, S1PTW = 0
[ 45.362596] Data abort info:
[ 45.365756] ISV = 0, ISS = 0x00000005
[ 45.369735] CM = 0, WnR = 0
[ 45.372814] user pgtable: 4k pages, 39-bit VAs, pgdp=000000000685d000
[ 45.379663] [0000000000000008] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
[ 45.388856] Internal error: Oops: 96000005 [#1] SMP
[ 45.393897] Modules linked in: ath9k ath9k_common ath9k_hw
[ 45.399574] CPU: 1 PID: 309 Comm: kworker/u4:2 Not tainted 5.12.0-rc2-dirty #785
[ 45.414746] Workqueue: phy0 ath_reset_work [ath9k]
[ 45.419713] pstate: 40000005 (nZcv daif -PAN -UAO -TCO BTYPE=--)
[ 45.425910] pc : ath9k_hw_reset+0xc4/0x1c48 [ath9k_hw]
[ 45.431234] lr : ath9k_hw_reset+0xc0/0x1c48 [ath9k_hw]
[ 45.436548] sp : ffffffc0118dbca0
[ 45.439961] x29: ffffffc0118dbca0 x28: 0000000000000000
[ 45.445442] x27: ffffff800dee4080 x26: 0000000000000000
[ 45.450923] x25: ffffff800df9b9d8 x24: 0000000000000000
[ 45.456404] x23: ffffffc0115f6000 x22: ffffffc008d0d408
[ 45.461885] x21: ffffff800dee5080 x20: ffffff800df9b9d8
[ 45.467366] x19: 0000000000000000 x18: 0000000000000000
[ 45.472846] x17: 0000000000000000 x16: 0000000000000000
[ 45.478326] x15: 0000000000000010 x14: ffffffffffffffff
[ 45.483807] x13: ffffffc0918db94f x12: ffffffc011498720
[ 45.489289] x11: 0000000000000003 x10: ffffffc0114806e0
[ 45.494770] x9 : ffffffc01014b2ec x8 : 0000000000017fe8
[ 45.500251] x7 : c0000000ffffefff x6 : 0000000000000001
[ 45.505733] x5 : 0000000000000000 x4 : 0000000000000000
[ 45.511213] x3 : 0000000000000000 x2 : ffffff801fece870
[ 45.516693] x1 : ffffffc00eded000 x0 : 000000000000003f
[ 45.522174] Call trace:
[ 45.524695] ath9k_hw_reset+0xc4/0x1c48 [ath9k_hw]
[ 45.529653] ath_reset_internal+0x1a8/0x2b8 [ath9k]
[ 45.534696] ath_reset_work+0x2c/0x40 [ath9k]
[ 45.539198] process_one_work+0x210/0x480
[ 45.543339] worker_thread+0x5c/0x510
[ 45.547115] kthread+0x12c/0x130
[ 45.550445] ret_from_fork+0x10/0x1c
[ 45.554138] Code: 910922c2 9117e021 95ff0398 b4000294 (b9400a61)
[ 45.560430] ---[ end trace 566410ba90b50e8b ]---
[ 45.565193] Kernel panic - not syncing: Oops: Fatal exception in interrupt
[ 45.572282] SMP: stopping secondary CPUs
[ 45.576331] Kernel Offset: disabled
[ 45.579924] CPU features: 0x00040002,0000200c
[ 45.584416] Memory Limit: none
[ 45.587564] Rebooting in 3 seconds..
Signed-off-by: Pali Rohár <pali@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210402122653.24014-1-pali@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 11b1d881a90fc184cc7d06e9804eb288c24a2a0d upstream.
The GLOBETROTTER.cis entry in serial_cs matches more devices than
intended and breaks them. Remove it.
Example: # pccardctl info
PRODID_1="Option International
"
PRODID_2="GSM-Ready 56K/ISDN
"
PRODID_3="021
"
PRODID_4="A
"
MANFID=0013,0000
FUNCID=0
result:
pcmcia 0.0: Direct firmware load for cis/GLOBETROTTER.cis failed with error -2
The GLOBETROTTER.cis is nowhere to be found. There's GLOBETROTTER.cis.ihex at
https://netdev.vger.kernel.narkive.com/h4inqdxM/patch-axnet-cs-fix-phy-id-detection-for-bogus-asix-chip#post41
It's from completely diffetent card:
vers_1 4.1, "Option International", "GSM/GPRS GlobeTrotter", "001", "A"
Signed-off-by: Ondrej Zary <linux@zary.sk>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210611201940.23898-1-linux@zary.sk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 08a84410a04f05c7c1b8e833f552416d8eb9f6fe upstream.
Stop dmaengine transfer in sci_stop_tx(). Otherwise, the following
message is possible output when system enters suspend and while
transferring data, because clearing TIE bit in SCSCR is not able to
stop any dmaengine transfer.
sh-sci e6550000.serial: ttySC1: Unable to drain transmitter
Note that this driver has already used some #ifdef in the .c file
so that this patch also uses #ifdef to fix the issue. Otherwise,
build errors happens if the CONFIG_SERIAL_SH_SCI_DMA is disabled.
Fixes: 73a19e4c0301 ("serial: sh-sci: Add DMA support.")
Cc: <stable@vger.kernel.org> # v4.9+
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Link: https://lore.kernel.org/r/20210610110806.277932-1-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 421a26f3d7a7c3ca43f3a9dc0f3cb0f562d5bd95 upstream.
The ltr559 chip uses only the lowest bit of the ALS_CONTR register to
configure between active and stand-by mode. In the original driver
BIT(1) is used, which does a software reset instead.
This patch fixes the problem by using BIT(0) as als_mode_active for
the ltr559 chip.
Fixes: 8592a7eefa54 ("iio: ltr501: Add support for ltr559 chip")
Signed-off-by: Oliver Lang <Oliver.Lang@gossenmetrawatt.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Tested-by: Nikita Travkin <nikita@trvn.ru> # ltr559
Link: https://lore.kernel.org/r/20210610134619.2101372-3-mkl@pengutronix.de
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2ac0b029a04b673ce83b5089368f467c5dca720c upstream.
The regmap is configured for 8 bit registers, uses a RB-Tree cache and
marks several registers as volatile (i.e. do not cache).
The ALS and PS data registers in the chip are 16 bit wide and spans
two regmap registers. In the current driver only the base register is
marked as volatile, resulting in the upper register only read once.
Further the data sheet notes:
| When the I2C read operation starts, all four ALS data registers are
| locked until the I2C read operation of register 0x8B is completed.
Which results in the registers never update after the 2nd read.
This patch fixes the problem by marking the upper 8 bits of the ALS
and PS registers as volatile, too.
Fixes: 2f2c96338afc ("iio: ltr501: Add regmap support.")
Reported-by: Oliver Lang <Oliver.Lang@gossenmetrawatt.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Tested-by: Nikita Travkin <nikita@trvn.ru> # ltr559
Link: https://lore.kernel.org/r/20210610134619.2101372-2-mkl@pengutronix.de
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7cd04c863f9e1655d607705455e7714f24451984 upstream.
Allocating an IRQ is conditional to the IRQ existence, but freeing it
was not. If no IRQ was allocate, the driver would still try to free
IRQ 0. Add the missing checks.
This fixes the following trace when the driver is removed:
[ 100.667788] Trying to free already-free IRQ 0
[ 100.667793] WARNING: CPU: 0 PID: 2315 at kernel/irq/manage.c:1826 free_irq+0x1fd/0x370
...
[ 100.667914] Call Trace:
[ 100.667920] tcs3472_remove+0x3a/0x90 [tcs3472]
[ 100.667927] i2c_device_remove+0x2b/0xa0
Signed-off-by: frank zago <frank@zago.net>
Link: https://lore.kernel.org/r/20210427022017.19314-2-frank@zago.net
Fixes: 9d2f715d592e ("iio: light: tcs3472: support out-of-threshold events")
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c749d8c018daf5fba6dfac7b6c5c78b27efd7d65 upstream.
Currently css_wait_for_slow_path() gets called inside the chp->lock.
The path-verification-loop of slowpath inside this lock could lead to
deadlock as reported by the lockdep validator.
The ccw_device_get_chp_desc() during the instance of a device-set-online
would try to acquire the same 'chp->lock' to read the chp->desc.
The instance of this function can get called from multiple scenario,
like probing or setting-device online manually. This could, in some
corner-cases lead to the deadlock.
lockdep validator reported this as,
CPU0 CPU1
---- ----
lock(&chp->lock);
lock(kn->active#43);
lock(&chp->lock);
lock((wq_completion)cio);
The chp->lock was introduced to serialize the access of struct
channel_path. This lock is not needed for the css_wait_for_slow_path()
function, so invoke the slow-path function outside this lock.
Fixes: b730f3a93395 ("[S390] cio: add lock to struct channel_path")
Cc: <stable@vger.kernel.org>
Reviewed-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Vineeth Vijayan <vneethv@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5483b904bf336948826594610af4c9bbb0d9e3aa upstream.
When find a task from wait queue to wake up, a non-privileged task may
be found out, rather than the privileged. This maybe lead a deadlock
same as commit dfe1fe75e00e ("NFSv4: Fix deadlock between nfs4_evict_inode()
and nfs4_opendata_get_inode()"):
Privileged delegreturn task is queued to privileged list because all
the slots are assigned. If there has no enough slot to wake up the
non-privileged batch tasks(session less than 8 slot), then the privileged
delegreturn task maybe lost waked up because the found out task can't
get slot since the session is on draining.
So we should treate the privileged task as the emergency task, and
execute it as for as we can.
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 5fcdfacc01f3 ("NFSv4: Return delegations synchronously in evict_inode")
Cc: stable@vger.kernel.org
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fcb170a9d825d7db4a3fb870b0300f5a40a8d096 upstream.
The 'queue->nr' will wraparound from 0 to 255 when only current
priority queue has tasks. This maybe lead a deadlock same as commit
dfe1fe75e00e ("NFSv4: Fix deadlock between nfs4_evict_inode()
and nfs4_opendata_get_inode()"):
Privileged delegreturn task is queued to privileged list because all
the slots are assigned. When non-privileged task complete and release
the slot, a non-privileged maybe picked out. It maybe allocate slot
failed when the session on draining.
If the 'queue->nr' has wraparound to 255, and no enough slot to
service it, then the privileged delegreturn will lost to wake up.
So we should avoid the wraparound on 'queue->nr'.
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 5fcdfacc01f3 ("NFSv4: Return delegations synchronously in evict_inode")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b17233d385d0b6b43ecf81d43008cb1bbb008166 upstream.
Rather than just indicating that transmission can start, this patch
requires the explicit flushing of the network TX queue when the driver
is informed by the device that it can transmit, next to its
configuration.
In this way, if frames have already been written by the application,
they will actually be transmitted.
Fixes: ffd137f7043c ("can: peak/pcie_fd: remove useless code when interface starts")
Link: https://lore.kernel.org/r/20210623142600.149904-1-s.grosjean@peak-system.com
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit fb8696ab14adadb2e3f6c17c18ed26b3ecd96691 upstream.
can_can_gw_rcv() is called under RCU protection, so after calling
can_rx_unregister(), we have to call synchronize_rcu in order to wait
for any RCU read-side critical sections to finish before removing the
kmem_cache entry with the referenced gw job entry.
Link: https://lore.kernel.org/r/20210618173645.2238-1-socketcan@hartkopp.net
Fixes: c1aabdf379bc ("can-gw: add netlink based CAN routing")
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d5f9023fa61ee8b94f37a93f08e94b136cf1e463 upstream.
can_rx_register() callbacks may be called concurrently to the call to
can_rx_unregister(). The callbacks and callback data, though, are
protected by RCU and the struct sock reference count.
So the callback data is really attached to the life of sk, meaning
that it should be released on sk_destruct. However, bcm_remove_op()
calls tasklet_kill(), and RCU callbacks may be called under RCU
softirq, so that cannot be used on kernels before the introduction of
HRTIMER_MODE_SOFT.
However, bcm_rx_handler() is called under RCU protection, so after
calling can_rx_unregister(), we may call synchronize_rcu() in order to
wait for any RCU read-side critical sections to finish. That is,
bcm_rx_handler() won't be called anymore for those ops. So, we only
free them, after we do that synchronize_rcu().
Fixes: ffd980f976e7 ("[CAN]: Add broadcast manager (bcm) protocol")
Link: https://lore.kernel.org/r/20210619161813.2098382-1-cascardo@canonical.com
Cc: linux-stable <stable@vger.kernel.org>
Reported-by: syzbot+0f7e7e5e2f4f40fa89c0@syzkaller.appspotmail.com
Reported-by: Norbert Slusarek <nslusarek@gmx.net>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cd84bbbac12a173a381a64c6ec8b76a5277b87b5 upstream.
Commit 5d1b1b3f492f ("ext4: fix BUG when calling ext4_error with locked
block group") introduces ext4_grp_locked_error to handle unlocking a
group in error cases. Otherwise, there is a possibility of a sleep while
atomic. However, since 43c73221b3b1 ("ext4: replace BUG_ON with WARN_ON
in mb_find_extent()"), mb_find_extent() has contained a ext4_error()
call while a group spinlock is held. Replace this with
ext4_grp_locked_error.
Fixes: 43c73221b3b1 ("ext4: replace BUG_ON with WARN_ON in mb_find_extent()")
Cc: <stable@vger.kernel.org> # 4.14+
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Link: https://lore.kernel.org/r/20210623232114.34457-1-stephen.s.brennan@oracle.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c89849cc0259f3d33624cc3bd127685c3c0fa25d upstream.
The avefreec should be average free clusters instead
of average free blocks, otherwize Orlov's allocator
will not work properly when bigalloc enabled.
Cc: stable@kernel.org
Signed-off-by: Pan Dong <pandong.peter@bytedance.com>
Link: https://lore.kernel.org/r/20210525073656.31594-1-pandong.peter@bytedance.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e5e7010e5444d923e4091cafff61d05f2d19cada upstream.
After converting fs shrinkers to new scan/count API, we are no longer
pass zero nr_to_scan parameter to detect the number of objects to free,
just remove this check.
Fixes: 1ab6c4997e04 ("fs: convert fs shrinkers to new scan/count API")
Cc: stable@vger.kernel.org # 3.12+
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210522103045.690103-2-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4fb7c70a889ead2e91e184895ac6e5354b759135 upstream.
The cache_cnt parameter of tracepoint ext4_es_shrink_exit means the
remaining cache count after shrink, but now it is the cache count before
shrink, fix it by read sbi->s_extent_cache_cnt again.
Fixes: 1ab6c4997e04 ("fs: convert fs shrinkers to new scan/count API")
Cc: stable@vger.kernel.org # 3.12+
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210522103045.690103-3-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8f6840c4fd1e7bd715e403074fb161c1a04cda73 upstream.
After commit c89128a00838 ("ext4: handle errors on
ext4_commit_super"), 'ret' may be set to 0 before calling
ext4_fill_flex_info(), if ext4_fill_flex_info() fails ext4_mount()
doesn't return error code, it makes 'root' is null which causes crash
in legacy_get_tree().
Fixes: c89128a00838 ("ext4: handle errors on ext4_commit_super")
Reported-by: Hulk Robot <hulkci@huawei.com>
Cc: <stable@vger.kernel.org> # v4.18+
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Link: https://lore.kernel.org/r/20210510111051.55650-1-yangyingliang@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b9a037b7f3c401d3c63e0423e56aef606b1ffaaf upstream.
In ext4_orphan_cleanup(), if ext4_truncate() failed to get a transaction
handle, it didn't remove the inode from the in-core orphan list, which
may probably trigger below error dump in ext4_destroy_inode() during the
final iput() and could lead to memory corruption on the later orphan
list changes.
EXT4-fs (sda): Inode 6291467 (00000000b8247c67): orphan list check failed!
00000000b8247c67: 0001f30a 00000004 00000000 00000023 ............#...
00000000e24cde71: 00000006 014082a3 00000000 00000000 ......@.........
0000000072c6a5ee: 00000000 00000000 00000000 00000000 ................
...
This patch fix this by cleanup in-core orphan list manually if
ext4_truncate() return error.
Cc: stable@kernel.org
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20210507071904.160808-1-yi.zhang@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 6819703f5a365c95488b07066a8744841bf14231 upstream.
The defrag loop processes leaves in batches and starting transaction for
each. The whole defragmentation on a given root is protected by a bit
but in case the transaction fails, the bit is not cleared
In case the transaction fails the bit would prevent starting
defragmentation again, so make sure it's cleared.
CC: stable@vger.kernel.org # 4.4+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d8ac76cdd1755b21e8c008c28d0b7251c0b14986 upstream.
During an incremental send operation, when processing the new references
for the current inode, we might send an unlink operation for another inode
that has a conflicting path and has more than one hard link. However this
path was computed and cached before we processed previous new references
for the current inode. We may have orphanized a directory of that path
while processing a previous new reference, in which case the path will
be invalid and cause the receiver process to fail.
The following reproducer triggers the problem and explains how/why it
happens in its comments:
$ cat test-send-unlink.sh
#!/bin/bash
DEV=/dev/sdi
MNT=/mnt/sdi
mkfs.btrfs -f $DEV >/dev/null
mount $DEV $MNT
# Create our test files and directory. Inode 259 (file3) has two hard
# links.
touch $MNT/file1
touch $MNT/file2
touch $MNT/file3
mkdir $MNT/A
ln $MNT/file3 $MNT/A/hard_link
# Filesystem looks like:
#
# . (ino 256)
# |----- file1 (ino 257)
# |----- file2 (ino 258)
# |----- file3 (ino 259)
# |----- A/ (ino 260)
# |---- hard_link (ino 259)
#
# Now create the base snapshot, which is going to be the parent snapshot
# for a later incremental send.
btrfs subvolume snapshot -r $MNT $MNT/snap1
btrfs send -f /tmp/snap1.send $MNT/snap1
# Move inode 257 into directory inode 260. This results in computing the
# path for inode 260 as "/A" and caching it.
mv $MNT/file1 $MNT/A/file1
# Move inode 258 (file2) into directory inode 260, with a name of
# "hard_link", moving first inode 259 away since it currently has that
# location and name.
mv $MNT/A/hard_link $MNT/tmp
mv $MNT/file2 $MNT/A/hard_link
# Now rename inode 260 to something else (B for example) and then create
# a hard link for inode 258 that has the old name and location of inode
# 260 ("/A").
mv $MNT/A $MNT/B
ln $MNT/B/hard_link $MNT/A
# Filesystem now looks like:
#
# . (ino 256)
# |----- tmp (ino 259)
# |----- file3 (ino 259)
# |----- B/ (ino 260)
# | |---- file1 (ino 257)
# | |---- hard_link (ino 258)
# |
# |----- A (ino 258)
# Create another snapshot of our subvolume and use it for an incremental
# send.
btrfs subvolume snapshot -r $MNT $MNT/snap2
btrfs send -f /tmp/snap2.send -p $MNT/snap1 $MNT/snap2
# Now unmount the filesystem, create a new one, mount it and try to
# apply both send streams to recreate both snapshots.
umount $DEV
mkfs.btrfs -f $DEV >/dev/null
mount $DEV $MNT
# First add the first snapshot to the new filesystem by applying the
# first send stream.
btrfs receive -f /tmp/snap1.send $MNT
# The incremental receive operation below used to fail with the
# following error:
#
# ERROR: unlink A/hard_link failed: No such file or directory
#
# This is because when send is processing inode 257, it generates the
# path for inode 260 as "/A", since that inode is its parent in the send
# snapshot, and caches that path.
#
# Later when processing inode 258, it first processes its new reference
# that has the path of "/A", which results in orphanizing inode 260
# because there is a a path collision. This results in issuing a rename
# operation from "/A" to "/o260-6-0".
#
# Finally when processing the new reference "B/hard_link" for inode 258,
# it notices that it collides with inode 259 (not yet processed, because
# it has a higher inode number), since that inode has the name
# "hard_link" under the directory inode 260. It also checks that inode
# 259 has two hardlinks, so it decides to issue a unlink operation for
# the name "hard_link" for inode 259. However the path passed to the
# unlink operation is "/A/hard_link", which is incorrect since currently
# "/A" does not exists, due to the orphanization of inode 260 mentioned
# before. The path is incorrect because it was computed and cached
# before the orphanization. This results in the receiver to fail with
# the above error.
btrfs receive -f /tmp/snap2.send $MNT
umount $MNT
When running the test, it fails like this:
$ ./test-send-unlink.sh
Create a readonly snapshot of '/mnt/sdi' in '/mnt/sdi/snap1'
At subvol /mnt/sdi/snap1
Create a readonly snapshot of '/mnt/sdi' in '/mnt/sdi/snap2'
At subvol /mnt/sdi/snap2
At subvol snap1
At snapshot snap2
ERROR: unlink A/hard_link failed: No such file or directory
Fix this by recomputing a path before issuing an unlink operation when
processing the new references for the current inode if we previously
have orphanized a directory.
A test case for fstests will follow soon.
CC: stable@vger.kernel.org # 4.4+
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f8f84af5da9ee04ef1d271528656dac42a090d00 upstream.
Even though we validate user-provided inputs we then traverse past
validated data when applying the new map. The issue was originally
discovered by Murray McAllister with this simple POC (if the following
is executed by an unprivileged user it will instantly panic the system):
int main(void) {
int fd, ret;
unsigned int buffer[10000];
fd = open("/dev/input/js0", O_RDONLY);
if (fd == -1)
printf("Error opening file\n");
ret = ioctl(fd, JSIOCSBTNMAP & ~IOCSIZE_MASK, &buffer);
printf("%d\n", ret);
}
The solution is to traverse internal buffer which is guaranteed to only
contain valid date when constructing the map.
Fixes: 182d679b2298 ("Input: joydev - prevent potential read overflow in ioctl")
Fixes: 999b874f4aa3 ("Input: joydev - validate axis/button maps before clobbering current ones")
Reported-by: Murray McAllister <murray.mcallister@gmail.com>
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Alexander Larkin <avlarkin82@gmail.com>
Link: https://lore.kernel.org/r/20210620120030.1513655-1-avlarkin82@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0e8f0d67401589a141950856902c7d0ec8d9c985 upstream.
... and actually should just check it's given an iovec-backed iterator
in the first place.
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d98e4d95411bbde2220a7afa38dcc9c14d71acbe upstream.
When checking the file name attribute, we want to ensure that it fits
within the bounds of ATTR_RECORD. To do this, we should check that (attr
record + file name offset + file name length) < (attr record + attr record
length).
However, the original check did not include the file name offset in the
calculation. This means that corrupted on-disk metadata might not caught
by the incorrect file name check, and lead to an invalid memory access.
An example can be seen in the crash report of a memory corruption error
found by Syzbot:
https://syzkaller.appspot.com/bug?id=a1a1e379b225812688566745c3e2f7242bffc246
Adding the file name offset to the validity check fixes this error and
passes the Syzbot reproducer test.
Link: https://lkml.kernel.org/r/20210614050540.289494-1-desmondcheongzx@gmail.com
Signed-off-by: Desmond Cheong Zhi Xi <desmondcheongzx@gmail.com>
Reported-by: syzbot+213ac8bb98f7f4420840@syzkaller.appspotmail.com
Tested-by: syzbot+213ac8bb98f7f4420840@syzkaller.appspotmail.com
Acked-by: Anton Altaparmakov <anton@tuxera.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b31d9d6d7abbf6483b871b6370bc31c930d53f54 upstream.
when system is doing s4, the process of xhci_resume may be as below:
1、xhci_mem_cleanup
2、xhci_init->xhci_mem_init->xhci_mem_cleanup(when memory is not enough).
xhci_mem_cleanup will be executed twice when system is out of memory.
xhci->port_caps is freed in xhci_mem_cleanup,but it isn't set to NULL.
It will be freed twice when xhci_mem_cleanup is called the second time.
We got following bug when system resumes from s4:
kernel BUG at mm/slub.c:309!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
CPU: 0 PID: 5929 Tainted: G S W 5.4.96-arm64-desktop #1
pc : __slab_free+0x5c/0x424
lr : kfree+0x30c/0x32c
Call trace:
__slab_free+0x5c/0x424
kfree+0x30c/0x32c
xhci_mem_cleanup+0x394/0x3cc
xhci_mem_init+0x9ac/0x1070
xhci_init+0x8c/0x1d0
xhci_resume+0x1cc/0x5fc
xhci_plat_resume+0x64/0x70
platform_pm_thaw+0x28/0x60
dpm_run_callback+0x54/0x24c
device_resume+0xd0/0x200
async_resume+0x24/0x60
async_run_entry_fn+0x44/0x110
process_one_work+0x1f0/0x490
worker_thread+0x5c/0x450
kthread+0x158/0x160
ret_from_fork+0x10/0x24
Original patch that caused this issue was backported to 4.4 stable,
so this should be backported to 4.4 stabe as well.
Fixes: cf0ee7c60c89 ("xhci: Fix memory leak when caching protocol extended capability PSI tables - take 2")
Cc: stable@vger.kernel.org # v4.4+
Signed-off-by: Jiantao Zhang <water.zhangjiantao@huawei.com>
Signed-off-by: Tao Xue <xuetao09@huawei.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Link: https://lore.kernel.org/r/20210617150354.1512157-5-mathias.nyman@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 03026197bb657d784220b040c6173267a0375741 upstream.
typec_register_altmode() misses to call altmode_id_remove() in an error
path. Add the missed function call to fix it.
Fixes: 8a37d87d72f0 ("usb: typec: Bus type for alternate modes")
Cc: stable <stable@vger.kernel.org>
Acked-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Link: https://lore.kernel.org/r/20210617073226.47599-1-jingxiangfeng@huawei.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 84524d1232ecca7cf8678e851b254f05cff4040a upstream.
Creation EP's debugfs called earlier than debugfs folder for dwc3
device created. As result EP's debugfs are created in '/sys/kernel/debug'
instead of '/sys/kernel/debug/usb/dwc3.1.auto'.
Moved dwc3_debugfs_init() function call before calling
dwc3_core_init_mode() to allow create dwc3 debugfs parent before
creating EP's debugfs's.
Fixes: 8d396bb0a5b6 ("usb: dwc3: debugfs: Add and remove endpoint dirs dynamically")
Cc: stable <stable@vger.kernel.org>
Reviewed-by: Jack Pham <jackp@codeaurora.org>
Signed-off-by: Minas Harutyunyan <Minas.Harutyunyan@synopsys.com>
Link: https://lore.kernel.org/r/01fafb5b2d8335e98e6eadbac61fc796bdf3ec1a.1623948457.git.Minas.Harutyunyan@synopsys.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4897807753e078655a78de39ed76044d784f3e63 upstream.
The device (32a7:0000 Heimann Sensor GmbH USB appset demo) claims to be
a CDC-ACM device in its descriptors but in fact is not. If it is run
with echo disabled it returns garbled data, probably due to something
that happens in the TTY layer. And when run with echo enabled (the
default), it will mess up the calibration data of the sensor the first
time any data is sent to the device.
In short, I had a bad time after connecting the sensor and trying to get
it to work. I hope blacklisting it in the cdc-acm driver will save
someone else a bit of trouble.
Signed-off-by: Hannu Hartikainen <hannu@hrtk.in>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210622141454.337948-1-hannu@hrtk.in
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4249d6fbc10fd997abdf8a1ea49c0389a0edf706 upstream.
when receive eem echo command, it will send a response,
but queue this response to the usb request which allocate
from gadget device endpoint zero,
and transmit the request to IN endpoint of eem interface.
on dwc3 gadget, it will trigger following warning in function
__dwc3_gadget_ep_queue(),
if (WARN(req->dep != dep, "request %pK belongs to '%s'\n",
&req->request, req->dep->name))
return -EINVAL;
fix it by allocating a usb request from IN endpoint of eem interface,
and transmit the usb request to same IN endpoint of eem interface.
Signed-off-by: Linyu Yuan <linyyuan@codeaurora.com>
Cc: stable <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210616115142.34075-1-linyyuan@codeaurora.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 41e81022a04a0294c55cfa7e366bc14b9634c66e upstream.
The direction of the pipe argument must match the request-type direction
bit or control requests may fail depending on the host-controller-driver
implementation.
Fix the four control requests which erroneously used usb_rcvctrlpipe().
Fixes: 1d3e20236d7a ("[PATCH] USB: usbtouchscreen: unified USB touchscreen driver")
Fixes: 24ced062a296 ("usbtouchscreen: add support for DMC TSC-10/25 devices")
Fixes: 9e3b25837a20 ("Input: usbtouchscreen - add support for e2i touchscreen controller")
Signed-off-by: Johan Hovold <johan@kernel.org>
Cc: stable@vger.kernel.org # 2.6.17
Link: https://lore.kernel.org/r/20210524092048.4443-1-johan@kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c680ed46e418e9c785d76cf44eb33bfd1e8cf3f6 upstream.
syzbot reported WARNING in vmalloc. The problem
was in zero size passed to vmalloc.
The root case was in wrong cxusb_bluebird_lgz201_properties
definition. adapter array has only 1 entry, but num_adapters was
2.
Call Trace:
__vmalloc_node mm/vmalloc.c:2963 [inline]
vmalloc+0x67/0x80 mm/vmalloc.c:2996
dvb_dmx_init+0xe4/0xb90 drivers/media/dvb-core/dvb_demux.c:1251
dvb_usb_adapter_dvb_init+0x564/0x860 drivers/media/usb/dvb-usb/dvb-usb-dvb.c:184
dvb_usb_adapter_init drivers/media/usb/dvb-usb/dvb-usb-init.c:86 [inline]
dvb_usb_init drivers/media/usb/dvb-usb/dvb-usb-init.c:184 [inline]
dvb_usb_device_init.cold+0xc94/0x146e drivers/media/usb/dvb-usb/dvb-usb-init.c:308
cxusb_probe+0x159/0x5e0 drivers/media/usb/dvb-usb/cxusb.c:1634
Fixes: 4d43e13f723e ("V4L/DVB (4643): Multi-input patch for DVB-USB device")
Cc: stable@vger.kernel.org
Reported-by: syzbot+7336195c02c1bd2f64e1@syzkaller.appspotmail.com
Signed-off-by: Pavel Skripkin <paskripkin@gmail.com>
Signed-off-by: Sean Young <sean@mess.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 362372ceb6556f338e230f2d90af27b47f82365a upstream.
At extending the available mixer values for 32bit types, we forgot to
add the corresponding entries for the format dump in the proc output.
This may result in OOB access. Here adds the missing entries.
Fixes: bc18e31c3042 ("ALSA: usb-audio: Fix parameter block size for UAC2 control requests")
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210622090647.14021-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 104739aca4488909175e9e31d5cd7d75b82a2046 upstream.
If the device is power-cycled, it takes time for the initiator to transmit
the periodic NOTIFY (ENABLE SPINUP) SAS primitive, and for the device to
respond to the primitive to become ACTIVE. Retry the I/O request to allow
the device time to become ACTIVE.
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210629155826.48441-1-quat.le@oracle.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Quat Le <quat.le@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 25de4ce5ed02994aea8bc111d133308f6fd62566 upstream.
There is a timer wrap issue on dra7 for the ARM architected timer.
In a typical clock configuration the timer fails to wrap after 388 days.
To work around the issue, we need to use timer-ti-dm percpu timers instead.
Let's configure dmtimer3 and 4 as percpu timers by default, and warn about
the issue if the dtb is not configured properly.
For more information, please see the errata for "AM572x Sitara Processors
Silicon Revisions 1.1, 2.0":
https://www.ti.com/lit/er/sprz429m/sprz429m.pdf
The concept is based on earlier reference patches done by Tero Kristo and
Keerthy.
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Keerthy <j-keerthy@ti.com>
Cc: Tero Kristo <kristo@kernel.org>
[tony@atomide.com: backported to 4.19.y]
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3efe7a878a11c13b5297057bfc1e5639ce1241ce upstream.
There is a timer wrap issue on dra7 for the ARM architected timer.
In a typical clock configuration the timer fails to wrap after 388 days.
To work around the issue, we need to use timer-ti-dm timers instead.
Let's prepare for adding support for percpu timers by adding a common
dmtimer_clkevt_init_common() and call it from __omap_sync32k_timer_init().
This patch makes no intentional functional changes.
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Keerthy <j-keerthy@ti.com>
Cc: Tero Kristo <kristo@kernel.org>
[tony@atomide.com: backported to 4.19.y]
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 52762fbd1c4778ac9b173624ca0faacd22ef4724 upstream.
We can move the TI dmtimer clockevent and clocksource to live under
drivers/clocksource if we rely only on the clock framework, and handle
the module configuration directly in the clocksource driver based on the
device tree data.
This removes the early dependency with system timers to the interconnect
related code, and we can probe pretty much everything else later on at
the module_init level.
Let's first add a new driver for timer-ti-dm-systimer based on existing
arch/arm/mach-omap2/timer.c. Then let's start moving SoCs to probe with
device tree data while still keeping the old timer.c. And eventually we
can just drop the old timer.c.
Let's take the opportunity to switch to use readl/writel as pointed out
by Daniel Lezcano <daniel.lezcano@linaro.org>. This allows further
clean-up of the timer-ti-dm code the a lot of the shared helpers can
just become static to the non-syster related code.
Note the boards can optionally configure different timer source clocks
if needed with assigned-clocks and assigned-clock-parents.
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Keerthy <j-keerthy@ti.com>
Cc: Tero Kristo <kristo@kernel.org>
[tony@atomide.com: backported to 4.19.y]
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b75ca5217743e4d7076cf65e044e88389e44318d upstream.
request_irq() is preferred over setup_irq(). Invocations of setup_irq()
occur after memory allocators are ready.
Per tglx[1], setup_irq() existed in olden days when allocators were not
ready by the time early interrupts were initialized.
Hence replace setup_irq() by request_irq().
[1] https://lkml.kernel.org/r/alpine.DEB.2.20.1710191609480.1971@nanos
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Keerthy <j-keerthy@ti.com>
Cc: Tero Kristo <kristo@kernel.org>
Signed-off-by: afzal mohammed <afzal.mohd.ma@gmail.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 934002cd660b035b926438244b4294e647507e13 upstream.
Send SEV_CMD_DECOMMISSION command to PSP firmware if ASID binding
fails. If a failure happens after a successful LAUNCH_START command,
a decommission command should be executed. Otherwise, guest context
will be unfreed inside the AMD SP. After the firmware will not have
memory to allocate more SEV guest context, LAUNCH_START command will
begin to fail with SEV_RET_RESOURCE_LIMIT error.
The existing code calls decommission inside sev_unbind_asid, but it is
not called if a failure happens before guest activation succeeds. If
sev_bind_asid fails, decommission is never called. PSP firmware has a
limit for the number of guests. If sev_asid_binding fails many times,
PSP firmware will not have resources to create another guest context.
Cc: stable@vger.kernel.org
Fixes: 59414c989220 ("KVM: SVM: Add support for KVM_SEV_LAUNCH_START command")
Reported-by: Peter Gonda <pgonda@google.com>
Signed-off-by: Alper Gun <alpergun@google.com>
Reviewed-by: Marc Orr <marcorr@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210610174604.2554090-1-alpergun@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3de218ff39b9e3f0d453fe3154f12a174de44b25 upstream.
In order to avoid a race condition for user events when changing
cpu affinity reset the active flag only when EOI-ing the event.
This is working fine as all user events are lateeoi events. Note that
lateeoi_ack_mask_dynirq() is not modified as there is no explicit call
to xen_irq_lateeoi() expected later.
Cc: stable@vger.kernel.org
Reported-by: Julien Grall <julien@xen.org>
Fixes: b6622798bc50b62 ("xen/events: avoid handling the same event on two cpus at the same time")
Tested-by: Julien Grall <julien@xen.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrvsky@oracle.com>
Link: https://lore.kernel.org/r/20210623130913.9405-1-jgross@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5fa54346caf67b4b1b10b1f390316ae466da4d53 upstream.
The system might hang with the following backtrace:
schedule+0x80/0x100
schedule_timeout+0x48/0x138
wait_for_common+0xa4/0x134
wait_for_completion+0x1c/0x2c
kthread_flush_work+0x114/0x1cc
kthread_cancel_work_sync.llvm.16514401384283632983+0xe8/0x144
kthread_cancel_delayed_work_sync+0x18/0x2c
xxxx_pm_notify+0xb0/0xd8
blocking_notifier_call_chain_robust+0x80/0x194
pm_notifier_call_chain_robust+0x28/0x4c
suspend_prepare+0x40/0x260
enter_state+0x80/0x3f4
pm_suspend+0x60/0xdc
state_store+0x108/0x144
kobj_attr_store+0x38/0x88
sysfs_kf_write+0x64/0xc0
kernfs_fop_write_iter+0x108/0x1d0
vfs_write+0x2f4/0x368
ksys_write+0x7c/0xec
It is caused by the following race between kthread_mod_delayed_work()
and kthread_cancel_delayed_work_sync():
CPU0 CPU1
Context: Thread A Context: Thread B
kthread_mod_delayed_work()
spin_lock()
__kthread_cancel_work()
spin_unlock()
del_timer_sync()
kthread_cancel_delayed_work_sync()
spin_lock()
__kthread_cancel_work()
spin_unlock()
del_timer_sync()
spin_lock()
work->canceling++
spin_unlock
spin_lock()
queue_delayed_work()
// dwork is put into the worker->delayed_work_list
spin_unlock()
kthread_flush_work()
// flush_work is put at the tail of the dwork
wait_for_completion()
Context: IRQ
kthread_delayed_work_timer_fn()
spin_lock()
list_del_init(&work->node);
spin_unlock()
BANG: flush_work is not longer linked and will never get proceed.
The problem is that kthread_mod_delayed_work() checks work->canceling
flag before canceling the timer.
A simple solution is to (re)check work->canceling after
__kthread_cancel_work(). But then it is not clear what should be
returned when __kthread_cancel_work() removed the work from the queue
(list) and it can't queue it again with the new @delay.
The return value might be used for reference counting. The caller has
to know whether a new work has been queued or an existing one was
replaced.
The proper solution is that kthread_mod_delayed_work() will remove the
work from the queue (list) _only_ when work->canceling is not set. The
flag must be checked after the timer is stopped and the remaining
operations can be done under worker->lock.
Note that kthread_mod_delayed_work() could remove the timer and then
bail out. It is fine. The other canceling caller needs to cancel the
timer as well. The important thing is that the queue (list)
manipulation is done atomically under worker->lock.
Link: https://lkml.kernel.org/r/20210610133051.15337-3-pmladek@suse.com
Fixes: 9a6b06c8d9a220860468a ("kthread: allow to modify delayed kthread work")
Signed-off-by: Petr Mladek <pmladek@suse.com>
Reported-by: Martin Liu <liumartin@google.com>
Cc: <jenhaochen@google.com>
Cc: Minchan Kim <minchan@google.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 34b3d5344719d14fd2185b2d9459b3abcb8cf9d8 upstream.
Patch series "kthread_worker: Fix race between kthread_mod_delayed_work()
and kthread_cancel_delayed_work_sync()".
This patchset fixes the race between kthread_mod_delayed_work() and
kthread_cancel_delayed_work_sync() including proper return value
handling.
This patch (of 2):
Simple code refactoring as a preparation step for fixing a race between
kthread_mod_delayed_work() and kthread_cancel_delayed_work_sync().
It does not modify the existing behavior.
Link: https://lkml.kernel.org/r/20210610133051.15337-2-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Cc: <jenhaochen@google.com>
Cc: Martin Liu <liumartin@google.com>
Cc: Minchan Kim <minchan@google.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4521de30fbb3f5be0db58de93582ebce72c9d44f upstream.
The vdd3p0 LDO's input should be from external USB VBUS directly, NOT
PMIC's power supply, the vdd3p0 LDO's target output voltage can be
controlled by SW, and it requires input voltage to be high enough, with
incorrect power supply assigned, if the power supply's voltage is lower
than the LDO target output voltage, it will return fail and skip the LDO
voltage adjustment, so remove the power supply assignment for vdd3p0 to
avoid such scenario.
Fixes: 93385546ba36 ("ARM: dts: imx6qdl-sabresd: Assign corresponding power supply for LDOs")
Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
Signed-off-by: Nobuhiro Iwamatsu (CIP) <nobuhiro1.iwamatsu@toshiba.co.jp>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>