linux

iv/linux

Author	SHA1	Message	Date
Wei Yongjun	6f3511eb86	net: ieee802154: fix error return code in dgram_bind() commit 444d8ad4916edec8a9fc684e841287db9b1e999f upstream. Fix to return error code -EINVAL from the error handling case instead of 0, as done elsewhere in this function. Fixes: 94160108a70c ("net/ieee802154: fix uninit value bug in dgram_sendmsg") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Link: https://lore.kernel.org/r/20220919160830.1436109-1-weiyongjun@huaweicloud.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:54 +09:00
Rik van Riel	11993652d0	mm,hugetlb: take hugetlb_lock before decrementing h->resv_huge_pages commit 12df140f0bdfae5dcfc81800970dd7f6f632e00c upstream. The h->*_huge_pages counters are protected by the hugetlb_lock, but alloc_huge_page has a corner case where it can decrement the counter outside of the lock. This could lead to a corrupted value of h->resv_huge_pages, which we have observed on our systems. Take the hugetlb_lock before decrementing h->resv_huge_pages to avoid a potential race. Link: https://lkml.kernel.org/r/20221017202505.0e6a4fcd@imladris.surriel.com Fixes: a88c76954804 ("mm: hugetlb: fix hugepage memory leak caused by wrong reserve count") Signed-off-by: Rik van Riel <riel@surriel.com> Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Glen McCready <gkmccready@meta.com> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Muchun Song <songmuchun@bytedance.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:54 +09:00
Chen Zhou	5a2d7c93d9	cgroup-v1: add disabled controller check in cgroup1_parse_param() commit 61e960b07b637f0295308ad91268501d744c21b5 upstream. When mounting a cgroup hierarchy with disabled controller in cgroup v1, all available controllers will be attached. For example, boot with cgroup_no_v1=cpu or cgroup_disable=cpu, and then mount with "mount -t cgroup -ocpu cpu /sys/fs/cgroup/cpu", then all enabled controllers will be attached except cpu. Fix this by adding disabled controller check in cgroup1_parse_param(). If the specified controller is disabled, just return error with information "Disabled controller xx" rather than attaching all the other enabled controllers. Fixes: f5dfb5315d34 ("cgroup: take options parsing into ->parse_monolithic()") Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Reviewed-by: Zefan Li <lizefan.x@bytedance.com> Reviewed-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Luiz Capitulino <luizcap@amazon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:54 +09:00
M. Vefa Bicakci	3d056d81b9	xen/gntdev: Prevent leaking grants commit 0991028cd49567d7016d1b224fe0117c35059f86 upstream. Prior to this commit, if a grant mapping operation failed partially, some of the entries in the map_ops array would be invalid, whereas all of the entries in the kmap_ops array would be valid. This in turn would cause the following logic in gntdev_map_grant_pages to become invalid: for (i = 0; i < map->count; i++) { if (map->map_ops[i].status == GNTST_okay) { map->unmap_ops[i].handle = map->map_ops[i].handle; if (!use_ptemod) alloced++; } if (use_ptemod) { if (map->kmap_ops[i].status == GNTST_okay) { if (map->map_ops[i].status == GNTST_okay) alloced++; map->kunmap_ops[i].handle = map->kmap_ops[i].handle; } } } ... atomic_add(alloced, &map->live_grants); Assume that use_ptemod is true (i.e., the domain mapping the granted pages is a paravirtualized domain). In the code excerpt above, note that the "alloced" variable is only incremented when both kmap_ops[i].status and map_ops[i].status are set to GNTST_okay (i.e., both mapping operations are successful). However, as also noted above, there are cases where a grant mapping operation fails partially, breaking the assumption of the code excerpt above. The aforementioned causes map->live_grants to be incorrectly set. In some cases, all of the map_ops mappings fail, but all of the kmap_ops mappings succeed, meaning that live_grants may remain zero. This in turn makes it impossible to unmap the successfully grant-mapped pages pointed to by kmap_ops, because unmap_grant_pages has the following snippet of code at its beginning: if (atomic_read(&map->live_grants) == 0) return; /* Nothing to do */ In other cases where only some of the map_ops mappings fail but all kmap_ops mappings succeed, live_grants is made positive, but when the user requests unmapping the grant-mapped pages, __unmap_grant_pages_done will then make map->live_grants negative, because the latter function does not check if all of the pages that were requested to be unmapped were actually unmapped, and the same function unconditionally subtracts "data->count" (i.e., a value that can be greater than map->live_grants) from map->live_grants. The side effects of a negative live_grants value have not been studied. The net effect of all of this is that grant references are leaked in one of the above conditions. In Qubes OS v4.1 (which uses Xen's grant mechanism extensively for X11 GUI isolation), this issue manifests itself with warning messages like the following to be printed out by the Linux kernel in the VM that had granted pages (that contain X11 GUI window data) to dom0: "g.e. 0x1234 still pending", especially after the user rapidly resizes GUI VM windows (causing some grant-mapping operations to partially or completely fail, due to the fact that the VM unshares some of the pages as part of the window resizing, making the pages impossible to grant-map from dom0). The fix for this issue involves counting all successful map_ops and kmap_ops mappings separately, and then adding the sum to live_grants. During unmapping, only the number of successfully unmapped grants is subtracted from live_grants. The code is also modified to check for negative live_grants values after the subtraction and warn the user. Link: https://github.com/QubesOS/qubes-issues/issues/7631 Fixes: dbe97cff7dd9 ("xen/gntdev: Avoid blocking in unmap_grant_pages()") Cc: stable@vger.kernel.org Signed-off-by: M. Vefa Bicakci <m.v.b@runbox.com> Acked-by: Demi Marie Obenour <demi@invisiblethingslab.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20221002222006.2077-2-m.v.b@runbox.com Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:54 +09:00
Jan Beulich	8f589b5c0e	Xen/gntdev: don't ignore kernel unmapping error commit f28347cc66395e96712f5c2db0a302ee75bafce6 upstream. While working on XSA-361 and its follow-ups, I failed to spot another place where the kernel mapping part of an operation was not treated the same as the user space part. Detect and propagate errors and add a 2nd pr_debug(). Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/c2513395-74dc-aea3-9192-fd265aa44e35@suse.com Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com> Co-authored-by: Demi Marie Obenour <demi@invisiblethingslab.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:54 +09:00
Chandan Babu R	f45ee20384	xfs: force the log after remapping a synchronous-writes file From: "Darrick J. Wong" <darrick.wong@oracle.com> commit 5ffce3cc22a0e89813ed0c7162a68b639aef9ab6 upstream. Commit 5833112df7e9 tried to make it so that a remap operation would force the log out to disk if the filesystem is mounted with mandatory synchronous writes. Unfortunately, that commit failed to handle the case where the inode or the file descriptor require mandatory synchronous writes. Refactor the check into into a helper that will look for all three conditions, and now we can treat reflink just like any other synchronous write. Fixes: 5833112df7e9 ("xfs: reflink should force the log out if mounted with wsync") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:54 +09:00
Chandan Babu R	102de7717d	xfs: clear XFS_DQ_FREEING if we can't lock the dquot buffer to flush From: "Darrick J. Wong" <darrick.wong@oracle.com> commit c97738a960a86081a147e7d436138e6481757445 upstream. In commit 8d3d7e2b35ea, we changed xfs_qm_dqpurge to bail out if we can't lock the dquot buf to flush the dquot. This prevents the AIL from blocking on the dquot, but it also forgets to clear the FREEING flag on its way out. A subsequent purge attempt will see the FREEING flag is set and bail out, which leads to dqpurge_all failing to purge all the dquots. (copy-pasting from Dave Chinner's identical patch) This was found by inspection after having xfs/305 hang 1 in ~50 iterations in a quotaoff operation: [ 8872.301115] xfs_quota D13888 92262 91813 0x00004002 [ 8872.302538] Call Trace: [ 8872.303193] __schedule+0x2d2/0x780 [ 8872.304108] ? do_raw_spin_unlock+0x57/0xd0 [ 8872.305198] schedule+0x6e/0xe0 [ 8872.306021] schedule_timeout+0x14d/0x300 [ 8872.307060] ? __next_timer_interrupt+0xe0/0xe0 [ 8872.308231] ? xfs_qm_dqusage_adjust+0x200/0x200 [ 8872.309422] schedule_timeout_uninterruptible+0x2a/0x30 [ 8872.310759] xfs_qm_dquot_walk.isra.0+0x15a/0x1b0 [ 8872.311971] xfs_qm_dqpurge_all+0x7f/0x90 [ 8872.313022] xfs_qm_scall_quotaoff+0x18d/0x2b0 [ 8872.314163] xfs_quota_disable+0x3a/0x60 [ 8872.315179] kernel_quotactl+0x7e2/0x8d0 [ 8872.316196] ? __do_sys_newstat+0x51/0x80 [ 8872.317238] __x64_sys_quotactl+0x1e/0x30 [ 8872.318266] do_syscall_64+0x46/0x90 [ 8872.319193] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 8872.320490] RIP: 0033:0x7f46b5490f2a [ 8872.321414] Code: Bad RIP value. Returning -EAGAIN from xfs_qm_dqpurge() without clearing the XFS_DQ_FREEING flag means the xfs_qm_dqpurge_all() code can never free the dquot, and we loop forever waiting for the XFS_DQ_FREEING flag to go away on the dquot that leaked it via -EAGAIN. Fixes: 8d3d7e2b35ea ("xfs: trylock underlying buffer on dquot flush") Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Allison Collins <allison.henderson@oracle.com> Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:54 +09:00
Chandan Babu R	03b449a880	xfs: finish dfops on every insert range shift iteration From: Brian Foster <bfoster@redhat.com> commit 9c516e0e4554e8f26ab73d46cbc789d7d8db664d upstream. The recent change to make insert range an atomic operation used the incorrect transaction rolling mechanism. The explicit transaction roll does not finish deferred operations. This means that intents for rmapbt updates caused by extent shifts are not logged until the final transaction commits. Thus if a crash occurs during an insert range, log recovery might leave the rmapbt in an inconsistent state. This was discovered by repeated runs of generic/455. Update insert range to finish dfops on every shift iteration. This is similar to collapse range and ensures that intents are logged with the transactions that make associated changes. Fixes: dd87f87d87fa ("xfs: rework insert range into an atomic operation") Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Acked-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandan.babu@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:54 +09:00
Heiko Carstens	3d295076ba	s390/pci: add missing EX_TABLE entries to __pcistg_mio_inuser()/__pcilg_mio_inuser() commit 6ec803025cf3173a57222e4411097166bd06fa98 upstream. For some exception types the instruction address points behind the instruction that caused the exception. Take that into account and add the missing exception table entry. Cc: <stable@vger.kernel.org> Fixes: f058599e22d5 ("s390/pci: Fix s390_mmio_read/write with MIO") Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:54 +09:00
Heiko Carstens	344e1cb0ba	s390/futex: add missing EX_TABLE entry to __futex_atomic_op() commit a262d3ad6a433e4080cecd0a8841104a5906355e upstream. For some exception types the instruction address points behind the instruction that caused the exception. Take that into account and add the missing exception table entry. Cc: <stable@vger.kernel.org> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:54 +09:00
Adrian Hunter	4f969d0753	perf auxtrace: Fix address filter symbol name match for modules commit cba04f3136b658583adb191556f99d087589c1cc upstream. For modules, names from kallsyms__parse() contain the module name which meant that module symbols did not match exactly by name. Fix by matching the name string up to the separating tab character. Fixes: 1b36c03e356936d6 ("perf record: Add support for using symbols in address filters") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221026072736.2982-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:54 +09:00
Christian A. Ehrhardt	c78b0dc6fb	kernfs: fix use-after-free in __kernfs_remove commit 4abc99652812a2ddf932f137515d5c5a04723538 upstream. Syzkaller managed to trigger concurrent calls to kernfs_remove_by_name_ns() for the same file resulting in a KASAN detected use-after-free. The race occurs when the root node is freed during kernfs_drain(). To prevent this acquire an additional reference for the root of the tree that is removed before calling __kernfs_remove(). Found by syzkaller with the following reproducer (slab_nomerge is required): syz_mount_image$ext4(0x0, &(0x7f0000000100)='./file0\x00', 0x100000, 0x0, 0x0, 0x0, 0x0) r0 = openat(0xffffffffffffff9c, &(0x7f0000000080)='/proc/self/exe\x00', 0x0, 0x0) close(r0) pipe2(&(0x7f0000000140)={0xffffffffffffffff, <r1=>0xffffffffffffffff}, 0x800) mount$9p_fd(0x0, &(0x7f0000000040)='./file0\x00', &(0x7f00000000c0), 0x408, &(0x7f0000000280)={'trans=fd,', {'rfdno', 0x3d, r0}, 0x2c, {'wfdno', 0x3d, r1}, 0x2c, {[{@cache_loose}, {@mmap}, {@loose}, {@loose}, {@mmap}], [{@mask={'mask', 0x3d, '^MAY_EXEC'}}, {@fsmagic={'fsmagic', 0x3d, 0x10001}}, {@dont_hash}]}}) Sample report: ================================================================== BUG: KASAN: use-after-free in kernfs_type include/linux/kernfs.h:335 [inline] BUG: KASAN: use-after-free in kernfs_leftmost_descendant fs/kernfs/dir.c:1261 [inline] BUG: KASAN: use-after-free in __kernfs_remove.part.0+0x843/0x960 fs/kernfs/dir.c:1369 Read of size 2 at addr ffff8880088807f0 by task syz-executor.2/857 CPU: 0 PID: 857 Comm: syz-executor.2 Not tainted 6.0.0-rc3-00363-g7726d4c3e60b #5 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x6e/0x91 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:317 [inline] print_report.cold+0x5e/0x5e5 mm/kasan/report.c:433 kasan_report+0xa3/0x130 mm/kasan/report.c:495 kernfs_type include/linux/kernfs.h:335 [inline] kernfs_leftmost_descendant fs/kernfs/dir.c:1261 [inline] __kernfs_remove.part.0+0x843/0x960 fs/kernfs/dir.c:1369 __kernfs_remove fs/kernfs/dir.c:1356 [inline] kernfs_remove_by_name_ns+0x108/0x190 fs/kernfs/dir.c:1589 sysfs_slab_add+0x133/0x1e0 mm/slub.c:5943 __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899 create_cache mm/slab_common.c:229 [inline] kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335 p9_client_create+0xd4d/0x1190 net/9p/client.c:993 v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408 v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126 legacy_get_tree+0xf1/0x200 fs/fs_context.c:610 vfs_get_tree+0x85/0x2e0 fs/super.c:1530 do_new_mount fs/namespace.c:3040 [inline] path_mount+0x675/0x1d00 fs/namespace.c:3370 do_mount fs/namespace.c:3383 [inline] __do_sys_mount fs/namespace.c:3591 [inline] __se_sys_mount fs/namespace.c:3568 [inline] __x64_sys_mount+0x282/0x300 fs/namespace.c:3568 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f725f983aed Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f725f0f7028 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 00007f725faa3f80 RCX: 00007f725f983aed RDX: 00000000200000c0 RSI: 0000000020000040 RDI: 0000000000000000 RBP: 00007f725f9f419c R08: 0000000020000280 R09: 0000000000000000 R10: 0000000000000408 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000006 R14: 00007f725faa3f80 R15: 00007f725f0d7000 </TASK> Allocated by task 855: kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38 kasan_set_track mm/kasan/common.c:45 [inline] set_alloc_info mm/kasan/common.c:437 [inline] __kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:470 kasan_slab_alloc include/linux/kasan.h:224 [inline] slab_post_alloc_hook mm/slab.h:727 [inline] slab_alloc_node mm/slub.c:3243 [inline] slab_alloc mm/slub.c:3251 [inline] __kmem_cache_alloc_lru mm/slub.c:3258 [inline] kmem_cache_alloc+0xbf/0x200 mm/slub.c:3268 kmem_cache_zalloc include/linux/slab.h:723 [inline] __kernfs_new_node+0xd4/0x680 fs/kernfs/dir.c:593 kernfs_new_node fs/kernfs/dir.c:655 [inline] kernfs_create_dir_ns+0x9c/0x220 fs/kernfs/dir.c:1010 sysfs_create_dir_ns+0x127/0x290 fs/sysfs/dir.c:59 create_dir lib/kobject.c:63 [inline] kobject_add_internal+0x24a/0x8d0 lib/kobject.c:223 kobject_add_varg lib/kobject.c:358 [inline] kobject_init_and_add+0x101/0x160 lib/kobject.c:441 sysfs_slab_add+0x156/0x1e0 mm/slub.c:5954 __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899 create_cache mm/slab_common.c:229 [inline] kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335 p9_client_create+0xd4d/0x1190 net/9p/client.c:993 v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408 v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126 legacy_get_tree+0xf1/0x200 fs/fs_context.c:610 vfs_get_tree+0x85/0x2e0 fs/super.c:1530 do_new_mount fs/namespace.c:3040 [inline] path_mount+0x675/0x1d00 fs/namespace.c:3370 do_mount fs/namespace.c:3383 [inline] __do_sys_mount fs/namespace.c:3591 [inline] __se_sys_mount fs/namespace.c:3568 [inline] __x64_sys_mount+0x282/0x300 fs/namespace.c:3568 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Freed by task 857: kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38 kasan_set_track+0x21/0x30 mm/kasan/common.c:45 kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:370 ____kasan_slab_free mm/kasan/common.c:367 [inline] ____kasan_slab_free mm/kasan/common.c:329 [inline] __kasan_slab_free+0x108/0x190 mm/kasan/common.c:375 kasan_slab_free include/linux/kasan.h:200 [inline] slab_free_hook mm/slub.c:1754 [inline] slab_free_freelist_hook mm/slub.c:1780 [inline] slab_free mm/slub.c:3534 [inline] kmem_cache_free+0x9c/0x340 mm/slub.c:3551 kernfs_put.part.0+0x2b2/0x520 fs/kernfs/dir.c:547 kernfs_put+0x42/0x50 fs/kernfs/dir.c:521 __kernfs_remove.part.0+0x72d/0x960 fs/kernfs/dir.c:1407 __kernfs_remove fs/kernfs/dir.c:1356 [inline] kernfs_remove_by_name_ns+0x108/0x190 fs/kernfs/dir.c:1589 sysfs_slab_add+0x133/0x1e0 mm/slub.c:5943 __kmem_cache_create+0x3e0/0x550 mm/slub.c:4899 create_cache mm/slab_common.c:229 [inline] kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335 p9_client_create+0xd4d/0x1190 net/9p/client.c:993 v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408 v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126 legacy_get_tree+0xf1/0x200 fs/fs_context.c:610 vfs_get_tree+0x85/0x2e0 fs/super.c:1530 do_new_mount fs/namespace.c:3040 [inline] path_mount+0x675/0x1d00 fs/namespace.c:3370 do_mount fs/namespace.c:3383 [inline] __do_sys_mount fs/namespace.c:3591 [inline] __se_sys_mount fs/namespace.c:3568 [inline] __x64_sys_mount+0x282/0x300 fs/namespace.c:3568 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd The buggy address belongs to the object at ffff888008880780 which belongs to the cache kernfs_node_cache of size 128 The buggy address is located 112 bytes inside of 128-byte region [ffff888008880780, ffff888008880800) The buggy address belongs to the physical page: page:00000000732833f8 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8880 flags: 0x100000000000200(slab\|node=0\|zone=1) raw: 0100000000000200 0000000000000000 dead000000000122 ffff888001147280 raw: 0000000000000000 0000000000150015 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888008880680: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb ffff888008880700: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc >ffff888008880780: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff888008880800: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb ffff888008880880: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc ================================================================== Acked-by: Tejun Heo <tj@kernel.org> Cc: stable <stable@kernel.org> # -rc3 Signed-off-by: Christian A. Ehrhardt <lk@c--e.de> Link: https://lore.kernel.org/r/20220913121723.691454-1-lk@c--e.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:54 +09:00
Matthew Ma	7a09c64b7d	mmc: core: Fix kernel panic when remove non-standard SDIO card commit 9972e6b404884adae9eec7463e30d9b3c9a70b18 upstream. SDIO tuple is only allocated for standard SDIO card, especially it causes memory corruption issues when the non-standard SDIO card has removed, which is because the card device's reference counter does not increase for it at sdio_init_func(), but all SDIO card device reference counter gets decreased at sdio_release_func(). Fixes: 6f51be3d37df ("sdio: allow non-standard SDIO cards") Signed-off-by: Matthew Ma <mahongwei@zeku.com> Reviewed-by: Weizhao Ouyang <ouyangweizhao@zeku.com> Reviewed-by: John Wang <wangdayu@zeku.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20221014034951.2300386-1-ouyangweizhao@zeku.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:54 +09:00
Johan Hovold	ed7f1ff87a	drm/msm/hdmi: fix memory corruption with too many bridges commit 4c1294da6aed1f16d47a417dcfe6602833c3c95c upstream. Add the missing sanity check on the bridge counter to avoid corrupting data beyond the fixed-sized bridge array in case there are ever more than eight bridges. Fixes: a3376e3ec81c ("drm/msm: convert to drm_bridge") Cc: stable@vger.kernel.org # 3.12 Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Tested-by: Kuogee Hsieh <quic_khsieh@quicinc.com> Reviewed-by: Kuogee Hsieh <quic_khsieh@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/502670/ Link: https://lore.kernel.org/r/20220913085320.8577-5-johan+linaro@kernel.org Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:53 +09:00
Johan Hovold	f649ed0e1b	drm/msm/dsi: fix memory corruption with too many bridges commit 2e786eb2f9cebb07e317226b60054df510b60c65 upstream. Add the missing sanity check on the bridge counter to avoid corrupting data beyond the fixed-sized bridge array in case there are ever more than eight bridges. Fixes: a689554ba6ed ("drm/msm: Initial add DSI connector support") Cc: stable@vger.kernel.org # 4.1 Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Tested-by: Kuogee Hsieh <quic_khsieh@quicinc.com> Reviewed-by: Kuogee Hsieh <quic_khsieh@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/502668/ Link: https://lore.kernel.org/r/20220913085320.8577-4-johan+linaro@kernel.org Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:53 +09:00
Miquel Raynal	e7348308f6	mac802154: Fix LQI recording commit 5a5c4e06fd03b595542d5590f2bc05a6b7fc5c2b upstream. Back in 2014, the LQI was saved in the skb control buffer (skb->cb, or mac_cb(skb)) without any actual reset of this area prior to its use. As part of a useful rework of the use of this region, 32edc40ae65c ("ieee802154: change _cb handling slightly") introduced mac_cb_init() to basically memset the cb field to 0. In particular, this new function got called at the beginning of mac802154_parse_frame_start(), right before the location where the buffer got actually filled. What went through unnoticed however, is the fact that the very first helper called by device drivers in the receive path already used this area to save the LQI value for later extraction. Resetting the cb field "so late" led to systematically zeroing the LQI. If we consider the reset of the cb field needed, we can make it as soon as we get an skb from a device driver, right before storing the LQI, as is the very first time we need to write something there. Cc: stable@vger.kernel.org Fixes: 32edc40ae65c ("ieee802154: change _cb handling slightly") Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Acked-by: Alexander Aring <aahringo@redhat.com> Link: https://lore.kernel.org/r/20221020142535.1038885-1-miquel.raynal@bootlin.com Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:53 +09:00
Hyunwoo Kim	5385af2f89	fbdev: smscufx: Fix several use-after-free bugs commit cc67482c9e5f2c80d62f623bcc347c29f9f648e1 upstream. Several types of UAFs can occur when physically removing a USB device. Adds ufx_ops_destroy() function to .fb_destroy of fb_ops, and in this function, there is kref_put() that finally calls ufx_free(). This fix prevents multiple UAFs. Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com> Link: https://lore.kernel.org/linux-fbdev/20221011153436.GA4446@ubuntu/ Cc: <stable@vger.kernel.org> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:53 +09:00
Shreeya Patel	07ef3be6ca	iio: light: tsl2583: Fix module unloading commit 0dec4d2f2636b9e54d9d29f17afc7687c5407f78 upstream. tsl2583 probe() uses devm_iio_device_register() and calling iio_device_unregister() causes the unregister to occur twice. s Switch to iio_device_register() instead of devm_iio_device_register() in probe to avoid the device managed cleanup. Fixes: 371894f5d1a0 ("iio: tsl2583: add runtime power management support") Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com> Link: https://lore.kernel.org/r/20220826122352.288438-1-shreeya.patel@collabora.com Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:53 +09:00
Matti Vaittinen	cb972e6d01	tools: iio: iio_utils: fix digit calculation commit 72b2aa38191bcba28389b0e20bf6b4f15017ff2b upstream. The iio_utils uses a digit calculation in order to know length of the file name containing a buffer number. The digit calculation does not work for number 0. This leads to allocation of one character too small buffer for the file-name when file name contains value '0'. (Eg. buffer0). Fix digit calculation by returning one digit to be present for number '0'. Fixes: 096f9b862e60 ("tools:iio:iio_utils: implement digit calculation") Signed-off-by: Matti Vaittinen <mazziesaccount@gmail.com> Link: https://lore.kernel.org/r/Y0f+tKCz+ZAIoroQ@dc75zzyyyyyyyyyyyyycy-3.rev.dnainternet.fi Cc: <Stable@vger.kernel.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:53 +09:00
Mathias Nyman	8f1cd9633d	xhci: Remove device endpoints from bandwidth list when freeing the device commit 5aed5b7c2430ce318a8e62f752f181e66f0d1053 upstream. Endpoints are normally deleted from the bandwidth list when they are dropped, before the virt device is freed. If xHC host is dying or being removed then the endpoints aren't dropped cleanly due to functions returning early to avoid interacting with a non-accessible host controller. So check and delete endpoints that are still on the bandwidth list when freeing the virt device. Solves a list_del corruption kernel crash when unbinding xhci-pci, caused by xhci_mem_cleanup() when it later tried to delete already freed endpoints from the bandwidth list. This only affects hosts that use software bandwidth checking, which currenty is only the xHC in intel Panther Point PCH (Ivy Bridge) Cc: stable@vger.kernel.org Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20221024142720.4122053-5-mathias.nyman@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:53 +09:00
Tony O'Brien	914704e0d2	mtd: rawnand: marvell: Use correct logic for nand-keep-config commit ce107713b722af57c4b7f2477594d445b496420e upstream. Originally the absence of the marvell,nand-keep-config property caused the setup_data_interface function to be provided. However when setup_data_interface was moved into nand_controller_ops the logic was unintentionally inverted. Update the logic so that only if the marvell,nand-keep-config property is present the bootloader NAND config kept. Cc: stable@vger.kernel.org Fixes: 7a08dbaedd36 ("mtd: rawnand: Move ->setup_data_interface() to nand_controller_ops") Signed-off-by: Tony O'Brien <tony.obrien@alliedtelesis.co.nz> Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220927024728.28447-1-chris.packham@alliedtelesis.co.nz Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:53 +09:00
Jens Glathe	5d36037b22	usb: xhci: add XHCI_SPURIOUS_SUCCESS to ASM1042 despite being a V0.96 controller commit 4f547472380136718b56064ea5689a61e135f904 upstream. This appears to fix the error: "xhci_hcd <address>; ERROR Transfer event TRB DMA ptr not part of current TD ep_index 2 comp_code 13" that appear spuriously (or pretty often) when using a r8152 USB3 ethernet adapter with integrated hub. ASM1042 reports as a 0.96 controller, but appears to behave more like 1.0 Inspired by this email thread: https://markmail.org/thread/7vzqbe7t6du6qsw3 Cc: stable@vger.kernel.org Signed-off-by: Jens Glathe <jens.glathe@oldschoolsolutions.biz> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20221024142720.4122053-2-mathias.nyman@intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:53 +09:00
Justin Chen	7b7a0d5433	usb: bdc: change state when port disconnected commit fb8f60dd1b67520e0e0d7978ef17d015690acfc1 upstream. When port is connected and then disconnected, the state stays as configured. Which is incorrect as the port is no longer configured, but in a not attached state. Signed-off-by: Justin Chen <justinpopo6@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Fixes: efed421a94e6 ("usb: gadget: Add UDC driver for Broadcom USB3.0 device controller IP BDC") Cc: stable <stable@kernel.org> Link: https://lore.kernel.org/r/1664997235-18198-1-git-send-email-justinpopo6@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:53 +09:00
Thinh Nguyen	6827b58a95	usb: dwc3: gadget: Don't set IMI for no_interrupt commit 308c316d16cbad99bb834767382baa693ac42169 upstream. The gadget driver may have a certain expectation of how the request completion flow should be from to its configuration. Make sure the controller driver respect that. That is, don't set IMI (Interrupt on Missed Isoc) when usb_request->no_interrupt is set. Also, the driver should only set IMI to the last TRB of a chain. Fixes: 72246da40f37 ("usb: Introduce DesignWare USB3 DRD Driver") Cc: stable@vger.kernel.org Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Reviewed-by: Jeff Vanhoof <jdv1029@gmail.com> Tested-by: Jeff Vanhoof <jdv1029@gmail.com> Link: https://lore.kernel.org/r/ced336c84434571340c07994e3667a0ee284fefe.1666735451.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:53 +09:00
Thinh Nguyen	9aa0254303	usb: dwc3: gadget: Stop processing more requests on IMI commit f78961f8380b940e0cfc7e549336c21a2ad44f4d upstream. When servicing a transfer completion event, the dwc3 driver will reclaim TRBs of started requests up to the request associated with the interrupt event. Currently we don't check for interrupt due to missed isoc, and the driver may attempt to reclaim TRBs beyond the associated event. This causes invalid memory access when the hardware still owns the TRB. If there's a missed isoc TRB with IMI (interrupt on missed isoc), make sure to stop servicing further. Note that only the last TRB of chained TRBs has its status updated with missed isoc. Fixes: 72246da40f37 ("usb: Introduce DesignWare USB3 DRD Driver") Cc: stable@vger.kernel.org Reported-by: Jeff Vanhoof <jdv1029@gmail.com> Reported-by: Dan Vacura <w36195@motorola.com> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Reviewed-by: Jeff Vanhoof <jdv1029@gmail.com> Tested-by: Jeff Vanhoof <jdv1029@gmail.com> Link: https://lore.kernel.org/r/b29acbeab531b666095dfdafd8cb5c7654fbb3e1.1666735451.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:53 +09:00
Hannu Hartikainen	035dda2bfd	USB: add RESET_RESUME quirk for NVIDIA Jetson devices in RCM commit fc4ade55c617dc73c7e9756b57f3230b4ff24540 upstream. NVIDIA Jetson devices in Force Recovery mode (RCM) do not support suspending, ie. flashing fails if the device has been suspended. The devices are still visible in lsusb and seem to work otherwise, making the issue hard to debug. This has been discovered in various forum posts, eg. [1]. The patch has been tested on NVIDIA Jetson AGX Xavier, but I'm adding all the Jetson models listed in [2] on the assumption that they all behave similarly. [1]: https://forums.developer.nvidia.com/t/flashing-not-working/72365 [2]: https://docs.nvidia.com/jetson/archives/l4t-archived/l4t-3271/index.html#page/Tegra%20Linux%20Driver%20Package%20Development%20Guide/quick_start.html Signed-off-by: Hannu Hartikainen <hannu@hrtk.in> Cc: stable <stable@kernel.org> # after 6.1-rc3 Link: https://lore.kernel.org/r/20220919171610.30484-1-hannu@hrtk.in Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:53 +09:00
Jason A. Donenfeld	e4045fbcd9	ALSA: au88x0: use explicitly signed char commit ee03c0f200eb0d9f22dd8732d9fb7956d91019c2 upstream. With char becoming unsigned by default, and with `char` alone being ambiguous and based on architecture, signed chars need to be marked explicitly as such. This fixes warnings like: sound/pci/au88x0/au88x0_core.c:2029 vortex_adb_checkinout() warn: signedness bug returning '(-22)' sound/pci/au88x0/au88x0_core.c:2046 vortex_adb_checkinout() warn: signedness bug returning '(-12)' sound/pci/au88x0/au88x0_core.c:2125 vortex_adb_allocroute() warn: 'vortex_adb_checkinout(vortex, (0), en, 0)' is unsigned sound/pci/au88x0/au88x0_core.c:2170 vortex_adb_allocroute() warn: 'vortex_adb_checkinout(vortex, stream->resources, en, 4)' is unsigned As well, since one function returns errnos, return an `int` rather than a `signed char`. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20221024162929.536004-1-Jason@zx2c4.com Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:53 +09:00
Steven Rostedt (Google)	d853b43808	ALSA: Use del_timer_sync() before freeing timer commit f0a868788fcbf63cdab51f5adcf73b271ede8164 upstream. The current code for freeing the emux timer is extremely dangerous: CPU0 CPU1 ---- ---- snd_emux_timer_callback() snd_emux_free() spin_lock(&emu->voice_lock) del_timer(&emu->tlist); <-- returns immediately spin_unlock(&emu->voice_lock); [..] kfree(emu); spin_lock(&emu->voice_lock); [BOOM!] Instead just use del_timer_sync() which will wait for the timer to finish before continuing. No need to check if the timer is active or not when doing so. This doesn't fix the race of a possible re-arming of the timer, but at least it won't use the data that has just been freed. [ Fixed unused variable warning by tiwai ] Cc: stable@vger.kernel.org Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Link: https://lore.kernel.org/r/20221026231236.6834b551@gandalf.local.home Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:53 +09:00
Anssi Hannula	caea5b20ef	can: kvaser_usb: Fix possible completions during init_completion commit 2871edb32f4622c3a25ce4b3977bad9050b91974 upstream. kvaser_usb uses completions to signal when a response event is received for outgoing commands. However, it uses init_completion() to reinitialize the start_comp and stop_comp completions before sending the start/stop commands. In case the device sends the corresponding response just before the actual command is sent, complete() may be called concurrently with init_completion() which is not safe. This might be triggerable even with a properly functioning device by stopping the interface (CMD_STOP_CHIP) just after it goes bus-off (which also causes the driver to send CMD_STOP_CHIP when restart-ms is off), but that was not tested. Fix the issue by using reinit_completion() instead. Fixes: 080f40a6fa28 ("can: kvaser_usb: Add support for Kvaser CAN/USB devices") Tested-by: Jimmy Assarsson <extja@kvaser.com> Signed-off-by: Anssi Hannula <anssi.hannula@bitwise.fi> Signed-off-by: Jimmy Assarsson <extja@kvaser.com> Link: https://lore.kernel.org/all/20221010185237.319219-2-extja@kvaser.com Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:53 +09:00
Yang Yingliang	5437642f91	can: j1939: transport: j1939_session_skb_drop_old(): spin_unlock_irqrestore() before kfree_skb() commit c3c06c61890da80494bb196f75d89b791adda87f upstream. It is not allowed to call kfree_skb() from hardware interrupt context or with interrupts being disabled. The skb is unlinked from the queue, so it can be freed after spin_unlock_irqrestore(). Fixes: 9d71dd0c7009 ("can: add support of SAE J1939 protocol") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://lore.kernel.org/all/20221027091237.2290111-1-yangyingliang@huawei.com Cc: stable@vger.kernel.org [mkl: adjust subject] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-03 23:56:52 +09:00
Greg Kroah-Hartman	5282d4de78	Linux 5.4.222 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> v5.4.222	2022-11-01 19:06:42 +01:00
Greg Kroah-Hartman	59f89518f5	once: fix section mismatch on clang builds On older kernels (5.4 and older), building the kernel with clang can cause the section name to end up with "" in them, which can cause lots of runtime issues as that is not normally a valid portion of the string. This was fixed up in newer kernels with commit 33def8498fdd ("treewide: Convert macro and uses of __section(foo) to __section("foo")") but that's too heavy-handed for older kernels. So for now, fix up the problem that commit 62c07983bef9 ("once: add DO_ONCE_SLOW() for sleepable contexts") caused by being backported by removing the "" characters from the section definition. Reported-by: Oleksandr Tymoshenko <ovt@google.com> Reported-by: Yongqin Liu <yongqin.liu@linaro.org> Tested-by: Yongqin Liu <yongqin.liu@linaro.org> Cc: Naresh Kamboju <naresh.kamboju@linaro.org> Link: https://lore.kernel.org/r/20221029011211.4049810-1-ovt@google.com Link: https://lore.kernel.org/r/CAMSo37XApZ_F5nSQYWFsSqKdMv_gBpfdKG3KN1TDB+QNXqSh0A@mail.gmail.com Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Eric Dumazet <edumazet@google.com> Cc: Willy Tarreau <w@1wt.eu> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David S. Miller <davem@davemloft.net> Cc: Sasha Levin <sashal@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-11-01 19:06:42 +01:00
Greg Kroah-Hartman	b70bfeb986	Linux 5.4.221 Link: https://lore.kernel.org/r/20221027165049.817124510@linuxfoundation.org Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> v5.4.221	2022-10-29 10:20:36 +02:00
Seth Jenkins	6bb8769326	mm: /proc/pid/smaps_rollup: fix no vma's null-deref Commit 258f669e7e88 ("mm: /proc/pid/smaps_rollup: convert to single value seq_file") introduced a null-deref if there are no vma's in the task in show_smaps_rollup. Fixes: 258f669e7e88 ("mm: /proc/pid/smaps_rollup: convert to single value seq_file") Signed-off-by: Seth Jenkins <sethjenkins@google.com> Reviewed-by: Alexey Dobriyan <adobriyan@gmail.com> Tested-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-10-29 10:20:36 +02:00
Gaurav Kohli	a351077e58	hv_netvsc: Fix race between VF offering and VF association message from host commit 365e1ececb2905f94cc10a5817c5b644a32a3ae2 upstream. During vm boot, there might be possibility that vf registration call comes before the vf association from host to vm. And this might break netvsc vf path, To prevent the same block vf registration until vf bind message comes from host. Cc: stable@vger.kernel.org Fixes: 00d7ddba11436 ("hv_netvsc: pair VF based on serial number") Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: Gaurav Kohli <gauravkohli@linux.microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-10-29 10:20:36 +02:00
Nick Desaulniers	2f1b3377b6	Makefile.debug: re-enable debug info for .S files This is _not_ an upstream commit and just for 5.4.y only. It is based on commit 32ef9e5054ec0321b9336058c58ec749e9c6b0fe upstream. Alexey reported that the fraction of unknown filename instances in kallsyms grew from ~0.3% to ~10% recently; Bill and Greg tracked it down to assembler defined symbols, which regressed as a result of: commit b8a9092330da ("Kbuild: do not emit debug info for assembly with LLVM_IAS=1") In that commit, I allude to restoring debug info for assembler defined symbols in a follow up patch, but it seems I forgot to do so in commit a66049e2cf0e ("Kbuild: make DWARF version a choice") Fixes: b8a9092330da ("Kbuild: do not emit debug info for assembly with LLVM_IAS=1") Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-10-29 10:20:36 +02:00
Werner Sembach	9220881831	ACPI: video: Force backlight native for more TongFang devices commit 3dbc80a3e4c55c4a5b89ef207bed7b7de36157b4 upstream. This commit is very different from the upstream commit! It fixes the same issue by adding more quirks, rather then the general fix from the 6.1 kernel, because the general fix from the 6.1 kernel is part of a larger refactoring of the backlight code which is not suitable for the stable series. As described in "ACPI: video: Drop NL5x?U, PF4NU1F and PF5?U?? acpi_backlight=native quirks" (10212754a0d2) the upstream commit "ACPI: video: Make backlight class device registration a separate step (v2)" (3dbc80a3e4c5) makes these quirks unnecessary. However as mentioned in this bugtracker ticket https://bugzilla.kernel.org/show_bug.cgi?id=215683#c17 the upstream fix is part of a larger patchset that is overall too complex for stable. The TongFang GKxNRxx, GMxNGxx, GMxZGxx, and GMxRGxx / TUXEDO Stellaris/Polaris Gen 1-4, have the same problem as the Clevo NL5xRU and NL5xNU / TUXEDO Aura 15 Gen1 and Gen2: They have a working native and video interface for screen backlight. However the default detection mechanism first registers the video interface before unregistering it again and switching to the native interface during boot. This results in a dangling SBIOS request for backlight change for some reason, causing the backlight to switch to ~2% once per boot on the first power cord connect or disconnect event. Setting the native interface explicitly circumvents this buggy behaviour by avoiding the unregistering process. Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Werner Sembach <wse@tuxedocomputers.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-10-29 10:20:36 +02:00
Conor Dooley	8ad8fc82ee	riscv: topology: fix default topology reporting commit fbd92809997a391f28075f1c8b5ee314c225557c upstream. RISC-V has no sane defaults to fall back on where there is no cpu-map in the devicetree. Without sane defaults, the package, core and thread IDs are all set to -1. This causes user-visible inaccuracies for tools like hwloc/lstopo which rely on the sysfs cpu topology files to detect a system's topology. On a PolarFire SoC, which should have 4 harts with a thread each, lstopo currently reports: Machine (793MB total) Package L#0 NUMANode L#0 (P#0 793MB) Core L#0 L1d L#0 (32KB) + L1i L#0 (32KB) + PU L#0 (P#0) L1d L#1 (32KB) + L1i L#1 (32KB) + PU L#1 (P#1) L1d L#2 (32KB) + L1i L#2 (32KB) + PU L#2 (P#2) L1d L#3 (32KB) + L1i L#3 (32KB) + PU L#3 (P#3) Adding calls to store_cpu_topology() in {boot,smp} hart bringup code results in the correct topolgy being reported: Machine (793MB total) Package L#0 NUMANode L#0 (P#0 793MB) L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0 + PU L#0 (P#0) L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1 + PU L#1 (P#1) L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2 + PU L#2 (P#2) L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3 + PU L#3 (P#3) CC: stable@vger.kernel.org # 456797da792f: arm64: topology: move store_cpu_topology() to shared code Fixes: 03f11f03dbfe ("RISC-V: Parse cpu topology during boot.") Reported-by: Brice Goglin <Brice.Goglin@inria.fr> Link: https://github.com/open-mpi/hwloc/issues/536 Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-10-29 10:20:36 +02:00
Conor Dooley	60dd3dc2ac	arm64: topology: move store_cpu_topology() to shared code commit 456797da792fa7cbf6698febf275fe9b36691f78 upstream. arm64's method of defining a default cpu topology requires only minimal changes to apply to RISC-V also. The current arm64 implementation exits early in a uniprocessor configuration by reading MPIDR & claiming that uniprocessor can rely on the default values. This is appears to be a hangover from prior to '3102bc0e6ac7 ("arm64: topology: Stop using MPIDR for topology information")', because the current code just assigns default values for multiprocessor systems. With the MPIDR references removed, store_cpu_topolgy() can be moved to the common arch_topology code. Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-10-29 10:20:36 +02:00
Jerry Snitselaar	724483b585	iommu/vt-d: Clean up si_domain in the init_dmars() error path [ Upstream commit 620bf9f981365c18cc2766c53d92bf8131c63f32 ] A splat from kmem_cache_destroy() was seen with a kernel prior to commit ee2653bbe89d ("iommu/vt-d: Remove domain and devinfo mempool") when there was a failure in init_dmars(), because the iommu_domain cache still had objects. While the mempool code is now gone, there still is a leak of the si_domain memory if init_dmars() fails. So clean up si_domain in the init_dmars() error path. Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Will Deacon <will@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Fixes: 86080ccc223a ("iommu/vt-d: Allocate si_domain in init_dmars()") Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Link: https://lore.kernel.org/r/20221010144842.308890-1-jsnitsel@redhat.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-29 10:20:35 +02:00
Yang Yingliang	dfc0337c6d	net: hns: fix possible memory leak in hnae_ae_register() [ Upstream commit ff2f5ec5d009844ec28f171123f9e58750cef4bf ] Inject fault while probing module, if device_register() fails, but the refcount of kobject is not decreased to 0, the name allocated in dev_set_name() is leaked. Fix this by calling put_device(), so that name can be freed in callback function kobject_cleanup(). unreferenced object 0xffff00c01aba2100 (size 128): comm "systemd-udevd", pid 1259, jiffies 4294903284 (age 294.152s) hex dump (first 32 bytes): 68 6e 61 65 30 00 00 00 18 21 ba 1a c0 00 ff ff hnae0....!...... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<0000000034783f26>] slab_post_alloc_hook+0xa0/0x3e0 [<00000000748188f2>] __kmem_cache_alloc_node+0x164/0x2b0 [<00000000ab0743e8>] __kmalloc_node_track_caller+0x6c/0x390 [<000000006c0ffb13>] kvasprintf+0x8c/0x118 [<00000000fa27bfe1>] kvasprintf_const+0x60/0xc8 [<0000000083e10ed7>] kobject_set_name_vargs+0x3c/0xc0 [<000000000b87affc>] dev_set_name+0x7c/0xa0 [<000000003fd8fe26>] hnae_ae_register+0xcc/0x190 [hnae] [<00000000fe97edc9>] hns_dsaf_ae_init+0x9c/0x108 [hns_dsaf] [<00000000c36ff1eb>] hns_dsaf_probe+0x548/0x748 [hns_dsaf] Fixes: 6fe6611ff275 ("net: add Hisilicon Network Subsystem hnae framework support") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20221018122451.1749171-1-yangyingliang@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-29 10:20:35 +02:00
Zhengchao Shao	bc8301ea7e	net: sched: cake: fix null pointer access issue when cake_init() fails [ Upstream commit 51f9a8921ceacd7bf0d3f47fa867a64988ba1dcb ] When the default qdisc is cake, if the qdisc of dev_queue fails to be inited during mqprio_init(), cake_reset() is invoked to clear resources. In this case, the tins is NULL, and it will cause gpf issue. The process is as follows: qdisc_create_dflt() cake_init() q->tins = kvcalloc(...) --->failed, q->tins is NULL ... qdisc_put() ... cake_reset() ... cake_dequeue_one() b = &q->tins[...] --->q->tins is NULL The following is the Call Trace information: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] RIP: 0010:cake_dequeue_one+0xc9/0x3c0 Call Trace: <TASK> cake_reset+0xb1/0x140 qdisc_reset+0xed/0x6f0 qdisc_destroy+0x82/0x4c0 qdisc_put+0x9e/0xb0 qdisc_create_dflt+0x2c3/0x4a0 mqprio_init+0xa71/0x1760 qdisc_create+0x3eb/0x1000 tc_modify_qdisc+0x408/0x1720 rtnetlink_rcv_msg+0x38e/0xac0 netlink_rcv_skb+0x12d/0x3a0 netlink_unicast+0x4a2/0x740 netlink_sendmsg+0x826/0xcc0 sock_sendmsg+0xc5/0x100 ____sys_sendmsg+0x583/0x690 ___sys_sendmsg+0xe8/0x160 __sys_sendmsg+0xbf/0x160 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7f89e5122d04 </TASK> Fixes: 046f6fd5daef ("sched: Add Common Applications Kept Enhanced (cake) qdisc") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-29 10:20:35 +02:00
Harini Katakam	b87f88d58f	net: phy: dp83867: Extend RX strap quirk for SGMII mode [ Upstream commit 0c9efbd5c50c64ead434960a404c9c9a097b0403 ] When RX strap in HW is not set to MODE 3 or 4, bit 7 and 8 in CF4 register should be set. The former is already handled in dp83867_config_init; add the latter in SGMII specific initialization. Fixes: 2a10154abcb7 ("net: phy: dp83867: Add TI dp83867 phy") Signed-off-by: Harini Katakam <harini.katakam@amd.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-29 10:20:35 +02:00
Xiaobo Liu	6453077a00	net/atm: fix proc_mpc_write incorrect return value [ Upstream commit d8bde3bf7f82dac5fc68a62c2816793a12cafa2a ] Then the input contains '\0' or '\n', proc_mpc_write has read them, so the return value needs +1. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Xiaobo Liu <cppcoffee@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-29 10:20:35 +02:00
José Expósito	4258c473ee	HID: magicmouse: Do not set BTN_MOUSE on double report [ Upstream commit bb5f0c855dcfc893ae5ed90e4c646bde9e4498bf ] Under certain conditions the Magic Trackpad can group 2 reports in a single packet. The packet is split and the raw event function is invoked recursively for each part. However, after processing each part, the BTN_MOUSE status is updated, sending multiple click events. [1] Return after processing double reports to avoid this issue. Link: https://gitlab.freedesktop.org/libinput/libinput/-/issues/811 # [1] Fixes: a462230e16ac ("HID: magicmouse: enable Magic Trackpad support") Reported-by: Nulo <git@nulo.in> Signed-off-by: José Expósito <jose.exposito89@gmail.com> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20221009182747.90730-1-jose.exposito89@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-29 10:20:35 +02:00
Alexander Potapenko	567f8de358	tipc: fix an information leak in tipc_topsrv_kern_subscr [ Upstream commit 777ecaabd614d47c482a5c9031579e66da13989a ] Use a 8-byte write to initialize sub.usr_handle in tipc_topsrv_kern_subscr(), otherwise four bytes remain uninitialized when issuing setsockopt(..., SOL_TIPC, ...). This resulted in an infoleak reported by KMSAN when the packet was received: ===================================================== BUG: KMSAN: kernel-infoleak in copyout+0xbc/0x100 lib/iov_iter.c:169 instrument_copy_to_user ./include/linux/instrumented.h:121 copyout+0xbc/0x100 lib/iov_iter.c:169 _copy_to_iter+0x5c0/0x20a0 lib/iov_iter.c:527 copy_to_iter ./include/linux/uio.h:176 simple_copy_to_iter+0x64/0xa0 net/core/datagram.c:513 __skb_datagram_iter+0x123/0xdc0 net/core/datagram.c:419 skb_copy_datagram_iter+0x58/0x200 net/core/datagram.c:527 skb_copy_datagram_msg ./include/linux/skbuff.h:3903 packet_recvmsg+0x521/0x1e70 net/packet/af_packet.c:3469 ____sys_recvmsg+0x2c4/0x810 net/socket.c:? ___sys_recvmsg+0x217/0x840 net/socket.c:2743 __sys_recvmsg net/socket.c:2773 __do_sys_recvmsg net/socket.c:2783 __se_sys_recvmsg net/socket.c:2780 __x64_sys_recvmsg+0x364/0x540 net/socket.c:2780 do_syscall_x64 arch/x86/entry/common.c:50 do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd arch/x86/entry/entry_64.S:120 ... Uninit was stored to memory at: tipc_sub_subscribe+0x42d/0xb50 net/tipc/subscr.c:156 tipc_conn_rcv_sub+0x246/0x620 net/tipc/topsrv.c:375 tipc_topsrv_kern_subscr+0x2e8/0x400 net/tipc/topsrv.c:579 tipc_group_create+0x4e7/0x7d0 net/tipc/group.c:190 tipc_sk_join+0x2a8/0x770 net/tipc/socket.c:3084 tipc_setsockopt+0xae5/0xe40 net/tipc/socket.c:3201 __sys_setsockopt+0x87f/0xdc0 net/socket.c:2252 __do_sys_setsockopt net/socket.c:2263 __se_sys_setsockopt net/socket.c:2260 __x64_sys_setsockopt+0xe0/0x160 net/socket.c:2260 do_syscall_x64 arch/x86/entry/common.c:50 do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd arch/x86/entry/entry_64.S:120 Local variable sub created at: tipc_topsrv_kern_subscr+0x57/0x400 net/tipc/topsrv.c:562 tipc_group_create+0x4e7/0x7d0 net/tipc/group.c:190 Bytes 84-87 of 88 are uninitialized Memory access of size 88 starts at ffff88801ed57cd0 Data copied to user address 0000000020000400 ... ===================================================== Signed-off-by: Alexander Potapenko <glider@google.com> Fixes: 026321c6d056a5 ("tipc: rename tipc_server to tipc_topsrv") Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-29 10:20:35 +02:00
Mark Tomlinson	27ee73c119	tipc: Fix recognition of trial period [ Upstream commit 28be7ca4fcfd69a2d52aaa331adbf9dbe91f9e6e ] The trial period exists until jiffies is after addr_trial_end. But as jiffies will eventually overflow, just using time_after will eventually give incorrect results. As the node address is set once the trial period ends, this can be used to know that we are not in the trial period. Fixes: e415577f57f4 ("tipc: correct discovery message handling during address trial period") Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-29 10:20:35 +02:00
Tony Luck	fa0676d94f	ACPI: extlog: Handle multiple records [ Upstream commit f6ec01da40e4139b41179f046044ee7c4f6370dc ] If there is no user space consumer of extlog_mem trace records, then Linux properly handles multiple error records in an ELOG block extlog_print() print_extlog_rcd() __print_extlog_rcd() cper_estatus_print() apei_estatus_for_each_section() But the other code path hard codes looking for a single record to output a trace record. Fix by using the same apei_estatus_for_each_section() iterator to step over all records. Fixes: 2dfb7d51a61d ("trace, RAS: Add eMCA trace event interface") Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-29 10:20:35 +02:00
Filipe Manana	13a2719ec8	btrfs: fix processing of delayed tree block refs during backref walking [ Upstream commit 943553ef9b51db303ab2b955c1025261abfdf6fb ] During backref walking, when processing a delayed reference with a type of BTRFS_TREE_BLOCK_REF_KEY, we have two bugs there: 1) We are accessing the delayed references extent_op, and its key, without the protection of the delayed ref head's lock; 2) If there's no extent op for the delayed ref head, we end up with an uninitialized key in the stack, variable 'tmp_op_key', and then pass it to add_indirect_ref(), which adds the reference to the indirect refs rb tree. This is wrong, because indirect references should have a NULL key when we don't have access to the key, and in that case they should be added to the indirect_missing_keys rb tree and not to the indirect rb tree. This means that if have BTRFS_TREE_BLOCK_REF_KEY delayed ref resulting from freeing an extent buffer, therefore with a count of -1, it will not cancel out the corresponding reference we have in the extent tree (with a count of 1), since both references end up in different rb trees. When using fiemap, where we often need to check if extents are shared through shared subtrees resulting from snapshots, it means we can incorrectly report an extent as shared when it's no longer shared. However this is temporary because after the transaction is committed the extent is no longer reported as shared, as running the delayed reference results in deleting the tree block reference from the extent tree. Outside the fiemap context, the result is unpredictable, as the key was not initialized but it's used when navigating the rb trees to insert and search for references (prelim_ref_compare()), and we expect all references in the indirect rb tree to have valid keys. The following reproducer triggers the second bug: $ cat test.sh #!/bin/bash DEV=/dev/sdj MNT=/mnt/sdj mkfs.btrfs -f $DEV mount -o compress $DEV $MNT # With a compressed 128M file we get a tree height of 2 (level 1 root). xfs_io -f -c "pwrite -b 1M 0 128M" $MNT/foo btrfs subvolume snapshot $MNT $MNT/snap # Fiemap should output 0x2008 in the flags column. # 0x2000 means shared extent # 0x8 means encoded extent (because it's compressed) echo echo "fiemap after snapshot, range [120M, 120M + 128K):" xfs_io -c "fiemap -v 120M 128K" $MNT/foo echo # Overwrite one extent and fsync to flush delalloc and COW a new path # in the snapshot's tree. # # After this we have a BTRFS_DROP_DELAYED_REF delayed ref of type # BTRFS_TREE_BLOCK_REF_KEY with a count of -1 for every COWed extent # buffer in the path. # # In the extent tree we have inline references of type # BTRFS_TREE_BLOCK_REF_KEY, with a count of 1, for the same extent # buffers, so they should cancel each other, and the extent buffers in # the fs tree should no longer be considered as shared. # echo "Overwriting file range [120M, 120M + 128K)..." xfs_io -c "pwrite -b 128K 120M 128K" $MNT/snap/foo xfs_io -c "fsync" $MNT/snap/foo # Fiemap should output 0x8 in the flags column. The extent in the range # [120M, 120M + 128K) is no longer shared, it's now exclusive to the fs # tree. echo echo "fiemap after overwrite range [120M, 120M + 128K):" xfs_io -c "fiemap -v 120M 128K" $MNT/foo echo umount $MNT Running it before this patch: $ ./test.sh (...) wrote 134217728/134217728 bytes at offset 0 128 MiB, 128 ops; 0.1152 sec (1.085 GiB/sec and 1110.5809 ops/sec) Create a snapshot of '/mnt/sdj' in '/mnt/sdj/snap' fiemap after snapshot, range [120M, 120M + 128K): /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [245760..246015]: 34304..34559 256 0x2008 Overwriting file range [120M, 120M + 128K)... wrote 131072/131072 bytes at offset 125829120 128 KiB, 1 ops; 0.0001 sec (683.060 MiB/sec and 5464.4809 ops/sec) fiemap after overwrite range [120M, 120M + 128K): /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [245760..246015]: 34304..34559 256 0x2008 The extent in the range [120M, 120M + 128K) is still reported as shared (0x2000 bit set) after overwriting that range and flushing delalloc, which is not correct - an entire path was COWed in the snapshot's tree and the extent is now only referenced by the original fs tree. Running it after this patch: $ ./test.sh (...) wrote 134217728/134217728 bytes at offset 0 128 MiB, 128 ops; 0.1198 sec (1.043 GiB/sec and 1068.2067 ops/sec) Create a snapshot of '/mnt/sdj' in '/mnt/sdj/snap' fiemap after snapshot, range [120M, 120M + 128K): /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [245760..246015]: 34304..34559 256 0x2008 Overwriting file range [120M, 120M + 128K)... wrote 131072/131072 bytes at offset 125829120 128 KiB, 1 ops; 0.0001 sec (694.444 MiB/sec and 5555.5556 ops/sec) fiemap after overwrite range [120M, 120M + 128K): /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [245760..246015]: 34304..34559 256 0x8 Now the extent is not reported as shared anymore. So fix this by passing a NULL key pointer to add_indirect_ref() when processing a delayed reference for a tree block if there's no extent op for our delayed ref head with a defined key. Also access the extent op only after locking the delayed ref head's lock. The reproducer will be converted later to a test case for fstests. Fixes: 86d5f994425252 ("btrfs: convert prelimary reference tracking to use rbtrees") Fixes: a6dbceafb915e8 ("btrfs: Remove unused op_key var from add_delayed_refs") Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-29 10:20:35 +02:00
Filipe Manana	b397ce3477	btrfs: fix processing of delayed data refs during backref walking [ Upstream commit 4fc7b57228243d09c0d878873bf24fa64a90fa01 ] When processing delayed data references during backref walking and we are using a share context (we are being called through fiemap), whenever we find a delayed data reference for an inode different from the one we are interested in, then we immediately exit and consider the data extent as shared. This is wrong, because: 1) This might be a DROP reference that will cancel out a reference in the extent tree; 2) Even if it's an ADD reference, it may be followed by a DROP reference that cancels it out. In either case we should not exit immediately. Fix this by never exiting when we find a delayed data reference for another inode - instead add the reference and if it does not cancel out other delayed reference, we will exit early when we call extent_is_shared() after processing all delayed references. If we find a drop reference, then signal the code that processes references from the extent tree (add_inline_refs() and add_keyed_refs()) to not exit immediately if it finds there a reference for another inode, since we have delayed drop references that may cancel it out. In this later case we exit once we don't have references in the rb trees that cancel out each other and have two references for different inodes. Example reproducer for case 1): $ cat test-1.sh #!/bin/bash DEV=/dev/sdj MNT=/mnt/sdj mkfs.btrfs -f $DEV mount $DEV $MNT xfs_io -f -c "pwrite 0 64K" $MNT/foo cp --reflink=always $MNT/foo $MNT/bar echo echo "fiemap after cloning:" xfs_io -c "fiemap -v" $MNT/foo rm -f $MNT/bar echo echo "fiemap after removing file bar:" xfs_io -c "fiemap -v" $MNT/foo umount $MNT Running it before this patch, the extent is still listed as shared, it has the flag 0x2000 (FIEMAP_EXTENT_SHARED) set: $ ./test-1.sh fiemap after cloning: /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..127]: 26624..26751 128 0x2001 fiemap after removing file bar: /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..127]: 26624..26751 128 0x2001 Example reproducer for case 2): $ cat test-2.sh #!/bin/bash DEV=/dev/sdj MNT=/mnt/sdj mkfs.btrfs -f $DEV mount $DEV $MNT xfs_io -f -c "pwrite 0 64K" $MNT/foo cp --reflink=always $MNT/foo $MNT/bar # Flush delayed references to the extent tree and commit current # transaction. sync echo echo "fiemap after cloning:" xfs_io -c "fiemap -v" $MNT/foo rm -f $MNT/bar echo echo "fiemap after removing file bar:" xfs_io -c "fiemap -v" $MNT/foo umount $MNT Running it before this patch, the extent is still listed as shared, it has the flag 0x2000 (FIEMAP_EXTENT_SHARED) set: $ ./test-2.sh fiemap after cloning: /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..127]: 26624..26751 128 0x2001 fiemap after removing file bar: /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..127]: 26624..26751 128 0x2001 After this patch, after deleting bar in both tests, the extent is not reported with the 0x2000 flag anymore, it gets only the flag 0x1 (which is FIEMAP_EXTENT_LAST): $ ./test-1.sh fiemap after cloning: /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..127]: 26624..26751 128 0x2001 fiemap after removing file bar: /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..127]: 26624..26751 128 0x1 $ ./test-2.sh fiemap after cloning: /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..127]: 26624..26751 128 0x2001 fiemap after removing file bar: /mnt/sdj/foo: EXT: FILE-OFFSET BLOCK-RANGE TOTAL FLAGS 0: [0..127]: 26624..26751 128 0x1 These tests will later be converted to a test case for fstests. Fixes: dc046b10c8b7d4 ("Btrfs: make fiemap not blow when you have lots of snapshots") Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-10-29 10:20:35 +02:00

1 2 3 4 5 ...

894295 Commits