linux

iv/linux

Author	SHA1	Message	Date
Nicholas Piggin	f4b2bfed80	KVM: do not allow mapping valid but non-reference-counted pages commit f8be156be163a052a067306417cd0ff679068c97 upstream. It's possible to create a region which maps valid but non-refcounted pages (e.g., tail pages of non-compound higher order allocations). These host pages can then be returned by gfn_to_page, gfn_to_pfn, etc., family of APIs, which take a reference to the page, which takes it from 0 to 1. When the reference is dropped, this will free the page incorrectly. Fix this by only taking a reference on valid pages if it was non-zero, which indicates it is participating in normal refcounting (and can be released with put_page). This addresses CVE-2021-22543. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Paolo Bonzini <pbonzini@redhat.com> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:43 +01:00
Sean Christopherson	29efa6b0ba	KVM: Use kvm_pfn_t for local PFN variable in hva_to_pfn_remapped() commit a9545779ee9e9e103648f6f2552e73cfe808d0f4 upstream. Use kvm_pfn_t, a.k.a. u64, for the local 'pfn' variable when retrieving a so called "remapped" hva/pfn pair. In theory, the hva could resolve to a pfn in high memory on a 32-bit kernel. This bug was inadvertantly exposed by commit bd2fae8da794 ("KVM: do not assume PTE is writable after follow_pfn"), which added an error PFN value to the mix, causing gcc to comlain about overflowing the unsigned long. arch/x86/kvm/../../../virt/kvm/kvm_main.c: In function ‘hva_to_pfn_remapped’: include/linux/kvm_host.h:89:30: error: conversion from ‘long long unsigned int’ to ‘long unsigned int’ changes value from ‘9218868437227405314’ to ‘2’ [-Werror=overflow] 89 \| #define KVM_PFN_ERR_RO_FAULT (KVM_PFN_ERR_MASK + 2) \| ^ virt/kvm/kvm_main.c:1935:9: note: in expansion of macro ‘KVM_PFN_ERR_RO_FAULT’ Cc: stable@vger.kernel.org Fixes: add6a0cd1c5b ("KVM: MMU: try to fix up page faults before giving up") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210208201940.1258328-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:43 +01:00
Paolo Bonzini	854a6e01f0	KVM: do not assume PTE is writable after follow_pfn commit bd2fae8da794b55bf2ac02632da3a151b10e664c upstream. In order to convert an HVA to a PFN, KVM usually tries to use the get_user_pages family of functinso. This however is not possible for VM_IO vmas; in that case, KVM instead uses follow_pfn. In doing this however KVM loses the information on whether the PFN is writable. That is usually not a problem because the main use of VM_IO vmas with KVM is for BARs in PCI device assignment, however it is a bug. To fix it, use follow_pte and check pte_write while under the protection of the PTE lock. The information can be used to fail hva_to_pfn_remapped or passed back to the caller via *writable. Usage of follow_pfn was introduced in commit add6a0cd1c5b ("KVM: MMU: try to fix up page faults before giving up", 2016-07-05); however, even older version have the same issue, all the way back to commit 2e2e3738af33 ("KVM: Handle vma regions with no backing page", 2008-07-20), as they also did not check whether the PFN was writable. Fixes: 2e2e3738af33 ("KVM: Handle vma regions with no backing page") Reported-by: David Stevens <stevensd@google.com> Cc: 3pvd@google.com Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: stable@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> [OP: backport to 4.19, adjust follow_pte() -> follow_pte_pmd()] Signed-off-by: Ovidiu Panait <ovidiu.panait@windriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backport to 4.9: follow_pte_pmd() does not take start or end parameters] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:43 +01:00
Ross Zwisler	0dd4d649a4	mm: add follow_pte_pmd() commit 097963959594c5eccaba42510f7033f703211bda upstream. Patch series "Write protect DAX PMDs in sync path". Currently dax_mapping_entry_mkclean() fails to clean and write protect the pmd_t of a DAX PMD entry during an sync operation. This can result in data loss, as detailed in patch 2. This series is based on Dan's "libnvdimm-pending" branch, which is the current home for Jan's "dax: Page invalidation fixes" series. You can find a working tree here: https://git.kernel.org/cgit/linux/kernel/git/zwisler/linux.git/log/?h=dax_pmd_clean This patch (of 2): Similar to follow_pte(), follow_pte_pmd() allows either a PTE leaf or a huge page PMD leaf to be found and returned. Link: http://lkml.kernel.org/r/1482272586-21177-2-git-send-email-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Suggested-by: Dave Hansen <dave.hansen@intel.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [bwh: Backported to 4.9: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:43 +01:00
Davidlohr Bueso	ef2e64035f	lib/timerqueue: Rely on rbtree semantics for next timer commit 511885d7061eda3eb1faf3f57dcc936ff75863f1 upstream. Simplify the timerqueue code by using cached rbtrees and rely on the tree leftmost node semantics to get the timer with earliest expiration time. This is a drop in conversion, and therefore semantics remain untouched. The runtime overhead of cached rbtrees is be pretty much the same as the current head->next method, noting that when removing the leftmost node, a common operation for the timerqueue, the rb_next(leftmost) is O(1) as well, so the next timer will either be the right node or its parent. Therefore no extra pointer chasing. Finally, the size of the struct timerqueue_head remains the same. Passes several hours of rcutorture. Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190724152323.bojciei3muvfxalm@linux-r8p5 [bwh: While this was supposed to be just refactoring, it also fixed a security flaw (CVE-2021-20317). Backported to 4.9: - Deleted code in timerqueue_del() is different before commit d852d39432f5 "timerqueue: Use rb_entry_safe() instead of open-coding it" - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:42 +01:00
Davidlohr Bueso	c89a768024	rbtree: cache leftmost node internally commit cd9e61ed1eebbcd5dfad59475d41ec58d9b64b6a upstream. Patch series "rbtree: Cache leftmost node internally", v4. A series to extending rbtrees to internally cache the leftmost node such that we can have fast overlap check optimization for all interval tree users[1]. The benefits of this series are that: (i) Unify users that do internal leftmost node caching. (ii) Optimize all interval tree users. (iii) Convert at least two new users (epoll and procfs) to the new interface. This patch (of 16): Red-black tree semantics imply that nodes with smaller or greater (or equal for duplicates) keys always be to the left and right, respectively. For the kernel this is extremely evident when considering our rb_first() semantics. Enabling lookups for the smallest node in the tree in O(1) can save a good chunk of cycles in not having to walk down the tree each time. To this end there are a few core users that explicitly do this, such as the scheduler and rtmutexes. There is also the desire for interval trees to have this optimization allowing faster overlap checking. This patch introduces a new 'struct rb_root_cached' which is just the root with a cached pointer to the leftmost node. The reason why the regular rb_root was not extended instead of adding a new structure was that this allows the user to have the choice between memory footprint and actual tree performance. The new wrappers on top of the regular rb_root calls are: - rb_first_cached(cached_root) -- which is a fast replacement for rb_first. - rb_insert_color_cached(node, cached_root, new) - rb_erase_cached(node, cached_root) In addition, augmented cached interfaces are also added for basic insertion and deletion operations; which becomes important for the interval tree changes. With the exception of the inserts, which adds a bool for updating the new leftmost, the interfaces are kept the same. To this end, porting rb users to the cached version becomes really trivial, and keeping current rbtree semantics for users that don't care about the optimization requires zero overhead. Link: http://lkml.kernel.org/r/20170719014603.19029-2-dave@stgolabs.net Signed-off-by: Davidlohr Bueso <dbueso@suse.de> Reviewed-by: Jan Kara <jack@suse.cz> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:42 +01:00
Paul Moore	f49f0e65a9	cipso,calipso: resolve a number of problems with the DOI refcounts commit ad5d07f4a9cd671233ae20983848874731102c08 upstream. The current CIPSO and CALIPSO refcounting scheme for the DOI definitions is a bit flawed in that we: 1. Don't correctly match gets/puts in netlbl_cipsov4_list(). 2. Decrement the refcount on each attempt to remove the DOI from the DOI list, only removing it from the list once the refcount drops to zero. This patch fixes these problems by adding the missing "puts" to netlbl_cipsov4_list() and introduces a more conventional, i.e. not-buggy, refcounting mechanism to the DOI definitions. Upon the addition of a DOI to the DOI list, it is initialized with a refcount of one, removing a DOI from the list removes it from the list and drops the refcount by one; "gets" and "puts" behave as expected with respect to refcounts, increasing and decreasing the DOI's refcount by one. Fixes: b1edeb102397 ("netlabel: Replace protocol/NetLabel linking with refrerence counts") Fixes: d7cce01504a0 ("netlabel: Add support for removing a CALIPSO DOI.") Reported-by: syzbot+9ec037722d2603a9f52e@syzkaller.appspotmail.com Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: David S. Miller <davem@davemloft.net> [bwh: Backported to 4.9: adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:42 +01:00
Michael Braun	2cf34285e6	gianfar: fix jumbo packets+napi+rx overrun crash commit d8861bab48b6c1fc3cdbcab8ff9d1eaea43afe7f upstream. When using jumbo packets and overrunning rx queue with napi enabled, the following sequence is observed in gfar_add_rx_frag: \| lstatus \| \| skb \| t \| lstatus, size, flags \| first \| len, data_len, *ptr \| ---+--------------------------------------+-------+-----------------------+ 13 \| 18002348, 9032, INTERRUPT LAST \| 0 \| 9600, 8000, f554c12e \| 12 \| 10000640, 1600, INTERRUPT \| 0 \| 8000, 6400, f554c12e \| 11 \| 10000640, 1600, INTERRUPT \| 0 \| 6400, 4800, f554c12e \| 10 \| 10000640, 1600, INTERRUPT \| 0 \| 4800, 3200, f554c12e \| 09 \| 10000640, 1600, INTERRUPT \| 0 \| 3200, 1600, f554c12e \| 08 \| 14000640, 1600, INTERRUPT FIRST \| 0 \| 1600, 0, f554c12e \| 07 \| 14000640, 1600, INTERRUPT FIRST \| 1 \| 0, 0, f554c12e \| 06 \| 1c000080, 128, INTERRUPT LAST FIRST \| 1 \| 0, 0, abf3bd6e \| 05 \| 18002348, 9032, INTERRUPT LAST \| 0 \| 8000, 6400, c5a57780 \| 04 \| 10000640, 1600, INTERRUPT \| 0 \| 6400, 4800, c5a57780 \| 03 \| 10000640, 1600, INTERRUPT \| 0 \| 4800, 3200, c5a57780 \| 02 \| 10000640, 1600, INTERRUPT \| 0 \| 3200, 1600, c5a57780 \| 01 \| 10000640, 1600, INTERRUPT \| 0 \| 1600, 0, c5a57780 \| 00 \| 14000640, 1600, INTERRUPT FIRST \| 1 \| 0, 0, c5a57780 \| So at t=7 a new packets is started but not finished, probably due to rx overrun - but rx overrun is not indicated in the flags. Instead a new packets starts at t=8. This results in skb->len to exceed size for the LAST fragment at t=13 and thus a negative fragment size added to the skb. This then crashes: kernel BUG at include/linux/skbuff.h:2277! Oops: Exception in kernel mode, sig: 5 [#1] ... NIP [c04689f4] skb_pull+0x2c/0x48 LR [c03f62ac] gfar_clean_rx_ring+0x2e4/0x844 Call Trace: [ec4bfd38] [c06a84c4] _raw_spin_unlock_irqrestore+0x60/0x7c (unreliable) [ec4bfda8] [c03f6a44] gfar_poll_rx_sq+0x48/0xe4 [ec4bfdc8] [c048d504] __napi_poll+0x54/0x26c [ec4bfdf8] [c048d908] net_rx_action+0x138/0x2c0 [ec4bfe68] [c06a8f34] __do_softirq+0x3a4/0x4fc [ec4bfed8] [c0040150] run_ksoftirqd+0x58/0x70 [ec4bfee8] [c0066ecc] smpboot_thread_fn+0x184/0x1cc [ec4bff08] [c0062718] kthread+0x140/0x144 [ec4bff38] [c0012350] ret_from_kernel_thread+0x14/0x1c This patch fixes this by checking for computed LAST fragment size, so a negative sized fragment is never added. In order to prevent the newer rx frame from getting corrupted, the FIRST flag is checked to discard the incomplete older frame. Signed-off-by: Michael Braun <michael-dev@fami-braun.de> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:42 +01:00
Andy Spencer	8d18509bd3	gianfar: simplify FCS handling and fix memory leak commit d903ec77118c09f93a610b384d83a6df33a64fe6 upstream. Previously, buffer descriptors containing only the frame check sequence (FCS) were skipped and not added to the skb. However, the page reference count was still incremented, leading to a memory leak. Fixing this inside gfar_add_rx_frag() is difficult due to reserved memory handling and page reuse. Instead, move the FCS handling to gfar_process_frame() and trim off the FCS before passing the skb up the networking stack. Signed-off-by: Andy Spencer <aspencer@spacex.com> Signed-off-by: Jim Gruen <jgruen@spacex.com> Signed-off-by: David S. Miller <davem@davemloft.net> Cc: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:42 +01:00
Dave Airlie	70f44dfbde	drm/ttm/nouveau: don't call tt destroy callback on alloc failure. commit 5de5b6ecf97a021f29403aa272cb4e03318ef586 upstream. This is confusing, and from my reading of all the drivers only nouveau got this right. Just make the API act under driver control of it's own allocation failing, and don't call destroy, if the page table fails to create there is nothing to cleanup here. (I'm willing to believe I've missed something here, so please review deeply). Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200728041736.20689-1-airlied@gmail.com [bwh: Backported to 4.14: - Drop change in ttm_sg_tt_init() - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:42 +01:00
Linus Torvalds	0c29640bde	gup: document and work around "COW can break either way" issue commit 9bbd42e79720122334226afad9ddcac1c3e6d373 upstream. Doing a "get_user_pages()" on a copy-on-write page for reading can be ambiguous: the page can be COW'ed at any time afterwards, and the direction of a COW event isn't defined. Yes, whoever writes to it will generally do the COW, but if the thread that did the get_user_pages() unmapped the page before the write (and that could happen due to memory pressure in addition to any outright action), the writer could also just take over the old page instead. End result: the get_user_pages() call might result in a page pointer that is no longer associated with the original VM, and is associated with - and controlled by - another VM having taken it over instead. So when doing a get_user_pages() on a COW mapping, the only really safe thing to do would be to break the COW when getting the page, even when only getting it for reading. At the same time, some users simply don't even care. For example, the perf code wants to look up the page not because it cares about the page, but because the code simply wants to look up the physical address of the access for informational purposes, and doesn't really care about races when a page might be unmapped and remapped elsewhere. This adds logic to force a COW event by setting FOLL_WRITE on any copy-on-write mapping when FOLL_GET (or FOLL_PIN) is used to get a page pointer as a result. The current semantics end up being: - __get_user_pages_fast(): no change. If you don't ask for a write, you won't break COW. You'd better know what you're doing. - get_user_pages_fast(): the fast-case "look it up in the page tables without anything getting mmap_sem" now refuses to follow a read-only page, since it might need COW breaking. Which happens in the slow path - the fast path doesn't know if the memory might be COW or not. - get_user_pages() (including the slow-path fallback for gup_fast()): for a COW mapping, turn on FOLL_WRITE for FOLL_GET/FOLL_PIN, with very similar semantics to FOLL_FORCE. If it turns out that we want finer granularity (ie "only break COW when it might actually matter" - things like the zero page are special and don't need to be broken) we might need to push these semantics deeper into the lookup fault path. So if people care enough, it's possible that we might end up adding a new internal FOLL_BREAK_COW flag to go with the internal FOLL_COW flag we already have for tracking "I had a COW". Alternatively, if it turns out that different callers might want to explicitly control the forced COW break behavior, we might even want to make such a flag visible to the users of get_user_pages() instead of using the above default semantics. But for now, this is mostly commentary on the issue (this commit message being a lot bigger than the patch, and that patch in turn is almost all comments), with that minimal "enable COW breaking early" logic using the existing FOLL_WRITE behavior. [ It might be worth noting that we've always had this ambiguity, and it could arguably be seen as a user-space issue. You only get private COW mappings that could break either way in situations where user space is doing cooperative things (ie fork() before an execve() etc), but it _is_ surprising and very subtle, and fork() is supposed to give you independent address spaces. So let's treat this as a kernel issue and make the semantics of get_user_pages() easier to understand. Note that obviously a true shared mapping will still get a page that can change under us, so this does _not_ mean that get_user_pages() somehow returns any "stable" page ] [surenb: backport notes Replaced (gup_flags \| FOLL_WRITE) with write=1 in gup_pgd_range. Removed FOLL_PIN usage in should_force_cow_break since it's missing in the earlier kernels.] Reported-by: Jann Horn <jannh@google.com> Tested-by: Christoph Hellwig <hch@lst.de> Acked-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Kirill Shutemov <kirill@shutemov.name> Acked-by: Jan Kara <jack@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> [surenb: backport to 4.19 kernel] Cc: stable@vger.kernel.org # 4.19.x Signed-off-by: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> [bwh: Backported to 4.9: - Generic get_user_pages_fast() calls __get_user_pages_fast() here, so make it pass write=1 - Various architectures have their own implementations of get_user_pages_fast(), so apply the corresponding change there - Adjust context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:42 +01:00
Ben Hutchings	6fbb838388	Revert "gup: document and work around "COW can break either way" issue" This reverts commit 9bbd42e79720122334226afad9ddcac1c3e6d373, which was commit 17839856fd588f4ab6b789f482ed3ffd7c403e1f upstream. The backport was incorrect and incomplete: * It forced the write flag on in the generic __get_user_pages_fast(), whereas only get_user_pages_fast() was supposed to do that. * It only fixed the generic RCU-based implementation used by arm, arm64, and powerpc. Before Linux 4.13, several other architectures had their own implementations: mips, s390, sparc, sh, and x86. This will be followed by a (hopefully) correct backport. Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Cc: Suren Baghdasaryan <surenb@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:42 +01:00
Miaoqian Lin	31f961674d	lib82596: Fix IRQ check in sni_82596_probe commit 99218cbf81bf21355a3de61cd46a706d36e900e6 upstream. platform_get_irq() returns negative error number instead 0 on failure. And the doc of platform_get_irq() provides a usage example: int irq = platform_get_irq(pdev, 0); if (irq < 0) return irq; Fix the check of return value to catch errors correctly. Fixes: 115978859272 ("i825xx: Move the Intel 82586/82593/82596 based drivers") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:42 +01:00
Matthias Schiffer	0febebd36e	scripts/dtc: dtx_diff: remove broken example from help text commit d8adf5b92a9d2205620874d498c39923ecea8749 upstream. dtx_diff suggests to use <(...) syntax to pipe two inputs into it, but this has never worked: The /proc/self/fds/... paths passed by the shell will fail the `[ -f "${dtx}" ] && [ -r "${dtx}" ]` check in compile_to_dts, but even with this check removed, the function cannot work: hexdump will eat up the DTB magic, making the subsequent dtc call fail, as a pipe cannot be rewound. Simply remove this broken example, as there is already an alternative one that works fine. Fixes: 10eadc253ddf ("dtc: create tool to diff device trees") Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Reviewed-by: Frank Rowand <frank.rowand@sony.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220113081918.10387-1-matthias.schiffer@ew.tq-group.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:42 +01:00
Sergey Shtylyov	5d48641222	bcmgenet: add WOL IRQ check commit 9deb48b53e7f4056c2eaa2dc2ee3338df619e4f6 upstream. The driver neglects to check the result of platform_get_irq_optional()'s call and blithely passes the negative error codes to devm_request_irq() (which takes unsigned IRQ #), causing it to fail with -EINVAL. Stop calling devm_request_irq() with the invalid IRQ #s. Fixes: 8562056f267d ("net: bcmgenet: request Wake-on-LAN interrupt") Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:42 +01:00
Kevin Bracey	019f0458c5	net_sched: restore "mpu xxx" handling commit fb80445c438c78b40b547d12b8d56596ce4ccfeb upstream. commit 56b765b79e9a ("htb: improved accuracy at high rates") broke "overhead X", "linklayer atm" and "mpu X" attributes. "overhead X" and "linklayer atm" have already been fixed. This restores the "mpu X" handling, as might be used by DOCSIS or Ethernet shaping: tc class add ... htb rate X overhead 4 mpu 64 The code being fixed is used by htb, tbf and act_police. Cake has its own mpu handling. qdisc_calculate_pkt_len still uses the size table containing values adjusted for mpu by user space. iproute2 tc has always passed mpu into the kernel via a tc_ratespec structure, but the kernel never directly acted on it, merely stored it so that it could be read back by `tc class show`. Rather, tc would generate length-to-time tables that included the mpu (and linklayer) in their construction, and the kernel used those tables. Since v3.7, the tables were no longer used. Along with "mpu", this also broke "overhead" and "linklayer" which were fixed in 01cb71d2d47b ("net_sched: restore "overhead xxx" handling", v3.10) and 8a8e3d84b171 ("net_sched: restore "linklayer atm" handling", v3.11). "overhead" was fixed by simply restoring use of tc_ratespec::overhead - this had originally been used by the kernel but was initially omitted from the new non-table-based calculations. "linklayer" had been handled in the table like "mpu", but the mode was not originally passed in tc_ratespec. The new implementation was made to handle it by getting new versions of tc to pass the mode in an extended tc_ratespec, and for older versions of tc the table contents were analysed at load time to deduce linklayer. As "mpu" has always been given to the kernel in tc_ratespec, accompanying the mpu-based table, we can restore system functionality with no userspace change by making the kernel act on the tc_ratespec value. Fixes: 56b765b79e9a ("htb: improved accuracy at high rates") Signed-off-by: Kevin Bracey <kevin@bracey.fi> Cc: Eric Dumazet <edumazet@google.com> Cc: Jiri Pirko <jiri@resnulli.us> Cc: Vimalkumar <j.vimal@gmail.com> Link: https://lore.kernel.org/r/20220112170210.1014351-1-kevin@bracey.fi Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:42 +01:00
Tudor Ambarus	94ca32fe45	dmaengine: at_xdmac: Fix at_xdmac_lld struct definition commit 912f7c6f7fac273f40e621447cf17d14b50d6e5b upstream. The hardware channel next descriptor view structure contains just fields of 32 bits, while dma_addr_t can be of type u64 or u32 depending on CONFIG_ARCH_DMA_ADDR_T_64BIT. Force u32 to comply with what the hardware expects. Fixes: e1f7c9eee707 ("dmaengine: at_xdmac: creation of the atmel eXtended DMA Controller driver") Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Link: https://lore.kernel.org/r/20211215110115.191749-11-tudor.ambarus@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:41 +01:00
Tudor Ambarus	bc88245308	dmaengine: at_xdmac: Fix lld view setting commit 1385eb4d14d447cc5d744bc2ac34f43be66c9963 upstream. AT_XDMAC_CNDC_NDVIEW_NDV3 was set even for AT_XDMAC_MBR_UBC_NDV2, because of the wrong bit handling. Fix it. Fixes: ee0fe35c8dcd ("dmaengine: xdmac: Handle descriptor's view 3 registers") Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Link: https://lore.kernel.org/r/20211215110115.191749-10-tudor.ambarus@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:41 +01:00
Tudor Ambarus	3e279f7aa0	dmaengine: at_xdmac: Print debug message after realeasing the lock commit 5edc24ac876a928f36f407a0fcdb33b94a3a210f upstream. It is desirable to do the prints without the lock held if possible, so move the print after the lock is released. Fixes: e1f7c9eee707 ("dmaengine: at_xdmac: creation of the atmel eXtended DMA Controller driver") Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Link: https://lore.kernel.org/r/20211215110115.191749-4-tudor.ambarus@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:41 +01:00
Tudor Ambarus	d7ab44f5fb	dmaengine: at_xdmac: Don't start transactions at tx_submit level commit bccfb96b59179d4f96cbbd1ddff8fac6d335eae4 upstream. tx_submit is supposed to push the current transaction descriptor to a pending queue, waiting for issue_pending() to be called. issue_pending() must start the transfer, not tx_submit(), thus remove at_xdmac_start_xfer() from at_xdmac_tx_submit(). Clients of at_xdmac that assume that tx_submit() starts the transfer must be updated and call dma_async_issue_pending() if they miss to call it (one example is atmel_serial). As the at_xdmac_start_xfer() is now called only from at_xdmac_advance_work() when !at_xdmac_chan_is_enabled(), the at_xdmac_chan_is_enabled() check is no longer needed in at_xdmac_start_xfer(), thus remove it. Fixes: e1f7c9eee707 ("dmaengine: at_xdmac: creation of the atmel eXtended DMA Controller driver") Signed-off-by: Tudor Ambarus <tudor.ambarus@microchip.com> Link: https://lore.kernel.org/r/20211215110115.191749-2-tudor.ambarus@microchip.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:41 +01:00
Guillaume Nault	6272a314ef	libcxgb: Don't accidentally set RTO_ONLINK in cxgb_find_route() commit a915deaa9abe4fb3a440312c954253a6a733608e upstream. Mask the ECN bits before calling ip_route_output_ports(). The tos variable might be passed directly from an IPv4 header, so it may have the last ECN bit set. This interferes with the route lookup process as ip_route_output_key_hash() interpretes this bit specially (to restrict the route scope). Found by code inspection, compile tested only. Fixes: 804c2f3e36ef ("libcxgb,iw_cxgb4,cxgbit: add cxgb_find_route()") Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:41 +01:00
Eric Dumazet	b2fd2514d8	netns: add schedule point in ops_exit_list() commit 2836615aa22de55b8fca5e32fe1b27a67cda625e upstream. When under stress, cleanup_net() can have to dismantle netns in big numbers. ops_exit_list() currently calls many helpers [1] that have no schedule point, and we can end up with soft lockups, particularly on hosts with many cpus. Even for moderate amount of netns processed by cleanup_net() this patch avoids latency spikes. [1] Some of these helpers like fib_sync_up() and fib_sync_down_dev() are very slow because net/ipv4/fib_semantics.c uses host-wide hash tables, and ifindex is used as the only input of two hash functions. ifindexes tend to be the same for all netns (lo.ifindex==1 per instance) This will be fixed in a separate patch. Fixes: 72ad937abd0a ("net: Add support for batching network namespace cleanups") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:41 +01:00
Robert Hancock	81fb2351f0	net: axienet: fix number of TX ring slots for available check commit aba57a823d2985a2cc8c74a2535f3a88e68d9424 upstream. The check for the number of available TX ring slots was off by 1 since a slot is required for the skb header as well as each fragment. This could result in overwriting a TX ring slot that was still in use. Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver") Signed-off-by: Robert Hancock <robert.hancock@calian.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:41 +01:00
Robert Hancock	9f1a3c1334	net: axienet: Wait for PhyRstCmplt after core reset commit b400c2f4f4c53c86594dd57098970d97d488bfde upstream. When resetting the device, wait for the PhyRstCmplt bit to be set in the interrupt status register before continuing initialization, to ensure that the core is actually ready. When using an external PHY, this also ensures we do not start trying to access the PHY while it is still in reset. The PHY reset is initiated by the core reset which is triggered just above, but remains asserted for 5ms after the core is reset according to the documentation. The MgtRdy bit could also be waited for, but unfortunately when using 7-series devices, the bit does not appear to work as documented (it seems to behave as some sort of link state indication and not just an indication the transceiver is ready) so it can't really be relied on for this purpose. Fixes: 8a3b7a252dca9 ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver") Signed-off-by: Robert Hancock <robert.hancock@calian.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:41 +01:00
Eric Dumazet	f87f80d8af	af_unix: annote lockless accesses to unix_tot_inflight & gc_in_progress commit 9d6d7f1cb67cdee15f1a0e85aacfb924e0e02435 upstream. wait_for_unix_gc() reads unix_tot_inflight & gc_in_progress without synchronization. Adds READ_ONCE()/WRITE_ONCE() and their associated comments to better document the intent. BUG: KCSAN: data-race in unix_inflight / wait_for_unix_gc write to 0xffffffff86e2b7c0 of 4 bytes by task 9380 on cpu 0: unix_inflight+0x1e8/0x260 net/unix/scm.c:63 unix_attach_fds+0x10c/0x1e0 net/unix/scm.c:121 unix_scm_to_skb net/unix/af_unix.c:1674 [inline] unix_dgram_sendmsg+0x679/0x16b0 net/unix/af_unix.c:1817 unix_seqpacket_sendmsg+0xcc/0x110 net/unix/af_unix.c:2258 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg net/socket.c:724 [inline] ____sys_sendmsg+0x39a/0x510 net/socket.c:2409 ___sys_sendmsg net/socket.c:2463 [inline] __sys_sendmmsg+0x267/0x4c0 net/socket.c:2549 __do_sys_sendmmsg net/socket.c:2578 [inline] __se_sys_sendmmsg net/socket.c:2575 [inline] __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2575 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae read to 0xffffffff86e2b7c0 of 4 bytes by task 9375 on cpu 1: wait_for_unix_gc+0x24/0x160 net/unix/garbage.c:196 unix_dgram_sendmsg+0x8e/0x16b0 net/unix/af_unix.c:1772 unix_seqpacket_sendmsg+0xcc/0x110 net/unix/af_unix.c:2258 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg net/socket.c:724 [inline] ____sys_sendmsg+0x39a/0x510 net/socket.c:2409 ___sys_sendmsg net/socket.c:2463 [inline] __sys_sendmmsg+0x267/0x4c0 net/socket.c:2549 __do_sys_sendmmsg net/socket.c:2578 [inline] __se_sys_sendmmsg net/socket.c:2575 [inline] __x64_sys_sendmmsg+0x53/0x60 net/socket.c:2575 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae value changed: 0x00000002 -> 0x00000004 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 9375 Comm: syz-executor.1 Not tainted 5.16.0-rc7-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: 9915672d4127 ("af_unix: limit unix_tot_inflight") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Link: https://lore.kernel.org/r/20220114164328.2038499-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:41 +01:00
Miaoqian Lin	46e9607193	parisc: pdc_stable: Fix memory leak in pdcs_register_pathentries commit d24846a4246b6e61ecbd036880a4adf61681d241 upstream. kobject_init_and_add() takes reference even when it fails. According to the doc of kobject_init_and_add()： If this function returns an error, kobject_put() must be called to properly clean up the memory associated with the object. Fix memory leak by calling kobject_put(). Fixes: 73f368cf679b ("Kobject: change drivers/parisc/pdc_stable.c to use kobject_init_and_add") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:41 +01:00
Tobias Waldekranz	738f88c92e	net/fsl: xgmac_mdio: Fix incorrect iounmap when removing module commit 3f7c239c7844d2044ed399399d97a5f1c6008e1b upstream. As reported by sparse: In the remove path, the driver would attempt to unmap its own priv pointer - instead of the io memory that it mapped in probe. Fixes: 9f35a7342cff ("net/fsl: introduce Freescale 10G MDIO driver") Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:41 +01:00
Tobias Waldekranz	68a03b3d00	powerpc/fsl/dts: Enable WA for erratum A-009885 on fman3l MDIO buses commit 0d375d610fa96524e2ee2b46830a46a7bfa92a9f upstream. This block is used in (at least) T1024 and T1040, including their variants like T1023 etc. Fixes: d55ad2967d89 ("powerpc/mpc85xx: Create dts components for the FSL QorIQ DPAA FMan") Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:41 +01:00
Chengguang Xu	a046a42fe0	RDMA/rxe: Fix a typo in opcode name commit 8d1cfb884e881efd69a3be4ef10772c71cb22216 upstream. There is a redundant ']' in the name of opcode IB_OPCODE_RC_SEND_MIDDLE, so just fix it. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20211218112320.3558770-1-cgxu519@mykernel.net Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:41 +01:00
Yixing Liu	902650b01a	RDMA/hns: Modify the mapping attribute of doorbell to device commit 39d5534b1302189c809e90641ffae8cbdc42a8fc upstream. It is more general for ARM device drivers to use the device attribute to map PCI BAR spaces. Fixes: 9a4435375cd1 ("IB/hns: Add driver files for hns RoCE driver") Link: https://lore.kernel.org/r/20211206133652.27476-1-liangwenpeng@huawei.com Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:40 +01:00
Christian König	47ed5eedd7	drm/radeon: fix error handling in radeon_driver_open_kms commit 4722f463896cc0ef1a6f1c3cb2e171e949831249 upstream. The return value was never initialized so the cleanup code executed when it isn't even necessary. Just add proper error handling. Fixes: ab50cb9df889 ("drm/radeon/radeon_kms: Fix a NULL pointer dereference in radeon_driver_open_kms()") Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Jan Stancek <jstancek@redhat.com> Tested-by: Borislav Petkov <bp@suse.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:40 +01:00
Amir Goldstein	fde32bbe9a	fuse: fix live lock in fuse_iget() commit 775c5033a0d164622d9d10dd0f0a5531639ed3ed upstream. Commit 5d069dbe8aaf ("fuse: fix bad inode") replaced make_bad_inode() in fuse_iget() with a private implementation fuse_make_bad(). The private implementation fails to remove the bad inode from inode cache, so the retry loop with iget5_locked() finds the same bad inode and marks it bad forever. kmsg snip: [ ] rcu: INFO: rcu_sched self-detected stall on CPU ... [ ] ? bit_wait_io+0x50/0x50 [ ] ? fuse_init_file_inode+0x70/0x70 [ ] ? find_inode.isra.32+0x60/0xb0 [ ] ? fuse_init_file_inode+0x70/0x70 [ ] ilookup5_nowait+0x65/0x90 [ ] ? fuse_init_file_inode+0x70/0x70 [ ] ilookup5.part.36+0x2e/0x80 [ ] ? fuse_init_file_inode+0x70/0x70 [ ] ? fuse_inode_eq+0x20/0x20 [ ] iget5_locked+0x21/0x80 [ ] ? fuse_inode_eq+0x20/0x20 [ ] fuse_iget+0x96/0x1b0 Fixes: 5d069dbe8aaf ("fuse: fix bad inode") Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:40 +01:00
Miklos Szeredi	3a2f8823aa	fuse: fix bad inode commit 5d069dbe8aaf2a197142558b6fb2978189ba3454 upstream. Jan Kara's analysis of the syzbot report (edited): The reproducer opens a directory on FUSE filesystem, it then attaches dnotify mark to the open directory. After that a fuse_do_getattr() call finds that attributes returned by the server are inconsistent, and calls make_bad_inode() which, among other things does: inode->i_mode = S_IFREG; This then confuses dnotify which doesn't tear down its structures properly and eventually crashes. Avoid calling make_bad_inode() on a live inode: switch to a private flag on the fuse inode. Also add the test to ops which the bad_inode_ops would have caught. This bug goes back to the initial merge of fuse in 2.6.14... Reported-by: syzbot+f427adf9324b92652ccc@syzkaller.appspotmail.com Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Tested-by: Jan Kara <jack@suse.cz> Cc: <stable@vger.kernel.org> [bwh: Backported to 4.9: - Drop changes in fuse_dir_fsync(), fuse_readahead(), fuse_evict_inode() - In fuse_get_link(), return ERR_PTR(-EIO) for bad inodes - Convert some additional calls to is_bad_inode() - Adjust filename, context] Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:40 +01:00
Theodore Ts'o	ab5edcdd0e	ext4: don't use the orphan list when migrating an inode commit 6eeaf88fd586f05aaf1d48cb3a139d2a5c6eb055 upstream. We probably want to remove the indirect block to extents migration feature after a deprecation window, but until then, let's fix a potential data loss problem caused by the fact that we put the tmp_inode on the orphan list. In the unlikely case where we crash and do a journal recovery, the data blocks belonging to the inode being migrated are also represented in the tmp_inode on the orphan list --- and so its data blocks will get marked unallocated, and available for reuse. Instead, stop putting the tmp_inode on the oprhan list. So in the case where we crash while migrating the inode, we'll leak an inode, which is not a disaster. It will be easily fixed the next time we run fsck, and it's better than potentially having blocks getting claimed by two different files, and losing data as a result. Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Lukas Czerner <lczerner@redhat.com> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:40 +01:00
Ye Bin	581658a611	ext4: Fix BUG_ON in ext4_bread when write quota data commit 380a0091cab482489e9b19e07f2a166ad2b76d5c upstream. We got issue as follows when run syzkaller: [ 167.936972] EXT4-fs error (device loop0): __ext4_remount:6314: comm rep: Abort forced by user [ 167.938306] EXT4-fs (loop0): Remounting filesystem read-only [ 167.981637] Assertion failure in ext4_getblk() at fs/ext4/inode.c:847: '(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) \|\| handle != NULL \|\| create == 0' [ 167.983601] ------------[ cut here ]------------ [ 167.984245] kernel BUG at fs/ext4/inode.c:847! [ 167.984882] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI [ 167.985624] CPU: 7 PID: 2290 Comm: rep Tainted: G B 5.16.0-rc5-next-20211217+ #123 [ 167.986823] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 [ 167.988590] RIP: 0010:ext4_getblk+0x17e/0x504 [ 167.989189] Code: c6 01 74 28 49 c7 c0 a0 a3 5c 9b b9 4f 03 00 00 48 c7 c2 80 9c 5c 9b 48 c7 c6 40 b6 5c 9b 48 c7 c7 20 a4 5c 9b e8 77 e3 fd ff <0f> 0b 8b 04 244 [ 167.991679] RSP: 0018:ffff8881736f7398 EFLAGS: 00010282 [ 167.992385] RAX: 0000000000000094 RBX: 1ffff1102e6dee75 RCX: 0000000000000000 [ 167.993337] RDX: 0000000000000001 RSI: ffffffff9b6e29e0 RDI: ffffed102e6dee66 [ 167.994292] RBP: ffff88816a076210 R08: 0000000000000094 R09: ffffed107363fa09 [ 167.995252] R10: ffff88839b1fd047 R11: ffffed107363fa08 R12: ffff88816a0761e8 [ 167.996205] R13: 0000000000000000 R14: 0000000000000021 R15: 0000000000000001 [ 167.997158] FS: 00007f6a1428c740(0000) GS:ffff88839b000000(0000) knlGS:0000000000000000 [ 167.998238] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 167.999025] CR2: 00007f6a140716c8 CR3: 0000000133216000 CR4: 00000000000006e0 [ 167.999987] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 168.000944] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 168.001899] Call Trace: [ 168.002235] <TASK> [ 168.007167] ext4_bread+0xd/0x53 [ 168.007612] ext4_quota_write+0x20c/0x5c0 [ 168.010457] write_blk+0x100/0x220 [ 168.010944] remove_free_dqentry+0x1c6/0x440 [ 168.011525] free_dqentry.isra.0+0x565/0x830 [ 168.012133] remove_tree+0x318/0x6d0 [ 168.014744] remove_tree+0x1eb/0x6d0 [ 168.017346] remove_tree+0x1eb/0x6d0 [ 168.019969] remove_tree+0x1eb/0x6d0 [ 168.022128] qtree_release_dquot+0x291/0x340 [ 168.023297] v2_release_dquot+0xce/0x120 [ 168.023847] dquot_release+0x197/0x3e0 [ 168.024358] ext4_release_dquot+0x22a/0x2d0 [ 168.024932] dqput.part.0+0x1c9/0x900 [ 168.025430] __dquot_drop+0x120/0x190 [ 168.025942] ext4_clear_inode+0x86/0x220 [ 168.026472] ext4_evict_inode+0x9e8/0xa22 [ 168.028200] evict+0x29e/0x4f0 [ 168.028625] dispose_list+0x102/0x1f0 [ 168.029148] evict_inodes+0x2c1/0x3e0 [ 168.030188] generic_shutdown_super+0xa4/0x3b0 [ 168.030817] kill_block_super+0x95/0xd0 [ 168.031360] deactivate_locked_super+0x85/0xd0 [ 168.031977] cleanup_mnt+0x2bc/0x480 [ 168.033062] task_work_run+0xd1/0x170 [ 168.033565] do_exit+0xa4f/0x2b50 [ 168.037155] do_group_exit+0xef/0x2d0 [ 168.037666] __x64_sys_exit_group+0x3a/0x50 [ 168.038237] do_syscall_64+0x3b/0x90 [ 168.038751] entry_SYSCALL_64_after_hwframe+0x44/0xae In order to reproduce this problem, the following conditions need to be met: 1. Ext4 filesystem with no journal; 2. Filesystem image with incorrect quota data; 3. Abort filesystem forced by user; 4. umount filesystem; As in ext4_quota_write: ... if (EXT4_SB(sb)->s_journal && !handle) { ext4_msg(sb, KERN_WARNING, "Quota write (off=%llu, len=%llu)" " cancelled because transaction is not started", (unsigned long long)off, (unsigned long long)len); return -EIO; } ... We only check handle if NULL when filesystem has journal. There is need check handle if NULL even when filesystem has no journal. Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20211223015506.297766-1-yebin10@huawei.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:40 +01:00
Luís Henriques	0626ef106a	ext4: set csum seed in tmp inode while migrating to extents commit e81c9302a6c3c008f5c30beb73b38adb0170ff2d upstream. When migrating to extents, the temporary inode will have it's own checksum seed. This means that, when swapping the inodes data, the inode checksums will be incorrect. This can be fixed by recalculating the extents checksums again. Or simply by copying the seed into the temporary inode. Link: https://bugzilla.kernel.org/show_bug.cgi?id=213357 Reported-by: Jeroen van Wolffelaar <jeroen@wolffelaar.nl> Signed-off-by: Luís Henriques <lhenriques@suse.de> Link: https://lore.kernel.org/r/20211214175058.19511-1-lhenriques@suse.de Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:40 +01:00
Ilan Peer	e0b2eba835	iwlwifi: mvm: Increase the scan timeout guard to 30 seconds commit ced50f1133af12f7521bb777fcf4046ca908fb77 upstream. With the introduction of 6GHz channels the scan guard timeout should be adjusted to account for the following extreme case: - All 6GHz channels are scanned passively: 58 channels. - The scan is fragmented with the following parameters: 3 fragments, 95 TUs suspend time, 44 TUs maximal out of channel time. The above would result with scan time of more than 24 seconds. Thus, set the timeout to 30 seconds. Cc: stable@vger.kernel.org Signed-off-by: Ilan Peer <ilan.peer@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Link: https://lore.kernel.org/r/iwlwifi.20211210090244.3c851b93aef5.I346fa2e1d79220a6770496e773c6f87a2ad9e6c4@changeid Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:40 +01:00
Petr Cvachoucek	ab76022f1b	ubifs: Error path in ubifs_remount_rw() seems to wrongly free write buffers commit 3fea4d9d160186617ff40490ae01f4f4f36b28ff upstream. it seems freeing the write buffers in the error path of the ubifs_remount_rw() is wrong. It leads later to a kernel oops like this: [10016.431274] UBIFS (ubi0:0): start fixing up free space [10090.810042] UBIFS (ubi0:0): free space fixup complete [10090.814623] UBIFS error (ubi0:0 pid 512): ubifs_remount_fs: cannot spawn "ubifs_bgt0_0", error -4 [10101.915108] UBIFS (ubi0:0): background thread "ubifs_bgt0_0" started, PID 517 [10105.275498] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000030 [10105.284352] Mem abort info: [10105.287160] ESR = 0x96000006 [10105.290252] EC = 0x25: DABT (current EL), IL = 32 bits [10105.295592] SET = 0, FnV = 0 [10105.298652] EA = 0, S1PTW = 0 [10105.301848] Data abort info: [10105.304723] ISV = 0, ISS = 0x00000006 [10105.308573] CM = 0, WnR = 0 [10105.311564] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000f03d1000 [10105.318034] [0000000000000030] pgd=00000000f6cee003, pud=00000000f4884003, pmd=0000000000000000 [10105.326783] Internal error: Oops: 96000006 [#1] PREEMPT SMP [10105.332355] Modules linked in: ath10k_pci ath10k_core ath mac80211 libarc4 cfg80211 nvme nvme_core cryptodev(O) [10105.342468] CPU: 3 PID: 518 Comm: touch Tainted: G O 5.4.3 #1 [10105.349517] Hardware name: HYPEX CPU (DT) [10105.353525] pstate: 40000005 (nZcv daif -PAN -UAO) [10105.358324] pc : atomic64_try_cmpxchg_acquire.constprop.22+0x8/0x34 [10105.364596] lr : mutex_lock+0x1c/0x34 [10105.368253] sp : ffff000075633aa0 [10105.371563] x29: ffff000075633aa0 x28: 0000000000000001 [10105.376874] x27: ffff000076fa80c8 x26: 0000000000000004 [10105.382185] x25: 0000000000000030 x24: 0000000000000000 [10105.387495] x23: 0000000000000000 x22: 0000000000000038 [10105.392807] x21: 000000000000000c x20: ffff000076fa80c8 [10105.398119] x19: ffff000076fa8000 x18: 0000000000000000 [10105.403429] x17: 0000000000000000 x16: 0000000000000000 [10105.408741] x15: 0000000000000000 x14: fefefefefefefeff [10105.414052] x13: 0000000000000000 x12: 0000000000000fe0 [10105.419364] x11: 0000000000000fe0 x10: ffff000076709020 [10105.424675] x9 : 0000000000000000 x8 : 00000000000000a0 [10105.429986] x7 : ffff000076fa80f4 x6 : 0000000000000030 [10105.435297] x5 : 0000000000000000 x4 : 0000000000000000 [10105.440609] x3 : 0000000000000000 x2 : ffff00006f276040 [10105.445920] x1 : ffff000075633ab8 x0 : 0000000000000030 [10105.451232] Call trace: [10105.453676] atomic64_try_cmpxchg_acquire.constprop.22+0x8/0x34 [10105.459600] ubifs_garbage_collect+0xb4/0x334 [10105.463956] ubifs_budget_space+0x398/0x458 [10105.468139] ubifs_create+0x50/0x180 [10105.471712] path_openat+0x6a0/0x9b0 [10105.475284] do_filp_open+0x34/0x7c [10105.478771] do_sys_open+0x78/0xe4 [10105.482170] __arm64_sys_openat+0x1c/0x24 [10105.486180] el0_svc_handler+0x84/0xc8 [10105.489928] el0_svc+0x8/0xc [10105.492808] Code: 52800013 17fffffb d2800003 f9800011 (c85ffc05) [10105.498903] ---[ end trace 46b721d93267a586 ]--- To reproduce the problem: 1. Filesystem initially mounted read-only, free space fixup flag set. 2. mount -o remount,rw <mountpoint> 3. it takes some time (free space fixup running) ... try to terminate running mount by CTRL-C ... does not respond, only after free space fixup is complete ... then "ubifs_remount_fs: cannot spawn "ubifs_bgt0_0", error -4" 4. mount -o remount,rw <mountpoint> ... now finished instantly (fixup already done). 5. Create file or just unmount the filesystem and we get the oops. Cc: <stable@vger.kernel.org> Fixes: b50b9f408502 ("UBIFS: do not free write-buffers when in R/O mode") Signed-off-by: Petr Cvachoucek <cvachoucek@gmail.com> Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-01-27 08:47:40 +01:00
Yauhen Kharuzhy	032e2829de	power: bq25890: Enable continuous conversion for ADC at charging [ Upstream commit 80211be1b9dec04cc2805d3d81e2091ecac289a1 ] Instead of one shot run of ADC at beginning of charging, run continuous conversion to ensure that all charging-related values are monitored properly (input voltage, input current, themperature etc.). Signed-off-by: Yauhen Kharuzhy <jekhor@gmail.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-01-27 08:47:40 +01:00
Tzung-Bi Shih	c7f1edc924	ASoC: mediatek: mt8173: fix device_node leak [ Upstream commit 493433785df0075afc0c106ab65f10a605d0b35d ] Fixes the device_node leak. Signed-off-by: Tzung-Bi Shih <tzungbi@google.com> Link: https://lore.kernel.org/r/20211224064719.2031210-2-tzungbi@google.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-01-27 08:47:40 +01:00
Christoph Hellwig	235b697f5f	scsi: sr: Don't use GFP_DMA [ Upstream commit d94d94969a4ba07a43d62429c60372320519c391 ] The allocated buffers are used as a command payload, for which the block layer and/or DMA API do the proper bounce buffering if needed. Link: https://lore.kernel.org/r/20211222090842.920724-1-hch@lst.de Reported-by: Baoquan He <bhe@redhat.com> Reviewed-by: Baoquan He <bhe@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-01-27 08:47:40 +01:00
Tianjia Zhang	0ba4af0351	MIPS: Octeon: Fix build errors using clang [ Upstream commit 95339b70677dc6f9a2d669c4716058e71b8dc1c7 ] A large number of the following errors is reported when compiling with clang: cvmx-bootinfo.h:326:3: error: adding 'int' to a string does not append to the string [-Werror,-Wstring-plus-int] ENUM_BRD_TYPE_CASE(CVMX_BOARD_TYPE_NULL) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cvmx-bootinfo.h:321:20: note: expanded from macro 'ENUM_BRD_TYPE_CASE' case x: return(#x + 16); /* Skip CVMX_BOARD_TYPE_ / ~~~^~~~ cvmx-bootinfo.h:326:3: note: use array indexing to silence this warning cvmx-bootinfo.h:321:20: note: expanded from macro 'ENUM_BRD_TYPE_CASE' case x: return(#x + 16); / Skip CVMX_BOARD_TYPE_ */ ^ Follow the prompts to use the address operator '&' to fix this error. Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-01-27 08:47:39 +01:00
Lakshmi Sowjanya D	8eabbce94f	i2c: designware-pci: Fix to change data types of hcnt and lcnt parameters [ Upstream commit d52097010078c1844348dc0e467305e5f90fd317 ] The data type of hcnt and lcnt in the struct dw_i2c_dev is of type u16. It's better to have same data type in struct dw_scl_sda_cfg as well. Reported-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Lakshmi Sowjanya D <lakshmi.sowjanya.d@intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-01-27 08:47:39 +01:00
Takashi Iwai	6f0bdfcd04	ALSA: seq: Set upper limit of processed events [ Upstream commit 6fadb494a638d8b8a55864ecc6ac58194f03f327 ] Currently ALSA sequencer core tries to process the queued events as much as possible when they become dispatchable. If applications try to queue too massive events to be processed at the very same timing, the sequencer core would still try to process such all events, either in the interrupt context or via some notifier; in either away, it might be a cause of RCU stall or such problems. As a potential workaround for those problems, this patch adds the upper limit of the amount of events to be processed. The remaining events are processed in the next batch, so they won't be lost. For the time being, it's limited up to 1000 events per queue, which should be high enough for any normal usages. Reported-by: Zqiang <qiang.zhang1211@gmail.com> Reported-by: syzbot+bb950e68b400ab4f65f8@syzkaller.appspotmail.com Link: https://lore.kernel.org/r/20211102033222.3849-1-qiang.zhang1211@gmail.com Link: https://lore.kernel.org/r/20211207165146.2888-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-01-27 08:47:39 +01:00
Christophe Leroy	8d4a8bb7f8	w1: Misuse of get_user()/put_user() reported by sparse [ Upstream commit 33dc3e3e99e626ce51f462d883b05856c6c30b1d ] sparse warnings: (new ones prefixed by >>) >> drivers/w1/slaves/w1_ds28e04.c:342:13: sparse: sparse: incorrect type in initializer (different address spaces) @@ expected char [noderef] __user _pu_addr @@ got char buf @@ drivers/w1/slaves/w1_ds28e04.c:342:13: sparse: expected char [noderef] __user _pu_addr drivers/w1/slaves/w1_ds28e04.c:342:13: sparse: got char buf >> drivers/w1/slaves/w1_ds28e04.c:356:13: sparse: sparse: incorrect type in initializer (different address spaces) @@ expected char const [noderef] __user _gu_addr @@ got char const buf @@ drivers/w1/slaves/w1_ds28e04.c:356:13: sparse: expected char const [noderef] __user _gu_addr drivers/w1/slaves/w1_ds28e04.c:356:13: sparse: got char const buf The buffer buf is a failsafe buffer in kernel space, it's not user memory hence doesn't deserve the use of get_user() or put_user(). Access 'buf' content directly. Link: https://lore.kernel.org/lkml/202111190526.K5vb7NWC-lkp@intel.com/T/ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Link: https://lore.kernel.org/r/d14ed8d71ad4372e6839ae427f91441d3ba0e94d.1637946316.git.christophe.leroy@csgroup.eu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-01-27 08:47:39 +01:00
Joakim Tjernlund	6671b609eb	i2c: mpc: Correct I2C reset procedure [ Upstream commit ebe82cf92cd4825c3029434cabfcd2f1780e64be ] Current I2C reset procedure is broken in two ways: 1) It only generate 1 START instead of 9 STARTs and STOP. 2) It leaves the bus Busy so every I2C xfer after the first fixup calls the reset routine again, for every xfer there after. This fixes both errors. Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com> Acked-by: Scott Wood <oss@buserror.net> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-01-27 08:47:39 +01:00
Michael Ellerman	a55db63e0b	powerpc/smp: Move setup_profiling_timer() under CONFIG_PROFILING [ Upstream commit a4ac0d249a5db80e79d573db9e4ad29354b643a8 ] setup_profiling_timer() is only needed when CONFIG_PROFILING is enabled. Fixes the following W=1 warning when CONFIG_PROFILING=n: linux/arch/powerpc/kernel/smp.c:1638:5: error: no previous prototype for ‘setup_profiling_timer’ Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211124093254.1054750-5-mpe@ellerman.id.au Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-01-27 08:47:39 +01:00
Heiner Kallweit	a126a8c3dd	i2c: i801: Don't silently correct invalid transfer size [ Upstream commit effa453168a7eeb8a562ff4edc1dbf9067360a61 ] If an invalid block size is provided, reject it instead of silently changing it to a supported value. Especially critical I see the case of a write transfer with block length 0. In this case we have no guarantee that the byte we would write is valid. When silently reducing a read to 32 bytes then we don't return an error and the caller may falsely assume that we returned the full requested data. If this change should break any (broken) caller, then I think we should fix the caller. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Jean Delvare <jdelvare@suse.de> Signed-off-by: Wolfram Sang <wsa@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-01-27 08:47:39 +01:00
Julia Lawall	dc4b1e0b75	powerpc/btext: add missing of_node_put [ Upstream commit a1d2b210ffa52d60acabbf7b6af3ef7e1e69cda0 ] for_each_node_by_type performs an of_node_get on each iteration, so a break out of the loop requires an of_node_put. A simplified version of the semantic patch that fixes this problem is as follows (http://coccinelle.lip6.fr): // <smpl> @@ local idexpression n; expression e; @@ for_each_node_by_type(n,...) { ... ( of_node_put(n); \| e = n \| + of_node_put(n); ? break; ) ... } ... when != n // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1448051604-25256-6-git-send-email-Julia.Lawall@lip6.fr Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-01-27 08:47:39 +01:00
Julia Lawall	75a6eb9520	powerpc/cell: add missing of_node_put [ Upstream commit a841fd009e51c8c0a8f07c942e9ab6bb48da8858 ] for_each_node_by_name performs an of_node_get on each iteration, so a break out of the loop requires an of_node_put. A simplified version of the semantic patch that fixes this problem is as follows (http://coccinelle.lip6.fr): // <smpl> @@ expression e,e1; local idexpression n; @@ for_each_node_by_name(n, e1) { ... when != of_node_put(n) when != e = n ( return n; \| + of_node_put(n); ? return ...; ) ... } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1448051604-25256-7-git-send-email-Julia.Lawall@lip6.fr Signed-off-by: Sasha Levin <sashal@kernel.org>	2022-01-27 08:47:39 +01:00

1 2 3 4 5 ...

656772 Commits