1186102 Commits

Author SHA1 Message Date
Huang Ying
124abced64 migrate_pages_batch: simplify retrying and failure counting of large folios
After recent changes to the retrying and failure counting in
migrate_pages_batch(), it was found that it's unnecessary to count
retrying and failure for normal, large, and THP folios separately. 
Because we don't use retrying and failure number of large folios directly.
So, in this patch, we simplified retrying and failure counting of large
folios via counting retrying and failure of normal and large folios
together.  This results in the reduced line number.

Previously, in migrate_pages_batch we need to track whether the source
folio is large/THP before splitting.  So is_large is used to cache
folio_test_large() result.  Now, we don't need that variable any more
because we don't count retrying and failure of large folios (only counting
that of THP folios).  So, in this patch, is_large is removed to simplify
the code.

This is just code cleanup, no functionality changes are expected.

Link: https://lkml.kernel.org/r/20230510031829.11513-1-ying.huang@intel.com
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:18 -07:00
Rick Wertenbroek
501350459b mm: memory_hotplug: fix format string in warnings
The format string in __add_pages and __remove_pages has a typo and prints
e.g., "Misaligned __add_pages start: 0xfc605 end: #fc609" instead of
"Misaligned __add_pages start: 0xfc605 end: 0xfc609" Fix "#%lx" => "%#lx"

Link: https://lkml.kernel.org/r/20230510090758.3537242-1-rick.wertenbroek@gmail.com
Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:18 -07:00
Pankaj Raghav
c963901197 filemap: remove page_endio()
page_endio() is not used anymore. Remove it.

Link: https://lkml.kernel.org/r/20230510124716.73655-1-p.raghav@samsung.com
Signed-off-by: Pankaj Raghav <p.raghav@samsung.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Acked-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:18 -07:00
Peng Zhang
cd00dd2585 maple_tree: fix potential out-of-bounds access in mas_wr_end_piv()
Check the write offset end bounds before using it as the offset into the
pivot array.  This avoids a possible out-of-bounds access on the pivot
array if the write extends to the last slot in the node, in which case the
node maximum should be used as the end pivot.

akpm: this doesn't affect any current callers, but new users of mapletree
may encounter this problem if backported into earlier kernels, so let's
fix it in -stable kernels in case of this.

Link: https://lkml.kernel.org/r/20230506024752.2550-1-zhangpeng.00@bytedance.com
Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:18 -07:00
Lorenzo Stoakes
311150343e mm/gup: add missing gup_must_unshare() check to gup_huge_pgd()
All other instances of gup_huge_pXd() perform the unshare check, so update
the PGD-specific function to do so as well.

While checking pgd_write() might seem unusual, this function already
performs such a check via pgd_access_permitted() so this is in line with
the existing implementation.

David said:

: This change makes unshare handling across all GUP-fast variants
: consistent, which is desirable as GUP-fast is complicated enough
: already even when consistent.
: 
: This function was the only one I seemed to have missed (or left out and
: forgot why -- maybe because it's really dead code for now).  The COW
: selftest would identify the problem, so far there was no report. 
: Either the selftest wasn't run on corresponding architectures with that
: hugetlb size, or that code is still dead code and unused by
: architectures.
: 
: the original commit(s) that added unsharing explain why we care about
: these checks:
: 
: a7f226604170acd6 ("mm/gup: trigger FAULT_FLAG_UNSHARE when R/O-pinning a possibly shared anonymous page")
: 84209e87c6963f92 ("mm/gup: reliable R/O long-term pinning in COW mappings")

Link: https://lkml.kernel.org/r/cb971ac8dd315df97058ea69442ecc007b9a364a.1683381545.git.lstoakes@gmail.com
Signed-off-by: Lorenzo Stoakes <lstoakes@gmail.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:17 -07:00
Keith Busch
9f297db356 dmapool: create/destroy cleanup
Set the 'empty' bool directly from the result of the function that
determines its value instead of adding additional logic.

Link: https://lkml.kernel.org/r/20230126215125.4069751-13-kbusch@meta.com
Fixes: 2d55c16c0c54 ("dmapool: create/destroy cleanup")
Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Tony Battersby <tonyb@cybernetics.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:17 -07:00
Ackerley Tng
adef080382 fs: hugetlbfs: set vma policy only when needed for allocating folio
Calling hugetlb_set_vma_policy() later avoids setting the vma policy
and then dropping it on a page cache hit.

Link: https://lkml.kernel.org/r/20230502235622.3652586-1-ackerleytng@google.com
Signed-off-by: Ackerley Tng <ackerleytng@google.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Vishal Annapurve <vannapurve@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:17 -07:00
Nhat Pham
88537aac0b selftests: add selftests for cachestat
Test cachestat on a newly created file, /dev/ files, /proc/ files and a
directory.  Also test on a shmem file (which can also be tested with
huge pages since tmpfs supports huge pages).

[colin.i.king@gmail.com: fix spelling mistake "trucate" -> "truncate"]
  Link: https://lkml.kernel.org/r/20230505110855.2493457-1-colin.i.king@gmail.com
[mpe@ellerman.id.au: avoid excessive stack allocation]
  Link: https://lkml.kernel.org/r/877ctfa6yv.fsf@mail.lhotse
Link: https://lkml.kernel.org/r/20230503013608.2431726-4-nphamcs@gmail.com
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Brian Foster <bfoster@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:17 -07:00
Nhat Pham
946e697c69 cachestat: wire up cachestat for other architectures
cachestat is previously only wired in for x86 (and architectures using
the generic unistd.h table):

https://lore.kernel.org/lkml/20230503013608.2431726-1-nphamcs@gmail.com/

This patch wires cachestat in for all the other architectures.

[nphamcs@gmail.com: wire up cachestat for arm64]
  Link: https://lkml.kernel.org/r/20230511092843.3896327-1-nphamcs@gmail.com
Link: https://lkml.kernel.org/r/20230510195806.2902878-1-nphamcs@gmail.com
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Tested-by: Michael Ellerman <mpe@ellerman.id.au>	[powerpc]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Heiko Carstens <hca@linux.ibm.com>		[s390]
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Chris Zankel <chris@zankel.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Helge Deller <deller@gmx.de>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:16 -07:00
Nhat Pham
cf264e1329 cachestat: implement cachestat syscall
There is currently no good way to query the page cache state of large file
sets and directory trees.  There is mincore(), but it scales poorly: the
kernel writes out a lot of bitmap data that userspace has to aggregate,
when the user really doesn not care about per-page information in that
case.  The user also needs to mmap and unmap each file as it goes along,
which can be quite slow as well.

Some use cases where this information could come in handy:
  * Allowing database to decide whether to perform an index scan or
    direct table queries based on the in-memory cache state of the
    index.
  * Visibility into the writeback algorithm, for performance issues
    diagnostic.
  * Workload-aware writeback pacing: estimating IO fulfilled by page
    cache (and IO to be done) within a range of a file, allowing for
    more frequent syncing when and where there is IO capacity, and
    batching when there is not.
  * Computing memory usage of large files/directory trees, analogous to
    the du tool for disk usage.

More information about these use cases could be found in the following
thread:

https://lore.kernel.org/lkml/20230315170934.GA97793@cmpxchg.org/

This patch implements a new syscall that queries cache state of a file and
summarizes the number of cached pages, number of dirty pages, number of
pages marked for writeback, number of (recently) evicted pages, etc.  in a
given range.  Currently, the syscall is only wired in for x86
architecture.

NAME
    cachestat - query the page cache statistics of a file.

SYNOPSIS
    #include <sys/mman.h>

    struct cachestat_range {
        __u64 off;
        __u64 len;
    };

    struct cachestat {
        __u64 nr_cache;
        __u64 nr_dirty;
        __u64 nr_writeback;
        __u64 nr_evicted;
        __u64 nr_recently_evicted;
    };

    int cachestat(unsigned int fd, struct cachestat_range *cstat_range,
        struct cachestat *cstat, unsigned int flags);

DESCRIPTION
    cachestat() queries the number of cached pages, number of dirty
    pages, number of pages marked for writeback, number of evicted
    pages, number of recently evicted pages, in the bytes range given by
    `off` and `len`.

    An evicted page is a page that is previously in the page cache but
    has been evicted since. A page is recently evicted if its last
    eviction was recent enough that its reentry to the cache would
    indicate that it is actively being used by the system, and that
    there is memory pressure on the system.

    These values are returned in a cachestat struct, whose address is
    given by the `cstat` argument.

    The `off` and `len` arguments must be non-negative integers. If
    `len` > 0, the queried range is [`off`, `off` + `len`]. If `len` ==
    0, we will query in the range from `off` to the end of the file.

    The `flags` argument is unused for now, but is included for future
    extensibility. User should pass 0 (i.e no flag specified).

    Currently, hugetlbfs is not supported.

    Because the status of a page can change after cachestat() checks it
    but before it returns to the application, the returned values may
    contain stale information.

RETURN VALUE
    On success, cachestat returns 0. On error, -1 is returned, and errno
    is set to indicate the error.

ERRORS
    EFAULT cstat or cstat_args points to an invalid address.

    EINVAL invalid flags.

    EBADF  invalid file descriptor.

    EOPNOTSUPP file descriptor is of a hugetlbfs file

[nphamcs@gmail.com: replace rounddown logic with the existing helper]
  Link: https://lkml.kernel.org/r/20230504022044.3675469-1-nphamcs@gmail.com
Link: https://lkml.kernel.org/r/20230503013608.2431726-3-nphamcs@gmail.com
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Brian Foster <bfoster@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:16 -07:00
Nhat Pham
ffcb5f5262 workingset: refactor LRU refault to expose refault recency check
Patch series "cachestat: a new syscall for page cache state of files",
v13.

There is currently no good way to query the page cache statistics of large
files and directory trees.  There is mincore(), but it scales poorly: the
kernel writes out a lot of bitmap data that userspace has to aggregate,
when the user really does not care about per-page information in that
case.  The user also needs to mmap and unmap each file as it goes along,
which can be quite slow as well.

Some use cases where this information could come in handy:
  * Allowing database to decide whether to perform an index scan or direct
    table queries based on the in-memory cache state of the index.
  * Visibility into the writeback algorithm, for performance issues
    diagnostic.
  * Workload-aware writeback pacing: estimating IO fulfilled by page cache
    (and IO to be done) within a range of a file, allowing for more
    frequent syncing when and where there is IO capacity, and batching
    when there is not.
  * Computing memory usage of large files/directory trees, analogous to
    the du tool for disk usage.

More information about these use cases could be found in this thread:
https://lore.kernel.org/lkml/20230315170934.GA97793@cmpxchg.org/

This series of patches introduces a new system call, cachestat, that
summarizes the page cache statistics (number of cached pages, dirty pages,
pages marked for writeback, evicted pages etc.) of a file, in a specified
range of bytes.  It also include a selftest suite that tests some typical
usage.  Currently, the syscall is only wired in for x86 architecture.

This interface is inspired by past discussion and concerns with fincore,
which has a similar design (and as a result, issues) as mincore.  Relevant
links:

https://lkml.indiana.edu/hypermail/linux/kernel/1302.1/04207.html
https://lkml.indiana.edu/hypermail/linux/kernel/1302.1/04209.html


I have also developed a small tool that computes the memory usage of files
and directories, analogous to the du utility.  User can choose between
mincore or cachestat (with cachestat exporting more information than
mincore).  To compare the performance of these two options, I benchmarked
the tool on the root directory of a Meta's server machine, each for five
runs:

Using cachestat
real -- Median: 33.377s, Average: 33.475s, Standard Deviation: 0.3602
user -- Median: 4.08s, Average: 4.1078s, Standard Deviation: 0.0742
sys -- Median: 28.823s, Average: 28.8866s, Standard Deviation: 0.2689

Using mincore:
real -- Median: 102.352s, Average: 102.3442s, Standard Deviation: 0.2059
user -- Median: 10.149s, Average: 10.1482s, Standard Deviation: 0.0162
sys -- Median: 91.186s, Average: 91.2084s, Standard Deviation: 0.2046

I also ran both syscalls on a 2TB sparse file:

Using cachestat:
real    0m0.009s
user    0m0.000s
sys     0m0.009s

Using mincore:
real    0m37.510s
user    0m2.934s
sys     0m34.558s

Very large files like this are the pathological case for mincore.  In
fact, to compute the stats for a single 2TB file, mincore takes as long as
cachestat takes to compute the stats for the entire tree!  This could
easily happen inadvertently when we run it on subdirectories.  Mincore is
clearly not suitable for a general-purpose command line tool.

Regarding security concerns, cachestat() should not pose any additional
issues.  The caller already has read permission to the file itself (since
they need an fd to that file to call cachestat).  This means that the
caller can access the underlying data in its entirety, which is a much
greater source of information (and as a result, a much greater security
risk) than the cache status itself.

The latest API change (in v13 of the patch series) is suggested by Jens
Axboe.  It allows for 64-bit length argument, even on 32-bit architecture
(which is previously not possible due to the limit on the number of
syscall arguments).  Furthermore, it eliminates the need for compatibility
handling - every user can use the same ABI.


This patch (of 4):

In preparation for computing recently evicted pages in cachestat, refactor
workingset_refault and lru_gen_refault to expose a helper function that
would test if an evicted page is recently evicted.

[penguin-kernel@I-love.SAKURA.ne.jp: add missing rcu_read_unlock() in lru_gen_refault()]
  Link: https://lkml.kernel.org/r/610781bc-cf11-fc89-a46f-87cb8235d439@I-love.SAKURA.ne.jp
Link: https://lkml.kernel.org/r/20230503013608.2431726-1-nphamcs@gmail.com
Link: https://lkml.kernel.org/r/20230503013608.2431726-2-nphamcs@gmail.com
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Brian Foster <bfoster@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:16 -07:00
Haifeng Xu
18b1d18bc2 memcg, oom: remove explicit wakeup in mem_cgroup_oom_synchronize()
Before commit 29ef680ae7c2 ("memcg, oom: move out_of_memory back to the
charge path"), all memcg oom killers were delayed to page fault path.  And
the explicit wakeup is used in this case:

thread A:
        ...
        if (locked) {           // complete oom-kill, hold the lock
                mem_cgroup_oom_unlock(memcg);
                ...
        }
        ...

thread B:
        ...

        if (locked && !memcg->oom_kill_disable) {
                ...
        } else {
                schedule();     // can't acquire the lock
                ...
        }
        ...

The reason is that thread A kicks off the OOM-killer, which leads to
wakeups from the uncharges of the exiting task.  But thread B is not
guaranteed to see them if it enters the OOM path after the OOM kills but
before thread A releases the lock.

Now only oom_kill_disable case is handled from the #PF path.  In that case
it is userspace to trigger the wake up not the #PF path itself.  All
potential paths to free some charges are responsible to call
memcg_oom_recover() , so the explicit wakeup is not needed in the
mem_cgroup_oom_synchronize() path which doesn't release any memory itself.

Link: https://lkml.kernel.org/r/20230419030739.115845-2-haifeng.xu@shopee.com
Signed-off-by: Haifeng Xu <haifeng.xu@shopee.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:16 -07:00
Haifeng Xu
857f21397f memcg, oom: remove unnecessary check in mem_cgroup_oom_synchronize()
mem_cgroup_oom_synchronize() is only used when the memcg oom handling is
handed over to the edge of the #PF path.  Since commit 29ef680ae7c2
("memcg, oom: move out_of_memory back to the charge path") this is the
case only when the kernel memcg oom killer is disabled
(current->memcg_in_oom is only set if memcg->oom_kill_disable).  Therefore
a check for oom_kill_disable in mem_cgroup_oom_synchronize() is not
required.

Link: https://lkml.kernel.org/r/20230419030739.115845-1-haifeng.xu@shopee.com
Signed-off-by: Haifeng Xu <haifeng.xu@shopee.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:15 -07:00
Yosry Ahmed
0a2dc6ac33 cgroup: remove cgroup_rstat_flush_atomic()
Previous patches removed the only caller of cgroup_rstat_flush_atomic(). 
Remove the function and simplify the code.

Link: https://lkml.kernel.org/r/20230421174020.2994750-6-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:15 -07:00
Yosry Ahmed
35822fdae3 memcg: remove mem_cgroup_flush_stats_atomic()
Previous patches removed all callers of mem_cgroup_flush_stats_atomic(). 
Remove the function and simplify the code.

Link: https://lkml.kernel.org/r/20230421174020.2994750-5-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:15 -07:00
Yosry Ahmed
f82a7a86db memcg: calculate root usage from global state
Currently, we approximate the root usage by adding the memcg stats for
anon, file, and conditionally swap (for memsw).  To read the memcg stats
we need to invoke an rstat flush.  rstat flushes can be expensive, they
scale with the number of cpus and cgroups on the system.

mem_cgroup_usage() is called by memcg_events()->mem_cgroup_threshold()
with irqs disabled, so such an expensive operation with irqs disabled can
cause problems.

Instead, approximate the root usage from global state.  This is not 100%
accurate, but the root usage has always been ill-defined anyway.

Link: https://lkml.kernel.org/r/20230421174020.2994750-4-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:15 -07:00
Yosry Ahmed
190409caaf memcg: flush stats non-atomically in mem_cgroup_wb_stats()
The previous patch moved the wb_over_bg_thresh()->mem_cgroup_wb_stats()
code path in wb_writeback() outside the lock section.  We no longer need
to flush the stats atomically.  Flush the stats non-atomically.

Link: https://lkml.kernel.org/r/20230421174020.2994750-3-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:14 -07:00
Yosry Ahmed
2816ea2abf writeback: move wb_over_bg_thresh() call outside lock section
Patch series "cgroup: eliminate atomic rstat flushing", v5.

A previous patch series [1] changed most atomic rstat flushing contexts to
become non-atomic.  This was done to avoid an expensive operation that
scales with # cgroups and # cpus to happen with irqs disabled and
scheduling not permitted.  There were two remaining atomic flushing
contexts after that series.  This series tries to eliminate them as well,
eliminating atomic rstat flushing completely.

The two remaining atomic flushing contexts are:
(a) wb_over_bg_thresh()->mem_cgroup_wb_stats()
(b) mem_cgroup_threshold()->mem_cgroup_usage()

For (a), flushing needs to be atomic as wb_writeback() calls
wb_over_bg_thresh() with a spinlock held.  However, it seems like the call
to wb_over_bg_thresh() doesn't need to be protected by that spinlock, so
this series proposes a refactoring that moves the call outside the lock
criticial section and makes the stats flushing in mem_cgroup_wb_stats()
non-atomic.

For (b), flushing needs to be atomic as mem_cgroup_threshold() is called
with irqs disabled.  We only flush the stats when calculating the root
usage, as it is approximated as the sum of some memcg stats (file, anon,
and optionally swap) instead of the conventional page counter.  This
series proposes changing this calculation to use the global stats instead,
eliminating the need for a memcg stat flush.

After these 2 contexts are eliminated, we no longer need
mem_cgroup_flush_stats_atomic() or cgroup_rstat_flush_atomic().  We can
remove them and simplify the code.

[1] https://lore.kernel.org/linux-mm/20230330191801.1967435-1-yosryahmed@google.com/


This patch (of 5):

wb_over_bg_thresh() calls mem_cgroup_wb_stats() which invokes an rstat
flush, which can be expensive on large systems. Currently,
wb_writeback() calls wb_over_bg_thresh() within a lock section, so we
have to do the rstat flush atomically. On systems with a lot of
cpus and/or cgroups, this can cause us to disable irqs for a long time,
potentially causing problems.

Move the call to wb_over_bg_thresh() outside the lock section in
preparation to make the rstat flush in mem_cgroup_wb_stats() non-atomic.
The list_empty(&wb->work_list) check should be okay outside the lock
section of wb->list_lock as it is protected by a separate lock
(wb->work_lock), and wb_over_bg_thresh() doesn't seem like it is
modifying any of wb->b_* lists the wb->list_lock is protecting.
Also, the loop seems to be already releasing and reacquring the
lock, so this refactoring looks safe.

Link: https://lkml.kernel.org/r/20230421174020.2994750-1-yosryahmed@google.com
Link: https://lkml.kernel.org/r/20230421174020.2994750-2-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:14 -07:00
Baolin Wang
3c4322c94b mm/page_alloc: drop the unnecessary pfn_valid() for start pfn
__pageblock_pfn_to_page() currently performs both pfn_valid check and
pfn_to_online_page().  The former one is redundant because the latter is a
stronger check.  Drop pfn_valid().

Link: https://lkml.kernel.org/r/c3868b58c6714c09a43440d7d02c7b4eed6e03f6.1682342634.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:14 -07:00
Wen Yang
8b9167cd9e mm: compaction: optimize compact_memory to comply with the admin-guide
For the /proc/sys/vm/compact_memory file, the admin-guide states: When 1
is written to the file, all zones are compacted such that free memory is
available in contiguous blocks where possible.  This can be important for
example in the allocation of huge pages although processes will also
directly compact memory as required

But it was not strictly followed, writing any value would cause all zones
to be compacted.

It has been slightly optimized to comply with the admin-guide.  Enforce
the 1 on the unlikely chance that the sysctl handler is ever extended to
do something different.

Commit ef4984384172 ("mm/compaction: remove unused variable
sysctl_compact_memory") has also been optimized a bit here, as the
declaration in the external header file has been eliminated, and
sysctl_compact_memory also needs to be verified.

[akpm@linux-foundation.org: add __read_mostly, per Mel]
Link: https://lkml.kernel.org/r/tencent_DFF54DB2A60F3333F97D3F6B5441519B050A@qq.com
Signed-off-by: Wen Yang <wenyang.linux@foxmail.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: William Lam <william.lam@bytedance.com>
Cc: Pintu Kumar <pintu@codeaurora.org>
Cc: Fu Wei <wefu@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:14 -07:00
Yosry Ahmed
dddb44ffa0 memcg: dump memory.stat during cgroup OOM for v1
Patch series "memcg: OOM log improvements", v2.

This short patch series brings back some cgroup v1 stats in OOM logs
that were unnecessarily changed before. It also makes memcg OOM logs
less reliant on printk() internals.


This patch (of 2):

Commit c8713d0b2312 ("mm: memcontrol: dump memory.stat during cgroup OOM")
made sure we dump all the stats in memory.stat during a cgroup OOM, but it
also introduced a slight behavioral change.  The code used to print the
non-hierarchical v1 cgroup stats for the entire cgroup subtree, now it
only prints the v2 cgroup stats for the cgroup under OOM.

For cgroup v1 users, this introduces a few problems:

(a) The non-hierarchical stats of the memcg under OOM are no longer
    shown.

(b) A couple of v1-only stats (e.g.  pgpgin, pgpgout) are no longer
    shown.

(c) We show the list of cgroup v2 stats, even in cgroup v1.  This list
    of stats is not tracked with v1 in mind.  While most of the stats seem
    to be working on v1, there may be some stats that are not fully or
    correctly tracked.

Although OOM log is not set in stone, we should not change it for no
reason.  When upgrading the kernel version to a version including commit
c8713d0b2312 ("mm: memcontrol: dump memory.stat during cgroup OOM"), these
behavioral changes are noticed in cgroup v1.

The fix is simple.  Commit c8713d0b2312 ("mm: memcontrol: dump memory.stat
during cgroup OOM") separated stats formatting from stats display for v2,
to reuse the stats formatting in the OOM logs.  Do the same for v1.

Move the v2 specific formatting from memory_stat_format() to
memcg_stat_format(), add memcg1_stat_format() for v1, and make
memory_stat_format() select between them based on cgroup version.  Since
memory_stat_show() now works for both v1 & v2, drop memcg_stat_show().

Link: https://lkml.kernel.org/r/20230428132406.2540811-1-yosryahmed@google.com
Link: https://lkml.kernel.org/r/20230428132406.2540811-3-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:14 -07:00
Yosry Ahmed
5b42360c73 memcg: use seq_buf_do_printk() with mem_cgroup_print_oom_meminfo()
Currently, we format all the memcg stats into a buffer in
mem_cgroup_print_oom_meminfo() and use pr_info() to dump it to the logs. 
However, this buffer is large in size.  Although it is currently working
as intended, ther is a dependency between the memcg stats buffer and the
printk record size limit.

If we add more stats in the future and the buffer becomes larger than the
printk record size limit, or if the prink record size limit is reduced,
the logs may be truncated.

It is safer to use seq_buf_do_printk(), which will automatically break up
the buffer at line breaks and issue small printk() calls.

Refactor the code to move the seq_buf from memory_stat_format() to its
callers, and use seq_buf_do_printk() to print the seq_buf in
mem_cgroup_print_oom_meminfo().

Link: https://lkml.kernel.org/r/20230428132406.2540811-2-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:13 -07:00
Douglas Anderson
4bb6dc79d9 migrate_pages: avoid blocking for IO in MIGRATE_SYNC_LIGHT
The MIGRATE_SYNC_LIGHT mode is intended to block for things that will
finish quickly but not for things that will take a long time.  Exactly how
long is too long is not well defined, but waits of tens of milliseconds is
likely non-ideal.

When putting a Chromebook under memory pressure (opening over 90 tabs on a
4GB machine) it was fairly easy to see delays waiting for some locks in
the kcompactd code path of > 100 ms.  While the laptop wasn't amazingly
usable in this state, it was still limping along and this state isn't
something artificial.  Sometimes we simply end up with a lot of memory
pressure.

Putting the same Chromebook under memory pressure while it was running
Android apps (though not stressing them) showed a much worse result (NOTE:
this was on a older kernel but the codepaths here are similar).  Android
apps on ChromeOS currently run from a 128K-block, zlib-compressed,
loopback-mounted squashfs disk.  If we get a page fault from something
backed by the squashfs filesystem we could end up holding a folio lock
while reading enough from disk to decompress 128K (and then decompressing
it using the somewhat slow zlib algorithms).  That reading goes through
the ext4 subsystem (because it's a loopback mount) before eventually
ending up in the block subsystem.  This extra jaunt adds extra overhead. 
Without much work I could see cases where we ended up blocked on a folio
lock for over a second.  With more extreme memory pressure I could see up
to 25 seconds.

We considered adding a timeout in the case of MIGRATE_SYNC_LIGHT for the
two locks that were seen to be slow [1] and that generated much
discussion.  After discussion, it was decided that we should avoid waiting
for the two locks during MIGRATE_SYNC_LIGHT if they were being held for
IO.  We'll continue with the unbounded wait for the more full SYNC modes.

With this change, I couldn't see any slow waits on these locks with my
previous testcases.

NOTE: The reason I stated digging into this originally isn't because some
benchmark had gone awry, but because we've received in-the-field crash
reports where we have a hung task waiting on the page lock (which is the
equivalent code path on old kernels).  While the root cause of those
crashes is likely unrelated and won't be fixed by this patch, analyzing
those crash reports did point out these very long waits seemed like
something good to fix.  With this patch we should no longer hang waiting
on these locks, but presumably the system will still be in a bad shape and
hang somewhere else.

[1] https://lore.kernel.org/r/20230421151135.v2.1.I2b71e11264c5c214bc59744b9e13e4c353bc5714@changeid

Link: https://lkml.kernel.org/r/20230428135414.v3.1.Ia86ccac02a303154a0b8bc60567e7a95d34c96d3@changeid
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Gao Xiang <hsiangkao@linux.alibaba.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Gao Xiang <hsiangkao@linux.alibaba.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:13 -07:00
Roman Gushchin
f785a8f21a mm: memcg: use READ_ONCE()/WRITE_ONCE() to access stock->cached
A memcg pointer in the percpu stock can be accessed by drain_all_stock()
from another cpu in a lockless way.  In theory it might lead to an issue,
similar to the one which has been discovered with stock->cached_objcg,
where the pointer was zeroed between the check for being NULL and
dereferencing.  In this case the issue is unlikely a real problem, but to
make it bulletproof and similar to stock->cached_objcg, let's annotate all
accesses to stock->cached with READ_ONCE()/WTRITE_ONCE().

Link: https://lkml.kernel.org/r/20230502160839.361544-2-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Acked-by: Shakeel Butt <shakeelb@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:13 -07:00
Roman Gushchin
3b8abb3239 mm: kmem: fix a NULL pointer dereference in obj_stock_flush_required()
KCSAN found an issue in obj_stock_flush_required():
stock->cached_objcg can be reset between the check and dereference:

==================================================================
BUG: KCSAN: data-race in drain_all_stock / drain_obj_stock

write to 0xffff888237c2a2f8 of 8 bytes by task 19625 on cpu 0:
 drain_obj_stock+0x408/0x4e0 mm/memcontrol.c:3306
 refill_obj_stock+0x9c/0x1e0 mm/memcontrol.c:3340
 obj_cgroup_uncharge+0xe/0x10 mm/memcontrol.c:3408
 memcg_slab_free_hook mm/slab.h:587 [inline]
 __cache_free mm/slab.c:3373 [inline]
 __do_kmem_cache_free mm/slab.c:3577 [inline]
 kmem_cache_free+0x105/0x280 mm/slab.c:3602
 __d_free fs/dcache.c:298 [inline]
 dentry_free fs/dcache.c:375 [inline]
 __dentry_kill+0x422/0x4a0 fs/dcache.c:621
 dentry_kill+0x8d/0x1e0
 dput+0x118/0x1f0 fs/dcache.c:913
 __fput+0x3bf/0x570 fs/file_table.c:329
 ____fput+0x15/0x20 fs/file_table.c:349
 task_work_run+0x123/0x160 kernel/task_work.c:179
 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
 exit_to_user_mode_loop+0xcf/0xe0 kernel/entry/common.c:171
 exit_to_user_mode_prepare+0x6a/0xa0 kernel/entry/common.c:203
 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
 syscall_exit_to_user_mode+0x26/0x140 kernel/entry/common.c:296
 do_syscall_64+0x4d/0xc0 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

read to 0xffff888237c2a2f8 of 8 bytes by task 19632 on cpu 1:
 obj_stock_flush_required mm/memcontrol.c:3319 [inline]
 drain_all_stock+0x174/0x2a0 mm/memcontrol.c:2361
 try_charge_memcg+0x6d0/0xd10 mm/memcontrol.c:2703
 try_charge mm/memcontrol.c:2837 [inline]
 mem_cgroup_charge_skmem+0x51/0x140 mm/memcontrol.c:7290
 sock_reserve_memory+0xb1/0x390 net/core/sock.c:1025
 sk_setsockopt+0x800/0x1e70 net/core/sock.c:1525
 udp_lib_setsockopt+0x99/0x6c0 net/ipv4/udp.c:2692
 udp_setsockopt+0x73/0xa0 net/ipv4/udp.c:2817
 sock_common_setsockopt+0x61/0x70 net/core/sock.c:3668
 __sys_setsockopt+0x1c3/0x230 net/socket.c:2271
 __do_sys_setsockopt net/socket.c:2282 [inline]
 __se_sys_setsockopt net/socket.c:2279 [inline]
 __x64_sys_setsockopt+0x66/0x80 net/socket.c:2279
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0xffff8881382d52c0 -> 0xffff888138893740

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 19632 Comm: syz-executor.0 Not tainted 6.3.0-rc2-syzkaller-00387-g534293368afa #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/02/2023

Fix it by using READ_ONCE()/WRITE_ONCE() for all accesses to
stock->cached_objcg.

Link: https://lkml.kernel.org/r/20230502160839.361544-1-roman.gushchin@linux.dev
Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API")
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Reported-by: syzbot+774c29891415ab0fd29d@syzkaller.appspotmail.com
Reported-by: Dmitry Vyukov <dvyukov@google.com>
  Link: https://lore.kernel.org/linux-mm/CACT4Y+ZfucZhM60YPphWiCLJr6+SGFhT+jjm8k1P-a_8Kkxsjg@mail.gmail.com/T/#t
Reviewed-by: Yosry Ahmed <yosryahmed@google.com>
Acked-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09 16:25:13 -07:00
Linus Torvalds
7877cb91f1 Linux 6.4-rc4 v6.4-rc4 2023-05-28 07:49:00 -04:00
Linus Torvalds
f8b2507c26 A single fix for x86:
- Prevent a bogus setting for the number of HT siblings, which is caused
    by the CPUID evaluation trainwreck of X86. That recomputes the value
    for each CPU, so the last CPU "wins". That can cause completely bogus
    sibling values.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmRzBxMTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYodfbEACp+oRBD2zIcmf8kpakMxIbH2CkDAxl
 AgqGHYVKAvsqKPQZhYmOFEsk+0isMzssuVaub9gMeLRmpMQxUqv7K20pluvWmuQm
 h7MTYEAvCZpBU2BGB1mixp57tQ/gpLSB4uaJU2Q5hh9po6uImvalIdqThf7ArxgA
 mm3BhtbFObLGwRaV2981zdsEs8Cjs9RusnAOrX5LbBBb6ibcnrqWy7DAhySMA0zr
 7RfaGyG/T9z6/BgX54FBQ7aDNU/RkZ8YYQoMj0kAFXdM3V+sa3oPfMRQpACPHKm4
 kwskJfWVdMq06kgOD7N9+WrOfBAgIsmzasT6VhTuGAGNo4CiEOthLqXDHwf4NLhN
 hj0XReHVyBiQ1KuI2lGNJ71c30KRkh7SCdGNiEFtnlpteXnS4bRnzWtfMf/+2SCC
 WOziRRYRzuAvvoTwoQS1BYJrDAB9f5jOX8FTYbC12rEk/RPOB6t4iFNLUzid7u1H
 yPYphoftXlQlli7EajVc5pSZCftMHRCzWernptiPFU5tMvyagYppcg25JY54rquC
 CTJQqoa/HPAEZhBjgZElwO9QBgu+VyJUq3R/Z+LbkQuz5z+PVDfqeDuDWhaqbcz1
 WGtdGdNo4bqVhwqgwyHjgKCFe05mF6tPuotnYZE/VSkXUCvLkjGEVg8HQ7WiPhJd
 IYDqxkIhX5h1Mg==
 =l43p
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 cpu fix from Thomas Gleixner:
 "A single fix for x86:

   - Prevent a bogus setting for the number of HT siblings, which is
     caused by the CPUID evaluation trainwreck of X86. That recomputes
     the value for each CPU, so the last CPU "wins". That can cause
     completely bogus sibling values"

* tag 'x86-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/topology: Fix erroneous smp_num_siblings on Intel Hybrid platforms
2023-05-28 07:42:05 -04:00
Linus Torvalds
2d5438f4c6 A small set of perf fixes:
- Make the CHA discovery based on MSR readout to work around broken
    discovery tables in some SPR firmwares.
 
  - Prevent saving PEBS configuration which has software bits set that
    cause a crash when restored into the relevant MSR.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmRzBlETHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYofUsEAC6VMmscn6MDmCpO71+o80IhFUeN/Pa
 /Qu4kKCQmeT/gonMG+bKCYSNobgdi4a7KNDRZypTjzMORTDmS3lT4aTbzy3ry+eI
 7rUTolAo3HiY0ZwPhU/evrGDf0O2d/19nd9pXiLZFinH9WsMf7mKsXhAJvlq6SKu
 SliGCQlB7+0bicdYgBkJqWPUbKY7QvcCcXiPx1Ujxg7CbMTcuj7MLzu/TVYmwZYH
 Je/k+vBnhl54PF9nGiqNYZGBPh4cBgkpHEFGFrvwplBRBdwHrjapNxhIRcA2N66u
 mr+f7Cl6StOP16Bn4dzG9nHLVTDz5NVgsbtPStVTcOoIzfOhch7mOL+tu/pafXdt
 zRL7U6usfbQqt1rcBBtPoeFlEifQwHoz3ZGFSyIZn5IRfXtXlDJHyPxlRYMB5/uz
 WjR10GzE0tZFLetvW4PdlMHcwwrOZNU+crvADmCk4KzzVDip9QLbomiGfxHVHI2J
 Fjstd6ZaNkuDHRy/r76YfwRDDzQWPvtsaWRLzD4aFQQxYdHHZOabBittK9UvENvu
 MjLRO+1ymN52Ap6TdcK9ZMNpb8ol/Xn5aYivNRlHsUjJ8RxJj94sElIlNHLI2yoa
 hICdMcGSCd7H+FIkXdgT2O8OBxEsf139xSrG2oSTOZFCjkR1BSxtyqNY47MFVSiA
 KBtbnJMkde48pg==
 =Z3GU
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf fixes from Thomas Gleixner:
 "A small set of perf fixes:

   - Make the MSR-readout based CHA discovery work around broken
     discovery tables in some SPR firmwares.

   - Prevent saving PEBS configuration which has software bits set that
     cause a crash when restored into the relevant MSR"

* tag 'perf-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/uncore: Correct the number of CHAs on SPR
  perf/x86/intel: Save/restore cpuc->active_pebs_data_cfg when using guest PEBS
2023-05-28 07:37:23 -04:00
Linus Torvalds
abbf7fa15b A set of unwinder and tooling fixes:
- Ensure that the stack pointer on x86 is aligned again so that the
     unwinder does not read past the end of the stack
 
   - Discard .note.gnu.property section which has a pointlessly different
     alignment than the other note sections. That confuses tooling of all
     sorts including readelf, libbpf and pahole.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmRzBW0THHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoczHD/4vCopMPBFsT5w5HhSTQSX5Kglx3UQL
 g+qa0Z/uJKLpMgGkBzWBQGXHH/UqyY4sk5T3DDOkVQjdSfD+zmJu4R44wAonEU+y
 SDW2MeBciyp1WtjwPNWNY9QgDUQj7Cr2dSLTVqBR9akladTcj7EKHApq0pgaqsbi
 MnmfXzPyH17PRfyW8FbsBjdQvhpkYZSZntQ0cU+W5VtsUtZvg2w4I8IU76ri9L0E
 OYBbk2zV8RabGVJBv79fnm4fXb78+Zkp/xvs5sTBRKRS+3T5FNNJjo8tNp1AtxdF
 eJMlOeY7PjSFyAL6+/+/uRqYNCnH4Rw2MlMqJJIhzKYCw7BFo0FAhi+mro63Z85b
 byGuriv1g7suOgOCK8IzTl7e1nTDP4YCZ0cMDDvcRceIiqR6Rrk3C/B5cBVz0Bng
 iFYFUJSlLT4G+tGZwupV2UUGsCwS5+vVCG/FVWes8Mz4g4zr2Dmrh6YMLzrcDshE
 eAY2yKMypD/JD9cJa4KOVJuFARaSDJDiBpsO1BjRC6MxLFaw1biB0mE2hkwn3LJe
 fLVvU/sWnIQ+dQoy7+GBzm7Pi5SmKBuLDsgOt/ocbCwfUwUQMGf+I7bNg+08UOOS
 dNzM6/agdd8rLZTS/Ma7EvnA367ZwVOb5COVMqwxAoKZbPkvgz1xvRGkFQCHYapx
 BwMILF9FIM0kvg==
 =TPwY
 -----END PGP SIGNATURE-----

Merge tag 'objtool-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull unwinder fixes from Thomas Gleixner:
 "A set of unwinder and tooling fixes:

   - Ensure that the stack pointer on x86 is aligned again so that the
     unwinder does not read past the end of the stack

   - Discard .note.gnu.property section which has a pointlessly
     different alignment than the other note sections. That confuses
     tooling of all sorts including readelf, libbpf and pahole"

* tag 'objtool-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/show_trace_log_lvl: Ensure stack pointer is aligned, again
  vmlinux.lds.h: Discard .note.gnu.property section
2023-05-28 07:33:29 -04:00
Linus Torvalds
d8f14b84fe Two fixes for debugobjects:
- Prevent that the allocation path wakes up kswapd. That's a long
     standing issue due to the GFP_ATOMIC allocation flag. As debug objects
     can be invoked from pretty much any context waking kswapd can end up
     in arbitrary lock chains versus the waitqueue lock.
 
   - Correct the explicit lockdep wait-type violation in
     debug_object_fill_pool().
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmRzCBQTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoa8FD/sFaHGSVtNTYgkV75umETMWbx+nR0Sp
 Y/i62MswIWU/DWmD9IKaBxlHpBByHgopBAozDnUix6RfQvf8V/GSU6PWa9HAR2QH
 rYwQCN/2/e8yQNAFv+9AiYGzPU3fRI/z7rYgfhhiWoLjivMFUCXypjBG0BAiCBxC
 pYKZDMhBeySIUjtEL6xjcflA8XXKuLUPGy1WeKBxRgJeNvM0GlbifNXoy0JaXBso
 NK+1FOG7zm05r2RqZjN0rAVRrrdgA4JYygpYC8YmzePoFQVXLeUnlbjjW9uYX+hz
 MoLuVeF+rKk9NHNu3NoD4kFgrNp3NXAAAzH1MJwIADy9THtsyWAeEgyUkkie9aiX
 Oa8eSjpJQjUv5h+VRKpMhh2RAAAhCYDuX/QC2FLImLy+GRF3dMhsAmuYgKXN2kHa
 CFkM84vStMiMVxKhwtLpxVE7VOrxzXxbqMO65kMrCXYxK1SfKtEZr8FrORvUjU7G
 MmH+D9sB034nkCBU+oGMsMYAAzB4rLp5Cw9qqvwWLfJvWLcUoPxjgUV6hLR6mNXx
 6+2133Tf68Fz4TgyEDN9XhQ7QEsKKGTTDMJ5JYolnrRe54sUJSsX+44khrbocSde
 WcEfcwhR+mjDDx0eVB2oT9bedxMf639mqPNn//EqJkzS4s+sECC8OiHbdvL3ArUq
 S92nrMxvyMB42Q==
 =7B4m
 -----END PGP SIGNATURE-----

Merge tag 'core-debugobjects-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull debugobjects fixes from Thomas Gleixner:
 "Two fixes for debugobjects:

   - Prevent the allocation path from waking up kswapd.

     That's a long standing issue due to the GFP_ATOMIC allocation flag.
     As debug objects can be invoked from pretty much any context waking
     kswapd can end up in arbitrary lock chains versus the waitqueue
     lock

   - Correct the explicit lockdep wait-type violation in
     debug_object_fill_pool()"

* tag 'core-debugobjects-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobjects: Don't wake up kswapd from fill_pool()
  debugobjects,locking: Annotate debug_object_fill_pool() wait type violation
2023-05-28 07:15:33 -04:00
Linus Torvalds
9bd5386c65 A set of fixes for interrupt chip drivers:
- Prevent loss of state in the MIPS GIC interrupt controller.
 
   - Disable pseudo NMIs on Mediatek based Chromebooks as they have
     firmware issues which cause instantenous chrashes and freezes
     wen pseudo NMIs are used.
 
   - Fix the error handling path in the MBIGEN driver and a defined but not
     used warning in the meson-gpio interrupt chip driver.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmRzBAQTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoT/7D/9CDePULfT8t58VwpS/ZXGB1pAYhyWX
 FyQnxGeFz+H0NEylhB5LhqkmCFq60IZ+fBIYS9LmBU9GSJIvVXnp62SOPDmFmMwj
 fAj6y5JfpOxVZNb8SJ+JGVvwm2henRvOgeYKB4R/APk37dJcWzPruv/E64J7z0BC
 IeqFdq2cvkTnEDgE3Fnt6kZ2/iS8gd+Vtp/O/+pzqbt3u3vcogygSpNE8WE8Y+WE
 fF2+97EZx4oQRwIltNjaLBeW63ycQzx2+UCMy/84QYsTfi/wlquGcKWrBYwx+CDf
 XqrYMK6dMXW22o3VyrMxM6Jrd4raU7KsxWWOpqhClNabQjNbNgFdnMH56lJPFoqx
 84tr2+RnxL1bGHQDiiQtCgBfRe0BGm82qUlQkKDXwcmBIDNm2rfeE1xGG6fHsylk
 lPT3dQBR3DvG5EkxKIe3BF6ZPlwhmCe76zsU1Dcf2tVoDZhN66Ck+/7VboTS1Ibb
 d/iR5gId0NUc3KFGQCPzo4roY2p+sp/PqZNm+U5aLZbfmFHsDPCQoh/N17y3Trh/
 A4UuVwIy1JklosXvozb040M/rPIPZoei12nEQARYCR2VxafLO3d3OuE9Kp5tzOwb
 jCD4Img53VzYcf5KTcj0FgHcPNL18+dRZ3/vn1X27BlPWKRjfTTVOcQbs6cyZuo4
 TmjaMYwX2vR7tQ==
 =OHtk
 -----END PGP SIGNATURE-----

Merge tag 'irq-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
 "A set of fixes for interrupt chip drivers:

   - Prevent loss of state in the MIPS GIC interrupt controller

   - Disable pseudo NMIs on Mediatek based Chromebooks as they have
     firmware issues which cause instantenous chrashes and freezes wen
     pseudo NMIs are used

   - Fix the error handling path in the MBIGEN driver and a defined but
     not used warning in the meson-gpio interrupt chip driver"

* tag 'irq-urgent-2023-05-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/mbigen: Unify the error handling in mbigen_of_create_domain()
  irqchip/meson-gpio: Mark OF related data as maybe unused
  irqchip/mips-gic: Use raw spinlock for gic_lock
  irqchip/mips-gic: Don't touch vl_map if a local interrupt is not routable
  irqchip/gic-v3: Disable pseudo NMIs on Mediatek devices w/ firmware issues
  dt-bindings: interrupt-controller: arm,gic-v3: Add quirk for Mediatek SoCs w/ broken FW
2023-05-28 07:12:21 -04:00
Linus Torvalds
045049cb38 - fixes to get alchemy platform back in shape
- fix for initrd detection
 -----BEGIN PGP SIGNATURE-----
 
 iQJOBAABCAA4FiEEbt46xwy6kEcDOXoUeZbBVTGwZHAFAmRzEeEaHHRzYm9nZW5k
 QGFscGhhLmZyYW5rZW4uZGUACgkQeZbBVTGwZHC2Ng//VlCCXU2pWG/bEMFsZlXo
 TdsrDhpoLHcJoXvl5/CLIWqDc2j4zCD13PbIJovYK3+7dPLMuPdTsljOKD9Yjg7P
 UrjzbM2k5Kb47EXq8kp7TWxP2+MiN87B7EAHbsFdbW8Alu6bgmVcyV5fAxo7ddxi
 b28qijQA9sOW6gZSl4lQ8oMvRsGTRPaDyO43xj+K03ONHKQFJN7i7e/H8/9wXNSs
 YWhA38HRvIDUFwHm5PeRTuxKApYP+FNH97pHPDDeZBO9135ISqTn2TkpubsRCz4D
 3eRT+vqzBsYAaL3Qz1cdtXdjo873vTrJPbu/md0rJE9kwDNEtEpoWiYVknDmNWiv
 WA747I/euH5VD0Mir855LJ2rLZOytxGSGU8cMcJoQd0qg+t40GY4DH+3mJHxXi3l
 ruj2eOQ8KnSQfHWI3ugv8s+kOPWQBo3vhIIBAlxOeVrhGvKsnIwd/t7ZDHxtsjSi
 V1zdR17XjWPlmS6mCSAqDI5JjidWL0LakyoYGOkaSpzohSoL4TUxpUHfOl6uL/cQ
 nju2OEM8f7G1FCQR1AdFikdSC29B3WxyuhoqE4DOorQy9POMDSYMUIpIOw5jIIDK
 8Tt3HuepRd37b/JngyFrXBYggJRzB6Zzmrr07lVpb7dmSPK9PRTFACw1Xlkw7NNj
 5G0xJ0QrDFIvGEEFdX1lBnQ=
 =HNFm
 -----END PGP SIGNATURE-----

Merge tag 'mips-fixes_6.4_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux

Pull MIPS fixes from Thomas Bogendoerfer:

 - fixes to get alchemy platform back in shape

 - fix for initrd detection

* tag 'mips-fixes_6.4_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  mips: Move initrd_start check after initrd address sanitisation.
  MIPS: Alchemy: fix dbdma2
  MIPS: Restore Au1300 support
  MIPS: unhide PATA_PLATFORM
2023-05-28 07:08:52 -04:00
Linus Torvalds
416839029e powerpc fixes for 6.4 #3
- Reinstate ARCH_FORCE_MAX_ORDER ranges to fix various breakage.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmRym44THG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgBxxD/kBOh9LYZELxc2tQ/5FZErM18Hb6Rcs
 XAl/VrBYR4T0CsZ7hNyZblBga/10xAyqnB63B+R00wJkDJ1ZefETbRQJNSnpDzju
 lThwBeP7y5qOyHj+pywa+zXoNqICeHPHH4tX/VsdCOIKeNy7GSyUyq0IxtXdYii5
 zc2O9Bz8pPhQi9fa6qAw5pv6rbvw0bitEGMThDoA7NVmwldyYdSy7/3/VWDikndU
 dVmpvLTUYXf8QClZ/dg3VVvucssip8oYhdLEKALoKnEcmp+YJAXidF5A7ExK+wmu
 OeREP/EZd6IzhRpwPcCYpyaqEkCD5/psbiWA67FfBiUFiB9zEFZJWjojojNCkohB
 mOFucO3117NTzc1ix6f7c/BSQGK28Tn6bYeaUG71SRQdFAlNi9zFDF7+VMbQpFi9
 a5iM5v2TvQ2Kry3pkFULQ9Wl/nM5kj9Wnk2AEcTg5DsquU6fIEGARPbgbijlCftm
 mNfC5SHQELNuPQOjA8Ml+OaiD88H8ylaMT1QZXpsVnvv30vpTNC0+USBmnlIOSJs
 1AnFcWZ4wNagivSy93AsjoemyEPiWtUDhlZWZ+9Yoi5vAqAU8W2B7cyEtaJDdtc/
 xF0++2lJ4qOiTy2drVdAv8HIMqYP8qZuid4R8jj6YcOKGe0g3IwmSbn/BGYq8vlc
 udDd3Cf8/lab8w==
 =SHY+
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc fix from Michael Ellerman:

 - Reinstate ARCH_FORCE_MAX_ORDER ranges to fix various breakage

* tag 'powerpc-6.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/mm: Reinstate ARCH_FORCE_MAX_ORDER ranges
2023-05-27 18:09:18 -07:00
Linus Torvalds
4e893b5aa4 xen: branch for v6.4-rc4
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCZHGVRAAKCRCAXGG7T9hj
 vqtJAQDizKasLE7tSnfs/FrZ/4xPaDLe3bQifMx2C1dtYCjRcAD/ciZSa1L0WzZP
 dpEZnlYRzsR3bwLktQEMQFOvlbh1SwE=
 =K860
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-6.4-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen fixes from Juergen Gross:

 - a double free fix in the Xen pvcalls backend driver

 - a fix for a regression causing the MSI related sysfs entries to not
   being created in Xen PV guests

 - a fix in the Xen blkfront driver for handling insane input data
   better

* tag 'for-linus-6.4-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  x86/pci/xen: populate MSI sysfs entries
  xen/pvcalls-back: fix double frees with pvcalls_new_active_socket()
  xen/blkfront: Only check REQ_FUA for writes
2023-05-27 09:42:56 -07:00
Linus Torvalds
957f3f8e53 Char/Misc fixes for 6.4-rc4
Here are some small driver fixes for 6.4-rc4.  They are just two
 different types:
   - binder fixes and reverts for reported problems and regressions in
     the binder "driver".
   - coresight driver fixes for reported problems.
 
 All of these have been in linux-next for over a week with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZHG/lQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ylPvACgx7Rq0HOFp4k984U/Z+cGzX2wSPIAn2W+6SKx
 0Wgv1hIeXQb/p4LO+U2j
 =LyzR
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc fixes from Greg KH:
 "Here are some small driver fixes for 6.4-rc4. They are just two
  different types:

   - binder fixes and reverts for reported problems and regressions in
     the binder "driver".

   - coresight driver fixes for reported problems.

  All of these have been in linux-next for over a week with no reported
  problems"

* tag 'char-misc-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  binder: fix UAF of alloc->vma in race with munmap()
  binder: add lockless binder_alloc_(set|get)_vma()
  Revert "android: binder: stop saving a pointer to the VMA"
  Revert "binder_alloc: add missing mmap_lock calls when using the VMA"
  binder: fix UAF caused by faulty buffer cleanup
  coresight: perf: Release Coresight path when alloc trace id failed
  coresight: Fix signedness bug in tmc_etr_buf_insert_barrier_packet()
2023-05-27 09:14:43 -07:00
Linus Torvalds
49572d5361 cxl fixes for v6.4-rc4
- Stop trusting capacity data before the "media ready" indication
 
 - Add missing HDM decoder capability enable for the cold-plug case
 
 - Fix a debug message induced crash
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCZHFLXgAKCRDfioYZHlFs
 Z2XxAQDaJl6CVwalMooLpnDN15eiGH2CFDRec9FK62ZH+m27CgD/Zx5QMZUp6YBC
 b9A3Tmx1MzIekO1CYRq9vi4ybXMCAAg=
 =cmO2
 -----END PGP SIGNATURE-----

Merge tag 'cxl-fixes-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull compute express link fixes from Dan Williams:
 "The 'media ready' series prevents the driver from acting on bad
  capacity information, and it moves some checks earlier in the init
  sequence which impacts topics in the queue for 6.5.

  Additional hotplug testing uncovered a missing enable for memory
  decode. A debug crash fix is also included.

  Summary:

   - Stop trusting capacity data before the "media ready" indication

   - Add missing HDM decoder capability enable for the cold-plug case

   - Fix a debug message induced crash"

* tag 'cxl-fixes-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
  cxl: Explicitly initialize resources when media is not ready
  cxl/port: Fix NULL pointer access in devm_cxl_add_port()
  cxl: Move cxl_await_media_ready() to before capacity info retrieval
  cxl: Wait Memory_Info_Valid before access memory related info
  cxl/port: Enable the HDM decoder capability for switch ports
2023-05-26 17:45:24 -07:00
Linus Torvalds
18713e8a68 ARM: SoC fixes for 6.4
There have not been a lot of fixes for for the soc tree in 6.4, but
 these have been sitting here for too long.
 
 For the devicetree side, there is one minor warning fix for vexpress,
 the rest all all for the the NXP i.MX platforms: SoC specific bugfixes
 for the iMX8 clocks and its USB-3.0 gadget device, as well as board
 specific fixes for regulators and the phy on some of the i.MX boards.
 
 The microchip risc-v and arm32 maintainers now also add a shared
 maintainer file entry for the arm64 parts.
 
 The remaining fixes are all for firmware drivers, addressing mistakes in
 the optee, scmi and ff-a firmware driver implementation, mostly in the
 error handling code, incorrect use of the alloc_workqueue() interface in
 SCMI, and compatibility with corner cases of the firmware implementation.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmRxHzQACgkQYKtH/8kJ
 UidW5g//W9PQIG70mWZr1uDIluUyyEnSp4gsuPHgDSSrQS7paSud3E8FF7ofmiKQ
 57Tuv+y9vVJtmzfWlR+L1VybPrOSGAsuzqag9/inwcfew7yPlOU9Z9Dse1v2/ClA
 GWuMsWjcQmCBgmluSMBupt6ZWbz3cZaLcJAuGmIYzUbeL/Cx45EnMnjc3ESpa6Ca
 ow3t/uoAk+KSgjMLyf2bec8BUiueRCJrqQmYHAg4utvehfTUl98epinRbtoi0VDR
 5HYlYvYLi0u0em14Nx1CaNVgvM13C5/LOoJNFlDcV+x2LCW9aPCqiRi5vfalOGwj
 r0NiA3cPIizJc4oiiK3+PqT5PGfBlkjJx/C6LfCZRsRnu6XegXo/prCNELOJPesD
 ecRU9NXAjWup5PLSRPNRz3ekI4yySbM6JnoNEe+I7+djrSkmtjGBggwX6Is+fSY0
 KkwhKEml6pI9w9zXh0f63R8QvKo91SUPIe/PgV65p4yeRf5JJgxeJRDk/dyzDfj8
 rViloBfTndS762mI7l46xEs3wWY0+CcxRoh53VFe50pqHbrIMY21pjarMEhMSI9V
 dFnkXcIIJwkt8TcoLOPMD0xSb0g3C2bEnAK253EeNtBkCdp7eLGYmUdTle+cOjyw
 0wafZSrbtBStLby+9F4zxYbrbArqxH5GrKanuX6jyiLtYefAa2U=
 =XeTU
 -----END PGP SIGNATURE-----

Merge tag 'arm-fixes-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "There have not been a lot of fixes for for the soc tree in 6.4, but
  these have been sitting here for too long.

  For the devicetree side, there is one minor warning fix for vexpress,
  the rest all all for the the NXP i.MX platforms: SoC specific bugfixes
  for the iMX8 clocks and its USB-3.0 gadget device, as well as board
  specific fixes for regulators and the phy on some of the i.MX boards.

  The microchip risc-v and arm32 maintainers now also add a shared
  maintainer file entry for the arm64 parts.

  The remaining fixes are all for firmware drivers, addressing mistakes
  in the optee, scmi and ff-a firmware driver implementation, mostly in
  the error handling code, incorrect use of the alloc_workqueue()
  interface in SCMI, and compatibility with corner cases of the firmware
  implementation"

* tag 'arm-fixes-6.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  MAINTAINERS: update arm64 Microchip entries
  arm64: dts: imx8: fix USB 3.0 Gadget Failure in QM & QXPB0 at super speed
  dt-binding: cdns,usb3: Fix cdns,on-chip-buff-size type
  arm64: dts: colibri-imx8x: delete adc1 and dsp
  arm64: dts: colibri-imx8x: fix iris pinctrl configuration
  arm64: dts: colibri-imx8x: move pinctrl property from SoM to eval board
  arm64: dts: colibri-imx8x: fix eval board pin configuration
  arm64: dts: imx8mp: Fix video clock parents
  ARM: dts: imx6qdl-mba6: Add missing pvcie-supply regulator
  ARM: dts: imx6ull-dhcor: Set and limit the mode for PMIC buck 1, 2 and 3
  arm64: dts: imx8mn-var-som: fix PHY detection bug by adding deassert delay
  arm64: dts: imx8mn: Fix video clock parents
  firmware: arm_ffa: Set reserved/MBZ fields to zero in the memory descriptors
  firmware: arm_ffa: Fix FFA device names for logical partitions
  firmware: arm_ffa: Fix usage of partition info get count flag
  firmware: arm_ffa: Check if ffa_driver remove is present before executing
  arm64: dts: arm: add missing cache properties
  ARM: dts: vexpress: add missing cache properties
  firmware: arm_scmi: Fix incorrect alloc_workqueue() invocation
  optee: fix uninited async notif value
2023-05-26 16:17:56 -07:00
Linus Torvalds
96f15fc6cc pci-v6.4-fixes-1
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmRxJuAUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vyjow/+N93KmaTcv8dD3WDuuY82fw8O0lMP
 iysEe3iUzdeLf7hB5cJ/xOuQrNVzAi8ZgbeSabIePd7vmfKe39vhWgDPjIR9nUd/
 /20kV+z2RYcIUyl1IXOLzjUQmzrxP+KxT4xa8DkLnffiMuQAxQPtdcEjt1/60yQ6
 8tEoGPEI/5d5H5nbpAcSMVaC5CLKSvXqdgfgOo4GF7g+C5dmy/dVcOKp2Jl3s+GF
 9A/Iipg/iKicIeoQ1V6rFG7rE/iq83saAwBVfn4lJJi8oNP7CARgL8ebwSbNTpsc
 2VyKoKH+sYderLhTlCycleL5IAmoJBtAL/KhNs8rVSTBtY9dO9MwrXi2+gFkUElv
 MtqswsbXWDZifOnekWXNeRDK781wDHhqEY8PdDbm+YSYfilV5Rd4GZCBCqVd2vvv
 1OUt3lgaeul84tHXC/wRJOSh3yBACldcMas9vwi4lGoAXzaLkewsbxmhMoOXmaT3
 GLCxziEbGKICBc8EqYvxE759CZzCzclPzPH0Lc4VgdHXdvDCjtbkomIi8LgVLtSG
 g2gpYkWkG7/UseOjzxFtQeg8yvzU2U1IEWE1hZtlh0cNWJ7qAdmSuBZHBEKXD6Z0
 /11y0VBTcQ2q6uEDE0Tl6JDJ0JTAm8t7dnSFk1tYhKDbDmArF94Lz8zF0fpPXobH
 1FVCR5Ush+d0rQg=
 =NYnq
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.4-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI fix from Bjorn Helgaas:

 - Quirk Ice Lake Root Ports to work around DPC log size issue (Mika
   Westerberg)

* tag 'pci-v6.4-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI/DPC: Quirk PIO log size for Intel Ice Lake Root Ports
2023-05-26 16:06:57 -07:00
Linus Torvalds
8846af7547 VFIO fix for v6.4-rc4
- Test for and return error for invalid pfns through the pin pages
    interface. (Yan Zhao)
 -----BEGIN PGP SIGNATURE-----
 
 iQJPBAABCAA5FiEEQvbATlQL0amee4qQI5ubbjuwiyIFAmRxEkgbHGFsZXgud2ls
 bGlhbXNvbkByZWRoYXQuY29tAAoJECObm247sIsi7OwP/RLQRC9fWJwJYAg+owh5
 ExpvVvNUH+2wGKH63rjUNdfIClOG1ScvTKb6InDJYpiOuJw05T/9PAZQNUTEbFn3
 /hQe7SRtpeylRAL5mG+q0hP8ZSuACVwB3Zi10kggfdEsEd47+CyhciJfPSbMtNU5
 onyK19z8MBXtEy64r5rMISRTuaY5zBBzQ5PrOMDsucKoNr8d0EOionIuisDQws9Z
 n5lSziCA9ie7EJAo94PYR4hVTEGBXWFWJp4T8RXPKJWw93cYvidIoPsspIW+tV+5
 MXkxBDULm9y0fX9y1d04RCK2LRIRefmRx9Oa5gWKK7EFU0rrW3Uh2bScQiDj9jn8
 qTsbQ2vC2/NlHrNIiQx817cIbPosGp8TeAWdbg3elM6aUTeMD0KslNo9Rj7CXxi9
 GWPZYb4O4BiggR5evw9e3dC8iKrP76wuUKvXOtwXpBQ9ScbMnhpUVZQNe21K+wtu
 hjBYgI+SdqSMgulbBuXRJ2u/hSAA0ybGprR3oyfA8ekZBP6tbG88CRmEvrEvV3IQ
 +dMasXL08m4zvjT6Ka1BD+RPfq7r+VoX66g0+IeVCS3/LG45lYCl9wxfU2NLbWnR
 bufIVqOtyZ+HQIugtUI4DxiBjXyOsbdmYaiDHKnyhDjWwdk0nIvBAU/VG5yCjgBL
 nt1um8xoXemjzuYZfl4s6Y6g
 =0s58
 -----END PGP SIGNATURE-----

Merge tag 'vfio-v6.4-rc4' of https://github.com/awilliam/linux-vfio

Pull VFIO fix from Alex Williamson:

 - Test for and return error for invalid pfns through the pin pages
   interface (Yan Zhao)

* tag 'vfio-v6.4-rc4' of https://github.com/awilliam/linux-vfio:
  vfio/type1: check pfn valid before converting to struct page
2023-05-26 15:57:14 -07:00
Linus Torvalds
a92c9ab69f block-6.4-2023-05-26
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmRw6C8QHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpm0aD/9h4AAwmFAhVYxcr2Lk4AXwTz1sMDnQ58P3
 /OhwvogCsAV9K4venjuDYPL0IdHOq+Txh2S51DnhqGlJlU9xRxPMbTjCToUMqfCw
 VjUo0aF9XfQdqPWR2/zh9Zi1YwZd0rC+KBRRvlEdWszoze0H3L8rbxBNllSzdeKl
 hECLNpq8z45nzPAYateISs7Cv9OvGuqkaPbjCdHqGQ60Tdt4QIcnnPUUPfntcZHI
 tjHnAc3uN4WyrEmMPlTd6MtVV6VCQKAKz65L9bvvNlESverXNv923p8TPzEdb8tR
 utSD995Wxh8rBOC3+WA5+d1pYI+C1cUADmgE6QN/8jqu04No4l5Ob3+8ImPmrnGW
 a0VSX/BKt63QnXgFFYfn5yBMVMkjtoFtD87WAlLK50fYJTDn+ouVzqhcTsBJ9f/5
 xzyFvof8vVp83/Xlkwv/NumDTcyauBobiN6mI1mTBk0hakmxqNgb5xEDvbf5Q/M0
 jpp/9nBqmnFJTZRhRVu/G1RdeqDMDN1+/ws3uc3v+K+aeDMGx44l6g3N7f7kAPor
 Xk/Cky4x0KirxvpgJzhvS2mOSXbpnAKqYGK2+nIFS00dtbvUX9bERJoxTZbyihC+
 AkcPpO92UvxR7nsjgQY7R/euzqD4HKEYExzImT9c3Iug9z3Mtot0FncY6Jb8CEHW
 Vkh6wxsg6A==
 =MvS9
 -----END PGP SIGNATURE-----

Merge tag 'block-6.4-2023-05-26' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:
 "A few fixes for the storage side of things:

   - Fix bio caching condition for passthrough IO (Anuj)

   - end-of-device check fix for zero sized devices (Christoph)

   - Update Paolo's email address

   - NVMe pull request via Keith with a single quirk addition

   - Fix regression in how wbt enablement is done (Yu)

   - Fix race in active queue accounting (Tian)"

* tag 'block-6.4-2023-05-26' of git://git.kernel.dk/linux:
  NVMe: Add MAXIO 1602 to bogus nid list.
  block: make bio_check_eod work for zero sized devices
  block: fix bio-cache for passthru IO
  block, bfq: update Paolo's address in maintainer list
  blk-mq: fix race condition in active queue accounting
  blk-wbt: fix that wbt can't be disabled by default
2023-05-26 15:04:54 -07:00
Linus Torvalds
6fae9129b1 io_uring-6.4-2023-05-26
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmRw6B0QHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpmudD/96eBoqSNKXwMYnByOpWTst9JdMCaIuqV50
 LXJlyht+ti3jJR+K/DrFZS7o5SSPwx2nFEdpsIYmusrowusHdzrHDyX1xoVXWhwF
 Gt1xVF8d8MxJ+/JkS+qSp9d1VrCApXPBruINwBHYHGjanmBUWPkkal6Dn9XIbSPg
 dU6nMOIUZQdKpx/ThVjIYp+pcOo4fajVIpOJV6x+yFsF2/ySgL6bsVX0AE1XV5O4
 jH3K2JtI+xkWXVfBHytPd9oK/UbyUgjVj01Ykxo6bBjjVAX0Yy2a8L8aSayK5o0J
 KC1/wX2U493xhjsjGeHCmxNoJzK5yOmdLFsALyPnbLlEtdaXVCK3YPlroxgGu2hJ
 UZzmZMYFEMmAXHcc3rXY6KTWHjP9idL0kx+8ncs8BoibLPJeVM3jAIbljaVeN+fU
 yvQMN5TWEWypwzHtSCSm/JQzR1NxF42vw/cPoKZ1yub7B9lWtexzZqjH69S6oiuy
 SL1HWSYd+LTP5m70uHCKVziONDANMKUBSUVzn63Uwpl/15PuZkx3RAe08k8LO8Eh
 S0Z82lvH2qFpNGA0EiGxQviOExH0Ym76lonMLMGEdENv6eDn1mUP4hejHyCJ0Dai
 +6cCiO0p+Pala4qxGevHrFn8WQ01+4ffAj6aTgyQnzcR7Cv1nmDJ+yCG69FlQj4D
 JnANAsG/EA==
 =Ixp+
 -----END PGP SIGNATURE-----

Merge tag 'io_uring-6.4-2023-05-26' of git://git.kernel.dk/linux

Pull io_uring fix from Jens Axboe:
 "Just a single fix for the conditional schedule with the SQPOLL thread,
  dropping the uring_lock if we do need to reschedule"

* tag 'io_uring-6.4-2023-05-26' of git://git.kernel.dk/linux:
  io_uring: unlock sqd->lock before sq thread release CPU
2023-05-26 15:00:04 -07:00
Linus Torvalds
77af1f2b9a Thermal control fix for 6.4-rc4
Fix a regression introduced inadvertently during the 6.3 cycle by a
 commit making the Intel int340x thermal driver use sysfs_emit_at()
 instead of scnprintf() (Srinivas Pandruvada).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmRw3GoSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxd/8P/3/0nPD8CaK5Oa6fBQFwiR8unCTfQSh7
 7Tdut0HV47W9m8gj58S3okCXT4iR3Sv9M8wWYbH81oSXrrRKCS1BOVSK65jvHq8Q
 3qbnft0xOBaYzcP0zlKQ6D0YkJKTjS8c6uqdiv0xMyvYd7kdbcPACaW+9HutIp9o
 k92sSWo/sbxi4aKr9BjoGULwT2AcwbW+4gchrokHh01SFqIDDWFV/lR/hjPkwMkG
 HX8LjjOboy1bzy8Vdam5Q41MNKY3jdKz2qabd9kgBgXVd/rTH9D1cfYud07Hdi7A
 qpkvQSaPA08jt+Kh+rAZWj4GUU058EEm2hKFg0oqc4oRMxgRcB9MlRu61XTjzhDY
 XrPFboOHHD1XWU8hCV0yh3QmjxyVh3njaoSPeosPJLtZ7RFCp90T0SHxSKYCvo7C
 d0TEW1JDdVWvoqZl2JB8rjSixKIZX1neKLDjuZc9Fdgx0Y2N70eR7SZTiu3wohSO
 muYDHyeJ6SV9DwKWPrdVQGvjNfmMdFay7Z7WdaugTFVI475rLRF05iZyijrie3b0
 ntvy4LtqEB4Lj3mdl5I6j7b72YcHMzZCuwl7DHMWwoPo4AsePKYxslpJI/wrf0CD
 oUqhz8H/5ftA2VMdWjaIDB10u1pS1qVtNgLahguVc2HM5cCk7TPVw3ZjDlenXofw
 GTGRPMJnGtxP
 =Qxds
 -----END PGP SIGNATURE-----

Merge tag 'thermal-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control fix from Rafael Wysocki:
 "Fix a regression introduced inadvertently during the 6.3 cycle by a
  commit making the Intel int340x thermal driver use sysfs_emit_at()
  instead of scnprintf() (Srinivas Pandruvada)"

* tag 'thermal-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: intel: int340x: Add new line for UUID display
2023-05-26 13:55:46 -07:00
Linus Torvalds
c551afcdc3 Power management fixes for 6.4-rc4
Fix 3 issues related to the ->fast_switch callback in the AMD
 P-state cpufreq driver (Gautham R. Shenoy and Wyes Karny).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmRw2+cSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxW68P/2W9mcDxlepkfN9KEUgZ6ianmoRdHqSU
 KpL8uB4Snavd0QQYnRUbNPeDht5cAMZT1cEzmcdkGAzGz2SpD/44LH0T8zFO24aM
 SAdlo6j72VWiRmR5POvEZbKPRczCWFPlpqKkQyh227Msb5tGayo7fRa1VpRqo/AH
 YwYWaW5Ur5XfU0c8E5K+P3Rj2dz+RsFzZ4chb5yd14ae2Qr8f6EnOVte0ja0bFUF
 uPneqJ27U4MoDW1s/bkud32lJTeGLi8BwgIIyP6L7a8m5QdWxv+pTZMsa/NW7a+v
 3+3GrXoLZp6xeBGdt0z6/04BwpcRUh6ZpVlloGeYoSMpnQpGleqeJaVepWZCAKDm
 K0Gc7nLTLQUMO877oOz17cDLdDgXbIDmngw9ubEW0N7g5yOGzSCTfQyYYItN0dky
 mnTMXU5TM2LktH3V1y34HTTZTZU5TDZkLzoFNAgHqmlFhEqEdaN6l5AJJ6rPt9IM
 ZwJNpLz3R4rQ+XwGofebQGSlwpQVAzUrigiHJ2VJAbFxWqw4iohYkgD+1e8B/VzG
 e5DygskN1FFTJ7oMXlPSDWqulCRghOillqtUWocdHJAbnkglkLpexxqUAsoNahDt
 qA553sM5/lp29uNlpuz6AbqHscjOto94aC3uK3kxNUGsDy6CRcbn5VVIN5ceDVk+
 SnzKHfnh65vv
 =6XJv
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "Fix three issues related to the ->fast_switch callback in the AMD
  P-state cpufreq driver (Gautham R. Shenoy and Wyes Karny)"

* tag 'pm-6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: amd-pstate: Update policy->cur in amd_pstate_adjust_perf()
  cpufreq: amd-pstate: Remove fast_switch_possible flag from active driver
  cpufreq: amd-pstate: Add ->fast_switch() callback
2023-05-26 13:45:43 -07:00
Dave Jiang
793a539ac7 cxl: Explicitly initialize resources when media is not ready
When media is not ready do not assume that the capacity information from
the identify command is valid, i.e. ->total_bytes
->partition_align_bytes ->{volatile,persistent}_only_bytes. Explicitly
zero out the capacity resources and exit early.

Given zero-init of those fields this patch is functionally equivalent to
the prior state, but it improves readability and robustness going
forward.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/168506118166.3004974.13523455340007852589.stgit@djiang5-mobl3
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2023-05-26 13:34:39 -07:00
Linus Torvalds
91a304340a gpio fixes for v6.4-rc4
- fix incorrect output in in-tree gpio tools
 - fix a shell coding issue in gpio-sim selftests
 - correctly set the permissions for debugfs attributes exposed by gpio-mockup
 - fix chip name and pin count in gpio-f7188x for one of the supported models
 - fix numberspace pollution when using dynamically and statically allocated
   GPIOs together
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEFp3rbAvDxGAT0sefEacuoBRx13IFAmRws3YACgkQEacuoBRx
 13JHlhAAkyUNjtvkcTdO9buB2AXB3e/JH31MgSEEzoAi9kr54n70D+mtmSz5iX0X
 4a1KjZrd5Gd7nSGA3AHll7jxL1Rfa4vJsTID/khfhIPB0arGl7brZ0J2fItRUGqz
 oxAvpH3OpSWI+QWQMzp/NEZV0F5rfWBme93w6OqYItzGNr4LfP+3x49zPDUngOjp
 d3EsDzQbgxezYvrmWjPvZuZvk3RYud+dfU8bjvtnI6TJZqIxzePmcqEAFgqsnDJF
 VLEJQFPGpLaX7U+ZepcAQAs2s5ggk6Mb0JdGKp1w9hQ9w91TClBUgUURG6hzdLmZ
 w6/X3iGaFGaHugA4DmWhYd4FWE7H58dWazw/RVK4RM2ZGZ9cT8R8OVQlDsBhesyG
 tOTY4RJlU1DgMt66nlUvf8r3ECdm2ayvFq7qSN9xRDQ0W4Ti3+QU2+/re3s8jevR
 VYAo1BHtVai28+lCMjopK+fYBe8G4DqsGx9Uh3y0RxYIFvMKdo4EQF9Ku3yTEcq3
 3FNFO00KU5ktKgnx7z6bnO7+F3OvHUeSRh44U8O6pm/odfdkUn7CHjFVi2dM6yO5
 763UKwMZq9rDT2owdwljolIRECb8IAN0siBhhHYYW4oAFTD2w2vTQ/y2KYrxbntv
 CptwauQMM9vjzg9bxMP+4R5YVZhZhtJ4I1oa2kQx7eXIPBy/6OM=
 =R4dV
 -----END PGP SIGNATURE-----

Merge tag 'gpio-fixes-for-v6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux

Pull gpio fixes from Bartosz Golaszewski:

 - fix incorrect output in in-tree gpio tools

 - fix a shell coding issue in gpio-sim selftests

 - correctly set the permissions for debugfs attributes exposed by
   gpio-mockup

 - fix chip name and pin count in gpio-f7188x for one of the supported
   models

 - fix numberspace pollution when using dynamically and statically
   allocated GPIOs together

* tag 'gpio-fixes-for-v6.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpio-f7188x: fix chip name and pin count on Nuvoton chip
  gpiolib: fix allocation of mixed dynamic/static GPIOs
  gpio: mockup: Fix mode of debugfs files
  selftests: gpio: gpio-sim: Fix BUG: test FAILED due to recent change
  tools: gpio: fix debounce_period_us output of lsgpio
2023-05-26 13:29:16 -07:00
Linus Torvalds
b158dd941b for-6.4-rc3-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmRwqRsACgkQxWXV+ddt
 WDuCPQ//T8JVY6usnGF/Fw/3zbtDNvrdQLDfp3HovIg7gmLIBda0bT05w4Q46FUU
 l4BV0bHyTUNWPlXUmrrSmt8HipRe2z4Wjwc16azdLmSs5zf0FO1LbsCKDmM8Ncid
 LTi2jzyyb3E44ZzC/i7RCaBt+vYRb2ZmtZ/glh3K4H0GgTAYl1GxZoAoYgBnvmlG
 nvmlWWDaM2cRKaUREm75il37LKLIlW5jvdUFQrqwWNgUH72ay5/7SZxHywlk8x6b
 qwhhp+s6bMUNzi6CqE2SLnESjI9yl0l/0gLebhDXVulo0BiCrti+YLpueP4eQs1B
 yYXX3PvHOXhoN4tUQ4yDF9G57To4Gw1aiQOnWOOLcbyGG1ZgyekpoRRXh6r74LKt
 FDyWT+u/xd78by1km3VzqmvKtqHnRFNMYfP+MMDIhyhy5prKCWeVo7bC+2FP+89o
 kv9+0Z0w0lkLycFfLaewZkEv0/WY8GMuT7kptHQ2Ao6ulAvG+j97sgVBFGXJjeCr
 B1OAGdeTF79IV139bCxPA62cat87Zrh15mZN+y7U32Vs2JkOqbT0LTQGKoVs/TCI
 AyHCDb8oOfGiebibnEDrDNtubz7NFCq4ntZRmuv5FJ+l2d1wl6ZvsI+DoYP7Zide
 DLR7ZtPs1Yvm27xDjs+fVmMx4nuNGikEbPZPxJro1CjLVzCEt7k=
 =elHB
 -----END PGP SIGNATURE-----

Merge tag 'for-6.4-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - handle memory allocation error in checksumming helper (reported by
   syzbot)

 - fix lockdep splat when aborting a transaction, add NOFS protection
   around invalidate_inode_pages2 that could allocate with GFP_KERNEL

 - reduce chances to hit an ENOSPC during scrub with RAID56 profiles

* tag 'for-6.4-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: use nofs when cleaning up aborted transactions
  btrfs: handle memory allocation failure in btrfs_csum_one_bio
  btrfs: scrub: try harder to mark RAID56 block groups read-only
2023-05-26 13:21:38 -07:00
Linus Torvalds
b83ac44e02 drm fixes for 6.4-rc4
core:
 - fix drmm_mutex_init lock class
 
 mgag200:
 - fix gamma lut initialisation
 
 pl111:
 - fix FB depth on IMPD-1 framebuffer
 
 amdgpu:
 - Fix missing BO unlocking in KIQ error path
 - Avoid spurious secure display error messages
 - SMU13 fix
 - Fix an OD regression
 - GPU reset display IRQ warning fix
 - MST fix
 
 radeon:
 - Fix a DP regression
 
 i915:
 - PIPEDMC disabling fix for bigjoiner config
 
 panel:
 - fix aya neo air plus quirk
 
 sched:
 - remove redundant NULL check
 
 qaic:
 - fix NNC message corruption
 - Grab ch_lock during QAIC_ATTACH_SLICE_BO
 - Flush the transfer list again
 - Validate if BO is sliced before slicing
 - Validate user data before grabbing any lock
 - initialize ret variable to 0
 - silence some uninitialized variable warnings
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmRwSkgACgkQDHTzWXnE
 hr7biw//U0n2qc0rBv+G2qdCu7zF6kuJZ83gStBRiybTm0qynUJA+wppwSyqUEDg
 EQqoIy0hMiKpcAxFQuXpLA3tk2F7UGHxEV2aaKEMhwrU+HBCDrpUqVCZjhe6DQbT
 bwj+SMvPCfrHhGfPbiGw7Vsw/qPYbC5ulrsn2th9vvwDuxL1wwtYIpJzA/xhV7eO
 sWjUKYVtuXSrI/oqZLzoZU6YCjf7G2wShkkSHTna/zjmHdWbYEiyqZ6huL70LJBD
 fZWI5w9HM8ABMMrE4W95TubMZOwDvsG7mb6G0WnS7P4Lq6Aab5CT1I6IkBB86pYz
 kSW1VMUpPIE9+E7m3OBL66L/i6xCmBZ5nEHqjsK9DYvpHX/L2IG26SdE/e6Ai29Y
 aMLxi4hM45FNfAOIB9Z36JJ9gaYCXf2hL/KAmrWovVlpT1vLkH/6DZ3nWNiNuhvT
 5PKKTM4w3zOHzv0R/xrb3dfg+0idSPlEd4+0bsjM5STAMc1pyp6PhEHtEI2HqFwM
 vkJ+RqK/lVXnDXyjgZnjbGsDaUQ3Rcw8/G3l+79j8mJcfeRYknGw/698BGPDBcpG
 9hMxxIk8ttXYqhwmTCGTy1D/xf3lPOLmdTTACokvOEyd6VxijaeI2apbV0HYVnca
 mPvSdxpKs+xx6nACSuxOG02E6Z4k85/kdNZeCmHEV9KGJACLIpw=
 =IDNm
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2023-05-26' of git://anongit.freedesktop.org/drm/drm

Pull drm fixes from Dave Airlie:
 "This week's collection is pretty spread out, accel/qaic has a bunch of
  fixes, amdgpu, then lots of single fixes across a bunch of places.

  core:
   - fix drmm_mutex_init lock class

  mgag200:
   - fix gamma lut initialisation

  pl111:
   - fix FB depth on IMPD-1 framebuffer

  amdgpu:
   - Fix missing BO unlocking in KIQ error path
   - Avoid spurious secure display error messages
   - SMU13 fix
   - Fix an OD regression
   - GPU reset display IRQ warning fix
   - MST fix

  radeon:
   - Fix a DP regression

  i915:
   - PIPEDMC disabling fix for bigjoiner config

  panel:
   - fix aya neo air plus quirk

  sched:
   - remove redundant NULL check

  qaic:
   - fix NNC message corruption
   - Grab ch_lock during QAIC_ATTACH_SLICE_BO
   - Flush the transfer list again
   - Validate if BO is sliced before slicing
   - Validate user data before grabbing any lock
   - initialize ret variable to 0
   - silence some uninitialized variable warnings"

* tag 'drm-fixes-2023-05-26' of git://anongit.freedesktop.org/drm/drm:
  drm/amd/display: Have Payload Properly Created After Resume
  drm/amd/display: Fix warning in disabling vblank irq
  drm/amd/pm: Fix output of pp_od_clk_voltage
  drm/amd/pm: add missing NotifyPowerSource message mapping for SMU13.0.7
  drm/radeon: reintroduce radeon_dp_work_func content
  drm/amdgpu: don't enable secure display on incompatible platforms
  drm:amd:amdgpu: Fix missing buffer object unlock in failure path
  accel/qaic: Fix NNC message corruption
  accel/qaic: Grab ch_lock during QAIC_ATTACH_SLICE_BO
  accel/qaic: Flush the transfer list again
  accel/qaic: Validate if BO is sliced before slicing
  accel/qaic: Validate user data before grabbing any lock
  accel/qaic: initialize ret variable to 0
  drm/i915: Fix PIPEDMC disabling for a bigjoiner configuration
  drm: fix drmm_mutex_init()
  drm/sched: Remove redundant check
  drm: panel-orientation-quirks: Change Air's quirk to support Air Plus
  accel/qaic: silence some uninitialized variable warnings
  drm/pl111: Fix FB depth on IMPD-1 framebuffer
  drm/mgag200: Fix gamma lut not initialized.
2023-05-26 13:11:41 -07:00
Linus Torvalds
47ee3f1dd9 x86: re-introduce support for ERMS copies for user space accesses
I tried to streamline our user memory copy code fairly aggressively in
commit adfcf4231b8c ("x86: don't use REP_GOOD or ERMS for user memory
copies"), in order to then be able to clean up the code and inline the
modern FSRM case in commit 577e6a7fd50d ("x86: inline the 'rep movs' in
user copies for the FSRM case").

We had reports [1] of that causing regressions earlier with blogbench,
but that turned out to be a horrible benchmark for that case, and not a
sufficient reason for re-instating "rep movsb" on older machines.

However, now Eric Dumazet reported [2] a regression in performance that
seems to be a rather more real benchmark, where due to the removal of
"rep movs" a TCP stream over a 100Gbps network no longer reaches line
speed.

And it turns out that with the simplified the calling convention for the
non-FSRM case in commit 427fda2c8a49 ("x86: improve on the non-rep
'copy_user' function"), re-introducing the ERMS case is actually fairly
simple.

Of course, that "fairly simple" is glossing over several missteps due to
having to fight our assembler alternative code.  This code really wanted
to rewrite a conditional branch to have two different targets, but that
made objtool sufficiently unhappy that this instead just ended up doing
a choice between "jump to the unrolled loop, or use 'rep movsb'
directly".

Let's see if somebody finds a case where the kernel memory copies also
care (see commit 68674f94ffc9: "x86: don't use REP_GOOD or ERMS for
small memory copies").  But Eric does argue that the user copies are
special because networking tries to copy up to 32KB at a time, if
order-3 pages allocations are possible.

In-kernel memory copies are typically small, unless they are the special
"copy pages at a time" kind that still use "rep movs".

Link: https://lore.kernel.org/lkml/202305041446.71d46724-yujie.liu@intel.com/ [1]
Link: https://lore.kernel.org/lkml/CANn89iKUbyrJ=r2+_kK+sb2ZSSHifFZ7QkPLDpAtkJ8v4WUumA@mail.gmail.com/ [2]
Reported-and-tested-by: Eric Dumazet <edumazet@google.com>
Fixes: adfcf4231b8c ("x86: don't use REP_GOOD or ERMS for user memory copies")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-05-26 12:34:20 -07:00
Jens Axboe
9491d01fbc Merge tag 'nvme-6.4-2023-05-26' of git://git.infradead.org/nvme into block-6.4
Pull NVMe fix from Keith:

"nvme fixes for 6.4

 One nvme quirk (Tatsuki)"

* tag 'nvme-6.4-2023-05-26' of git://git.infradead.org/nvme:
  NVMe: Add MAXIO 1602 to bogus nid list.
2023-05-26 09:46:01 -06:00
Tatsuki Sugiura
a3a9d63dcd NVMe: Add MAXIO 1602 to bogus nid list.
HIKSEMI FUTURE M.2 SSD uses the same dummy nguid and eui64.
I confirmed it with my two devices.

This patch marks the controller as NVME_QUIRK_BOGUS_NID.

---------------------------------------------------------
sugi@tempest:~% sudo nvme id-ctrl /dev/nvme0
NVME Identify Controller:
vid       : 0x1e4b
ssvid     : 0x1e4b
sn        : 30096022612
mn        : HS-SSD-FUTURE 2048G
fr        : SN10542
rab       : 0
ieee      : 000000
cmic      : 0
mdts      : 7
cntlid    : 0
ver       : 0x10400
rtd3r     : 0x7a120
rtd3e     : 0x1e8480
oaes      : 0x200
ctratt    : 0x2
rrls      : 0
cntrltype : 1
fguid     : 00000000-0000-0000-0000-000000000000
<snip...>
---------------------------------------------------------

---------------------------------------------------------
sugi@tempest:~% sudo nvme id-ns /dev/nvme0n1
NVME Identify Namespace 1:
<snip...>
nguid   : 00000000000000000000000000000000
eui64   : 0000000000000002
lbaf  0 : ms:0   lbads:9  rp:0 (in use)
---------------------------------------------------------

Signed-off-by: Tatsuki Sugiura <sugi@nemui.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Keith Busch <kbusch@kernel.org>
2023-05-26 08:21:50 -07:00