IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
The forward declaration for shmem_should_replace_page() and
shmem_replace_page() is unnecessary. Remove them.
Link: https://lkml.kernel.org/r/20210812120350.49801-4-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mfill_atomic_install_pte() is introduced to install pte and update mmu
cache since commit bf6ebd97aba0 ("userfaultfd/shmem: modify
shmem_mfill_atomic_pte to use install_pte()"). So we should remove
tlbflush.h as update_mmu_cache() is not called here now.
Link: https://lkml.kernel.org/r/20210812120350.49801-3-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "Cleanups for shmem".
This series contains cleanups to remove unneeded variable, header file,
function forward declaration and so on. More details can be found in the
respective changelogs.
This patch (of 4):
The local variable ret is always equal to -ENOMEM and never touched. So
remove it and return -ENOMEM directly to simplify the code.
Link: https://lkml.kernel.org/r/20210812120350.49801-1-linmiaohe@huawei.com
Link: https://lkml.kernel.org/r/20210812120350.49801-2-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Each CPU has SHMEM_INO_BATCH inodes available in `->ino_batch' which is
per-CPU. Access here is serialized by disabling preemption. If the pool
is empty, it gets reloaded from `->next_ino'. Access here is serialized
by ->stat_lock which is a spinlock_t and can not be acquired with disabled
preemption.
One way around it would make per-CPU ino_batch struct containing the inode
number a local_lock_t.
Another solution is to promote ->stat_lock to a raw_spinlock_t. The
critical sections are short. The mpol_put() must be moved outside of the
critical section to avoid invoking the destructor with disabled
preemption.
Link: https://lkml.kernel.org/r/20210806142916.jdwkb5bx62q5fwfo@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
get_kernel_page() was added in 2012 by [1]. It was used for a while for
NFS, but then in 2014, a refactoring [2] removed all callers, and it has
apparently not been used since.
Remove get_kernel_page() because it has no callers.
[1] commit 18022c5d8627 ("mm: add get_kernel_page[s] for pinning of
kernel addresses for I/O")
[2] commit 91f79c43d1b5 ("new helper: iov_iter_get_pages_alloc()")
Link: https://lkml.kernel.org/r/20210729221847.1165665-1-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Eric Paris <eparis@redhat.com>
Cc: James Morris <jmorris@namei.org>
Cc: Mike Christie <michaelc@cs.wisc.edu>
Cc: Neil Brown <neilb@suse.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: Xiaotian Feng <dfeng@redhat.com>
Cc: Mark Salter <msalter@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We had a recurring situation in which admin procedures setting up
swapfiles would race with test preparation clearing away swapfiles; and
just occasionally that got stuck on a swapfile "(deleted)" which could
never be swapped off. That is not supposed to be possible.
2.6.28 commit f9454548e17c ("don't unlink an active swapfile") admitted
that it was leaving a race window open: now close it.
may_delete() makes the IS_SWAPFILE check (amongst many others) before
inode_lock has been taken on target: now repeat just that simple check in
vfs_unlink() and vfs_rename(), after taking inode_lock.
Which goes most of the way to fixing the race, but swapon() must also
check after it acquires inode_lock, that the file just opened has not
already been unlinked.
Link: https://lkml.kernel.org/r/e17b91ad-a578-9a15-5e3-4989e0f999b5@google.com
Fixes: f9454548e17c ("don't unlink an active swapfile")
Signed-off-by: Hugh Dickins <hughd@google.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
try_get_page() is very similar to try_get_compound_head(), and in fact
try_get_page() has fallen a little behind in terms of maintenance:
try_get_compound_head() handles speculative page references more
thoroughly.
There are only two try_get_page() callsites, so just call
try_get_compound_head() directly from those, and remove try_get_page()
entirely.
Also, seeing as how this changes try_get_compound_head() into a non-static
function, provide some kerneldoc documentation for it.
Link: https://lkml.kernel.org/r/20210813044133.1536842-4-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
try_grab_page() does the same thing as try_grab_compound_head(..., refs=1,
...), just with a different API. So there is a lot of code duplication
there.
Change try_grab_page() to call try_grab_compound_head(), while keeping the
API contract identical for callers.
Also, now that try_grab_compound_head() always has a caller, remove the
__maybe_unused annotation.
Link: https://lkml.kernel.org/r/20210813044133.1536842-3-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "A few gup refactorings and documentation updates", v3.
While reviewing some of the other things going on around gup.c, I noticed
that the documentation was wrong for a few of the routines that I wrote.
And then I noticed that there was some significant code duplication too.
So this fixes those issues.
This is not entirely risk-free, but after looking closely at this, I think
it's actually a useful improvement, getting rid of the code duplication
here.
This patch (of 3):
The documentation for try_grab_compound_head() and try_grab_page() has
fallen a little out of date. Update and clarify a few points.
Also make it kerneldoc-correct, by adding @args documentation.
Link: https://lkml.kernel.org/r/20210813044133.1536842-1-jhubbard@nvidia.com
Link: https://lkml.kernel.org/r/20210813044133.1536842-2-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use helper PAGE_ALIGNED to check if address is aligned to PAGE_SIZE.
Minor readability improvement.
Link: https://lkml.kernel.org/r/20210807093620.21347-6-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When failed to try_grab_page, put_dev_pagemap() is missed. So pgmap
refcnt will leak in this case. Also we remove the check for pgmap against
NULL as it's also checked inside the put_dev_pagemap().
[akpm@linux-foundation.org: simplify, cleanup]
[akpm@linux-foundation.org: fix return value]
Link: https://lkml.kernel.org/r/20210807093620.21347-5-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Fixes: 3faa52c03f44 ("mm/gup: track FOLL_PIN pages")
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Indeed, this BUG_ON couldn't catch anything useful. We are sure ret == 0
here because we would already bail out if ret != 0 and ret is untouched
till here.
Link: https://lkml.kernel.org/r/20210807093620.21347-4-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Remove unneed local variable orig_refs since refs is unchanged now.
Link: https://lkml.kernel.org/r/20210807093620.21347-3-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "Cleanups and fixup for gup".
This series contains cleanups to remove unneeded variable, useless BUG_ON
and use helper to improve readability. Also we fix a potential pgmap
refcnt leak. More details can be found in the respective changelogs.
This patch (of 5):
Since commit a2beb5f1efed ("mm: clean up the last pieces of page fault
accountings"), the local variable major is unused. Remove it.
Link: https://lkml.kernel.org/r/20210807093620.21347-1-linmiaohe@huawei.com
Link: https://lkml.kernel.org/r/20210807093620.21347-2-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently cgroup_writeback_by_id calls mem_cgroup_wb_stats() to get dirty
pages for a memcg. However mem_cgroup_wb_stats() does a lot more than
just get the number of dirty pages. Just directly get the number of dirty
pages instead of calling mem_cgroup_wb_stats(). Also
cgroup_writeback_by_id() is only called for best-effort dirty flushing, so
remove the unused 'nr' parameter and no need to explicitly flush memcg
stats.
Link: https://lkml.kernel.org/r/20210722182627.2267368-1-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
pginodesteal is supposed to capture the impact that inode reclaim has on
the page cache state. Currently, it doesn't consider shadow pages that
get dropped this way, even though this can have a significant impact on
paging behavior, memory pressure calculations etc.
To improve visibility into these effects, make sure shadow pages get
counted when they get dropped through inode reclaim.
This changes the return value semantics of invalidate_mapping_pages()
semantics slightly, but the only two users are the inode shrinker itsel
and a usb driver that logs it for debugging purposes.
Link: https://lkml.kernel.org/r/20210614211904.14420-3-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The page cache deletion paths all have interrupts enabled, so no need to
use irqsafe/irqrestore locking variants.
They used to have irqs disabled by the memcg lock added in commit
c4843a7593a9 ("memcg: add per cgroup dirty page accounting"), but that has
since been replaced by memcg taking the page lock instead, commit
0a31bc97c80c ("mm: memcontrol: rewrite uncharge AP").
Link: https://lkml.kernel.org/r/20210614211904.14420-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We do some unlocked reads of writeback statistics like
avg_write_bandwidth, dirty_ratelimit, or bw_time_stamp. Generally we are
fine with getting somewhat out-of-date values but actually getting
different values in various parts of the functions because the compiler
decided to reload value from original memory location could confuse
calculations. Use READ_ONCE for these unlocked accesses and WRITE_ONCE
for the updates to be on the safe side.
Link: https://lkml.kernel.org/r/20210713104716.22868-5-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Michael Stapelberg <stapelberg+linux@google.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rename domain_update_bandwidth() to domain_update_dirty_limit(). The
original name is a misnomer. The function has nothing to do with a
bandwidth, it updates dirty limits.
Link: https://lkml.kernel.org/r/20210713104716.22868-4-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Michael Stapelberg <stapelberg+linux@google.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Michael Stapelberg has reported that for workload with short big spikes of
writes (GCC linker seem to trigger this frequently) the write throughput
is heavily underestimated and tends to steadily sink until it reaches
zero. This has rather bad impact on writeback throttling (causing
stalls). The problem is that writeback throughput estimate gets updated
at most once per 200 ms. One update happens early after we submit pages
for writeback (at that point writeout of only small fraction of pages is
completed and thus observed throughput is tiny). Next update happens only
during the next write spike (updates happen only from inode writeback and
dirty throttling code) and if that is more than 1s after previous spike,
we decide system was idle and just ignore whatever was written until this
moment.
Fix the problem by making sure writeback throughput estimate is also
updated shortly after writeback completes to get reasonable estimate of
throughput for spiky workloads.
[jack@suse.cz: avoid division by 0 in wb_update_dirty_ratelimit()]
Link: https://lore.kernel.org/lkml/20210617095309.3542373-1-stapelberg+linux@google.com
Link: https://lkml.kernel.org/r/20210713104716.22868-3-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Reported-by: Michael Stapelberg <stapelberg+linux@google.com>
Tested-by: Michael Stapelberg <stapelberg+linux@google.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently we trigger writeback bandwidth estimation from
balance_dirty_pages() and from wb_writeback(). However neither of these
need to trigger when the system is relatively idle and writeback is
triggered e.g. from fsync(2). Make sure writeback estimates happen
reliably by triggering them from do_writepages().
Link: https://lkml.kernel.org/r/20210713104716.22868-2-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Michael Stapelberg <stapelberg+linux@google.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "writeback: Fix bandwidth estimates", v4.
Fix estimate of writeback throughput when device is not fully busy doing
writeback. Michael Stapelberg has reported that such workload (e.g.
generated by linking) tends to push estimated throughput down to 0 and as
a result writeback on the device is practically stalled.
The first three patches fix the reported issue, the remaining two patches
are unrelated cleanups of problems I've noticed when reading the code.
This patch (of 4):
Track number of inodes under writeback for each bdi_writeback structure.
We will use this to decide whether wb does any IO and so we can estimate
its writeback throughput. In principle we could use number of pages under
writeback (WB_WRITEBACK counter) for this however normal percpu counter
reads are too inaccurate for our purposes and summing the counter is too
expensive.
Link: https://lkml.kernel.org/r/20210713104519.16394-1-jack@suse.cz
Link: https://lkml.kernel.org/r/20210713104716.22868-1-jack@suse.cz
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Michael Stapelberg <stapelberg+linux@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Print NR_KERNEL_MISC_RECLAIMABLE stat from show_free_areas() so users can
check whether the shrinker is working correctly and to show the current
memory usage.
Link: https://lkml.kernel.org/r/20210813104725.4562-1-liuhailong@oppo.com
Signed-off-by: liuhailong <liuhailong@oppo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
A recent lockdep report included these lines:
[ 96.177910] 3 locks held by containerd/770:
[ 96.177934] #0: ffff88810815ea28 (&mm->mmap_lock#2){++++}-{3:3},
at: do_user_addr_fault+0x115/0x770
[ 96.177999] #1: ffffffff82915020 (rcu_read_lock){....}-{1:2}, at:
get_swap_device+0x33/0x140
[ 96.178057] #2: ffffffff82955ba0 (fs_reclaim){+.+.}-{0:0}, at:
__fs_reclaim_acquire+0x5/0x30
While it was not useful to that bug report to know where the reclaim lock
had been acquired, it might be useful under other circumstances. Allow
the caller of __fs_reclaim_acquire to specify the instruction pointer to
use.
Link: https://lkml.kernel.org/r/20210719185709.1755149-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Omar Sandoval <osandov@fb.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In page table entry modifying tests, set_xxx_at() are used to populate
the page table entries. On ARM64, PG_arch_1 (PG_dcache_clean) flag is
set to the target page flag if execution permission is given. The logic
exits since commit 4f04d8f00545 ("arm64: MMU definitions"). The page
flag is kept when the page is free'd to buddy's free area list. However,
it will trigger page checking failure when it's pulled from the buddy's
free area list, as the following warning messages indicate.
BUG: Bad page state in process memhog pfn:08000
page:0000000015c0a628 refcount:0 mapcount:0 \
mapping:0000000000000000 index:0x1 pfn:0x8000
flags: 0x7ffff8000000800(arch_1|node=0|zone=0|lastcpupid=0xfffff)
raw: 07ffff8000000800 dead000000000100 dead000000000122 0000000000000000
raw: 0000000000000001 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: PAGE_FLAGS_CHECK_AT_PREP flag(s) set
This fixes the issue by clearing PG_arch_1 through flush_dcache_page()
after set_xxx_at() is called. For architectures other than ARM64, the
unexpected overhead of cache flushing is acceptable.
Link: https://lkml.kernel.org/r/20210809092631.1888748-13-gshan@redhat.com
Fixes: a5c3b9ffb0f4 ("mm/debug_vm_pgtable: add tests validating advanced arch page table helpers")
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> [powerpc 8xx]
Tested-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> [s390]
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chunyu Hu <chuhu@redhat.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This uses struct pgtable_debug_args in PUD modifying tests. The allocated
huge page is used when set_pud_at() is used. The corresponding tests are
skipped if the huge page doesn't exist. Besides, the following unused
variables in debug_vm_pgtable() are dropped: @prot, @paddr, @pud_aligned.
Link: https://lkml.kernel.org/r/20210809092631.1888748-10-gshan@redhat.com
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> [powerpc 8xx]
Tested-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> [s390]
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chunyu Hu <chuhu@redhat.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This uses struct pgtable_debug_args in PTE modifying tests. The allocated
page is used as set_pte_at() is used there. The tests are skipped if the
allocated page doesn't exist. It's notable that args->ptep need to be
mapped before the tests. The reason why we don't map args->ptep at the
beginning is PTE entry is only mapped and accessible in atomic context
when CONFIG_HIGHPTE is enabled. So we avoid to do that so that atomic
context is only enabled if needed.
Besides, the unused variable @pte_aligned and @ptep in debug_vm_pgtable()
are dropped.
Link: https://lkml.kernel.org/r/20210809092631.1888748-8-gshan@redhat.com
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> [powerpc 8xx]
Tested-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> [s390]
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chunyu Hu <chuhu@redhat.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This uses struct pgtable_debug_args in the migration and thp test
functions. It's notable that the pre-allocated page is used in
swap_migration_tests() as set_pte_at() is used there.
Link: https://lkml.kernel.org/r/20210809092631.1888748-7-gshan@redhat.com
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> [powerpc 8xx]
Tested-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> [s390]
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chunyu Hu <chuhu@redhat.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Patch series "mm/debug_vm_pgtable: Enhancements", v6.
There are a couple of issues with current implementations and this series
tries to resolve the issues:
(a) All needed information are scattered in variables, passed to various
test functions. The code is organized in pretty much relaxed fashion.
(b) The page isn't allocated from buddy during page table entry modifying
tests. The page can be invalid, conflicting to the implementations
of set_xxx_at() on ARM64. The target page is accessed so that the
iCache can be flushed when execution permission is given on ARM64.
Besides, the target page can be unmapped and accessing to it causes
kernel crash.
"struct pgtable_debug_args" is introduced to address issue (a). For issue
(b), the used page is allocated from buddy in page table entry modifying
tests. The corresponding tets will be skipped if we fail to allocate the
(huge) page. For other test cases, the original page around to kernel
symbol (@start_kernel) is still used.
The patches are organized as below. PATCH[2-10] could be combined to one
patch, but it will make the review harder:
PATCH[1] introduces "struct pgtable_debug_args" as place holder of all
needed information. With it, the old and new implementation
can coexist.
PATCH[2-10] uses "struct pgtable_debug_args" in various test functions.
PATCH[11] removes the unused code for old implementation.
PATCH[12] fixes the issue of corrupted page flag for ARM64
This patch (of 6):
In debug_vm_pgtable(), there are many local variables introduced to track
the needed information and they are passed to the functions for various
test cases. It'd better to introduce a struct as place holder for these
information. With it, what the tests functions need is the struct. In
this way, the code is simplified and easier to be maintained.
Besides, set_xxx_at() could access the data on the corresponding pages in
the page table modifying tests. So the accessed pages in the tests should
have been allocated from buddy. Otherwise, we're accessing pages that
aren't owned by us. This causes issues like page flag corruption or
kernel crash on accessing unmapped page when CONFIG_DEBUG_PAGEALLOC is
enabled.
This introduces "struct pgtable_debug_args". The struct is initialized
and destroyed, but the information in the struct isn't used yet. It will
be used in subsequent patches.
Link: https://lkml.kernel.org/r/20210809092631.1888748-1-gshan@redhat.com
Link: https://lkml.kernel.org/r/20210809092631.1888748-2-gshan@redhat.com
Signed-off-by: Gavin Shan <gshan@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> [powerpc 8xx]
Tested-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> [s390]
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Qian Cai <cai@lca.pw>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Chunyu Hu <chuhu@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Let's also remove masking off MAP_DENYWRITE from ksys_mmap_pgoff():
the last in-tree occurrence of MAP_DENYWRITE is now in LEGACY_MAP_MASK,
which accepts the flag e.g., for MAP_SHARED_VALIDATE; however, the flag
is ignored throughout the kernel now.
Add a comment to LEGACY_MAP_MASK stating that MAP_DENYWRITE is ignored.
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
All in-tree users of MAP_DENYWRITE are gone. MAP_DENYWRITE cannot be
set from user space, so all users are gone; let's remove it.
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
'kvmalloc()' is a convenience function for people who want to do a
kmalloc() but fall back on vmalloc() if there aren't enough physically
contiguous pages, or if the allocation is larger than what kmalloc()
supports.
However, let's make sure it doesn't get _too_ easy to do crazy things
with it. In particular, don't allow big allocations that could be due
to integer overflow or underflow. So make sure the allocation size fits
in an 'int', to protect against trivial integer conversion issues.
Acked-by: Willy Tarreau <w@1wt.eu>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Fix a potential log livelock on busy filesystems when there's so much
work going on that we can't finish a quotaoff before filling up the log
by removing the ability to disable quota accounting.
- Introduce the ability to use per-CPU data structures in XFS so that
we can do a better job of maintaining CPU locality for certain
operations.
- Defer inode inactivation work to per-CPU lists, which will help us
batch that processing. Deletions of large sparse files will *appear*
to run faster, but all that means is that we've moved the work to the
backend.
- Drop the EXPERIMENTAL warnings from the y2038+ support and the inode
btree counters, since it's been nearly a year and no complaints have
come in.
- Remove more of our bespoke kmem* variants in favor of using the
standard Linux calls.
- Prepare for the addition of log incompat features in upcoming cycles
by actually adding code to support this.
- Small cleanups of the xattr code in preparation for landing support
for full logging of extended attribute updates in a future cycle.
- Replace the various log shutdown state and flag code all over xfs
with a single atomic bit flag.
- Fix a serious log recovery bug where log item replay can be skipped
based on the start lsn of a transaction even though the transaction
commit lsn is the key data point for that by enforcing start lsns to
appear in the log in the same order as commit lsns.
- Enable pipelining in the code that pushes log items to disk.
- Drop ->writepage.
- Fix some bugs in GETFSMAP where the last fsmap record reported for a
device could extend beyond the end of the device, and a separate bug
where query keys for one device could be applied to another.
- Don't let GETFSMAP query functions edit their input parameters.
- Small cleanups to the scrub code's handling of perag structures.
- Small cleanups to the incore inode tree walk code.
- Constify btree function parameters that aren't changed, so that there
will never again be confusion about range query functions changing
their input parameters.
- Standardize the format and names of tracepoint data attributes.
- Clean up all the mount state and feature flags to use wrapped bitset
functions instead of inconsistently open-coded flag checks.
- Fix some confusion between xfs_buf hash table key variable vs. block
number.
- Fix a mis-interaction with iomap where we reported shared delalloc
cow fork extents to iomap, which would cause the iomap unshare
operation to return IO errors unnecessarily.
- Fix DONTCACHE behavior.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAmEnwqcACgkQ+H93GTRK
tOtpZg/9G1RD9oDbVhKJy67bxkeLPX990dUtQFhcVjL3AMMyCJez2PBTqkQY3tL9
WDQveIF0UL5TjP5QUO2/6fncIXBmf5yXtinkfeQwkvkStb/yxs10zlpn2ZDEvJ7H
EUWwkV3cBY6Q+ftJIfXJmNW6eCcaxYs6KFiBwodbcoBxy2dIx6KFBQuqwtxOA97s
ZYfv1mPGOIg6AVJN9oxFWtF36qM8loFDNQeZj1ATfCsP25VNHbQf7YOFnJEnwLOB
rzz2zKQ3lP0hWavA6M2lX+IGymDphngx7qe4lZYcjAsh2BzL0IZf0QmFrXGQKuY/
kD0dWeStM8OHQbqCdkYx4XxcjucvJ7qmIYCtrWdpFqrrrQHygaJW6nI8LgsNTdvb
OPXpPPz58jdGY3ATaRYX/IFmpJExj655ZHUfpkeVGacBTa5KCVDykYKv1eYOfNsk
Aj+bZ4g++bx3dlGFHGsPScRn+hwg5h/+UyQJpAYupuaUsq3rpBhH/bhAJNyPUsYu
ej8LIeAWB3EPLozT4ewop8G0WWDBOe0MlYeO5gQho2AfFZzFInf15cSR62KZqx+v
XTZgITnnp0ND4wzgqAhgdU4USS9z5MtHGvhSkuYejg85R/bKirrwRu2P0n681sHv
UioiIVbXGWSAJqDQicfSjncafS3POIAUmMt4tgmDI33/3mTKwZQ=
=HPJr
-----END PGP SIGNATURE-----
Merge tag 'xfs-5.15-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs updates from Darrick Wong:
"There's a lot in this cycle.
Starting with bug fixes: To avoid livelocks between the logging code
and the quota code, we've disabled the ability of quotaoff to turn off
quota accounting. (Admins can still disable quota enforcement, but
truly turning off accounting requires a remount.) We've tried to do
this in a careful enough way that there shouldn't be any user visible
effects aside from quotaoff no longer randomly hanging the system.
We've also fixed some bugs in runtime log behavior that could trip up
log recovery if (otherwise unrelated) transactions manage to start and
commit concurrently; some bugs in the GETFSMAP ioctl where we would
incorrectly restrict the range of records output if the two xfs
devices are of different sizes; a bug that resulted in fallocate
funshare failing unnecessarily; and broken behavior in the xfs inode
cache when DONTCACHE is in play.
As for new features: we now batch inode inactivations in percpu
background threads, which sharply decreases frontend thread wait time
when performing file deletions and should improve overall directory
tree deletion times. This eliminates both the problem where closing an
unlinked file (especially on a frozen fs) can stall for a long time,
and should also ease complaints about direct reclaim bogging down on
unlinked file cleanup.
Starting with this release, we've enabled pipelining of the XFS log.
On workloads with high rates of metadata updates to different shards
of the filesystem, multiple threads can be used to format committed
log updates into log checkpoints.
Lastly, with this release, two new features have graduated to
supported status: inode btree counters (for faster mounts), and
support for dates beyond Y2038. Expect these to be enabled by default
in a future release of xfsprogs.
Summary:
- Fix a potential log livelock on busy filesystems when there's so
much work going on that we can't finish a quotaoff before filling
up the log by removing the ability to disable quota accounting.
- Introduce the ability to use per-CPU data structures in XFS so that
we can do a better job of maintaining CPU locality for certain
operations.
- Defer inode inactivation work to per-CPU lists, which will help us
batch that processing. Deletions of large sparse files will
*appear* to run faster, but all that means is that we've moved the
work to the backend.
- Drop the EXPERIMENTAL warnings from the y2038+ support and the
inode btree counters, since it's been nearly a year and no
complaints have come in.
- Remove more of our bespoke kmem* variants in favor of using the
standard Linux calls.
- Prepare for the addition of log incompat features in upcoming
cycles by actually adding code to support this.
- Small cleanups of the xattr code in preparation for landing support
for full logging of extended attribute updates in a future cycle.
- Replace the various log shutdown state and flag code all over xfs
with a single atomic bit flag.
- Fix a serious log recovery bug where log item replay can be skipped
based on the start lsn of a transaction even though the transaction
commit lsn is the key data point for that by enforcing start lsns
to appear in the log in the same order as commit lsns.
- Enable pipelining in the code that pushes log items to disk.
- Drop ->writepage.
- Fix some bugs in GETFSMAP where the last fsmap record reported for
a device could extend beyond the end of the device, and a separate
bug where query keys for one device could be applied to another.
- Don't let GETFSMAP query functions edit their input parameters.
- Small cleanups to the scrub code's handling of perag structures.
- Small cleanups to the incore inode tree walk code.
- Constify btree function parameters that aren't changed, so that
there will never again be confusion about range query functions
changing their input parameters.
- Standardize the format and names of tracepoint data attributes.
- Clean up all the mount state and feature flags to use wrapped
bitset functions instead of inconsistently open-coded flag checks.
- Fix some confusion between xfs_buf hash table key variable vs.
block number.
- Fix a mis-interaction with iomap where we reported shared delalloc
cow fork extents to iomap, which would cause the iomap unshare
operation to return IO errors unnecessarily.
- Fix DONTCACHE behavior"
* tag 'xfs-5.15-merge-6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (103 commits)
xfs: fix I_DONTCACHE
xfs: only set IOMAP_F_SHARED when providing a srcmap to a write
xfs: fix perag structure refcounting error when scrub fails
xfs: rename buffer cache index variable b_bn
xfs: convert bp->b_bn references to xfs_buf_daddr()
xfs: introduce xfs_buf_daddr()
xfs: kill xfs_sb_version_has_v3inode()
xfs: introduce xfs_sb_is_v5 helper
xfs: remove unused xfs_sb_version_has wrappers
xfs: convert xfs_sb_version_has checks to use mount features
xfs: convert scrub to use mount-based feature checks
xfs: open code sb verifier feature checks
xfs: convert xfs_fs_geometry to use mount feature checks
xfs: replace XFS_FORCED_SHUTDOWN with xfs_is_shutdown
xfs: convert remaining mount flags to state flags
xfs: convert mount flags to features
xfs: consolidate mount option features in m_features
xfs: replace xfs_sb_version checks with feature flag checks
xfs: reflect sb features in xfs_mount
xfs: rework attr2 feature and mount options
...
- Support for 32-bit tasks on asymmetric AArch32 systems (on top of the
scheduler changes merged via the tip tree).
- More entry.S clean-ups and conversion to C.
- MTE updates: allow a preferred tag checking mode to be set per CPU
(the overhead of synchronous mode is smaller for some CPUs than
others); optimisations for kernel entry/exit path; optionally disable
MTE on the kernel command line.
- Kselftest improvements for SVE and signal handling, PtrAuth.
- Fix unlikely race where a TLBI could use stale ASID on an ASID
roll-over (found by inspection).
- Miscellaneous fixes: disable trapping of PMSNEVFR_EL1 to higher
exception levels; drop unnecessary sigdelsetmask() call in the
signal32 handling; remove BUG_ON when failing to allocate SVE state
(just signal the process); SYM_CODE annotations.
- Other trivial clean-ups: use macros instead of magic numbers, remove
redundant returns, typos.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAmEuYkoACgkQa9axLQDI
XvEWVw/9HSWbccLrQ68ulaqZkL4r6lL2RqvZ2p6fkIRW7bX1JS4UJjWe3+VBg5Ed
DQ1A5cHC5ZndQ4gCRsUhcq7IMXBSj3twMzK7yxBk3zh8tbhVrIOONsKMurMw1NyM
OmoyTJ01i2ZrkDs0OU3fBlvIHPxBjKbOZqykOJHjrB2rwBSbsyUw2KvpM7ha8DOf
O7gKViDrdAhumdIL9rsMvSiIPoJLCxvqeu55c3saVu1JrUR6ENu7lMu3jt4WrfK3
m5gf76IFbgxXvlLiC8RJW7OYaXZ+COb7RA/yP/lK+Y0ug9PwqTpzXDwqvAp8nBIv
y7DK0umcBwfDWmwnRO+ZzNPjOGTHnOnjC07WNBPn3v03pMeJ8v8RnvzHkliek31P
r6uFWBxWO/O0sBbSpR+4tzgNfir0RkMajwL5pxQCEMoPCucStYQQl8zIeJeJecpT
DKIyKzfFw6O59gdhE6dCj2wXH8YmKUoSUPCAXpKGzK/oYVOGVQTZSZjIC++ydFWv
AOXz77etPidk3/Tl15Ena7fkkMkxX9UM8dTjOFS64mSWlEyzE6FtfAgm2rIEOaG7
ps6IjVzVves39SC+yry8T2L6gsxPnanRfwKKCWHkovQzNFgs5Qt51Fd5eIeI1jZ0
uEZhd19FN4136QhjWJOeXL/eyj0bv1WLX/mUln95sHnKyf4je9w=
=X6Wm
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
- Support for 32-bit tasks on asymmetric AArch32 systems (on top of the
scheduler changes merged via the tip tree).
- More entry.S clean-ups and conversion to C.
- MTE updates: allow a preferred tag checking mode to be set per CPU
(the overhead of synchronous mode is smaller for some CPUs than
others); optimisations for kernel entry/exit path; optionally disable
MTE on the kernel command line.
- Kselftest improvements for SVE and signal handling, PtrAuth.
- Fix unlikely race where a TLBI could use stale ASID on an ASID
roll-over (found by inspection).
- Miscellaneous fixes: disable trapping of PMSNEVFR_EL1 to higher
exception levels; drop unnecessary sigdelsetmask() call in the
signal32 handling; remove BUG_ON when failing to allocate SVE state
(just signal the process); SYM_CODE annotations.
- Other trivial clean-ups: use macros instead of magic numbers, remove
redundant returns, typos.
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (56 commits)
arm64: Do not trap PMSNEVFR_EL1
arm64: mm: fix comment typo of pud_offset_phys()
arm64: signal32: Drop pointless call to sigdelsetmask()
arm64/sve: Better handle failure to allocate SVE register storage
arm64: Document the requirement for SCR_EL3.HCE
arm64: head: avoid over-mapping in map_memory
arm64/sve: Add a comment documenting the binutils needed for SVE asm
arm64/sve: Add some comments for sve_save/load_state()
kselftest/arm64: signal: Add a TODO list for signal handling tests
kselftest/arm64: signal: Add test case for SVE register state in signals
kselftest/arm64: signal: Verify that signals can't change the SVE vector length
kselftest/arm64: signal: Check SVE signal frame shows expected vector length
kselftest/arm64: signal: Support signal frames with SVE register data
kselftest/arm64: signal: Add SVE to the set of features we can check for
arm64: replace in_irq() with in_hardirq()
kselftest/arm64: pac: Fix skipping of tests on systems without PAC
Documentation: arm64: describe asymmetric 32-bit support
arm64: Remove logic to kill 32-bit tasks on 64-bit-only cores
arm64: Hook up cmdline parameter to allow mismatched 32-bit EL0
arm64: Advertise CPUs capable of running 32-bit applications in sysfs
...
- Enable memcg accounting for various networking objects.
BPF:
- Introduce bpf timers.
- Add perf link and opaque bpf_cookie which the program can read
out again, to be used in libbpf-based USDT library.
- Add bpf_task_pt_regs() helper to access user space pt_regs
in kprobes, to help user space stack unwinding.
- Add support for UNIX sockets for BPF sockmap.
- Extend BPF iterator support for UNIX domain sockets.
- Allow BPF TCP congestion control progs and bpf iterators to call
bpf_setsockopt(), e.g. to switch to another congestion control
algorithm.
Protocols:
- Support IOAM Pre-allocated Trace with IPv6.
- Support Management Component Transport Protocol.
- bridge: multicast: add vlan support.
- netfilter: add hooks for the SRv6 lightweight tunnel driver.
- tcp:
- enable mid-stream window clamping (by user space or BPF)
- allow data-less, empty-cookie SYN with TFO_SERVER_COOKIE_NOT_REQD
- more accurate DSACK processing for RACK-TLP
- mptcp:
- add full mesh path manager option
- add partial support for MP_FAIL
- improve use of backup subflows
- optimize option processing
- af_unix: add OOB notification support.
- ipv6: add IFLA_INET6_RA_MTU to expose MTU value advertised by
the router.
- mac80211: Target Wake Time support in AP mode.
- can: j1939: extend UAPI to notify about RX status.
Driver APIs:
- Add page frag support in page pool API.
- Many improvements to the DSA (distributed switch) APIs.
- ethtool: extend IRQ coalesce uAPI with timer reset modes.
- devlink: control which auxiliary devices are created.
- Support CAN PHYs via the generic PHY subsystem.
- Proper cross-chip support for tag_8021q.
- Allow TX forwarding for the software bridge data path to be
offloaded to capable devices.
Drivers:
- veth: more flexible channels number configuration.
- openvswitch: introduce per-cpu upcall dispatch.
- Add internet mix (IMIX) mode to pktgen.
- Transparently handle XDP operations in the bonding driver.
- Add LiteETH network driver.
- Renesas (ravb):
- support Gigabit Ethernet IP
- NXP Ethernet switch (sja1105)
- fast aging support
- support for "H" switch topologies
- traffic termination for ports under VLAN-aware bridge
- Intel 1G Ethernet
- support getcrosststamp() with PCIe PTM (Precision Time
Measurement) for better time sync
- support Credit-Based Shaper (CBS) offload, enabling HW traffic
prioritization and bandwidth reservation
- Broadcom Ethernet (bnxt)
- support pulse-per-second output
- support larger Rx rings
- Mellanox Ethernet (mlx5)
- support ethtool RSS contexts and MQPRIO channel mode
- support LAG offload with bridging
- support devlink rate limit API
- support packet sampling on tunnels
- Huawei Ethernet (hns3):
- basic devlink support
- add extended IRQ coalescing support
- report extended link state
- Netronome Ethernet (nfp):
- add conntrack offload support
- Broadcom WiFi (brcmfmac):
- add WPA3 Personal with FT to supported cipher suites
- support 43752 SDIO device
- Intel WiFi (iwlwifi):
- support scanning hidden 6GHz networks
- support for a new hardware family (Bz)
- Xen pv driver:
- harden netfront against malicious backends
- Qualcomm mobile
- ipa: refactor power management and enable automatic suspend
- mhi: move MBIM to WWAN subsystem interfaces
Refactor:
- Ambient BPF run context and cgroup storage cleanup.
- Compat rework for ndo_ioctl.
Old code removal:
- prism54 remove the obsoleted driver, deprecated by the p54 driver.
- wan: remove sbni/granch driver.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmEukBYACgkQMUZtbf5S
IrsyHA//TO8dw18NYts4n9LmlJT2naJ7yBUUSSXK/M+DtW0MQ9nnHhqzPm5uJdRl
IgQTNJrW3dYzRwgqaWZqEwO1t5/FI+f87ND1Nsekg7x9tF66a6ov5WxU26TwwSba
U+si/inQ/4chuQ+LxMQobqCDxaLE46I2dIoRl+YfndJ24DRzYSwAEYIPPbSdfyU+
+/l+3s4GaxO4k/hLciPAiOniyxLoUNiGUTNh+2yqRBXelSRJRKVnl+V22ANFrxRW
nTEiplfVKhlPU1e4iLuRtaxDDiePHhw9I3j/lMHhfeFU2P/gKJIvz4QpGV0CAZg2
1VvDU32WEx1GQLXJbKm0KwoNRUq1QSjOyyFti+BO7ugGaYAR4gKhShOqlSYLzUtB
tbtzQhSNLWOGqgmSJOztZb5kFDm2EdRSll5/lP2uyFlPkIsIp0QbscJVzNTnS74b
Xz15ZOw41Z4TfWPEMWgfrx6Zkm7pPWkly+7WfUkPcHa1gftNz6tzXXxSXcXIBPdi
yQ5JCzzxrM5573YHuk5YedwZpn6PiAt4A/muFGk9C6aXP60TQAOS/ppaUzZdnk4D
NfOk9mj06WEULjYjPcKEuT3GGWE6kmjb8Pu0QZWKOchv7vr6oZly1EkVZqYlXELP
AfhcrFeuufie8mqm0jdb4LnYaAnqyLzlb1J4Zxh9F+/IX7G3yoc=
=JDGD
-----END PGP SIGNATURE-----
Merge tag 'net-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- Enable memcg accounting for various networking objects.
BPF:
- Introduce bpf timers.
- Add perf link and opaque bpf_cookie which the program can read out
again, to be used in libbpf-based USDT library.
- Add bpf_task_pt_regs() helper to access user space pt_regs in
kprobes, to help user space stack unwinding.
- Add support for UNIX sockets for BPF sockmap.
- Extend BPF iterator support for UNIX domain sockets.
- Allow BPF TCP congestion control progs and bpf iterators to call
bpf_setsockopt(), e.g. to switch to another congestion control
algorithm.
Protocols:
- Support IOAM Pre-allocated Trace with IPv6.
- Support Management Component Transport Protocol.
- bridge: multicast: add vlan support.
- netfilter: add hooks for the SRv6 lightweight tunnel driver.
- tcp:
- enable mid-stream window clamping (by user space or BPF)
- allow data-less, empty-cookie SYN with TFO_SERVER_COOKIE_NOT_REQD
- more accurate DSACK processing for RACK-TLP
- mptcp:
- add full mesh path manager option
- add partial support for MP_FAIL
- improve use of backup subflows
- optimize option processing
- af_unix: add OOB notification support.
- ipv6: add IFLA_INET6_RA_MTU to expose MTU value advertised by the
router.
- mac80211: Target Wake Time support in AP mode.
- can: j1939: extend UAPI to notify about RX status.
Driver APIs:
- Add page frag support in page pool API.
- Many improvements to the DSA (distributed switch) APIs.
- ethtool: extend IRQ coalesce uAPI with timer reset modes.
- devlink: control which auxiliary devices are created.
- Support CAN PHYs via the generic PHY subsystem.
- Proper cross-chip support for tag_8021q.
- Allow TX forwarding for the software bridge data path to be
offloaded to capable devices.
Drivers:
- veth: more flexible channels number configuration.
- openvswitch: introduce per-cpu upcall dispatch.
- Add internet mix (IMIX) mode to pktgen.
- Transparently handle XDP operations in the bonding driver.
- Add LiteETH network driver.
- Renesas (ravb):
- support Gigabit Ethernet IP
- NXP Ethernet switch (sja1105):
- fast aging support
- support for "H" switch topologies
- traffic termination for ports under VLAN-aware bridge
- Intel 1G Ethernet
- support getcrosststamp() with PCIe PTM (Precision Time
Measurement) for better time sync
- support Credit-Based Shaper (CBS) offload, enabling HW traffic
prioritization and bandwidth reservation
- Broadcom Ethernet (bnxt)
- support pulse-per-second output
- support larger Rx rings
- Mellanox Ethernet (mlx5)
- support ethtool RSS contexts and MQPRIO channel mode
- support LAG offload with bridging
- support devlink rate limit API
- support packet sampling on tunnels
- Huawei Ethernet (hns3):
- basic devlink support
- add extended IRQ coalescing support
- report extended link state
- Netronome Ethernet (nfp):
- add conntrack offload support
- Broadcom WiFi (brcmfmac):
- add WPA3 Personal with FT to supported cipher suites
- support 43752 SDIO device
- Intel WiFi (iwlwifi):
- support scanning hidden 6GHz networks
- support for a new hardware family (Bz)
- Xen pv driver:
- harden netfront against malicious backends
- Qualcomm mobile
- ipa: refactor power management and enable automatic suspend
- mhi: move MBIM to WWAN subsystem interfaces
Refactor:
- Ambient BPF run context and cgroup storage cleanup.
- Compat rework for ndo_ioctl.
Old code removal:
- prism54 remove the obsoleted driver, deprecated by the p54 driver.
- wan: remove sbni/granch driver"
* tag 'net-next-5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1715 commits)
net: Add depends on OF_NET for LiteX's LiteETH
ipv6: seg6: remove duplicated include
net: hns3: remove unnecessary spaces
net: hns3: add some required spaces
net: hns3: clean up a type mismatch warning
net: hns3: refine function hns3_set_default_feature()
ipv6: remove duplicated 'net/lwtunnel.h' include
net: w5100: check return value after calling platform_get_resource()
net/mlxbf_gige: Make use of devm_platform_ioremap_resourcexxx()
net: mdio: mscc-miim: Make use of the helper function devm_platform_ioremap_resource()
net: mdio-ipq4019: Make use of devm_platform_ioremap_resource()
fou: remove sparse errors
ipv4: fix endianness issue in inet_rtm_getroute_build_skb()
octeontx2-af: Set proper errorcode for IPv4 checksum errors
octeontx2-af: Fix static code analyzer reported issues
octeontx2-af: Fix mailbox errors in nix_rss_flowkey_cfg
octeontx2-af: Fix loop in free and unmap counter
af_unix: fix potential NULL deref in unix_dgram_connect()
dpaa2-eth: Replace strlcpy with strscpy
octeontx2-af: Use NDC TX for transmit packet data
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmEs2NIACgkQxWXV+ddt
WDsJMQ/+PJ/yXfI85mAeAzTJLWQ0zD6YO3iBhf3wOeyychWC4on435pj+zW8zR/U
/bix25ygoWF4MvGF6p0uyv4Z5mnvkZXE5lapUcJu6wXG7se1QRPH0broTh05IBXK
SnT93Eb9RexaiNFk7DVma9XkviqZ/ZISPtkJ9wYrfIba7j/U/wa+PtEFS7wk58hP
rFQXgV64xm/pcP28YYHfOkCjdyUMdJrnBUvfKOlX6d94lmYbP5lyiTL+XJEXExzN
wPakD0UsnXPr4TRvf+YRTPeFHPPUgyORII7otVUOKmGywWtcJrELX8rXFoW+6GwB
dzZIcSYXHUxU5UrtMbZgiztVBJ+bQY5juYMIrj13eYOMYkijxAqPP84iDO15+TSV
zNqyAVjUglHCGUGjhSpAxnAmtp+IJTZfVAWcvIKq3VqvJtb8tssQsk9bqFjH1xlH
qNJLE57CYe3tjw05K9y0keMh2iJWRWkXZYkgI/zjwo5nreemobpN+3fO4yneVLh7
ecdBmSl/JVSzAB1NamLOCZNGZLUqiiuTvZlJtI6ZsekrN1+4A6QzVcU/MGjSYL1v
C7W0hK0LF+e3xIBkxTKVq8noolsgbmlWacxJq8fZq9HwZy5IVJOVm9STDlCuLaIo
gPr0V0itkclcsMU0CHTyCjMsfuHYUwJZXwg93wKfJf5UCzS4OWU=
=ALO9
-----END PGP SIGNATURE-----
Merge tag 'for-5.15-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"The highlights of this round are integrations with fs-verity and
idmapped mounts, the rest is usual mix of minor improvements, speedups
and cleanups.
There are some patches outside of btrfs, namely updating some VFS
interfaces, all straightforward and acked.
Features:
- fs-verity support, using standard ioctls, backward compatible with
read-only limitation on inodes with previously enabled fs-verity
- idmapped mount support
- make mount with rescue=ibadroots more tolerant to partially damaged
trees
- allow raid0 on a single device and raid10 on two devices,
degenerate cases but might be useful as an intermediate step during
conversion to other profiles
- zoned mode block group auto reclaim can be disabled via sysfs knob
Performance improvements:
- continue readahead of node siblings even if target node is in
memory, could speed up full send (on sample test +11%)
- batching of delayed items can speed up creating many files
- fsync/tree-log speedups
- avoid unnecessary work (gains +2% throughput, -2% run time on
sample load)
- reduced lock contention on renames (on dbench +4% throughput,
up to -30% latency)
Fixes:
- various zoned mode fixes
- preemptive flushing threshold tuning, avoid excessive work on
almost full filesystems
Core:
- continued subpage support, preparation for implementing remaining
features like compression and defragmentation; with some
limitations, write is now enabled on 64K page systems with 4K
sectors, still considered experimental
- no readahead on compressed reads
- inline extents disabled
- disabled raid56 profile conversion and mount
- improved flushing logic, fixing early ENOSPC on some workloads
- inode flags have been internally split to read-only and read-write
incompat bit parts, used by fs-verity
- new tree items for fs-verity
- descriptor item
- Merkle tree item
- inode operations extended to be namespace-aware
- cleanups and refactoring
Generic code changes:
- fs: new export filemap_fdatawrite_wbc
- fs: removed sync_inode
- block: bio_trim argument type fixups
- vfs: add namespace-aware lookup"
* tag 'for-5.15-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (114 commits)
btrfs: reset replace target device to allocation state on close
btrfs: zoned: fix ordered extent boundary calculation
btrfs: do not do preemptive flushing if the majority is global rsv
btrfs: reduce the preemptive flushing threshold to 90%
btrfs: tree-log: check btrfs_lookup_data_extent return value
btrfs: avoid unnecessarily logging directories that had no changes
btrfs: allow idmapped mount
btrfs: handle ACLs on idmapped mounts
btrfs: allow idmapped INO_LOOKUP_USER ioctl
btrfs: allow idmapped SUBVOL_SETFLAGS ioctl
btrfs: allow idmapped SET_RECEIVED_SUBVOL ioctls
btrfs: relax restrictions for SNAP_DESTROY_V2 with subvolids
btrfs: allow idmapped SNAP_DESTROY ioctls
btrfs: allow idmapped SNAP_CREATE/SUBVOL_CREATE ioctls
btrfs: check whether fsgid/fsuid are mapped during subvolume creation
btrfs: allow idmapped permission inode op
btrfs: allow idmapped setattr inode op
btrfs: allow idmapped tmpfile inode op
btrfs: allow idmapped symlink inode op
btrfs: allow idmapped mkdir inode op
...
* tip/sched/arm64: (785 commits)
Documentation: arm64: describe asymmetric 32-bit support
arm64: Remove logic to kill 32-bit tasks on 64-bit-only cores
arm64: Hook up cmdline parameter to allow mismatched 32-bit EL0
arm64: Advertise CPUs capable of running 32-bit applications in sysfs
arm64: Prevent offlining first CPU with 32-bit EL0 on mismatched system
arm64: exec: Adjust affinity for compat tasks with mismatched 32-bit EL0
arm64: Implement task_cpu_possible_mask()
sched: Introduce dl_task_check_affinity() to check proposed affinity
sched: Allow task CPU affinity to be restricted on asymmetric systems
sched: Split the guts of sched_setaffinity() into a helper function
sched: Introduce task_struct::user_cpus_ptr to track requested affinity
sched: Reject CPU affinity changes based on task_cpu_possible_mask()
cpuset: Cleanup cpuset_cpus_allowed_fallback() use in select_fallback_rq()
cpuset: Honour task_cpu_possible_mask() in guarantee_online_cpus()
cpuset: Don't use the cpu_possible_mask as a last resort for cgroup v1
sched: Introduce task_cpu_possible_mask() to limit fallback rq selection
sched: Cgroup SCHED_IDLE support
sched/topology: Skip updating masks for non-online nodes
Linux 5.14-rc6
lib: use PFN_PHYS() in devmem_is_allowed()
...
- Replace get/put_online_cpus() in various places. The final removal will
happen shortly before v5.15-rc1 when the rest of the patches have been
merged.
- Add debug code to help the analysis of CPU hotplug failures
- A set of kernel doc updates
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmEsnvcTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoeydD/9b1eDTA/bLabrByKYCxOYOT3VvHPlB
ik7obOtgR3xFrvINqIsLGfGIASCUFqGZuLz801+ZJSwtrhVGzZiBztyU5ZvDwTVT
fwn+Trkis2RxbWh3T0qM+GbXJofsdbSQiO6gd/Nfn5hmSeXY/RZ118TIodCl0My9
IcYAt4u9U0E4LEfIwOMhJCesXKgrTU9mcCpSrnfPt2q6zAMMlNQE6Ty0uzGWDKaA
ejp7i7/K6SDwyPadWICriVNmE3WyiKOc8vCep96CHAIQgSOz5O+OXmeCJistRv+a
cu5yxYeMCy+sYQnixfmC6VcCdWv677/d1CQRrG2ze9kHT7i/8uoJFewp6uZvbR6g
KAufsZfYS7EaEqNWUVLiAT3cxtcjJx0lb5EL1QlaolQWNtEptXYjEd/CNuvGUt+h
YzccIVtlqrBXjsrxkmhubZZNp35QwPhdAeMspF/xJxBztQhZCwUzD4L6DlVkH7j4
hN62ezVzoWpmfbfyDB57DJF3sPOtCaeiyV9ZNKq9975/6BRWRSmrTPI1S2PGhXHU
6NMKvt/Ackn6u1CjeNxq4jDuJbnMpiFXnEGG8md8ePJiF7mdAoG3EeBR6qgmkSR9
MaR8C4hVLoDUSfXB80ef7iaIVYDdabmYCZ9JfEzmyk/j5Yt2Z1x0Mvwy63PeE+0q
Zrf5KUtrGGANyg==
=yqLB
-----END PGP SIGNATURE-----
Merge tag 'smp-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull SMP core updates from Thomas Gleixner:
- Replace get/put_online_cpus() in various places. The final removal
will happen shortly before v5.15-rc1 when the rest of the patches
have been merged.
- Add debug code to help the analysis of CPU hotplug failures
- A set of kernel doc updates
* tag 'smp-core-2021-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
mm: Replace deprecated CPU-hotplug functions.
md/raid5: Replace deprecated CPU-hotplug functions.
Documentation: Replace deprecated CPU-hotplug functions.
smp: Fix all kernel-doc warnings
cpu/hotplug: Add debug printks for hotplug callback failures
cpu/hotplug: Use DEVICE_ATTR_*() macro
cpu/hotplug: Eliminate all kernel-doc warnings
cpu/hotplug: Fix kernel doc warnings for __cpuhp_setup_state_cpuslocked()
cpu/hotplug: Fix comment typo
smpboot: Replace deprecated CPU-hotplug functions.
- Improve ftrace code patching so that stop_machine is not required anymore.
This requires a small common code patch acked by Steven Rostedt:
https://lore.kernel.org/linux-s390/20210730220741.4da6fdf6@oasis.local.home/
- Enable KCSAN for s390. This comes with a small common code change to fix a
compile warning. Acked by Marco Elver:
https://lore.kernel.org/r/20210729142811.1309391-1-hca@linux.ibm.com
- Add KFENCE support for s390. This also comes with a minimal x86 patch from
Marco Elver who said also this can be carried via the s390 tree:
https://lore.kernel.org/linux-s390/YQJdarx6XSUQ1tFZ@elver.google.com/
- More changes to prepare the decompressor for relocation.
- Enable DAT also for CPU restart path.
- Final set of register asm removal patches; leaving only three locations where
needed and sane.
- Add NNPA, Vector-Packed-Decimal-Enhancement Facility 2, PCI MIO support to
hwcaps flags.
- Cleanup hwcaps implementation.
- Add new instructions to in-kernel disassembler.
- Various QDIO cleanups.
- Add SCLP debug feature.
- Various other cleanups and improvements all over the place.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAmEs1GsACgkQIg7DeRsp
bsJ9+A/9FApCECNPgu6jOX4Ee+no+LxpCPUF8rvt56TFTLv7+Dhm7fJl0xQ9utsZ
FyLMDAr1/FKdm2wBW23QZH4vEIt1bd6e/03DwwK+6IjHKZHRIfB8eGJMsLj/TDzm
K6/+FI7qXjvpNXxgkCqXf5yESi/y5Dgr+16kTBhPZj5awRiwe5puPamji3uiQ45V
r4MdGCCC9BnTZvtPpUrr8ImnUqHJ4/TMo1YYdykLbZFuAvvYUyZ5YG5kh0pMa8JZ
DGJpfLQfy7ZNscIzdVhZtfzzESVtS6/AOeBzDMO1pbM1CGXtvpJJP0Wjlr/PGwoW
fvuMHpqTlDi+TfNZiPP5lwsFC89xSd6gtZH7vAuI8kFCXgW3RMjABF6h/mzpH1WO
jXyaSEZROc/83gxPMYyOYiqrKyAFPbpZ/Rnav2bvGQGneqx7RvmpF3GgA9WEo1PW
rMDoEbLstJuHk0E2uEV+OnQd5F7MHNonzpYfp/7pyQ+PL8w2GExV9yngVc/f3TqG
HYLC9rc3K6DkxZappcJm0qTb7lDTMFI7xK3g9RiqPQBJE1v1MYE/rai48nW69ypE
bRNL76AjyXKo+zKR2wlhJVMY1I1+DarMopHhZj6fzQT5te1LLsv8OuTU2gkt6dIq
YoSYOYvModf3HbKnJul2tszQG9yl+vpE9MiCyBQSsxIYXCriq/c=
=WDRh
-----END PGP SIGNATURE-----
Merge tag 's390-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Heiko Carstens:
- Improve ftrace code patching so that stop_machine is not required
anymore. This requires a small common code patch acked by Steven
Rostedt:
https://lore.kernel.org/linux-s390/20210730220741.4da6fdf6@oasis.local.home/
- Enable KCSAN for s390. This comes with a small common code change to
fix a compile warning. Acked by Marco Elver:
https://lore.kernel.org/r/20210729142811.1309391-1-hca@linux.ibm.com
- Add KFENCE support for s390. This also comes with a minimal x86 patch
from Marco Elver who said also this can be carried via the s390 tree:
https://lore.kernel.org/linux-s390/YQJdarx6XSUQ1tFZ@elver.google.com/
- More changes to prepare the decompressor for relocation.
- Enable DAT also for CPU restart path.
- Final set of register asm removal patches; leaving only three
locations where needed and sane.
- Add NNPA, Vector-Packed-Decimal-Enhancement Facility 2, PCI MIO
support to hwcaps flags.
- Cleanup hwcaps implementation.
- Add new instructions to in-kernel disassembler.
- Various QDIO cleanups.
- Add SCLP debug feature.
- Various other cleanups and improvements all over the place.
* tag 's390-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (105 commits)
s390: remove SCHED_CORE from defconfigs
s390/smp: do not use nodat_stack for secondary CPU start
s390/smp: enable DAT before CPU restart callback is called
s390: update defconfigs
s390/ap: fix state machine hang after failure to enable irq
KVM: s390: generate kvm hypercall functions
s390/sclp: add tracing of SCLP interactions
s390/debug: add early tracing support
s390/debug: fix debug area life cycle
s390/debug: keep debug data on resize
s390/diag: make restart_part2 a local label
s390/mm,pageattr: fix walk_pte_level() early exit
s390: fix typo in linker script
s390: remove do_signal() prototype and do_notify_resume() function
s390/crypto: fix all kernel-doc warnings in vfio_ap_ops.c
s390/pci: improve DMA translation init and exit
s390/pci: simplify CLP List PCI handling
s390/pci: handle FH state mismatch only on disable
s390/pci: fix misleading rc in clp_set_pci_fn()
s390/boot: factor out offset_vmlinux_info() function
...
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEES8DXskRxsqGE6vXTAA5oQRlWghUFAmEo3WcTHGpsYXl0b25A
a2VybmVsLm9yZwAKCRAADmhBGVaCFbnuD/9Vj3dnXRgZ9LHFuQHp5Vy5yaGfujwP
kIMN7bJiL0pCZ1zHapyn0cidZuTScDK2qMzGVR9Wrj/uYHEI+T08IEfaq/2CU3pS
PdygHgqQi+WtJj4dks1OphVdlF0+46B/zy2YY0LE39zFUDO38VUIgb/gTiQ8JHIr
BaDGtohlXmK8bDXo2XqMunSp7uQoRX92mv+bPjvIpsM60RnhPzhX0HeO/xhXO1PP
OpswQG3nOTX1alZOXuSKdH7e3zfO+pLnC5NYeo5R4rys5xRiShU19OSAJfiPwf2w
06OX7uLT7aF5LPQPjOP4MA9dHPWWeIuYnKzUpQZZ+D+zjuaxhF62saJG1Um0yQns
qqPZgTWhxzkJ16bj94T2UZW0bufFKM5cEfBPFw9ZyzQshX5jIE2V5ecV+F/AdtTj
EA55m/5N8NZMsNgnEOqn6o8TkWoV1seHKJXqja0+ZY9/Xs8GkmjHLhjNBCwGZIbw
8S64Szp+yex3m1mRS+L6T0Gudic5WdFxOKQAzvvJcmT6u+ZGBzU0FjNTNNdnu3tp
f9mANYxT1vNfYplRuNhMCp9oioSRpO2vfDQLCBRtsCy0QlKCum6qb7E/BcWRlwn1
SOAFmjeXpr+VXGX+Rjp/bzCdHlNOVbR4ODWJ+h5D7QZaQwQ6FafkEq1D2yjPR0OR
i722ECoM++vZ4w==
=CfcC
-----END PGP SIGNATURE-----
Merge tag 'locks-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux
Pull file locking updates from Jeff Layton:
"This starts with a couple of fixes for potential deadlocks in the
fowner/fasync handling.
The next patch removes the old mandatory locking code from the kernel
altogether.
The last patch cleans up rw_verify_area a bit more after the mandatory
locking removal"
* tag 'locks-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
fs: clean up after mandatory file locking support removal
fs: remove mandatory file locking support
fcntl: fix potential deadlock for &fasync_struct.fa_lock
fcntl: fix potential deadlocks for &fown_struct.lock
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmEmTZcACgkQnJ2qBz9k
QNkkmAgArW6XoF1CePds/ZaC9vfg/nk66/zVo0n+J8xXjMWAPxcKbWFfV0uWVixq
yk4lcLV47a2Mu/B/1oLNd3vrSmhwU+srWqNwOFn1nv+lP/6wJqr8oztRHn/0L9Q3
ZSRrukSejbQ6AvTL/WzTNnCjjCc2ne3Kyko6W41aU6uyJuzhSM32wbx7qlV6t54Z
iint9OrB4gM0avLohNafTUq6I+tEGzBMNwpCG/tqCmkcvDcv3rTDVAnPSCTm0Tx2
hdrYDcY/rLxo93pDBaW1rYA/fohR+mIVye6k2TjkPAL6T1x+rxeT5qnc+YijH5yF
sFPDhlD+ZsfOLi8stWXLOJ+8+gLODg==
=pDBR
-----END PGP SIGNATURE-----
Merge tag 'hole_punch_for_v5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull fs hole punching vs cache filling race fixes from Jan Kara:
"Fix races leading to possible data corruption or stale data exposure
in multiple filesystems when hole punching races with operations such
as readahead.
This is the series I was sending for the last merge window but with
your objection fixed - now filemap_fault() has been modified to take
invalidate_lock only when we need to create new page in the page cache
and / or bring it uptodate"
* tag 'hole_punch_for_v5.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
filesystems/locking: fix Malformed table warning
cifs: Fix race between hole punch and page fault
ceph: Fix race between hole punch and page fault
fuse: Convert to using invalidate_lock
f2fs: Convert to using invalidate_lock
zonefs: Convert to using invalidate_lock
xfs: Convert double locking of MMAPLOCK to use VFS helpers
xfs: Convert to use invalidate_lock
xfs: Refactor xfs_isilocked()
ext2: Convert to using invalidate_lock
ext4: Convert to use mapping->invalidate_lock
mm: Add functions to lock invalidate_lock for two mappings
mm: Protect operations adding pages to page cache with invalidate_lock
documentation: Sync file_operations members with reality
mm: Fix comments mentioning i_mutex