IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
arch/mips/pci/fixup-lantiq.c requires nothing from <linux/of_irq.h>, so
remove the include of it.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
After bbd8810d3998 ("PCI: Remove unused includes and superfluous struct
declaration"), <linux/of_pci.h> no longer includes <linux/pci.h>, which
provides the extern declarations for pcibios_plat_dev_init() and
pcibios_map_irq() via <asm/pci.h>.
This results in these new warnings:
arch/mips/pci/fixup-lantiq.c:13:5: warning: no previous prototype for 'pcibios_plat_dev_init' [-Wmissing-prototypes]
arch/mips/pci/fixup-lantiq.c:24:5: warning: no previous prototype for 'pcibios_map_irq' [-Wmissing-prototypes]
Include <linux/pci.h> directly to get these declarations.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202310070445.tzRBNYRC-lkp@intel.com/
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
After commit 411740f5422a ("KVM: MIPS/MMU: Implement KVM_CAP_SYNC_MMU")
old_pte is no longer used in kvm_mips_map_page(). So remove it to fix a
build warning about variable set but not used:
arch/mips/kvm/mmu.c: In function 'kvm_mips_map_page':
>> arch/mips/kvm/mmu.c:701:29: warning: variable 'old_pte' set but not used [-Wunused-but-set-variable]
701 | pte_t *ptep, entry, old_pte;
| ^~~~~~~
Cc: stable@vger.kernel.org
Fixes: 411740f5422a960 ("KVM: MIPS/MMU: Implement KVM_CAP_SYNC_MMU")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202310070530.aARZCSfh-lkp@intel.com/
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Architectures which don't define their own use the one in
asm-generic/bitops/lock.h. Get rid of all the ifdefs around "maybe we
don't have it".
Link: https://lkml.kernel.org/r/20231004165317.1061855-15-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Inspired by the mips test_and_change_bit(), this will surely be more
efficient than the generic one defined in filemap.c
Link: https://lkml.kernel.org/r/20231004165317.1061855-11-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
To prepare for completely separating the VGA console screen_info from
the one used in EFI/sysfb, rename the vgacon instances and make them
local as much as possible.
ia64 and arm both have confurations with vgacon and efi, but the contents
never overlaps because ia64 has no EFI framebuffer, and arm only has
vga console on legacy platforms without EFI. Renaming these is required
before the EFI screen_info can be moved into drivers/firmware.
The ia64 vga console is actually registered in two places from
setup_arch(), but one of them is wrong, so drop the one in pcdp.c and
fix the one in setup.c to use the correct conditional.
x86 has to keep them together, as the boot protocol is used to switch
between VGA text console and framebuffer through the screen_info data.
Acked-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Khalid Aziz <khalid@gonehiking.org>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20231009211845.3136536-7-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The vga console driver is fairly self-contained, and only used by
architectures that explicitly initialize the screen_info settings.
Chance every instance that picks the vga console by setting conswitchp
to call a function instead, and pass a reference to the screen_info
there.
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Khalid Azzi <khalid@gonehiking.org>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20231009211845.3136536-6-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
A number of architectures either kept the screen_info definition for
historical purposes as it used to be required by the generic VT code, or
they copied it from another architecture in order to build the VGA console
driver in an allmodconfig build. The mips definition is used by some
platforms, but the initialization on jazz is not needed.
Now that vgacon no longer builds on these architectures, remove the
stale definitions and initializations.
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Dinh Nguyen <dinguyen@kernel.org>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Acked-by: Guo Ren <guoren@kernel.org>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20231009211845.3136536-5-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On non-x86 architectures, the screen_info variable is generally only
used for the VGA console where supported, and in some cases the EFI
framebuffer or vga16fb.
Now that we have a definite list of which architectures actually use it
for what, use consistent #ifdef checks so the global variable is only
defined when it is actually used on those architectures.
Loongarch and riscv have no support for vgacon or vga16fb, but
they support EFI firmware, so only that needs to be checked, and the
initialization can be removed because that is handled by EFI.
IA64 has both vgacon and EFI, though EFI apparently never uses
a framebuffer here.
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Khalid Aziz <khalid@gonehiking.org>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20231009211845.3136536-3-arnd@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Rename the fbdev mmap helper fb_pgprotect() to pgprot_framebuffer().
The helper sets VMA page-access flags for framebuffers in device I/O
memory.
Also clean up the helper's parameters and return value. Instead of
the VMA instance, pass the individial parameters separately: existing
page-access flags, the VMAs start and end addresses and the offset
in the underlying device memory rsp file. Return the new page-access
flags. These changes align pgprot_framebuffer() with other pgprot_()
functions.
v4:
* fix commit message (Christophe)
v3:
* rename fb_pgprotect() to pgprot_framebuffer() (Arnd)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230922080636.26762-3-tzimmermann@suse.de
Recently, we found that cross-die access to pagetable pages on ARM64
machines can cause performance fluctuations in our business. Currently,
there are no PMU events available to track this situation on our ARM64
machines, so accurate pagetable accounting can help to analyze this issue,
but now the PUD level pagetable accounting is missed.
So introduce pagetable_pud_ctor/dtor() to help to get accurate PUD
pagetable accounting, as well as converting the architectures which use
generic PUD pagetable allocation to add corresponding PUD pagetable
accounting. Moreover this patch will mark the PUD level pagetable with
PG_table flag, which will help to do sanity validation in
unpoison_memory().
On my testing machine, I can see more pagetables statistics after the patch
with page-types tool:
Before patch:
flags page-count MB symbolic-flags long-symbolic-flags
0x0000000004000000 27326 106 __________________________g_________________ pgtable
After patch:
0x0000000004000000 27541 107 __________________________g_________________ pgtable
Link: https://lkml.kernel.org/r/876c71c03a7e69c17722a690e3225a4f7b172fb2.1695017383.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
commit c35559f94ebc ("x86/shstk: Introduce map_shadow_stack syscall")
recently added support for map_shadow_stack() but it is limited to x86
only for now. There is a possibility that other architectures (namely,
arm64 and RISC-V), that are implementing equivalent support for shadow
stacks, might need to add support for it.
Independent of that, reserving arch-specific syscall numbers in the
syscall tables of all architectures is good practice and would help
avoid future conflicts. map_shadow_stack() is marked as a conditional
syscall in sys_ni.c. Adding it to the syscall tables of other
architectures is harmless and would return ENOSYS when exercised.
Note, map_shadow_stack() was assigned #453 during the merge process
since #452 was taken by fchmodat2().
For Powerpc, map it to sys_ni_syscall() as is the norm for Powerpc
syscall tables.
For Alpha, map_shadow_stack() takes up #563 as Alpha still diverges from
the common syscall numbering system in the other architectures.
Link: https://lore.kernel.org/lkml/20230515212255.GA562920@debug.ba.rivosinc.com/
Link: https://lore.kernel.org/lkml/b402b80b-a7c6-4ef0-b977-c0f5f582b78a@sirena.org.uk/
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
There are two adjacent "b" instructions, the second one is unreachable,
it is dead code, just remove it.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Rename GB-PC1 to GnuBee GB-PC1, and GB-PC2 to GnuBee GB-PC2 to include
brand and model name.
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Acked-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Each item of the resets property should define a reset. Split the item with
two resets on the ethernet node into two separate items.
Sort the items of the clocks property to the same line as a trivial change.
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Acked-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
The "probe-type" property was only needed when used with the
(long obsolete) "direct-mapped" compatible value.
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Since commit 7b3415f581c7 ("MIPS: Loongson32: Remove
unused platform devices"), struct plat_ls1x_dma and plat_ls1x_nand
are unused. Then, dma.h and nand.h are useless.
Therefore, remove these useless header files.
Signed-off-by: Keguang Zhang <keguang.zhang@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Add two parameters 'low_size' and 'high' to function parse_crashkernel(),
later crashkernel=,high|low parsing will be added. Make adjustments in
all call sites of parse_crashkernel() in arch.
Link: https://lkml.kernel.org/r/20230914033142.676708-3-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Zhen Lei <thunder.leizhen@huawei.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Jiahao <chenjiahao16@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
__flush_dcache_pages() is called during hugetlb migration via
migrate_pages() -> migrate_hugetlbs() -> unmap_and_move_huge_page() ->
move_to_new_folio() -> flush_dcache_folio(). And with hugetlb and without
sparsemem vmemmap, struct page is not guaranteed to be contiguous beyond a
section. Use nth_page() instead.
Without the fix, a wrong address might be used for data cache page flush.
No bug is reported. The fix comes from code inspection.
Link: https://lkml.kernel.org/r/20230913201248.452081-6-zi.yan@sent.com
Fixes: 15fa3e8e3269 ("mips: implement the new page table range API")
Signed-off-by: Zi Yan <ziy@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Rewrite local_add_unless() as a static inline function with boolean
return value, similar to the arch_atomic_add_unless() arch fallbacks.
The function is currently unused.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230731084458.28096-1-ubizjak@gmail.com
commit 'be65de6b03aa ("fs: Remove dcookies support")' removed the
syscall definition for lookup_dcookie. However, syscall tables still
point to the old sys_lookup_dcookie() definition. Update syscall tables
of all architectures to directly point to sys_ni_syscall() instead.
Signed-off-by: Sohil Mehta <sohil.mehta@intel.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Namhyung Kim <namhyung@kernel.org> # for perf
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
While commit d4a5c59a955b ("mmc: au1xmmc: force non-modular build and
remove symbol_get usage") to be built in, it can still build a kernel
without MMC support and thuse no mmc_detect_change symbol at all.
Add ifdefs to build the mmc support code in the alchemy arch code
conditional on mmc support.
Fixes: d4a5c59a955b ("mmc: au1xmmc: force non-modular build and remove symbol_get usage")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Finish off the 'simple' futex2 syscall group by adding
sys_futex_requeue(). Unlike sys_futex_{wait,wake}() its arguments are
too numerous to fit into a regular syscall. As such, use struct
futex_waitv to pass the 'source' and 'destination' futexes to the
syscall.
This syscall implements what was previously known as FUTEX_CMP_REQUEUE
and uses {val, uaddr, flags} for source and {uaddr, flags} for
destination.
This design explicitly allows requeueing between different types of
futex by having a different flags word per uaddr.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/r/20230921105248.511860556@noisy.programming.kicks-ass.net
To complement sys_futex_waitv()/wake(), add sys_futex_wait(). This
syscall implements what was previously known as FUTEX_WAIT_BITSET
except it uses 'unsigned long' for the value and bitmask arguments,
takes timespec and clockid_t arguments for the absolute timeout and
uses FUTEX2 flags.
The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/r/20230921105248.164324363@noisy.programming.kicks-ass.net
To complement sys_futex_waitv() add sys_futex_wake(). This syscall
implements what was previously known as FUTEX_WAKE_BITSET except it
uses 'unsigned long' for the bitmask and takes FUTEX2 flags.
The 'unsigned long' allows FUTEX2_SIZE_U64 on 64bit platforms.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/r/20230921105247.936205525@noisy.programming.kicks-ass.net
* Clean up vCPU targets, always returning generic v8 as the preferred target
* Trap forwarding infrastructure for nested virtualization (used for traps
that are taken from an L2 guest and are needed by the L1 hypervisor)
* FEAT_TLBIRANGE support to only invalidate specific ranges of addresses
when collapsing a table PTE to a block PTE. This avoids that the guest
refills the TLBs again for addresses that aren't covered by the table PTE.
* Fix vPMU issues related to handling of PMUver.
* Don't unnecessary align non-stack allocations in the EL2 VA space
* Drop HCR_VIRT_EXCP_MASK, which was never used...
* Don't use smp_processor_id() in kvm_arch_vcpu_load(),
but the cpu parameter instead
* Drop redundant call to kvm_set_pfn_accessed() in user_mem_abort()
* Remove prototypes without implementations
RISC-V:
* Zba, Zbs, Zicntr, Zicsr, Zifencei, and Zihpm support for guest
* Added ONE_REG interface for SATP mode
* Added ONE_REG interface to enable/disable multiple ISA extensions
* Improved error codes returned by ONE_REG interfaces
* Added KVM_GET_REG_LIST ioctl() implementation for KVM RISC-V
* Added get-reg-list selftest for KVM RISC-V
s390:
* PV crypto passthrough enablement (Tony, Steffen, Viktor, Janosch)
Allows a PV guest to use crypto cards. Card access is governed by
the firmware and once a crypto queue is "bound" to a PV VM every
other entity (PV or not) looses access until it is not bound
anymore. Enablement is done via flags when creating the PV VM.
* Guest debug fixes (Ilya)
x86:
* Clean up KVM's handling of Intel architectural events
* Intel bugfixes
* Add support for SEV-ES DebugSwap, allowing SEV-ES guests to use debug
registers and generate/handle #DBs
* Clean up LBR virtualization code
* Fix a bug where KVM fails to set the target pCPU during an IRTE update
* Fix fatal bugs in SEV-ES intrahost migration
* Fix a bug where the recent (architecturally correct) change to reinject
#BP and skip INT3 broke SEV guests (can't decode INT3 to skip it)
* Retry APIC map recalculation if a vCPU is added/enabled
* Overhaul emergency reboot code to bring SVM up to par with VMX, tie the
"emergency disabling" behavior to KVM actually being loaded, and move all of
the logic within KVM
* Fix user triggerable WARNs in SVM where KVM incorrectly assumes the TSC
ratio MSR cannot diverge from the default when TSC scaling is disabled
up related code
* Add a framework to allow "caching" feature flags so that KVM can check if
the guest can use a feature without needing to search guest CPUID
* Rip out the ancient MMU_DEBUG crud and replace the useful bits with
CONFIG_KVM_PROVE_MMU
* Fix KVM's handling of !visible guest roots to avoid premature triple fault
injection
* Overhaul KVM's page-track APIs, and KVMGT's usage, to reduce the API surface
that is needed by external users (currently only KVMGT), and fix a variety
of issues in the process
This last item had a silly one-character bug in the topic branch that
was sent to me. Because it caused pretty bad selftest failures in
some configurations, I decided to squash in the fix. So, while the
exact commit ids haven't been in linux-next, the code has (from the
kvm-x86 tree).
Generic:
* Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier events to pass
action specific data without needing to constantly update the main handlers.
* Drop unused function declarations
Selftests:
* Add testcases to x86's sync_regs_test for detecting KVM TOCTOU bugs
* Add support for printf() in guest code and covert all guest asserts to use
printf-based reporting
* Clean up the PMU event filter test and add new testcases
* Include x86 selftests in the KVM x86 MAINTAINERS entry
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmT1m0kUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroMNgggAiN7nz6UC423FznuI+yO3TLm8tkx1
CpKh5onqQogVtchH+vrngi97cfOzZb1/AtifY90OWQi31KEWhehkeofcx7G6ERhj
5a9NFADY1xGBsX4exca/VHDxhnzsbDWaWYPXw5vWFWI6erft9Mvy3tp1LwTvOzqM
v8X4aWz+g5bmo/DWJf4Wu32tEU6mnxzkrjKU14JmyqQTBawVmJ3RYvHVJ/Agpw+n
hRtPAy7FU6XTdkmq/uCT+KUHuJEIK0E/l1js47HFAqSzwdW70UDg14GGo1o4ETxu
RjZQmVNvL57yVgi6QU38/A0FWIsWQm5IlaX1Ug6x8pjZPnUKNbo9BY4T1g==
=W+4p
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini:
"ARM:
- Clean up vCPU targets, always returning generic v8 as the preferred
target
- Trap forwarding infrastructure for nested virtualization (used for
traps that are taken from an L2 guest and are needed by the L1
hypervisor)
- FEAT_TLBIRANGE support to only invalidate specific ranges of
addresses when collapsing a table PTE to a block PTE. This avoids
that the guest refills the TLBs again for addresses that aren't
covered by the table PTE.
- Fix vPMU issues related to handling of PMUver.
- Don't unnecessary align non-stack allocations in the EL2 VA space
- Drop HCR_VIRT_EXCP_MASK, which was never used...
- Don't use smp_processor_id() in kvm_arch_vcpu_load(), but the cpu
parameter instead
- Drop redundant call to kvm_set_pfn_accessed() in user_mem_abort()
- Remove prototypes without implementations
RISC-V:
- Zba, Zbs, Zicntr, Zicsr, Zifencei, and Zihpm support for guest
- Added ONE_REG interface for SATP mode
- Added ONE_REG interface to enable/disable multiple ISA extensions
- Improved error codes returned by ONE_REG interfaces
- Added KVM_GET_REG_LIST ioctl() implementation for KVM RISC-V
- Added get-reg-list selftest for KVM RISC-V
s390:
- PV crypto passthrough enablement (Tony, Steffen, Viktor, Janosch)
Allows a PV guest to use crypto cards. Card access is governed by
the firmware and once a crypto queue is "bound" to a PV VM every
other entity (PV or not) looses access until it is not bound
anymore. Enablement is done via flags when creating the PV VM.
- Guest debug fixes (Ilya)
x86:
- Clean up KVM's handling of Intel architectural events
- Intel bugfixes
- Add support for SEV-ES DebugSwap, allowing SEV-ES guests to use
debug registers and generate/handle #DBs
- Clean up LBR virtualization code
- Fix a bug where KVM fails to set the target pCPU during an IRTE
update
- Fix fatal bugs in SEV-ES intrahost migration
- Fix a bug where the recent (architecturally correct) change to
reinject #BP and skip INT3 broke SEV guests (can't decode INT3 to
skip it)
- Retry APIC map recalculation if a vCPU is added/enabled
- Overhaul emergency reboot code to bring SVM up to par with VMX, tie
the "emergency disabling" behavior to KVM actually being loaded,
and move all of the logic within KVM
- Fix user triggerable WARNs in SVM where KVM incorrectly assumes the
TSC ratio MSR cannot diverge from the default when TSC scaling is
disabled up related code
- Add a framework to allow "caching" feature flags so that KVM can
check if the guest can use a feature without needing to search
guest CPUID
- Rip out the ancient MMU_DEBUG crud and replace the useful bits with
CONFIG_KVM_PROVE_MMU
- Fix KVM's handling of !visible guest roots to avoid premature
triple fault injection
- Overhaul KVM's page-track APIs, and KVMGT's usage, to reduce the
API surface that is needed by external users (currently only
KVMGT), and fix a variety of issues in the process
Generic:
- Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier
events to pass action specific data without needing to constantly
update the main handlers.
- Drop unused function declarations
Selftests:
- Add testcases to x86's sync_regs_test for detecting KVM TOCTOU bugs
- Add support for printf() in guest code and covert all guest asserts
to use printf-based reporting
- Clean up the PMU event filter test and add new testcases
- Include x86 selftests in the KVM x86 MAINTAINERS entry"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (279 commits)
KVM: x86/mmu: Include mmu.h in spte.h
KVM: x86/mmu: Use dummy root, backed by zero page, for !visible guest roots
KVM: x86/mmu: Disallow guest from using !visible slots for page tables
KVM: x86/mmu: Harden TDP MMU iteration against root w/o shadow page
KVM: x86/mmu: Harden new PGD against roots without shadow pages
KVM: x86/mmu: Add helper to convert root hpa to shadow page
drm/i915/gvt: Drop final dependencies on KVM internal details
KVM: x86/mmu: Handle KVM bookkeeping in page-track APIs, not callers
KVM: x86/mmu: Drop @slot param from exported/external page-track APIs
KVM: x86/mmu: Bug the VM if write-tracking is used but not enabled
KVM: x86/mmu: Assert that correct locks are held for page write-tracking
KVM: x86/mmu: Rename page-track APIs to reflect the new reality
KVM: x86/mmu: Drop infrastructure for multiple page-track modes
KVM: x86/mmu: Use page-track notifiers iff there are external users
KVM: x86/mmu: Move KVM-only page-track declarations to internal header
KVM: x86: Remove the unused page-track hook track_flush_slot()
drm/i915/gvt: switch from ->track_flush_slot() to ->track_remove_region()
KVM: x86: Add a new page-track hook to handle memslot deletion
drm/i915/gvt: Don't bother removing write-protection on to-be-deleted slot
KVM: x86: Reject memslot MOVE operations if KVMGT is attached
...
Here is the big set of tty and serial driver changes for 6.6-rc1.
Lots of cleanups in here this cycle, and some driver updates. Short
summary is:
- Jiri's continued work to make the tty code and apis be a bit more
sane with regards to modern kernel coding style and types
- cpm_uart driver updates
- n_gsm updates and fixes
- meson driver updates
- sc16is7xx driver updates
- 8250 driver updates for different hardware types
- qcom-geni driver fixes
- tegra serial driver change
- stm32 driver updates
- synclink_gt driver cleanups
- tty structure size reduction
All of these have been in linux-next this week with no reported issues.
The last bit of cleanups from Jiri and the tty structure size reduction
came in last week, a bit late but as they were just style changes and
size reductions, I figured they should get into this merge cycle so that
others can work on top of them with no merge conflicts.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZPH+jA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykKyACgldt6QeenTN+6dXIHS/eQHtTKZwMAn3arSeXI
QrUUnLFjOWyoX87tbMBQ
=LVw0
-----END PGP SIGNATURE-----
Merge tag 'tty-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial driver updates from Greg KH:
"Here is the big set of tty and serial driver changes for 6.6-rc1.
Lots of cleanups in here this cycle, and some driver updates. Short
summary is:
- Jiri's continued work to make the tty code and apis be a bit more
sane with regards to modern kernel coding style and types
- cpm_uart driver updates
- n_gsm updates and fixes
- meson driver updates
- sc16is7xx driver updates
- 8250 driver updates for different hardware types
- qcom-geni driver fixes
- tegra serial driver change
- stm32 driver updates
- synclink_gt driver cleanups
- tty structure size reduction
All of these have been in linux-next this week with no reported
issues. The last bit of cleanups from Jiri and the tty structure size
reduction came in last week, a bit late but as they were just style
changes and size reductions, I figured they should get into this merge
cycle so that others can work on top of them with no merge conflicts"
* tag 'tty-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (199 commits)
tty: shrink the size of struct tty_struct by 40 bytes
tty: n_tty: deduplicate copy code in n_tty_receive_buf_real_raw()
tty: n_tty: extract ECHO_OP processing to a separate function
tty: n_tty: unify counts to size_t
tty: n_tty: use u8 for chars and flags
tty: n_tty: simplify chars_in_buffer()
tty: n_tty: remove unsigned char casts from character constants
tty: n_tty: move newline handling to a separate function
tty: n_tty: move canon handling to a separate function
tty: n_tty: use MASK() for masking out size bits
tty: n_tty: make n_tty_data::num_overrun unsigned
tty: n_tty: use time_is_before_jiffies() in n_tty_receive_overrun()
tty: n_tty: use 'num' for writes' counts
tty: n_tty: use output character directly
tty: n_tty: make flow of n_tty_receive_buf_common() a bool
Revert "tty: serial: meson: Add a earlycon for the T7 SoC"
Documentation: devices.txt: Fix minors for ttyCPM*
Documentation: devices.txt: Remove ttySIOC*
Documentation: devices.txt: Remove ttyIOC*
serial: 8250_bcm7271: improve bcm7271 8250 port
...
Here is the big set of USB, Thunderbolt, and PHY driver updates for
6.6-rc1. Included in here are:
- PHY driver additions and cleanups
- Thunderbolt minor additions and fixes
- USB MIDI 2 gadget support added
- dwc3 driver updates and additions
- Removal of some old USB wireless code that was missed when that
codebase was originally removed a few years ago, cleaning up some
core USB code paths
- USB core potential use-after-free fixes that syzbot from different
people/groups keeps tripping over
- typec updates and additions
- gadget fixes and cleanups
- loads of smaller USB core and driver cleanups all over the place
Full details are in the shortlog. All of these have been in linux-next
for a while with no reported problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZPIAOQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yn80gCgybzMp0YnSildFetSC8lUJTnzjQcAn3KWzb75
Zt72jxGl4ZOXHEpozG4O
=FLrK
-----END PGP SIGNATURE-----
Merge tag 'usb-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB / Thunderbolt / PHY driver updates from Greg KH:
"Here is the big set of USB, Thunderbolt, and PHY driver updates for
6.6-rc1. Included in here are:
- PHY driver additions and cleanups
- Thunderbolt minor additions and fixes
- USB MIDI 2 gadget support added
- dwc3 driver updates and additions
- Removal of some old USB wireless code that was missed when that
codebase was originally removed a few years ago, cleaning up some
core USB code paths
- USB core potential use-after-free fixes that syzbot from different
people/groups keeps tripping over
- typec updates and additions
- gadget fixes and cleanups
- loads of smaller USB core and driver cleanups all over the place
Full details are in the shortlog. All of these have been in linux-next
for a while with no reported problems"
* tag 'usb-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (154 commits)
platform/chrome: cros_ec_typec: Configure Retimer cable type
tcpm: Avoid soft reset when partner does not support get_status
usb: typec: tcpm: reset counter when enter into unattached state after try role
usb: typec: tcpm: set initial svdm version based on pd revision
USB: serial: option: add FOXCONN T99W368/T99W373 product
USB: serial: option: add Quectel EM05G variant (0x030e)
usb: dwc2: add pci_device_id driver_data parse support
usb: gadget: remove max support speed info in bind operation
usb: gadget: composite: cleanup function config_ep_by_speed_and_alt()
usb: gadget: config: remove max speed check in usb_assign_descriptors()
usb: gadget: unconditionally allocate hs/ss descriptor in bind operation
usb: gadget: f_uvc: change endpoint allocation in uvc_function_bind()
usb: gadget: add a inline function gether_bitrate()
usb: gadget: use working speed to calcaulate network bitrate and qlen
dt-bindings: usb: samsung,exynos-dwc3: Add Exynos850 support
usb: dwc3: exynos: Add support for Exynos850 variant
usb: gadget: udc-xilinx: fix incorrect type in assignment warning
usb: gadget: udc-xilinx: fix cast from restricted __le16 warning
usb: gadget: udc-xilinx: fix restricted __le16 degrades to integer warning
USB: dwc2: hande irq on dead controller correctly
...
Convert IBT selftest to asm to fix objtool warning
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmTv1QQACgkQaDWVMHDJ
krAUwhAAn6TOwHJK8BSkHeiQhON1nrlP3c5cv0AyZ2NP8RYDrZrSZvhpYBJ6wgKC
Cx5CGq5nn9twYsYS3KsktLKDfR3lRdsQ7K9qtyFtYiaeaVKo+7gEKl/K+klwai8/
gninQWHk0zmSCja8Vi77q52WOMkQKapT8+vaON9EVDO8dVEi+CvhAIfPwMafuiwO
Rk4X86SzoZu9FP79LcCg9XyGC/XbM2OG9eNUTSCKT40qTTKm5y4gix687NvAlaHR
ko5MTsdl0Wfp6Qk0ohT74LnoA2c1g/FluvZIM33ci/2rFpkf9Hw7ip3lUXqn6CPx
rKiZ+pVRc0xikVWkraMfIGMJfUd2rhelp8OyoozD7DB7UZw40Q4RW4N5tgq9Fhe9
MQs3p1v9N8xHdRKl365UcOczUxNAmv4u0nV5gY/4FMC6VjldCl2V9fmqYXyzFS4/
Ogg4FSd7c2JyGFKPs+5uXyi+RY2qOX4+nzHOoKD7SY616IYqtgKoz5usxETLwZ6s
VtJOmJL0h//z0A7tBliB0zd+SQ5UQQBDC2XouQH2fNX2isJMn0UDmWJGjaHgK6Hh
8jVp6LNqf+CEQS387UxckOyj7fu438hDky1Ggaw4YqowEOhQeqLVO4++x+HITrbp
AupXfbJw9h9cMN63Yc0gVxXQ9IMZ+M7UxLtZ3Cd8/PVztNy/clA=
=3UUm
-----END PGP SIGNATURE-----
Merge tag 'x86_shstk_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 shadow stack support from Dave Hansen:
"This is the long awaited x86 shadow stack support, part of Intel's
Control-flow Enforcement Technology (CET).
CET consists of two related security features: shadow stacks and
indirect branch tracking. This series implements just the shadow stack
part of this feature, and just for userspace.
The main use case for shadow stack is providing protection against
return oriented programming attacks. It works by maintaining a
secondary (shadow) stack using a special memory type that has
protections against modification. When executing a CALL instruction,
the processor pushes the return address to both the normal stack and
to the special permission shadow stack. Upon RET, the processor pops
the shadow stack copy and compares it to the normal stack copy.
For more information, refer to the links below for the earlier
versions of this patch set"
Link: https://lore.kernel.org/lkml/20220130211838.8382-1-rick.p.edgecombe@intel.com/
Link: https://lore.kernel.org/lkml/20230613001108.3040476-1-rick.p.edgecombe@intel.com/
* tag 'x86_shstk_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits)
x86/shstk: Change order of __user in type
x86/ibt: Convert IBT selftest to asm
x86/shstk: Don't retry vm_munmap() on -EINTR
x86/kbuild: Fix Documentation/ reference
x86/shstk: Move arch detail comment out of core mm
x86/shstk: Add ARCH_SHSTK_STATUS
x86/shstk: Add ARCH_SHSTK_UNLOCK
x86: Add PTRACE interface for shadow stack
selftests/x86: Add shadow stack test
x86/cpufeatures: Enable CET CR4 bit for shadow stack
x86/shstk: Wire in shadow stack interface
x86: Expose thread features in /proc/$PID/status
x86/shstk: Support WRSS for userspace
x86/shstk: Introduce map_shadow_stack syscall
x86/shstk: Check that signal frame is shadow stack mem
x86/shstk: Check that SSP is aligned on sigreturn
x86/shstk: Handle signals for shadow stack
x86/shstk: Introduce routines modifying shstk
x86/shstk: Handle thread shadow stack
x86/shstk: Add user-mode shadow stack support
...
- Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier events to pass
action specific data without needing to constantly update the main handlers.
- Drop unused function declarations
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmTudpYSHHNlYW5qY0Bn
b29nbGUuY29tAAoJEGCRIgFNDBL5xJUQAKnMVEV+7gRtfV5KCJFRTNAMxo4zSIt/
K6QX+x/SwUriXj4nTAlvAtju1xz4nwTYBABKj3bXEaLpVjIUIbnEzEGuTKKK6XY9
UyJKVgafwLuKLWPYN/5Zv5SCO7DmVC9W3lVMtchgt7gFcRxtZhmEn53boHhrhan0
/2L5XD6N9rd81Zmd/rQkJNRND7XY3HkvDSnfmsRI/rfFUglCUHBDp4c2Wkmz+Dnb
ux7N37si5OTbRVp18VzbLg1jalstDEm36ZQ7tLkvIbNbZV6pV93/ZmcTmsOruTeN
gHVr6/RXmKKwgO3wtZ9DKL6oMcoh20yoT+vqhbaihVssLPGPusk7S2cCQ7529u8/
Oda+w67MMdbE46N9CmB56fkpwNvn9nLCoQFhMhXBWhPJVNmorpiR6drHKqLy5zCq
lrsWGqXU/DXA2PwdsztfIIMVeALawzExHu9ayppcKwb4S8TLJhLma7dT+EvwUxuV
hoswaIT7Tq2ptZ34Fo5/vEz+90u2wi7LynHrNdTs7NLsW+WI/jab7KxKc+mf5WYh
KuMzqmmPXmWRFupFeDa61YY5PCvMddDeac/jCYL/2cr73RA8bUItivwt5mEg5nOW
9NEU+cLbl1s8g2KfxwhvodVkbhiNGf8MkVpE5skHHh9OX8HYzZUa/s6uUZO1O0eh
XOk+fa9KWabt
=n819
-----END PGP SIGNATURE-----
Merge tag 'kvm-x86-generic-6.6' of https://github.com/kvm-x86/linux into HEAD
Common KVM changes for 6.6:
- Wrap kvm_{gfn,hva}_range.pte in a union to allow mmu_notifier events to pass
action specific data without needing to constantly update the main handlers.
- Drop unused function declarations
- Add support for TLB range invalidation of Stage-2 page tables,
avoiding unnecessary invalidations. Systems that do not implement
range invalidation still rely on a full invalidation when dealing
with large ranges.
- Add infrastructure for forwarding traps taken from a L2 guest to
the L1 guest, with L0 acting as the dispatcher, another baby step
towards the full nested support.
- Simplify the way we deal with the (long deprecated) 'CPU target',
resulting in a much needed cleanup.
- Fix another set of PMU bugs, both on the guest and host sides,
as we seem to never have any shortage of those...
- Relax the alignment requirements of EL2 VA allocations for
non-stack allocations, as we were otherwise wasting a lot of that
precious VA space.
- The usual set of non-functional cleanups, although I note the lack
of spelling fixes...
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmTsXrUPHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpDZpIQAJUM1rNEOJ8ExYRfoG1LaTfcOm5TD6D1IWlO
uCUx4xLMBudw/55HusmUSdiomQ3Xg5UdRaU7vX5OYwPbdoWebjEUfgdP3jCA/TiW
mZTMv3x9hOvp+EOS/UnS469cERvg1/KfwcdOQsWL0HsCFZnu2XmQHWPD++vovLNp
F1892ij875mC6C6mOR60H2nyjIiCuqWh/8eKBkp65CARCbFDYxWhqBnmcmTvoquh
E87pQDPdtgXc0KlOWCABh5bYOu1WGVEXE5f3ixtdY9cQakkSI3NkFKw27/mIWS4q
TCsagByNnPFDXTglb1dJopNdluLMFi1iXhRJX78R/PYaHxf4uFafWcQk1U7eDdLg
1kPANggwYe4KNAQZUvRhH7lIPWHCH0r4c1qHV+FsiOZVoDOSKHo4RW1ZFtirJSNW
LNJMdk+8xyae0S7z164EpZB/tpFttX4gl3YvUT/T+4gH8+CRFAaoAlK39CoGDPpk
f+P2GE1Z5YupF16YjpZtBnan55KkU1b6eORl5zpnAtoaz5WGXqj1t4qo0Q6e9WB9
X4rdDVhH7vRUmhjmSP6PuEygb84hnITLdGpkH2BmWj/4uYuCN+p+U2B2o/QdMJoo
cPxdflLOU/+1gfAFYPtHVjVKCqzhwbw3iLXQpO12gzRYqE13rUnAr7RuGDf5fBVC
LW7Pv81o
=DKhx
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for Linux 6.6
- Add support for TLB range invalidation of Stage-2 page tables,
avoiding unnecessary invalidations. Systems that do not implement
range invalidation still rely on a full invalidation when dealing
with large ranges.
- Add infrastructure for forwarding traps taken from a L2 guest to
the L1 guest, with L0 acting as the dispatcher, another baby step
towards the full nested support.
- Simplify the way we deal with the (long deprecated) 'CPU target',
resulting in a much needed cleanup.
- Fix another set of PMU bugs, both on the guest and host sides,
as we seem to never have any shortage of those...
- Relax the alignment requirements of EL2 VA allocations for
non-stack allocations, as we were otherwise wasting a lot of that
precious VA space.
- The usual set of non-functional cleanups, although I note the lack
of spelling fixes...
- allow dynamic sizing of the swiotlb buffer, to cater for secure
virtualization workloads that require all I/O to be bounce buffered
(Petr Tesarik)
- move a declaration to a header (Arnd Bergmann)
- check for memory region overlap in dma-contiguous (Binglei Wang)
- remove the somewhat dangerous runtime swiotlb-xen enablement and
unexport is_swiotlb_active (Christoph Hellwig, Juergen Gross)
- per-node CMA improvements (Yajun Deng)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmTuDHkLHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYOqvhAApMk2/ceTgVH17sXaKE822+xKvgv377O6TlggMeGG
W4zA0KD69DNz0AfaaCc5U5f7n8Ld/YY1RsvkHW4b3jgw+KRTeQr0jjitBgP5kP2M
A1+qxdyJpCTwiPt9s2+JFVPeyZ0s52V6OJODKRG3s0ore55R+U09VySKtASON+q3
GMKfWqQteKC+thg7NkrQ7JUixuo84oICws+rZn4K9ifsX2O0HYW6aMW0feRfZjJH
r0TgqZc4RdPTSaF22oapR9Ls39+7hp/pBvoLm5sBNA3cl5C3X4VWo9ERMU1jW9h+
VYQv39NycUspgskWJmpbU06/+ooYqQlwHSR/vdNusmFIvxo4tf6/UX72YO5F8Dar
ap0wYGauiEwTjSnhVxPTXk3obWyWEsgFAeRnPdTlH2CNmv38QZU2HLb8eU1pcXxX
j+WI2Ewy9z22uBVYiPOKpdW1jkSfmlmfPp/8SbAdua7I3YQ90rQN6AvU06zAi/cL
NQTgO81E4jPkygqAVgS/LeYziWAQ73yM7m9ExThtTgqFtHortwhJ4Fd8XKtvtvEb
viXAZ/WZtQBv/CIKAW98NhgIDP/SPOT8ym6V35WK+kkNFMS6LMSQUfl9GgbHGyFa
n9icMm7BmbDtT1+AKNafG9En4DtAf9M9QNidAVOyfrsIk6S0gZoZwvIStkA7on8a
cNY=
=kVVr
-----END PGP SIGNATURE-----
Merge tag 'dma-mapping-6.6-2023-08-29' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-maping updates from Christoph Hellwig:
- allow dynamic sizing of the swiotlb buffer, to cater for secure
virtualization workloads that require all I/O to be bounce buffered
(Petr Tesarik)
- move a declaration to a header (Arnd Bergmann)
- check for memory region overlap in dma-contiguous (Binglei Wang)
- remove the somewhat dangerous runtime swiotlb-xen enablement and
unexport is_swiotlb_active (Christoph Hellwig, Juergen Gross)
- per-node CMA improvements (Yajun Deng)
* tag 'dma-mapping-6.6-2023-08-29' of git://git.infradead.org/users/hch/dma-mapping:
swiotlb: optimize get_max_slots()
swiotlb: move slot allocation explanation comment where it belongs
swiotlb: search the software IO TLB only if the device makes use of it
swiotlb: allocate a new memory pool when existing pools are full
swiotlb: determine potential physical address limit
swiotlb: if swiotlb is full, fall back to a transient memory pool
swiotlb: add a flag whether SWIOTLB is allowed to grow
swiotlb: separate memory pool data from other allocator data
swiotlb: add documentation and rename swiotlb_do_find_slots()
swiotlb: make io_tlb_default_mem local to swiotlb.c
swiotlb: bail out of swiotlb_init_late() if swiotlb is already allocated
dma-contiguous: check for memory region overlap
dma-contiguous: support numa CMA for specified node
dma-contiguous: support per-numa CMA for all architectures
dma-mapping: move arch_dma_set_mask() declaration to header
swiotlb: unexport is_swiotlb_active
x86: always initialize xen-swiotlb when xen-pcifront is enabling
xen/pci: add flag for PCI passthrough being possible
Summary of the changes worth highlighting from most interesting to boring below:
* Christoph Hellwig's symbol_get() fix to Nvidia's efforts to circumvent the
protection he put in place in year 2020 to prevent proprietary modules from
using GPL only symbols, and also ensuring proprietary modules which export
symbols grandfather their taint. That was done through year 2020 commit
262e6ae7081d ("modules: inherit TAINT_PROPRIETARY_MODULE"). Christoph's new
fix is done by clarifing __symbol_get() was only ever intended to prevent
module reference loops by Linux kernel modules and so making it only find
symbols exported via EXPORT_SYMBOL_GPL(). The circumvention tactic used
by Nvidia was to use symbol_get() to purposely swift through proprietary
module symbols and completley bypass our traditional EXPORT_SYMBOL*()
annotations and community agreed upon restrictions.
A small set of preamble patches fix up a few symbols which just needed
adjusting for this on two modules, the rtc ds1685 and the networking enetc
module. Two other modules just needed some build fixing and removal of use
of __symbol_get() as they can't ever be modular, as was done by Arnd on
the ARM pxa module and Christoph did on the mmc au1xmmc driver.
This is a good reminder to us that symbol_get() is just a hack to address
things which should be fixed through Kconfig at build time as was done in
the later patches, and so ultimately it should just go.
* Extremely late minor fix for old module layout 055f23b74b20 ("module: check
for exit sections in layout_sections() instead of module_init_section()") by
James Morse for arm64. Note that this layout thing is old, it is *not*
Song Liu's commit ac3b43283923 ("module: replace module_layout with
module_memory"). The issue however is very odd to run into and so there was
no hurry to get this in fast.
* Although the fix did not go through the modules tree I'd like to highlight
the fix by Peter Zijlstra in commit 54097309620e ("x86/static_call: Fix
__static_call_fixup()") now merged in your tree which came out of what
was originally suspected to be a fallout of the the newer module layout
changes by Song Liu commit ac3b43283923 ("module: replace module_layout
with module_memory") instead of module_init_section()"). Thanks to the report
by Christian Bricart and the debugging by Song Liu & Peter that turned to
be noted as a kernel regression in place since v5.19 through commit
ee88d363d156 ("x86,static_call: Use alternative RET encoding").
I highlight this to reflect and clarify that we haven't seen more fallout
from ac3b43283923 ("module: replace module_layout with module_memory").
* RISC-V toolchain got mapping symbol support which prefix symbols with "$"
to help with alignment considerations for disassembly. This is used to
differentiate between incompatible instruction encodings when disassembling.
RISC-V just matches what ARM/AARCH64 did for alignment considerations and
Palmer Dabbelt extended is_mapping_symbol() to accept these symbols for
RISC-V. We already had support for this for all architectures but it also
checked for the second character, the RISC-V check Dabbelt added was just
for the "$". After a bit of testing and fallout on linux-next and based on
feedback from Masahiro Yamada it was decided to simplify the check and treat
the first char "$" as unique for all architectures, and so we no make
is_mapping_symbol() for all archs if the symbol starts with "$".
The most relevant commit for this for RISC-V on binutils was:
https://sourceware.org/pipermail/binutils/2021-July/117350.html
* A late fix by Andrea Righi (today) to make module zstd decompression use
vmalloc() instead of kmalloc() to account for large compressed modules. I
suspect we'll see similar things for other decompression algorithms soon.
* samples/hw_breakpoint minor fixes by Rong Tao, Arnd Bergmann and Chen Jiahao
-----BEGIN PGP SIGNATURE-----
iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmTuShISHG1jZ3JvZkBr
ZXJuZWwub3JnAAoJEM4jHQowkoin7rEQAIt9cGmkHyA6Po/Ex8DejWvSTTOQzIXk
NvtGurODghWnCejZ7Yofo1T48mvgHOenDQB9qNSkVtKDyhmWCbss6wQU/5M8Mc3A
G+9svkQ8H1BRzTwX3WJKF9KNMhI0HA0CXz3ED/I4iX/Q4Ffv3bgbAiitY6r48lJV
PSKPzwH9QMIti6k3j+bFf2SwWCV3X2jz+btdxwY34dVFyggdYgaBNKEdrumCx4nL
g0tQQxI8QgltOnwlfOPLEhdSU1yWyIWZtqtki6xksLziwTreRaw1HotgXQDpnt/S
iJY9xiKN1ChtVSprQlbTb9yhFbCEGvOYGEaKl/ZsGENQjKzRWsQ+dtT8Ww6n2Y1H
aJXwniv6SqCW7dCwdKo4sE7JFYDP56yFYKBLOPSPbMm6DJwTMbzLUf7TGNh6NKyl
3pqjGagJ+LTj3l9w5ur4zTrDGAmLzMpNR03+6niTM7C3TPOI1+wh5zGbvtoA/WdA
ytQeOTiUsi0uyVgk50f67IC6virrxwupeyZQlYFGNuEGBClgXzzzgw/MKwg0VMvc
aWhFPUOLx8/8juJ3A5qiOT+znQJ2DTqWlT+QkQ8R5qFVXEW1g9IOnhaHqDX+KB0A
OPlZ9xwss2U0Zd1XhourtqhUhvcODWNzTj3oPzjdrGiBjdENz8hPKP+7HV1CG6xy
RdxpSwu72kFu
=IQy2
-----END PGP SIGNATURE-----
Merge tag 'modules-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux
Pull modules updates from Luis Chamberlain:
"Summary of the changes worth highlighting from most interesting to
boring below:
- Christoph Hellwig's symbol_get() fix to Nvidia's efforts to
circumvent the protection he put in place in year 2020 to prevent
proprietary modules from using GPL only symbols, and also ensuring
proprietary modules which export symbols grandfather their taint.
That was done through year 2020 commit 262e6ae7081d ("modules:
inherit TAINT_PROPRIETARY_MODULE"). Christoph's new fix is done by
clarifing __symbol_get() was only ever intended to prevent module
reference loops by Linux kernel modules and so making it only find
symbols exported via EXPORT_SYMBOL_GPL(). The circumvention tactic
used by Nvidia was to use symbol_get() to purposely swift through
proprietary module symbols and completely bypass our traditional
EXPORT_SYMBOL*() annotations and community agreed upon
restrictions.
A small set of preamble patches fix up a few symbols which just
needed adjusting for this on two modules, the rtc ds1685 and the
networking enetc module. Two other modules just needed some build
fixing and removal of use of __symbol_get() as they can't ever be
modular, as was done by Arnd on the ARM pxa module and Christoph
did on the mmc au1xmmc driver.
This is a good reminder to us that symbol_get() is just a hack to
address things which should be fixed through Kconfig at build time
as was done in the later patches, and so ultimately it should just
go.
- Extremely late minor fix for old module layout 055f23b74b20
("module: check for exit sections in layout_sections() instead of
module_init_section()") by James Morse for arm64. Note that this
layout thing is old, it is *not* Song Liu's commit ac3b43283923
("module: replace module_layout with module_memory"). The issue
however is very odd to run into and so there was no hurry to get
this in fast.
- Although the fix did not go through the modules tree I'd like to
highlight the fix by Peter Zijlstra in commit 54097309620e
("x86/static_call: Fix __static_call_fixup()") now merged in your
tree which came out of what was originally suspected to be a
fallout of the the newer module layout changes by Song Liu commit
ac3b43283923 ("module: replace module_layout with module_memory")
instead of module_init_section()"). Thanks to the report by
Christian Bricart and the debugging by Song Liu & Peter that turned
to be noted as a kernel regression in place since v5.19 through
commit ee88d363d156 ("x86,static_call: Use alternative RET
encoding").
I highlight this to reflect and clarify that we haven't seen more
fallout from ac3b43283923 ("module: replace module_layout with
module_memory").
- RISC-V toolchain got mapping symbol support which prefix symbols
with "$" to help with alignment considerations for disassembly.
This is used to differentiate between incompatible instruction
encodings when disassembling. RISC-V just matches what ARM/AARCH64
did for alignment considerations and Palmer Dabbelt extended
is_mapping_symbol() to accept these symbols for RISC-V. We already
had support for this for all architectures but it also checked for
the second character, the RISC-V check Dabbelt added was just for
the "$". After a bit of testing and fallout on linux-next and based
on feedback from Masahiro Yamada it was decided to simplify the
check and treat the first char "$" as unique for all architectures,
and so we no make is_mapping_symbol() for all archs if the symbol
starts with "$".
The most relevant commit for this for RISC-V on binutils was:
https://sourceware.org/pipermail/binutils/2021-July/117350.html
- A late fix by Andrea Righi (today) to make module zstd
decompression use vmalloc() instead of kmalloc() to account for
large compressed modules. I suspect we'll see similar things for
other decompression algorithms soon.
- samples/hw_breakpoint minor fixes by Rong Tao, Arnd Bergmann and
Chen Jiahao"
* tag 'modules-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
module/decompress: use vmalloc() for zstd decompression workspace
kallsyms: Add more debug output for selftest
ARM: module: Use module_init_layout_section() to spot init sections
arm64: module: Use module_init_layout_section() to spot init sections
module: Expose module_init_layout_section()
modules: only allow symbol_get of EXPORT_SYMBOL_GPL modules
rtc: ds1685: use EXPORT_SYMBOL_GPL for ds1685_rtc_poweroff
net: enetc: use EXPORT_SYMBOL_GPL for enetc_phc_index
mmc: au1xmmc: force non-modular build and remove symbol_get usage
ARM: pxa: remove use of symbol_get()
samples/hw_breakpoint: mark sample_hbp as static
samples/hw_breakpoint: fix building without module unloading
samples/hw_breakpoint: Fix kernel BUG 'invalid opcode: 0000'
modpost, kallsyms: Treat add '$'-prefixed symbols as mapping symbols
kernel: params: Remove unnecessary ‘0’ values from err
module: Ignore RISC-V mapping symbols too
("refactor Kconfig to consolidate KEXEC and CRASH options").
- kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a
couple of macros to args.h").
- gdb feature work from Kuan-Ying Lee ("Add GDB memory helper
commands").
- vsprintf inclusion rationalization from Andy Shevchenko
("lib/vsprintf: Rework header inclusions").
- Switch the handling of kdump from a udev scheme to in-kernel handling,
by Eric DeVolder ("crash: Kernel handling of CPU and memory hot
un/plug").
- Many singleton patches to various parts of the tree
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZO2GpAAKCRDdBJ7gKXxA
juW3AQD1moHzlSN6x9I3tjm5TWWNYFoFL8af7wXDJspp/DWH/AD/TO0XlWWhhbYy
QHy7lL0Syha38kKLMXTM+bN6YQHi9AU=
=WJQa
-----END PGP SIGNATURE-----
Merge tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- An extensive rework of kexec and crash Kconfig from Eric DeVolder
("refactor Kconfig to consolidate KEXEC and CRASH options")
- kernel.h slimming work from Andy Shevchenko ("kernel.h: Split out a
couple of macros to args.h")
- gdb feature work from Kuan-Ying Lee ("Add GDB memory helper
commands")
- vsprintf inclusion rationalization from Andy Shevchenko
("lib/vsprintf: Rework header inclusions")
- Switch the handling of kdump from a udev scheme to in-kernel
handling, by Eric DeVolder ("crash: Kernel handling of CPU and memory
hot un/plug")
- Many singleton patches to various parts of the tree
* tag 'mm-nonmm-stable-2023-08-28-22-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (81 commits)
document while_each_thread(), change first_tid() to use for_each_thread()
drivers/char/mem.c: shrink character device's devlist[] array
x86/crash: optimize CPU changes
crash: change crash_prepare_elf64_headers() to for_each_possible_cpu()
crash: hotplug support for kexec_load()
x86/crash: add x86 crash hotplug support
crash: memory and CPU hotplug sysfs attributes
kexec: exclude elfcorehdr from the segment digest
crash: add generic infrastructure for crash hotplug support
crash: move a few code bits to setup support of crash hotplug
kstrtox: consistently use _tolower()
kill do_each_thread()
nilfs2: fix WARNING in mark_buffer_dirty due to discarded buffer reuse
scripts/bloat-o-meter: count weak symbol sizes
treewide: drop CONFIG_EMBEDDED
lockdep: fix static memory detection even more
lib/vsprintf: declare no_hash_pointers in sprintf.h
lib/vsprintf: split out sprintf() and friends
kernel/fork: stop playing lockless games for exe_file replacement
adfs: delete unused "union adfs_dirtail" definition
...
- Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which
reduces the special-case code for handling hugetlb pages in GUP. It
also speeds up GUP handling of transparent hugepages.
- Peng Zhang provides some maple tree speedups ("Optimize the fast path
of mas_store()").
- Sergey Senozhatsky has improved te performance of zsmalloc during
compaction (zsmalloc: small compaction improvements").
- Domenico Cerasuolo has developed additional selftest code for zswap
("selftests: cgroup: add zswap test program").
- xu xin has doe some work on KSM's handling of zero pages. These
changes are mainly to enable the user to better understand the
effectiveness of KSM's treatment of zero pages ("ksm: support tracking
KSM-placed zero-pages").
- Jeff Xu has fixes the behaviour of memfd's
MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl
MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED").
- David Howells has fixed an fscache optimization ("mm, netfs, fscache:
Stop read optimisation when folio removed from pagecache").
- Axel Rasmussen has given userfaultfd the ability to simulate memory
poisoning ("add UFFDIO_POISON to simulate memory poisoning with UFFD").
- Miaohe Lin has contributed some routine maintenance work on the
memory-failure code ("mm: memory-failure: remove unneeded PageHuge()
check").
- Peng Zhang has contributed some maintenance work on the maple tree
code ("Improve the validation for maple tree and some cleanup").
- Hugh Dickins has optimized the collapsing of shmem or file pages into
THPs ("mm: free retracted page table by RCU").
- Jiaqi Yan has a patch series which permits us to use the healthy
subpages within a hardware poisoned huge page for general purposes
("Improve hugetlbfs read on HWPOISON hugepages").
- Kemeng Shi has done some maintenance work on the pagetable-check code
("Remove unused parameters in page_table_check").
- More folioification work from Matthew Wilcox ("More filesystem folio
conversions for 6.6"), ("Followup folio conversions for zswap"). And
from ZhangPeng ("Convert several functions in page_io.c to use a
folio").
- page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext").
- Baoquan He has converted some architectures to use the GENERIC_IOREMAP
ioremap()/iounmap() code ("mm: ioremap: Convert architectures to take
GENERIC_IOREMAP way").
- Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support
batched/deferred tlb shootdown during page reclamation/migration").
- Better maple tree lockdep checking from Liam Howlett ("More strict
maple tree lockdep"). Liam also developed some efficiency improvements
("Reduce preallocations for maple tree").
- Cleanup and optimization to the secondary IOMMU TLB invalidation, from
Alistair Popple ("Invalidate secondary IOMMU TLB on permission
upgrade").
- Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes
for arm64").
- Kemeng Shi provides some maintenance work on the compaction code ("Two
minor cleanups for compaction").
- Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle most
file-backed faults under the VMA lock").
- Aneesh Kumar contributes code to use the vmemmap optimization for DAX
on ppc64, under some circumstances ("Add support for DAX vmemmap
optimization for ppc64").
- page-ext cleanups from Kemeng Shi ("add page_ext_data to get client
data in page_ext"), ("minor cleanups to page_ext header").
- Some zswap cleanups from Johannes Weiner ("mm: zswap: three
cleanups").
- kmsan cleanups from ZhangPeng ("minor cleanups for kmsan").
- VMA handling cleanups from Kefeng Wang ("mm: convert to
vma_is_initial_heap/stack()").
- DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes:
implement DAMOS tried total bytes file"), ("Extend DAMOS filters for
address ranges and DAMON monitoring targets").
- Compaction work from Kemeng Shi ("Fixes and cleanups to compaction").
- Liam Howlett has improved the maple tree node replacement code
("maple_tree: Change replacement strategy").
- ZhangPeng has a general code cleanup - use the K() macro more widely
("cleanup with helper macro K()").
- Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for memmap
on memory feature on ppc64").
- pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list
in page_alloc"), ("Two minor cleanups for get pageblock migratetype").
- Vishal Moola introduces a memory descriptor for page table tracking,
"struct ptdesc" ("Split ptdesc from struct page").
- memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups
for vm.memfd_noexec").
- MM include file rationalization from Hugh Dickins ("arch: include
asm/cacheflush.h in asm/hugetlb.h").
- THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text
output").
- kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use
object_cache instead of kmemleak_initialized").
- More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor
and _folio_order").
- A VMA locking scalability improvement from Suren Baghdasaryan
("Per-VMA lock support for swap and userfaults").
- pagetable handling cleanups from Matthew Wilcox ("New page table range
API").
- A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop
using page->private on tail pages for THP_SWAP + cleanups").
- Cleanups and speedups to the hugetlb fault handling from Matthew
Wilcox ("Change calling convention for ->huge_fault").
- Matthew Wilcox has also done some maintenance work on the MM subsystem
documentation ("Improve mm documentation").
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZO1JUQAKCRDdBJ7gKXxA
jrMwAP47r/fS8vAVT3zp/7fXmxaJYTK27CTAM881Gw1SDhFM/wEAv8o84mDenCg6
Nfio7afS1ncD+hPYT8947UnLxTgn+ww=
=Afws
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Some swap cleanups from Ma Wupeng ("fix WARN_ON in
add_to_avail_list")
- Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which
reduces the special-case code for handling hugetlb pages in GUP. It
also speeds up GUP handling of transparent hugepages.
- Peng Zhang provides some maple tree speedups ("Optimize the fast path
of mas_store()").
- Sergey Senozhatsky has improved te performance of zsmalloc during
compaction (zsmalloc: small compaction improvements").
- Domenico Cerasuolo has developed additional selftest code for zswap
("selftests: cgroup: add zswap test program").
- xu xin has doe some work on KSM's handling of zero pages. These
changes are mainly to enable the user to better understand the
effectiveness of KSM's treatment of zero pages ("ksm: support
tracking KSM-placed zero-pages").
- Jeff Xu has fixes the behaviour of memfd's
MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl
MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED").
- David Howells has fixed an fscache optimization ("mm, netfs, fscache:
Stop read optimisation when folio removed from pagecache").
- Axel Rasmussen has given userfaultfd the ability to simulate memory
poisoning ("add UFFDIO_POISON to simulate memory poisoning with
UFFD").
- Miaohe Lin has contributed some routine maintenance work on the
memory-failure code ("mm: memory-failure: remove unneeded PageHuge()
check").
- Peng Zhang has contributed some maintenance work on the maple tree
code ("Improve the validation for maple tree and some cleanup").
- Hugh Dickins has optimized the collapsing of shmem or file pages into
THPs ("mm: free retracted page table by RCU").
- Jiaqi Yan has a patch series which permits us to use the healthy
subpages within a hardware poisoned huge page for general purposes
("Improve hugetlbfs read on HWPOISON hugepages").
- Kemeng Shi has done some maintenance work on the pagetable-check code
("Remove unused parameters in page_table_check").
- More folioification work from Matthew Wilcox ("More filesystem folio
conversions for 6.6"), ("Followup folio conversions for zswap"). And
from ZhangPeng ("Convert several functions in page_io.c to use a
folio").
- page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext").
- Baoquan He has converted some architectures to use the
GENERIC_IOREMAP ioremap()/iounmap() code ("mm: ioremap: Convert
architectures to take GENERIC_IOREMAP way").
- Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support
batched/deferred tlb shootdown during page reclamation/migration").
- Better maple tree lockdep checking from Liam Howlett ("More strict
maple tree lockdep"). Liam also developed some efficiency
improvements ("Reduce preallocations for maple tree").
- Cleanup and optimization to the secondary IOMMU TLB invalidation,
from Alistair Popple ("Invalidate secondary IOMMU TLB on permission
upgrade").
- Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes
for arm64").
- Kemeng Shi provides some maintenance work on the compaction code
("Two minor cleanups for compaction").
- Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle
most file-backed faults under the VMA lock").
- Aneesh Kumar contributes code to use the vmemmap optimization for DAX
on ppc64, under some circumstances ("Add support for DAX vmemmap
optimization for ppc64").
- page-ext cleanups from Kemeng Shi ("add page_ext_data to get client
data in page_ext"), ("minor cleanups to page_ext header").
- Some zswap cleanups from Johannes Weiner ("mm: zswap: three
cleanups").
- kmsan cleanups from ZhangPeng ("minor cleanups for kmsan").
- VMA handling cleanups from Kefeng Wang ("mm: convert to
vma_is_initial_heap/stack()").
- DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes:
implement DAMOS tried total bytes file"), ("Extend DAMOS filters for
address ranges and DAMON monitoring targets").
- Compaction work from Kemeng Shi ("Fixes and cleanups to compaction").
- Liam Howlett has improved the maple tree node replacement code
("maple_tree: Change replacement strategy").
- ZhangPeng has a general code cleanup - use the K() macro more widely
("cleanup with helper macro K()").
- Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for
memmap on memory feature on ppc64").
- pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list
in page_alloc"), ("Two minor cleanups for get pageblock
migratetype").
- Vishal Moola introduces a memory descriptor for page table tracking,
"struct ptdesc" ("Split ptdesc from struct page").
- memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups
for vm.memfd_noexec").
- MM include file rationalization from Hugh Dickins ("arch: include
asm/cacheflush.h in asm/hugetlb.h").
- THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text
output").
- kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use
object_cache instead of kmemleak_initialized").
- More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor
and _folio_order").
- A VMA locking scalability improvement from Suren Baghdasaryan
("Per-VMA lock support for swap and userfaults").
- pagetable handling cleanups from Matthew Wilcox ("New page table
range API").
- A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop
using page->private on tail pages for THP_SWAP + cleanups").
- Cleanups and speedups to the hugetlb fault handling from Matthew
Wilcox ("Change calling convention for ->huge_fault").
- Matthew Wilcox has also done some maintenance work on the MM
subsystem documentation ("Improve mm documentation").
* tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (489 commits)
maple_tree: shrink struct maple_tree
maple_tree: clean up mas_wr_append()
secretmem: convert page_is_secretmem() to folio_is_secretmem()
nios2: fix flush_dcache_page() for usage from irq context
hugetlb: add documentation for vma_kernel_pagesize()
mm: add orphaned kernel-doc to the rst files.
mm: fix clean_record_shared_mapping_range kernel-doc
mm: fix get_mctgt_type() kernel-doc
mm: fix kernel-doc warning from tlb_flush_rmaps()
mm: remove enum page_entry_size
mm: allow ->huge_fault() to be called without the mmap_lock held
mm: move PMD_ORDER to pgtable.h
mm: remove checks for pte_index
memcg: remove duplication detection for mem_cgroup_uncharge_swap
mm/huge_memory: work on folio->swap instead of page->private when splitting folio
mm/swap: inline folio_set_swap_entry() and folio_swap_entry()
mm/swap: use dedicated entry for swap in folio
mm/swap: stop using page->private on tail pages for THP_SWAP
selftests/mm: fix WARNING comparing pointer to 0
selftests: cgroup: fix test_kmem_memcg_deletion kernel mem check
...
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZOXT7QAKCRCRxhvAZXjc
ort3AP0VIK/oJk5skgjpinQrCfvtVz0XOtawuBtn0f1weIfb6AD9Hg1rqOKnQD5z
dkvn3xaEr3gPOVzqU5SvFwVoCM0cMwA=
=24Ha
-----END PGP SIGNATURE-----
Merge tag 'v6.6-vfs.fchmodat2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull fchmodat2 system call from Christian Brauner:
"This adds the fchmodat2() system call. It is a revised version of the
fchmodat() system call, adding a missing flag argument. Support for
both AT_SYMLINK_NOFOLLOW and AT_EMPTY_PATH are included.
Adding this system call revision has been a longstanding request but
so far has always fallen through the cracks. While the kernel
implementation of fchmodat() does not have a flag argument the libc
provided POSIX-compliant fchmodat(3) version does. Both glibc and musl
have to implement a workaround in order to support AT_SYMLINK_NOFOLLOW
(see [1] and [2]).
The workaround is brittle because it relies not just on O_PATH and
O_NOFOLLOW semantics and procfs magic links but also on our rather
inconsistent symlink semantics.
This gives userspace a proper fchmodat2() system call that libcs can
use to properly implement fchmodat(3) and allows them to get rid of
their hacks. In this case it will immediately benefit them as the
current workaround is already defunct because of aformentioned
inconsistencies.
In addition to AT_SYMLINK_NOFOLLOW, give userspace the ability to use
AT_EMPTY_PATH with fchmodat2(). This is already possible with
fchownat() so there's no reason to not also support it for
fchmodat2().
The implementation is simple and comes with selftests. Implementation
of the system call and wiring up the system call are done as separate
patches even though they could arguably be one patch. But in case
there are merge conflicts from other system call additions it can be
beneficial to have separate patches"
Link: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35 [1]
Link: https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28 [2]
* tag 'v6.6-vfs.fchmodat2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
selftests: fchmodat2: remove duplicate unneeded defines
fchmodat2: add support for AT_EMPTY_PATH
selftests: Add fchmodat2 selftest
arch: Register fchmodat2, usually as syscall 452
fs: Add fchmodat2()
Non-functional cleanup of a "__user * filename"
Instead of if conditions with line splits, use the usual error handling
pattern with a separate variable to improve readability.
The second check can use reverse logic which reduces indentation level.
No functional changes intended.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Move the default (no-op) implementation of flush_icache_pages() to
<linux/cacheflush.h> from <asm-generic/cacheflush.h>. Remove the
flush_icache_page() wrapper from each architecture into
<linux/cacheflush.h>.
Link: https://lkml.kernel.org/r/20230802151406.3735276-32-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Rename _PFN_SHIFT to PFN_PTE_SHIFT. Convert a few places
to call set_pte() instead of set_pte_at(). Add set_ptes(),
update_mmu_cache_range(), flush_icache_pages() and flush_dcache_folio().
Change the PG_arch_1 (aka PG_dcache_dirty) flag from being per-page
to per-folio.
Link: https://lkml.kernel.org/r/20230802151406.3735276-18-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Drop CONFIG_IP_NF_TARGET_CLUSTERIP from any remaining mips defconfigs as it
was removed in commit 9db5d918e2c0 ("netfilter: ip_tables: remove clusterip
target").
Signed-off-by: Trevor Woerner <twoerner@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
ld.lld 16.0.0 and newer defaults to '--no-undefined-version', which
causes the following error when CONFIG_MIPS_CLOCK_VSYSCALL is not set:
ld.lld: error: version script assignment of 'LINUX_2.6' to symbol '__vdso_gettimeofday' failed: symbol not defined
Only export __vdso_gettimeofday() when it will be present in the final
object file, which clears up the error.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202308170532.zxFFv25c-lkp@intel.com/
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Fangrui Song <maskray@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
There is only one Kconfig user of CONFIG_EMBEDDED and it can be switched
to EXPERT or "if !ARCH_MULTIPLATFORM" (suggested by Arnd).
Link: https://lkml.kernel.org/r/20230816055010.31534-1-rdunlap@infradead.org
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com> [RISC-V]
Acked-by: Greg Ungerer <gerg@linux-m68k.org>
Acked-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc]
Cc: Russell King <linux@armlinux.org.uk>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Brian Cain <bcain@quicinc.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
As part of the conversions to replace pgtable constructor/destructors with
ptdesc equivalents, convert various page table functions to use ptdescs.
Some of the functions use the *get*page*() helper functions. Convert
these to use pagetable_alloc() and ptdesc_address() instead to help
standardize page tables further.
Link: https://lkml.kernel.org/r/20230807230513.102486-24-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Guo Ren <guoren@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The APIs that allow backtracing across CPUs have always had a way to
exclude the current CPU. This convenience means callers didn't need to
find a place to allocate a CPU mask just to handle the common case.
Let's extend the API to take a CPU ID to exclude instead of just a
boolean. This isn't any more complex for the API to handle and allows the
hardlockup detector to exclude a different CPU (the one it already did a
trace for) without needing to find space for a CPU mask.
Arguably, this new API also encourages safer behavior. Specifically if
the caller wants to avoid tracing the current CPU (maybe because they
already traced the current CPU) this makes it more obvious to the caller
that they need to make sure that the current CPU ID can't change.
[akpm@linux-foundation.org: fix trigger_allbutcpu_cpu_backtrace() stub]
Link: https://lkml.kernel.org/r/20230804065935.v4.1.Ia35521b91fc781368945161d7b28538f9996c182@changeid
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Lecopzer Chen <lecopzer.chen@mediatek.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Pingfan Liu <kernelfans@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
range.h works with struct range data type. The resource_size_t
is an alien here.
(1) Move cap_resource() implementation into its only user, and
(2) rename and move RESOURCE_SIZE_MAX to limits.h.
Link: https://lkml.kernel.org/r/20230804064636.15368-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>