linux

iv/linux

Author	SHA1	Message	Date
Palmer Dabbelt	2bf8acbf54	Merge patch series "Fix XIP boot and make XIP testable in QEMU" Frederik Haxel <haxel@fzi.de> says: XIP boot seems to be broken for some time now. A likely reason why no one seems to have noticed this is that XIP is more difficult to test, as it is currently not easily testable with QEMU. These patches fix the XIP boot and allow an XIP build without BUILTIN_DTB, which in turn makes it easier to test an image with the QEMU virt machine. * b4-shazam-merge: riscv: Allow disabling of BUILTIN_DTB for XIP riscv: Fixed wrong register in XIP_FIXUP_FLASH_OFFSET macro riscv: Make XIP bootable again Link: https://lore.kernel.org/r/20231212130116.848530-1-haxel@fzi.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-09 20:10:39 -08:00
Song Shuai	a7565f4d06	riscv: Remove SHADOW_OVERFLOW_STACK_SIZE macro The commit be97d0db5f44 ("riscv: VMAP_STACK overflow detection thread-safe") got rid of `shadow_stack`, so SHADOW_OVERFLOW_STACK_SIZE should be removed too. Fixes: be97d0db5f44 ("riscv: VMAP_STACK overflow detection thread-safe") Signed-off-by: Song Shuai <songshuaishuai@tinylab.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20231211110331.359534-1-songshuaishuai@tinylab.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-09 20:10:38 -08:00
Palmer Dabbelt	5c89186a32	Merge remote-tracking branch 'palmer/fixes' into for-next I don't usually merge these in, but I missed sending a PR due to the holidays. * palmer/fixes: riscv: Fix set_direct_map_default_noflush() to reset _PAGE_EXEC riscv: Fix module_alloc() that did not reset the linear mapping permissions riscv: Fix wrong usage of lm_alias() when splitting a huge linear mapping riscv: Check if the code to patch lies in the exit section riscv: errata: andes: Probe for IOCP only once in boot stage riscv: Fix SMP when shadow call stacks are enabled dt-bindings: perf: riscv,pmu: drop unneeded quotes riscv: fix misaligned access handling of C.SWSP and C.SDSP RISC-V: hwprobe: Always use u64 for extension bits Support rv32 ULEB128 test riscv: Correct type casting in module loading riscv: Safely remove entries from relocation list Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-09 20:10:32 -08:00
Ben Dooks	869436dae7	riscv; fix __user annotation in save_v_state() The save_v_state() is technically sending a __user pointer through __put_user() and thus is generating a sparse warning so force the value to be "void " to fix: arch/riscv/kernel/signal.c:94:16: warning: incorrect type in initializer (different address spaces) arch/riscv/kernel/signal.c:94:16: expected void __val arch/riscv/kernel/signal.c:94:16: got void [noderef] __user *[assigned] datap Fixes: 8ee0b41898fa26f66e32 ("riscv: signal: Add sigcontext save/restore for vector") Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Link: https://lore.kernel.org/r/20231123142708.261733-1-ben.dooks@codethink.co.uk Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-09 20:10:18 -08:00
Ben Dooks	ca0e433b41	riscv: fix __user annotation in traps_misaligned.c The instruction reading code can read from either user or kernel addresses and thus the use of __user on pointers to instructions depends on which context. Fix a few sparse warnings by using __user for user-accesses and remove it when not. Fixes: arch/riscv/kernel/traps_misaligned.c:361:21: warning: dereference of noderef expression arch/riscv/kernel/traps_misaligned.c:373:21: warning: dereference of noderef expression arch/riscv/kernel/traps_misaligned.c:381:21: warning: dereference of noderef expression arch/riscv/kernel/traps_misaligned.c:322:24: warning: incorrect type in initializer (different address spaces) arch/riscv/kernel/traps_misaligned.c:322:24: expected unsigned char const [noderef] __user __gu_ptr arch/riscv/kernel/traps_misaligned.c:322:24: got unsigned char const [usertype] addr arch/riscv/kernel/traps_misaligned.c:361:21: warning: dereference of noderef expression arch/riscv/kernel/traps_misaligned.c:373:21: warning: dereference of noderef expression arch/riscv/kernel/traps_misaligned.c:381:21: warning: dereference of noderef expression arch/riscv/kernel/traps_misaligned.c:332:24: warning: incorrect type in initializer (different address spaces) arch/riscv/kernel/traps_misaligned.c:332:24: expected unsigned char [noderef] __user __gu_ptr arch/riscv/kernel/traps_misaligned.c:332:24: got unsigned char [usertype] addr Fixes: 7c83232161f60 ("riscv: add support for misaligned trap handling in S-mode") Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk> Link: https://lore.kernel.org/r/20231123141617.259591-1-ben.dooks@codethink.co.uk Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-09 20:10:17 -08:00
Jisheng Zhang	cfbc4f81c9	riscv: Select ARCH_WANTS_NO_INSTR As said in the help of ARCH_WANTS_NO_INSTR entry in arch/Kconfig: "An architecture should select this if the noinstr macro is being used on functions to denote that the toolchain should avoid instrumenting such functions and is required for correctness." Select ARCH_WANTS_NO_INSTR for correctness. PS: The reason we didn't find any issue so far is that the CC_HAS_NO_PROFILE_FN_ATTR is true. Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Link: https://lore.kernel.org/r/20231123142223.1787-1-jszhang@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-09 20:10:16 -08:00
Palmer Dabbelt	5634d9c280	Merge patch series "riscv: CPU operations cleanup" Samuel Holland <samuel.holland@sifive.com> says: This series cleans up some duplicated and dead code around the RISC-V CPU operations, that was copied from arm64 but is not needed here. The result is a bit of memory savings and removal of a few SBI calls during boot, with no functional change. * b4-shazam-merge: riscv: Use the same CPU operations for all CPUs riscv: Remove unused members from struct cpu_operations riscv: Deduplicate code in setup_smp() Link: https://lore.kernel.org/r/20231121234736.3489608-1-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-09 20:10:15 -08:00
Samuel Holland	b7b4e4d79f	riscv: Remove obsolete rv32_defconfig file This file is not used since commit 72f045d19f25 ("riscv: Fixup difference with defconfig"), where it was replaced by the 32-bit.config fragment. Delete the old file to avoid any confusion. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20231121225320.3430550-1-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-09 20:10:14 -08:00
Palmer Dabbelt	7a4749739c	Merge patch series "RISC-V: hwprobe: Introduce which-cpus" Andrew Jones <ajones@ventanamicro.com> says: This series introduces a flag for the hwprobe syscall which effectively reverses its behavior from getting the values of keys for a set of cpus to getting the cpus for a set of key-value pairs. * b4-shazam-merge: RISC-V: selftests: Add which-cpus hwprobe test RISC-V: hwprobe: Introduce which-cpus flag RISC-V: Move the hwprobe syscall to its own file RISC-V: hwprobe: Clarify cpus size parameter Link: https://lore.kernel.org/r/20231122164700.127954-6-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-09 20:10:13 -08:00
Frederik Haxel	6c4a2f6329	riscv: Allow disabling of BUILTIN_DTB for XIP This enables, among other things, testing with the QEMU virt machine. To build an XIP kernel for the QEMU virt machine, configure the the kernel as desired and apply the following configuration ``` CONFIG_NONPORTABLE=y CONFIG_XIP_KERNEL=y CONFIG_XIP_PHYS_ADDR=0x20000000 CONFIG_PHYS_RAM_BASE=0x80200000 CONFIG_BUILTIN_DTB=n ``` Since the QEMU virt flash memory expects a 32 MB file, the built image must be padded. For example, with `truncate -s 32M arch/riscv/boot/xipImage` The kernel can be started using the following command in QEMU (v8+) ``` qemu-system-riscv64 -M virt,pflash0=pflash0 \ -blockdev node-name=pflash0,driver=file,read-only=on,\ filename=arch/riscv/boot/xipImage <optional parameters> ``` Signed-off-by: Frederik Haxel <haxel@fzi.de> Link: https://lore.kernel.org/r/20231212130116.848530-4-haxel@fzi.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-09 19:33:22 -08:00
Frederik Haxel	5daa372641	riscv: Fixed wrong register in XIP_FIXUP_FLASH_OFFSET macro During the refactoring, a bug was introduced in the rarly used XIP_FIXUP_FLASH_OFFSET macro. Fixes: bee7fbc38579 ("RISC-V CPU Idle Support") Fixes: e7681beba992 ("RISC-V: Split out the XIP fixups into their own file") Signed-off-by: Frederik Haxel <haxel@fzi.de> Link: https://lore.kernel.org/r/20231212130116.848530-3-haxel@fzi.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-09 19:33:21 -08:00
Frederik Haxel	66f1e68093	riscv: Make XIP bootable again Currently, the XIP kernel seems to fail to boot due to missing XIP_FIXUP and a wrong page_offset value. A superfluous XIP_FIXUP has also been removed. Signed-off-by: Frederik Haxel <haxel@fzi.de> Link: https://lore.kernel.org/r/20231212130116.848530-2-haxel@fzi.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-09 19:33:20 -08:00
Linus Torvalds	9f2a635235	Quite a lot of kexec work this time around. Many singleton patches in many places. The notable patch series are: - nilfs2 folio conversion from Matthew Wilcox in "nilfs2: Folio conversions for file paths". - Additional nilfs2 folio conversion from Ryusuke Konishi in "nilfs2: Folio conversions for directory paths". - IA64 remnant removal in Heiko Carstens's "Remove unused code after IA-64 removal". - Arnd Bergmann has enabled the -Wmissing-prototypes warning everywhere in "Treewide: enable -Wmissing-prototypes". This had some followup fixes: - Nathan Chancellor has cleaned up the hexagon build in the series "hexagon: Fix up instances of -Wmissing-prototypes". - Nathan also addressed some s390 warnings in "s390: A couple of fixes for -Wmissing-prototypes". - Arnd Bergmann addresses the same warnings for MIPS in his series "mips: address -Wmissing-prototypes warnings". - Baoquan He has made kexec_file operate in a top-down-fitting manner similar to kexec_load in the series "kexec_file: Load kernel at top of system RAM if required" - Baoquan He has also added the self-explanatory "kexec_file: print out debugging message if required". - Some checkstack maintenance work from Tiezhu Yang in the series "Modify some code about checkstack". - Douglas Anderson has disentangled the watchdog code's logging when multiple reports are occurring simultaneously. The series is "watchdog: Better handling of concurrent lockups". - Yuntao Wang has contributed some maintenance work on the crash code in "crash: Some cleanups and fixes". -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZZ2R6AAKCRDdBJ7gKXxA juCVAP4t76qUISDOSKugB/Dn5E4Nt9wvPY9PcufnmD+xoPsgkQD+JVl4+jd9+gAV vl6wkJDiJO5JZ3FVtBtC3DFA/xHtVgk= =kQw+ -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2024-01-09-10-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: "Quite a lot of kexec work this time around. Many singleton patches in many places. The notable patch series are: - nilfs2 folio conversion from Matthew Wilcox in 'nilfs2: Folio conversions for file paths'. - Additional nilfs2 folio conversion from Ryusuke Konishi in 'nilfs2: Folio conversions for directory paths'. - IA64 remnant removal in Heiko Carstens's 'Remove unused code after IA-64 removal'. - Arnd Bergmann has enabled the -Wmissing-prototypes warning everywhere in 'Treewide: enable -Wmissing-prototypes'. This had some followup fixes: - Nathan Chancellor has cleaned up the hexagon build in the series 'hexagon: Fix up instances of -Wmissing-prototypes'. - Nathan also addressed some s390 warnings in 's390: A couple of fixes for -Wmissing-prototypes'. - Arnd Bergmann addresses the same warnings for MIPS in his series 'mips: address -Wmissing-prototypes warnings'. - Baoquan He has made kexec_file operate in a top-down-fitting manner similar to kexec_load in the series 'kexec_file: Load kernel at top of system RAM if required' - Baoquan He has also added the self-explanatory 'kexec_file: print out debugging message if required'. - Some checkstack maintenance work from Tiezhu Yang in the series 'Modify some code about checkstack'. - Douglas Anderson has disentangled the watchdog code's logging when multiple reports are occurring simultaneously. The series is 'watchdog: Better handling of concurrent lockups'. - Yuntao Wang has contributed some maintenance work on the crash code in 'crash: Some cleanups and fixes'" * tag 'mm-nonmm-stable-2024-01-09-10-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (157 commits) crash_core: fix and simplify the logic of crash_exclude_mem_range() x86/crash: use SZ_1M macro instead of hardcoded value x86/crash: remove the unused image parameter from prepare_elf_headers() kdump: remove redundant DEFAULT_CRASH_KERNEL_LOW_SIZE scripts/decode_stacktrace.sh: strip unexpected CR from lines watchdog: if panicking and we dumped everything, don't re-enable dumping watchdog/hardlockup: use printk_cpu_sync_get_irqsave() to serialize reporting watchdog/softlockup: use printk_cpu_sync_get_irqsave() to serialize reporting watchdog/hardlockup: adopt softlockup logic avoiding double-dumps kexec_core: fix the assignment to kimage->control_page x86/kexec: fix incorrect end address passed to kernel_ident_mapping_init() lib/trace_readwrite.c:: replace asm-generic/io with linux/io nilfs2: cpfile: fix some kernel-doc warnings stacktrace: fix kernel-doc typo scripts/checkstack.pl: fix no space expression between sp and offset x86/kexec: fix incorrect argument passed to kexec_dprintk() x86/kexec: use pr_err() instead of kexec_dprintk() when an error occurs nilfs2: add missing set_freezable() for freezable kthread kernel: relay: remove relay_file_splice_read dead code, doesn't work docs: submit-checklist: remove all of "make namespacecheck" ...	2024-01-09 11:46:20 -08:00
Linus Torvalds	fb46e22a9e	Many singleton patches against the MM code. The patch series which are included in this merge do the following: - Peng Zhang has done some mapletree maintainance work in the series "maple_tree: add mt_free_one() and mt_attr() helpers" "Some cleanups of maple tree" - In the series "mm: use memmap_on_memory semantics for dax/kmem" Vishal Verma has altered the interworking between memory-hotplug and dax/kmem so that newly added 'device memory' can more easily have its memmap placed within that newly added memory. - Matthew Wilcox continues folio-related work (including a few fixes) in the patch series "Add folio_zero_tail() and folio_fill_tail()" "Make folio_start_writeback return void" "Fix fault handler's handling of poisoned tail pages" "Convert aops->error_remove_page to ->error_remove_folio" "Finish two folio conversions" "More swap folio conversions" - Kefeng Wang has also contributed folio-related work in the series "mm: cleanup and use more folio in page fault" - Jim Cromie has improved the kmemleak reporting output in the series "tweak kmemleak report format". - In the series "stackdepot: allow evicting stack traces" Andrey Konovalov to permits clients (in this case KASAN) to cause eviction of no longer needed stack traces. - Charan Teja Kalla has fixed some accounting issues in the page allocator's atomic reserve calculations in the series "mm: page_alloc: fixes for high atomic reserve caluculations". - Dmitry Rokosov has added to the samples/ dorectory some sample code for a userspace memcg event listener application. See the series "samples: introduce cgroup events listeners". - Some mapletree maintanance work from Liam Howlett in the series "maple_tree: iterator state changes". - Nhat Pham has improved zswap's approach to writeback in the series "workload-specific and memory pressure-driven zswap writeback". - DAMON/DAMOS feature and maintenance work from SeongJae Park in the series "mm/damon: let users feed and tame/auto-tune DAMOS" "selftests/damon: add Python-written DAMON functionality tests" "mm/damon: misc updates for 6.8" - Yosry Ahmed has improved memcg's stats flushing in the series "mm: memcg: subtree stats flushing and thresholds". - In the series "Multi-size THP for anonymous memory" Ryan Roberts has added a runtime opt-in feature to transparent hugepages which improves performance by allocating larger chunks of memory during anonymous page faults. - Matthew Wilcox has also contributed some cleanup and maintenance work against eh buffer_head code int he series "More buffer_head cleanups". - Suren Baghdasaryan has done work on Andrea Arcangeli's series "userfaultfd move option". UFFDIO_MOVE permits userspace heap compaction algorithms to move userspace's pages around rather than UFFDIO_COPY'a alloc/copy/free. - Stefan Roesch has developed a "KSM Advisor", in the series "mm/ksm: Add ksm advisor". This is a governor which tunes KSM's scanning aggressiveness in response to userspace's current needs. - Chengming Zhou has optimized zswap's temporary working memory use in the series "mm/zswap: dstmem reuse optimizations and cleanups". - Matthew Wilcox has performed some maintenance work on the writeback code, both code and within filesystems. The series is "Clean up the writeback paths". - Andrey Konovalov has optimized KASAN's handling of alloc and free stack traces for secondary-level allocators, in the series "kasan: save mempool stack traces". - Andrey also performed some KASAN maintenance work in the series "kasan: assorted clean-ups". - David Hildenbrand has gone to town on the rmap code. Cleanups, more pte batching, folio conversions and more. See the series "mm/rmap: interface overhaul". - Kinsey Ho has contributed some maintenance work on the MGLRU code in the series "mm/mglru: Kconfig cleanup". - Matthew Wilcox has contributed lruvec page accounting code cleanups in the series "Remove some lruvec page accounting functions". -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZZyF2wAKCRDdBJ7gKXxA jjWjAP42LHvGSjp5M+Rs2rKFL0daBQsrlvy6/jCHUequSdWjSgEAmOx7bc5fbF27 Oa8+DxGM9C+fwqZ/7YxU2w/WuUmLPgU= =0NHs -----END PGP SIGNATURE----- Merge tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Many singleton patches against the MM code. The patch series which are included in this merge do the following: - Peng Zhang has done some mapletree maintainance work in the series 'maple_tree: add mt_free_one() and mt_attr() helpers' 'Some cleanups of maple tree' - In the series 'mm: use memmap_on_memory semantics for dax/kmem' Vishal Verma has altered the interworking between memory-hotplug and dax/kmem so that newly added 'device memory' can more easily have its memmap placed within that newly added memory. - Matthew Wilcox continues folio-related work (including a few fixes) in the patch series 'Add folio_zero_tail() and folio_fill_tail()' 'Make folio_start_writeback return void' 'Fix fault handler's handling of poisoned tail pages' 'Convert aops->error_remove_page to ->error_remove_folio' 'Finish two folio conversions' 'More swap folio conversions' - Kefeng Wang has also contributed folio-related work in the series 'mm: cleanup and use more folio in page fault' - Jim Cromie has improved the kmemleak reporting output in the series 'tweak kmemleak report format'. - In the series 'stackdepot: allow evicting stack traces' Andrey Konovalov to permits clients (in this case KASAN) to cause eviction of no longer needed stack traces. - Charan Teja Kalla has fixed some accounting issues in the page allocator's atomic reserve calculations in the series 'mm: page_alloc: fixes for high atomic reserve caluculations'. - Dmitry Rokosov has added to the samples/ dorectory some sample code for a userspace memcg event listener application. See the series 'samples: introduce cgroup events listeners'. - Some mapletree maintanance work from Liam Howlett in the series 'maple_tree: iterator state changes'. - Nhat Pham has improved zswap's approach to writeback in the series 'workload-specific and memory pressure-driven zswap writeback'. - DAMON/DAMOS feature and maintenance work from SeongJae Park in the series 'mm/damon: let users feed and tame/auto-tune DAMOS' 'selftests/damon: add Python-written DAMON functionality tests' 'mm/damon: misc updates for 6.8' - Yosry Ahmed has improved memcg's stats flushing in the series 'mm: memcg: subtree stats flushing and thresholds'. - In the series 'Multi-size THP for anonymous memory' Ryan Roberts has added a runtime opt-in feature to transparent hugepages which improves performance by allocating larger chunks of memory during anonymous page faults. - Matthew Wilcox has also contributed some cleanup and maintenance work against eh buffer_head code int he series 'More buffer_head cleanups'. - Suren Baghdasaryan has done work on Andrea Arcangeli's series 'userfaultfd move option'. UFFDIO_MOVE permits userspace heap compaction algorithms to move userspace's pages around rather than UFFDIO_COPY'a alloc/copy/free. - Stefan Roesch has developed a 'KSM Advisor', in the series 'mm/ksm: Add ksm advisor'. This is a governor which tunes KSM's scanning aggressiveness in response to userspace's current needs. - Chengming Zhou has optimized zswap's temporary working memory use in the series 'mm/zswap: dstmem reuse optimizations and cleanups'. - Matthew Wilcox has performed some maintenance work on the writeback code, both code and within filesystems. The series is 'Clean up the writeback paths'. - Andrey Konovalov has optimized KASAN's handling of alloc and free stack traces for secondary-level allocators, in the series 'kasan: save mempool stack traces'. - Andrey also performed some KASAN maintenance work in the series 'kasan: assorted clean-ups'. - David Hildenbrand has gone to town on the rmap code. Cleanups, more pte batching, folio conversions and more. See the series 'mm/rmap: interface overhaul'. - Kinsey Ho has contributed some maintenance work on the MGLRU code in the series 'mm/mglru: Kconfig cleanup'. - Matthew Wilcox has contributed lruvec page accounting code cleanups in the series 'Remove some lruvec page accounting functions'" * tag 'mm-stable-2024-01-08-15-31' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (361 commits) mm, treewide: rename MAX_ORDER to MAX_PAGE_ORDER mm, treewide: introduce NR_PAGE_ORDERS selftests/mm: add separate UFFDIO_MOVE test for PMD splitting selftests/mm: skip test if application doesn't has root privileges selftests/mm: conform test to TAP format output selftests: mm: hugepage-mmap: conform to TAP format output selftests/mm: gup_test: conform test to TAP format output mm/selftests: hugepage-mremap: conform test to TAP format output mm/vmstat: move pgdemote_* out of CONFIG_NUMA_BALANCING mm: zsmalloc: return -ENOSPC rather than -EINVAL in zs_malloc while size is too large mm/memcontrol: remove __mod_lruvec_page_state() mm/khugepaged: use a folio more in collapse_file() slub: use a folio in __kmalloc_large_node slub: use folio APIs in free_large_kmalloc() slub: use alloc_pages_node() in alloc_slab_page() mm: remove inc/dec lruvec page state functions mm: ratelimit stat flush from workingset shrinker kasan: stop leaking stack trace handles mm/mglru: remove CONFIG_TRANSPARENT_HUGEPAGE mm/mglru: add dummy pmd_dirty() ...	2024-01-09 11:18:47 -08:00
Alexandre Ghiti	b8b2711336	riscv: Fix set_direct_map_default_noflush() to reset _PAGE_EXEC When resetting the linear mapping permissions, we must make sure that we clear the X bit so that do not end up with WX mappings (since we set PAGE_KERNEL). Fixes: 395a21ff859c ("riscv: add ARCH_HAS_SET_DIRECT_MAP support") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20231213134027.155327-3-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-09 10:59:08 -08:00
Alexandre Ghiti	749b94b080	riscv: Fix module_alloc() that did not reset the linear mapping permissions After unloading a module, we must reset the linear mapping permissions, see the example below: Before unloading a module: 0xffffaf809d65d000-0xffffaf809d6dc000 0x000000011d65d000 508K PTE . .. .. D A G . . W R V 0xffffaf809d6dc000-0xffffaf809d6dd000 0x000000011d6dc000 4K PTE . .. .. D A G . . . R V 0xffffaf809d6dd000-0xffffaf809d6e1000 0x000000011d6dd000 16K PTE . .. .. D A G . . W R V 0xffffaf809d6e1000-0xffffaf809d6e7000 0x000000011d6e1000 24K PTE . .. .. D A G . X . R V After unloading a module: 0xffffaf809d65d000-0xffffaf809d6e1000 0x000000011d65d000 528K PTE . .. .. D A G . . W R V 0xffffaf809d6e1000-0xffffaf809d6e7000 0x000000011d6e1000 24K PTE . .. .. D A G . X W R V The last mapping is not reset and we end up with WX mappings in the linear mapping. So add VM_FLUSH_RESET_PERMS to our module_alloc() definition. Fixes: 0cff8bff7af8 ("riscv: avoid the PIC offset of static percpu data in module beyond 2G limits") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20231213134027.155327-2-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-09 10:59:07 -08:00
Alexandre Ghiti	c29fc621e1	riscv: Fix wrong usage of lm_alias() when splitting a huge linear mapping lm_alias() can only be used on kernel mappings since it explicitly uses __pa_symbol(), so simply fix this by checking where the address belongs to before. Fixes: 311cd2f6e253 ("riscv: Fix set_memory_XX() and set_direct_map_XX() by splitting huge linear mappings") Reported-by: syzbot+afb726d49f84c8d95ee1@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-riscv/000000000000620dd0060c02c5e1@google.com/ Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20231212195400.128457-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-09 10:59:06 -08:00
Alexandre Ghiti	420370f3ae	riscv: Check if the code to patch lies in the exit section Otherwise we fall through to vmalloc_to_page() which panics since the address does not lie in the vmalloc region. Fixes: 043cb41a85de ("riscv: introduce interfaces to patch kernel code") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20231214091926.203439-1-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-09 10:58:59 -08:00
Linus Torvalds	bfe8eb3b85	Scheduler changes for v6.8: - Energy scheduling: - Consolidate how the max compute capacity is used in the scheduler and how we calculate the frequency for a level of utilization. - Rework interface between the scheduler and the schedutil governor - Simplify the util_est logic - Deadline scheduler: - Work more towards reducing SCHED_DEADLINE starvation of low priority tasks (e.g., SCHED_OTHER) tasks when higher priority tasks monopolize CPU cycles, via the introduction of 'deadline servers' (nested/2-level scheduling). "Fair servers" to make use of this facility are not introduced yet. - EEVDF: - Introduce O(1) fastpath for EEVDF task selection - NUMA balancing: - Tune the NUMA-balancing vma scanning logic some more, to better distribute the probability of a particular vma getting scanned. - Plus misc fixes, cleanups and updates. Signed-off-by: Ingo Molnar <mingo@kernel.org> -----BEGIN PGP SIGNATURE----- iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmWcASMRHG1pbmdvQGtl cm5lbC5vcmcACgkQEnMQ0APhK1jLbg/+NOwF18M6klF1/3jUaV1PU09vRzYnnA7w oF7Tru7JLV+/vZK+rwI1zxzj5Nj3sVBQPIyp1embEHx7Z/QH8MIaIVpcSFsDDCYY Q8n6ZVRB+lKWEo5+Ti6JEJftDAWuLHXwFWDa57oWPuR0Tc736+zYHUfj7jdKk0RI nT/lnOT6hXU8q26O4QFrBrrhvCCxc4byo7buKPQfqie0bDA70ppIWkFQoQME6mvQ US9jvOyUipOiPV06DPwFvPDJUQBGq2VdJNk+5zCEtcqEfLREuo/Xq1Ww1x1BWaZI 761532EuDo73iMK4IFZrvVmj1ioz957qbje11MSSkDdKj692xxjXyvnY0NBvZuho Ueog/jQ4D4I2qu7pPSCF8UfnI/Hw4Q+KJ89j3pcywRm4hmCTf9k3MGpAaVLVxH7G e5REZ5MSsFZi4Cs+zF87Of5KCKLhTr1qSetNtShinKahg06WZ+MZ8tW4jb52qy0j F8PMlvfBI3f7SOtA8s2P26mDGQ21YQehN2d5P+Fbwj/U3fjIlSTOyx6NwLpFwYaS Vf+fctchGFV1Sh7c2JjCh+ecYfXx3ghT/pvyPOImJtxtCKSRUQ8c26ApC1OsWfOE FdHv4f2dPqcyswCZzIv/2fyDXc9eaS2E05EMDNqVuMCGnzidzSs81n7hBioNMrnH ZgHK90TmEbw= =wTVh -----END PGP SIGNATURE----- Merge tag 'sched-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Energy scheduling: - Consolidate how the max compute capacity is used in the scheduler and how we calculate the frequency for a level of utilization. - Rework interface between the scheduler and the schedutil governor - Simplify the util_est logic Deadline scheduler: - Work more towards reducing SCHED_DEADLINE starvation of low priority tasks (e.g., SCHED_OTHER) tasks when higher priority tasks monopolize CPU cycles, via the introduction of 'deadline servers' (nested/2-level scheduling). "Fair servers" to make use of this facility are not introduced yet. EEVDF: - Introduce O(1) fastpath for EEVDF task selection NUMA balancing: - Tune the NUMA-balancing vma scanning logic some more, to better distribute the probability of a particular vma getting scanned. Plus misc fixes, cleanups and updates" * tag 'sched-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits) sched/fair: Fix tg->load when offlining a CPU sched/fair: Remove unused 'next_buddy_marked' local variable in check_preempt_wakeup_fair() sched/fair: Use all little CPUs for CPU-bound workloads sched/fair: Simplify util_est sched/fair: Remove SCHED_FEAT(UTIL_EST_FASTUP, true) arm64/amu: Use capacity_ref_freq() to set AMU ratio cpufreq/cppc: Set the frequency used for computing the capacity cpufreq/cppc: Move and rename cppc_cpufreq_{perf_to_khz\|khz_to_perf}() energy_model: Use a fixed reference frequency cpufreq/schedutil: Use a fixed reference frequency cpufreq: Use the fixed and coherent frequency for scaling capacity sched/topology: Add a new arch_scale_freq_ref() method freezer,sched: Clean saved_state when restoring it during thaw sched/fair: Update min_vruntime for reweight_entity() correctly sched/doc: Update documentation after renames and synchronize Chinese version sched/cpufreq: Rework iowait boost sched/cpufreq: Rework schedutil governor performance estimation sched/pelt: Avoid underestimation of task utilization sched/timers: Explain why idle task schedules out on remote timer enqueue sched/cpuidle: Comment about timers requirements VS idle handler ...	2024-01-08 19:49:17 -08:00
Paolo Bonzini	fb872da8e7	Common KVM changes for 6.8: - Use memdup_array_user() to harden against overflow. - Unconditionally advertise KVM_CAP_DEVICE_CTRL for all architectures. -----BEGIN PGP SIGNATURE----- iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmWW8F4SHHNlYW5qY0Bn b29nbGUuY29tAAoJEGCRIgFNDBL5urcP/Rex6Too26aHJXelUVHlFOGw3hfOnvbq Wr/P3kPqB/1Mncx3aiYTpEvUxFjVTvIkMB5dWba39Eq/G1BbOT2CAHCunlvKJrXy L83YgOl17QtZZJS1KmLTRCj1umfl4Z0c+GEIH+P1FOuOmllNXlLJ1+GWmolP6LLf u4DF2/tyVZf8JXXeJWYITHsU0YQQ0MhHgYL8/aMYJK8epNFpR3wKIqT3428ASxV3 Ru4WH7jpYkFF7PaKbvjKdepr+1wyVt4PXJDDpciCScz45/8eebgfylLJbMglpsR1 JSUTzd6KdCbekgzp51NnRdoIxP+MXgKA3dIuzXKyIDzm2Xq6tna87ve/aWDGw8JC nUMkP/vAuaKT+/QTOwskGAvK2GYDQD1UwVcFNLi12Iis50H0qPwcxsUionQuZgUC ykCmY4N31rSX4DhPg1WLiqsvC/EeDhfXprYrfSd4HQq08NgD45orRJw0Kov+shcS xijIlE1e3aVJMRrbfoSWyc4m79AcooxjYwojQC1Ayqsq0ZTTzzIpd6rqjmY+LbLL aP/wNz8hCfMhFekUV7dDk9rMdZY+bBnTiolyKAN66E6EnPYfl2EdrDEGnZOCPXF4 L/O/kMCXHE90cszzrmiR40yNHLkPelij8sK+ligE4JpqteQ7ia/knh8YAiPBxDw6 XcIfftXMm5XG =wpT4 -----END PGP SIGNATURE----- Merge tag 'kvm-x86-generic-6.8' of https://github.com/kvm-x86/linux into HEAD Common KVM changes for 6.8: - Use memdup_array_user() to harden against overflow. - Unconditionally advertise KVM_CAP_DEVICE_CTRL for all architectures.	2024-01-08 08:09:57 -05:00
Paolo Bonzini	caadf876bb	KVM: introduce CONFIG_KVM_COMMON CONFIG_HAVE_KVM is currently used by some architectures to either enabled the KVM config proper, or to enable host-side code that is not part of the KVM module. However, CONFIG_KVM's "select" statement in virt/kvm/Kconfig corresponds to a third meaning, namely to enable common Kconfigs required by all architectures that support KVM. These three meanings can be replaced respectively by an architecture-specific Kconfig, by IS_ENABLED(CONFIG_KVM), or by a new Kconfig symbol that is in turn selected by the architecture-specific "config KVM". Start by introducing such a new Kconfig symbol, CONFIG_KVM_COMMON. Unlike CONFIG_HAVE_KVM, it is selected by CONFIG_KVM, not by architecture code, and it brings in all dependencies of common KVM code. In particular, INTERVAL_TREE was missing in loongarch and riscv, so that is another thing that is fixed. Fixes: 8132d887a702 ("KVM: remove CONFIG_HAVE_KVM_EVENTFD", 2023-12-08) Reported-by: Randy Dunlap <rdunlap@infradead.org> Closes: https://lore.kernel.org/all/44907c6b-c5bd-4e4a-a921-e4d3825539d8@infradead.org/ Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2024-01-08 08:09:38 -05:00
Ingo Molnar	cdb3033e19	Merge branch 'sched/urgent' into sched/core, to pick up pending v6.7 fixes for the v6.8 merge window This fix didn't make it upstream in time, pick it up for the v6.8 merge window. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2024-01-08 12:57:28 +01:00
Linus Torvalds	95c8a35f1c	12 hotfixes. 2 are cc:stable and the remainder either address post-6.7 issues or aren't considered necessary for earlier kernel versions. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZZhbAwAKCRDdBJ7gKXxA jiFeAQD20H8wGT9hbMUYr/PxE4rOxyDXlhnv/mZ0Av3neve4SQD/YlgCFWYpEu8G F5rEAtq89UYE13qlgS3o9KjYOPzhtgQ= =vZ0P -----END PGP SIGNATURE----- Merge tag 'mm-hotfixes-stable-2024-01-05-11-35' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc mm fixes from Andrew Morton: "12 hotfixes. Two are cc:stable and the remainder either address post-6.7 issues or aren't considered necessary for earlier kernel versions" * tag 'mm-hotfixes-stable-2024-01-05-11-35' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm: shrinker: use kvzalloc_node() from expand_one_shrinker_info() mailmap: add entries for Mathieu Othacehe MAINTAINERS: change vmware.com addresses to broadcom.com arch/mm/fault: fix major fault accounting when retrying under per-VMA lock mm/mglru: skip special VMAs in lru_gen_look_around() MAINTAINERS: hand over hwpoison maintainership to Miaohe Lin MAINTAINERS: remove hugetlb maintainer Mike Kravetz mm: fix unmap_mapping_range high bits shift bug mm: memcg: fix split queue list crash when large folio migration mm: fix arithmetic for max_prop_frac when setting max_ratio mm: fix arithmetic for bdi min_ratio mm: align larger anonymous mappings on THP boundaries	2024-01-05 13:46:18 -08:00
Kinsey Ho	533c67e635	mm/mglru: add dummy pmd_dirty() Add dummy pmd_dirty() for architectures that don't provide it. This is similar to commit 6617da8fb565 ("mm: add dummy pmd_young() for architectures not having it"). Link: https://lkml.kernel.org/r/20231227141205.2200125-5-kinseyho@google.com Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202312210606.1Etqz3M4-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202312210042.xQEiqlEh-lkp@intel.com/ Signed-off-by: Kinsey Ho <kinseyho@google.com> Suggested-by: Yu Zhao <yuzhao@google.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Donet Tom <donettom@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-01-05 10:17:44 -08:00
Jakub Kicinski	e63c1822ac	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt.c e009b2efb7a8 ("bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters()") 0f2b21477988 ("bnxt_en: Fix compile error without CONFIG_RFS_ACCEL") https://lore.kernel.org/all/20240105115509.225aa8a2@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-04 18:06:46 -08:00
Samuel Holland	62ff262227	riscv: Use the same CPU operations for all CPUs RISC-V provides no binding (ACPI or DT) to describe per-cpu start/stop operations, so cpu_set_ops() will always detect the same operations for every CPU. Replace the cpu_ops array with a single pointer to save space and reduce boot time. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20231121234736.3489608-4-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-04 15:03:07 -08:00
Samuel Holland	79093f3ec3	riscv: Remove unused members from struct cpu_operations name is not used anywhere at all. cpu_prepare and cpu_disable do nothing and always return 0 if implemented. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20231121234736.3489608-3-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-04 15:03:06 -08:00
Samuel Holland	a4166aec11	riscv: Deduplicate code in setup_smp() Both the ACPI and DT implementations contain some of the same code. Move it to the calling function so it is not duplicated. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20231121234736.3489608-2-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-04 15:03:05 -08:00
Andrew Jones	e178bf146e	RISC-V: hwprobe: Introduce which-cpus flag Introduce the first flag for the hwprobe syscall. The flag basically reverses its behavior, i.e. instead of populating the values of keys for a given set of cpus, the set of cpus after the call is the result of finding a set which supports the values of the keys. In order to do this, we implement a pair compare function which takes the type of value (a single value vs. a bitmask of booleans) into consideration. We also implement vdso support for the new flag. Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Evan Green <evan@rivosinc.com> Link: https://lore.kernel.org/r/20231122164700.127954-9-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-03 03:36:49 -08:00
Andrew Jones	53b2b22850	RISC-V: Move the hwprobe syscall to its own file As Palmer says, hwprobe is "sort of its own thing now, and it's only going to get bigger..." Suggested-by: Palmer Dabbelt <palmer@dabbelt.com> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20231122164700.127954-8-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-03 03:36:48 -08:00
Andrew Jones	36d842d654	RISC-V: hwprobe: Clarify cpus size parameter The "count" parameter associated with the 'cpus' parameter of the hwprobe syscall is the size in bytes of 'cpus'. Naming it 'cpu_count' may mislead users (it did me) to think it's the number of CPUs that are or can be represented by 'cpus' instead. This is particularly easy (IMO) to get wrong since 'cpus' is documented to be defined by CPU_SET(3) and CPU_SET(3) also documents a CPU_COUNT() (the number of CPUs in set) macro. CPU_SET(3) refers to the size of cpu sets with 'setsize'. Adopt 'cpusetsize' for the hwprobe parameter and specifically state it is in bytes in Documentation/riscv/hwprobe.rst to clarify. Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20231122164700.127954-7-ajones@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-03 03:36:47 -08:00
Palmer Dabbelt	cbc911392c	RISC-V: Remove the removed single-letter extensions There were a few single-letter extensions that we had references to floating around in the kernel, but that never ended up as actual ISA specs and have mostly been replaced by multi-letter extensions. This removes the references to those extensions. Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20231110175903.2631-1-palmer@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>	2024-01-03 03:28:49 -08:00
Joerg Roedel	75f74f85a4	Merge branches 'apple/dart', 'arm/rockchip', 'arm/smmu', 'virtio', 'x86/vt-d', 'x86/amd' and 'core' into next	2024-01-03 09:59:32 +01:00
Paolo Bonzini	9cc52627c7	KVM/riscv changes for 6.8 part #1 - KVM_GET_REG_LIST improvement for vector registers - Generate ISA extension reg_list using macros in get-reg-list selftest - Steal time account support along with selftest -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmWQ+cgACgkQrUjsVaLH LAckBA//R4X9L5ugfPdDunp3ntjZXmNtBS5pM2jD+UvaoFn2kOA1o5kOD5mXluuh 0imNjVuzlrX7XoAATQ4BoeoXg0whDbnv/8TE13KqSl1PfNziH2p5YD2DuHXPST3B V2VHrGACZ4wN074Whztl0oYI72obGnmpcsiglifkeCvRPerghHuDu40eUaWvCmgD DPwT+bjBgxkDZ4IheyytUrPql6reALh1Qo1vfj0FsJAgj+MAqQeD8n6rixSPnOdE 9XXa4jdu7ycDp675VX/DVEWsNBQGPrsRK/uCiMksO36td+wLCKpAkvX95mE0w/L8 qFJ+dN1c+1ZkjooHdVLfq2MjxaIRwmIowk7SeJbpvGIf/zG3r7eany7eXJT0+NjO 22j5FY2z1NqcSG6Fazx76Qp2vVBVbxHShP9h7d6VTZYS7XENjmV6IWHpTSuSF8+n puj8Nf5C7WuqbySirSgQndDuKawn9myqfXXEoAuSiZ+kVyYEl8QnXm2gAIcxRDHX x+NDPMv0DpMBRO9qa/tXeqgNue/XOTJwgbmXzAlCNff3U7hPIHJ/5aZiJ/Re5TeE DxiU9AmIsNN2Bh0csS/wQbdScIqkOdOiDYEwT1DXOJWpmhiyCW7vR8ltaIuMJ4vP DtlfuUlSe4aml957nAiqqyjQAY/7gqmpoaGwu+lmrOX1K7fdtF0= =FeiG -----END PGP SIGNATURE----- Merge tag 'kvm-riscv-6.8-1' of https://github.com/kvm-riscv/linux into HEAD KVM/riscv changes for 6.8 part #1 - KVM_GET_REG_LIST improvement for vector registers - Generate ISA extension reg_list using macros in get-reg-list selftest - Steal time account support along with selftest	2024-01-02 13:19:40 -05:00
Paolo Bonzini	136292522e	LoongArch KVM changes for v6.8 1. Optimization for memslot hugepage checking. 2. Cleanup and fix some HW/SW timer issues. 3. Add LSX/LASX (128bit/256bit SIMD) support. -----BEGIN PGP SIGNATURE----- iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmWGu+0WHGNoZW5odWFj YWlAa2VybmVsLm9yZwAKCRAChivD8uImesO7D/wOdYP96R+mRzpLBeuTtFxU8e4A 3n2luxOeP8v1WYtQ9H8M01Wgly+9u6cJ2pgAlv79BQHfmCfC0aWQLmpnCZmk/mYW wtQ75ASA3Qg6zOBWEksCkA0LUdPDHfQuaaUXT7RYZ7QtHKSNkkhsw2nMCq6fgrXU RnZjGctjuxgYSqQtwzfYO2AjSBAfAq1MjSzCTULJ0KkE8o5Bg0KOoGj8ijC1U+ua QWBnqTNzeKmYmqAFfhXoiiFYcuBUq7DEk5RtwDU7SeqqJEV3a8AbbsrWfz+wMemG gri95uRxvnhpPZ+6/PrVjIezqexPJmQ9+tjY6mxh/bPRnS5ICFygjV3lt050JUK8 xIaJEFvl7g88RIz5mnTeM9tU4ibIsCLgA9zj33ps2H7QP5NazUm1dzk1YGAgqPdw m5hjwtTFQEujQM6cz1DLfhoi15VDNcYUonJIvGFZMhl7InitDpB3u9sI+AVGIVUG yKzBkqGB1L1vbJGnuWmspEqSUo7Z9iYzuVGbOnjc9LKQ/8OpLxj0brymYheA+CKG CIdULximQFVEHc2lbE+H+bW4hnrFP4sN9hlTng7KN7ommCIg+FltisM8Nt5NLWID 9ywLj4Qa0Qrc5vB3FJ8+ksuDe2nD83uVLj247R7B0wxQcYw4ocyW/YU+gayF4EjY 6azutwllW5ZB+I3hyw== =phol -----END PGP SIGNATURE----- Merge tag 'loongarch-kvm-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.8 1. Optimization for memslot hugepage checking. 2. Cleanup and fix some HW/SW timer issues. 3. Add LSX/LASX (128bit/256bit SIMD) support.	2024-01-02 13:16:29 -05:00
Andrew Jones	e9f12b5fff	RISC-V: KVM: Implement SBI STA extension Add a select SCHED_INFO to the KVM config in order to get run_delay info. Then implement SBI STA's set-steal-time-shmem function and kvm_riscv_vcpu_record_steal_time() to provide the steal-time info to guests. Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Anup Patel <anup@brainfault.org>	2023-12-30 11:26:38 +05:30
Andrew Jones	f61ce890b1	RISC-V: KVM: Add support for SBI STA registers KVM userspace needs to be able to save and restore the steal-time shared memory address. Provide the address through the get/set-one-reg interface with two ulong-sized SBI STA extension registers (lo and hi). 64-bit KVM userspace must not set the hi register to anything other than zero and is allowed to completely neglect saving/restoring it. Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Anup Patel <anup@brainfault.org>	2023-12-30 11:26:35 +05:30
Andrew Jones	5b9e41321b	RISC-V: KVM: Add support for SBI extension registers Some SBI extensions have state that needs to be saved / restored when migrating the VM. Provide a get/set-one-reg register type for SBI extension registers. Each SBI extension that uses this type will have its own subtype. There are currently no subtypes defined. The next patch introduces the first one. Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Anup Patel <anup@brainfault.org>	2023-12-30 11:26:33 +05:30
Andrew Jones	38b3390ee4	RISC-V: KVM: Add SBI STA info to vcpu_arch KVM's implementation of SBI STA needs to track the address of each VCPU's steal-time shared memory region as well as the amount of stolen time. Add a structure to vcpu_arch to contain this state and make sure that the address is always set to INVALID_GPA on vcpu reset. And, of course, ensure KVM won't try to update steal- time when the shared memory address is invalid. Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Anup Patel <anup@brainfault.org>	2023-12-30 11:26:26 +05:30
Andrew Jones	2a1f6bf079	RISC-V: KVM: Add steal-update vcpu request Add a new vcpu request to inform a vcpu that it should record its steal-time information. The request is made each time it has been detected that the vcpu task was not assigned a cpu for some time, which is easy to do by making the request from vcpu-load. The record function is just a stub for now and will be filled in with the rest of the steal-time support functions in following patches. Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Anup Patel <anup@brainfault.org>	2023-12-30 11:26:22 +05:30
Andrew Jones	5fed84a800	RISC-V: KVM: Add SBI STA extension skeleton Add the files and functions needed to support the SBI STA (steal-time accounting) extension. In the next patches we'll complete the functions to fully enable SBI STA support. Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Anup Patel <anup@brainfault.org>	2023-12-30 11:26:20 +05:30
Andrew Jones	fdf68acccf	RISC-V: paravirt: Implement steal-time support When the SBI STA extension exists we can use it to implement paravirt steal-time support. Fill in the empty pv-time functions with an SBI STA implementation and add the Kconfig knobs allowing it to be enabled. Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Anup Patel <anup@brainfault.org>	2023-12-30 11:26:04 +05:30
Andrew Jones	6cfc624576	RISC-V: Add SBI STA extension definitions The SBI STA extension enables steal-time accounting. Add the definitions it specifies. Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Anup Patel <anup@brainfault.org>	2023-12-30 11:25:09 +05:30
Andrew Jones	323925ed6d	RISC-V: paravirt: Add skeleton for pv-time support Add the files and functions needed to support paravirt time on RISC-V. Also include the common code needed for the first application of pv-time, which is steal-time. In the next patches we'll complete the functions to fully enable steal-time support. Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Reviewed-by: Anup Patel <anup@brainfault.org> Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Anup Patel <anup@brainfault.org>	2023-12-30 11:25:03 +05:30
Suren Baghdasaryan	46e714c729	arch/mm/fault: fix major fault accounting when retrying under per-VMA lock A test [1] in Android test suite started failing after [2] was merged. It turns out that after handling a major fault under per-VMA lock, the process major fault counter does not register that fault as major. Before [2] read faults would be done under mmap_lock, in which case FAULT_FLAG_TRIED flag is set before retrying. That in turn causes mm_account_fault() to account the fault as major once retry completes. With per-VMA locks we often retry because a fault can't be handled without locking the whole mm using mmap_lock. Therefore such retries do not set FAULT_FLAG_TRIED flag. This logic does not work after [2] because we can now handle read major faults under per-VMA lock and upon retry the fact there was a major fault gets lost. Fix this by setting FAULT_FLAG_TRIED after retrying under per-VMA lock if VM_FAULT_MAJOR was returned. Ideally we would use an additional VM_FAULT bit to indicate the reason for the retry (could not handle under per-VMA lock vs other reason) but this simpler solution seems to work, so keeping it simple. [1] https://cs.android.com/android/platform/superproject/+/master:test/vts-testcase/kernel/api/drop_caches_prop/drop_caches_test.cpp [2] https://lore.kernel.org/all/20231006195318.4087158-6-willy@infradead.org/ Link: https://lkml.kernel.org/r/20231226214610.109282-1-surenb@google.com Fixes: 12214eba1992 ("mm: handle read faults under the VMA lock") Signed-off-by: Suren Baghdasaryan <surenb@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-12-29 11:06:49 -08:00
Anup Patel	4c460eb369	RISC-V: KVM: Fix indentation in kvm_riscv_vcpu_set_reg_csr() The indentation of "break" in kvm_riscv_vcpu_set_reg_csr() is inconsistent hence let us fix it. Fixes: c04913f2b54e ("RISCV: KVM: Add sstateen0 to ONE_REG") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202312190719.kBuYl6oJ-lkp@intel.com/ Signed-off-by: Anup Patel <apatel@ventanamicro.com> Signed-off-by: Anup Patel <anup@brainfault.org>	2023-12-29 12:31:58 +05:30
Daniel Henrique Barboza	3975525e55	RISC-V: KVM: add vector registers and CSRs in KVM_GET_REG_LIST Add all vector registers and CSRs (vstart, vl, vtype, vcsr, vlenb) in get-reg-list. Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Anup Patel <anup@brainfault.org>	2023-12-29 12:31:56 +05:30
Daniel Henrique Barboza	2fa290372d	RISC-V: KVM: add 'vlenb' Vector CSR Userspace requires 'vlenb' to be able to encode it in reg ID. Otherwise it is not possible to retrieve any vector reg since we're returning EINVAL if reg_size isn't vlenb (see kvm_riscv_vcpu_vreg_addr()). Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Anup Patel <anup@brainfault.org>	2023-12-29 12:31:54 +05:30
Daniel Henrique Barboza	197bd237b6	RISC-V: KVM: set 'vlenb' in kvm_riscv_vcpu_alloc_vector_context() 'vlenb', added to riscv_v_ext_state by commit c35f3aa34509 ("RISC-V: vector: export VLENB csr in __sc_riscv_v_state"), isn't being initialized in guest_context. If we export 'vlenb' as a KVM CSR, something we want to do in the next patch, it'll always return 0. Set 'vlenb' to riscv_v_size/32. Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Anup Patel <anup@brainfault.org>	2023-12-29 12:31:53 +05:30
Andrew Jones	23e1dc4502	RISC-V: KVM: Make SBI uapi consistent with ISA uapi When an SBI extension cannot be enabled, that's a distinct state vs. enabled and disabled. Modify enum kvm_riscv_sbi_ext_status to accommodate it, which allows KVM userspace to tell the difference in state too, as the SBI extension register will disappear when it cannot be enabled, i.e. accesses to it return ENOENT. get-reg-list is updated as well to only add SBI extension registers to the list which may be enabled. Returning ENOENT for SBI extension registers which cannot be enabled makes them consistent with ISA extension registers. Any SBI extensions which were enabled by default are still enabled by default, if they can be enabled at all. Signed-off-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Anup Patel <anup@brainfault.org> Signed-off-by: Anup Patel <anup@brainfault.org>	2023-12-29 12:31:44 +05:30

1 2 3 4 5 ...

3257 Commits