IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Pull documentation updates from Jonathan Corbet:
"Documentation work keeps chugging along; this includes:
- Work from Carlos Bilbao to integrate rustdoc output into the
generated HTML documentation. This took some work to figure out how
to do it without slowing the docs build and without creating people
who don't have Rust installed, but Carlos got there
- Move the loongarch and mips architecture documentation under
Documentation/arch/
- Some more maintainer documentation from Jakub
... plus the usual assortment of updates, translations, and fixes"
* tag 'docs-6.6' of git://git.lwn.net/linux: (56 commits)
Docu: genericirq.rst: fix irq-example
input: docs: pxrc: remove reference to phoenix-sim
Documentation: serial-console: Fix literal block marker
docs/mm: remove references to hmm_mirror ops and clean typos
docs/zh_CN: correct regi_chg(),regi_add() to region_chg(),region_add()
Documentation: Fix typos
Documentation/ABI: Fix typos
scripts: kernel-doc: fix macro handling in enums
scripts: kernel-doc: parse DEFINE_DMA_UNMAP_[ADDR|LEN]
Documentation: riscv: Update boot image header since EFI stub is supported
Documentation: riscv: Add early boot document
Documentation: arm: Add bootargs to the table of added DT parameters
docs: kernel-parameters: Refer to the correct bitmap function
doc: update params of memhp_default_state=
docs: Add book to process/kernel-docs.rst
docs: sparse: fix invalid link addresses
docs: vfs: clean up after the iterate() removal
docs: Add a section on surveys to the researcher guidelines
docs: move mips under arch
docs: move loongarch under arch
...
On x86, batched and deferred tlb shootdown has lead to 90% performance
increase on tlb shootdown. on arm64, HW can do tlb shootdown without
software IPI. But sync tlbi is still quite expensive.
Even running a simplest program which requires swapout can
prove this is true,
#include <sys/types.h>
#include <unistd.h>
#include <sys/mman.h>
#include <string.h>
int main()
{
#define SIZE (1 * 1024 * 1024)
volatile unsigned char *p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
memset(p, 0x88, SIZE);
for (int k = 0; k < 10000; k++) {
/* swap in */
for (int i = 0; i < SIZE; i += 4096) {
(void)p[i];
}
/* swap out */
madvise(p, SIZE, MADV_PAGEOUT);
}
}
Perf result on snapdragon 888 with 8 cores by using zRAM
as the swap block device.
~ # perf record taskset -c 4 ./a.out
[ perf record: Woken up 10 times to write data ]
[ perf record: Captured and wrote 2.297 MB perf.data (60084 samples) ]
~ # perf report
# To display the perf.data header info, please use --header/--header-only options.
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 60K of event 'cycles'
# Event count (approx.): 35706225414
#
# Overhead Command Shared Object Symbol
# ........ ....... ................. ......
#
21.07% a.out [kernel.kallsyms] [k] _raw_spin_unlock_irq
8.23% a.out [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore
6.67% a.out [kernel.kallsyms] [k] filemap_map_pages
6.16% a.out [kernel.kallsyms] [k] __zram_bvec_write
5.36% a.out [kernel.kallsyms] [k] ptep_clear_flush
3.71% a.out [kernel.kallsyms] [k] _raw_spin_lock
3.49% a.out [kernel.kallsyms] [k] memset64
1.63% a.out [kernel.kallsyms] [k] clear_page
1.42% a.out [kernel.kallsyms] [k] _raw_spin_unlock
1.26% a.out [kernel.kallsyms] [k] mod_zone_state.llvm.8525150236079521930
1.23% a.out [kernel.kallsyms] [k] xas_load
1.15% a.out [kernel.kallsyms] [k] zram_slot_lock
ptep_clear_flush() takes 5.36% CPU in the micro-benchmark swapping in/out
a page mapped by only one process. If the page is mapped by multiple
processes, typically, like more than 100 on a phone, the overhead would be
much higher as we have to run tlb flush 100 times for one single page.
Plus, tlb flush overhead will increase with the number of CPU cores due to
the bad scalability of tlb shootdown in HW, so those ARM64 servers should
expect much higher overhead.
Further perf annonate shows 95% cpu time of ptep_clear_flush is actually
used by the final dsb() to wait for the completion of tlb flush. This
provides us a very good chance to leverage the existing batched tlb in
kernel. The minimum modification is that we only send async tlbi in the
first stage and we send dsb while we have to sync in the second stage.
With the above simplest micro benchmark, collapsed time to finish the
program decreases around 5%.
Typical collapsed time w/o patch:
~ # time taskset -c 4 ./a.out
0.21user 14.34system 0:14.69elapsed
w/ patch:
~ # time taskset -c 4 ./a.out
0.22user 13.45system 0:13.80elapsed
Also tested with benchmark in the commit on Kunpeng920 arm64 server
and observed an improvement around 12.5% with command
`time ./swap_bench`.
w/o w/
real 0m13.460s 0m11.771s
user 0m0.248s 0m0.279s
sys 0m12.039s 0m11.458s
Originally it's noticed a 16.99% overhead of ptep_clear_flush()
which has been eliminated by this patch:
[root@localhost yang]# perf record -- ./swap_bench && perf report
[...]
16.99% swap_bench [kernel.kallsyms] [k] ptep_clear_flush
It is tested on 4,8,128 CPU platforms and shows to be beneficial on
large systems but may not have improvement on small systems like on
a 4 CPU platform.
Also this patch improve the performance of page migration. Using pmbench
and tries to migrate the pages of pmbench between node 0 and node 1 for
100 times for 1G memory, this patch decrease the time used around 20%
(prev 18.338318910 sec after 13.981866350 sec) and saved the time used
by ptep_clear_flush().
Link: https://lkml.kernel.org/r/20230717131004.12662-5-yangyicong@huawei.com
Tested-by: Yicong Yang <yangyicong@hisilicon.com>
Tested-by: Xin Hao <xhao@linux.alibaba.com>
Tested-by: Punit Agrawal <punit.agrawal@bytedance.com>
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Xin Hao <xhao@linux.alibaba.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Nadav Amit <namit@vmware.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Barry Song <baohua@kernel.org>
Cc: Darren Hart <darren@os.amperecomputing.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: lipeifeng <lipeifeng@oppo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zeng Tao <prime.zeng@hisilicon.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pull RISC-V updates from Palmer Dabbelt:
- Support for the T-Head PMU via the perf subsystem
- ftrace support for rv32
- Support for non-volatile memory devices
- Various fixes and cleanups
* tag 'riscv-for-linus-6.2-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (52 commits)
Documentation: RISC-V: patch-acceptance: s/implementor/implementer
Documentation: RISC-V: Mention the UEFI Standards
Documentation: RISC-V: Allow patches for non-standard behavior
Documentation: RISC-V: Fix a typo in patch-acceptance
riscv: Fixup compile error with !MMU
riscv: Fix P4D_SHIFT definition for 3-level page table mode
riscv: Apply a static assert to riscv_isa_ext_id
RISC-V: Add some comments about the shadow and overflow stacks
RISC-V: Align the shadow stack
RISC-V: Ensure Zicbom has a valid block size
RISC-V: Introduce riscv_isa_extension_check
RISC-V: Improve use of isa2hwcap[]
riscv: Don't duplicate _ALTERNATIVE_CFG* macros
riscv: alternatives: Drop the underscores from the assembly macro names
riscv: alternatives: Don't name unused macro parameters
riscv: Don't duplicate __ALTERNATIVE_CFG in __ALTERNATIVE_CFG_2
riscv: mm: call best_map_size many times during linear-mapping
riscv: Move cast inside kernel_mapping_[pv]a_to_[vp]a
riscv: Fix crash during early errata patching
riscv: boot: add zstd support
...
* 'remove-h8300' of git://git.infradead.org/users/hch/misc:
remove the h8300 architecture
This is clearly the least actively maintained architecture we have at
the moment, and probably the least useful. It is now the only one that
does not support MMUs at all, and most of the boards only support 4MB
of RAM, out of which the defconfig kernel needs more than half just
for .text/.data.
Guenter Roeck did the original patch to remove the architecture in 2013
after it had already been obsolete for a while, and Yoshinori Sato brought
it back in a much more modern form in 2015. Looking at the git history
since the reinstantiation, it's clear that almost all commits in the tree
are build fixes or cross-architecture cleanups:
$ git log --no-merges --format=%an v4.5.. arch/h8300/ | sort | uniq
-c | sort -rn | head -n 12
25 Masahiro Yamada
18 Christoph Hellwig
14 Mike Rapoport
9 Arnd Bergmann
8 Mark Rutland
7 Peter Zijlstra
6 Kees Cook
6 Ingo Molnar
6 Al Viro
5 Randy Dunlap
4 Yury Norov
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
The nds32 architecture, also known as AndeStar V3, is a custom 32-bit
RISC target designed by Andes Technologies. Support was added to the
kernel in 2016 as the replacement RISC-V based V5 processors were
already announced, and maintained by (current or former) Andes
employees.
As explained by Alan Kao, new customers are now all using RISC-V,
and all known nds32 users are already on longterm stable kernels
provided by Andes, with no development work going into mainline
support any more.
While the port is still in a reasonably good shape, it only gets
worse over time without active maintainers, so it seems best
to remove it before it becomes unusable. As always, if it turns
out that there are mainline users after all, and they volunteer
to maintain the port in the future, the removal can be reverted.
Link: https://lore.kernel.org/linux-mm/YhdWNLUhk+x9RAzU@yamatobi.andestech.com/
Link: https://lore.kernel.org/lkml/20220302065213.82702-1-alankao@andestech.com/
Link: https://www.andestech.com/en/products-solutions/andestar-architecture/
Signed-off-by: Alan Kao <alankao@andestech.com>
[arnd: rewrite changelog to provide more background]
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Pull more RISC-V updates from Palmer Dabbelt:
- A pair of defconfig additions, for NVMe and the EFI filesystem
localization options.
- A larger address space for stack randomization.
- A cleanup to our install rules.
- A DTS update for the Microchip Icicle board, to fix the serial
console.
- Support for build-time table sorting, which allows us to have
__ex_table read-only.
* tag 'riscv-for-linus-5.15-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
riscv: Move EXCEPTION_TABLE to RO_DATA segment
riscv: Enable BUILDTIME_TABLE_SORT
riscv: dts: microchip: mpfs-icicle: Fix serial console
riscv: move the (z)install rules to arch/riscv/Makefile
riscv: Improve stack randomisation on RV64
riscv: defconfig: enable NLS_CODEPAGE_437, NLS_ISO8859_1
riscv: defconfig: enable BLK_DEV_NVME
This enlarges the bits availiable for stack randomisation on RV64 from
the default of 8MiB to 1GiB, to match arm64 and x86.
Also, update the documentation to reflect our support for stack
randomisation.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
[Palmer: commit text]
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
BATCHED_UNMAP_TLB_FLUSH is used on x86 to do batched tlb shootdown by
sending one IPI to TLB flush all entries after unmapping pages rather
than sending an IPI to flush each individual entry.
On arm64, tlb shootdown is done by hardware. Flush instructions are
innershareable. The local flushes are limited to the boot (1 per CPU)
and when a task is getting a new ASID.
So marking this feature as "TODO" is not proper. ".." isn't good as
well. So this patch adds a "N/A" for this kind of features which are
not needed on some architectures.
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Acked-by: Will Deacon <will@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210223003230.11976-1-song.bao.hua@hisilicon.com
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
With our current support for the new MIO PCI instructions, write
combining/write back MMIO memory can be obtained via the pci_iomap_wc()
and pci_iomap_wc_range() functions.
This is achieved by using the write back address for a specific bar
as provided in clp_store_query_pci_fn()
These functions are however not widely used and instead drivers often
rely on ioremap_wc() and ioremap_prot(), which on other platforms enable
write combining using a PTE flag set through the pgrprot value.
While we do not have a write combining flag in the low order flag bits
of the PTE like x86_64 does, with MIO support, there is a write back bit
in the physical address (bit 1 on z15) and thus also the PTE.
Which bit is used to toggle write back and whether it is available at
all, is however not fixed in the architecture. Instead we get this
information from the CLP Store Logical Processor Characteristics for PCI
command. When the write back bit is not provided we fall back to the
existing behavior.
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
The unicore32 port do not seem maintained for a long time now, there is no
upstream toolchain that can create unicore32 binaries and all the links to
prebuilt toolchains for unicore32 are dead. Even compilers that were
available are not supported by the kernel anymore.
Guenter Roeck says:
I have stopped building unicore32 images since v4.19 since there is no
available compiler that is still supported by the kernel. I am surprised
that support for it has not been removed from the kernel.
Remove unicore32 port.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Risc-V architecture has actually supported pte_special since its merge
upstream, simply add this info to the documentation.
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Allows the users of ptrace to access memory mapped by the ptraced process
using the same cache coherency attributes as the original process.
For example while using gdb with ioremap_prot() incorporated, both gdb and
the process being traced will have same cache coherency attributes.
Signed-off-by: Hassan Naveed <hnaveed@wavecomp.com>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Patchwork: https://patchwork.linux-mips.org/patch/20955/
Cc: <linux-mips@linux-mips.org>
The removal of this file appears to have been premature; it's not a feature
enabled by Kconfig, but it's a arch-level feature regardless. Put it back
for now until some happy future time when we decide how we really want to
document such features.
This reverts commit 2bef69a385.
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
A number of architecture ports are obsolete and getting dropped,
so we no longer want to track the respective features.
We already removed the lines for metag and mn10300, this does
the same edits for all the others.
For the remaining 21 architectures, this shows how many are known
to implement each given feature:
19 time/modern-timekeeping/arch-support.txt
19 time/clockevents/arch-support.txt
15 core/tracehook/arch-support.txt
14 core/generic-idle-thread/arch-support.txt
13 locking/lockdep/arch-support.txt
12 io/dma-api-debug/arch-support.txt
11 debug/kgdb/arch-support.txt
10 time/virt-cpuacct/arch-support.txt
9 debug/kretprobes/arch-support.txt
9 debug/kprobes/arch-support.txt
8 vm/THP/arch-support.txt
8 vm/pte_special/arch-support.txt
8 vm/numa-memblock/arch-support.txt
8 io/sg-chain/arch-support.txt
7 perf/kprobes-event/arch-support.txt
7 locking/rwsem-optimized/arch-support.txt
7 debug/gcov-profile-all/arch-support.txt
7 core/jump-labels/arch-support.txt
7 core/BPF-JIT/arch-support.txt
6 vm/ELF-ASLR/arch-support.txt
6 time/context-tracking/arch-support.txt
6 seccomp/seccomp-filter/arch-support.txt
6 debug/stackprotector/arch-support.txt
5 time/irq-time-acct/arch-support.txt
5 io/dma-contiguous/arch-support.txt
5 debug/uprobes/arch-support.txt
4 vm/ioremap_prot/arch-support.txt
4 time/arch-tick-broadcast/arch-support.txt
4 perf/perf-stackdump/arch-support.txt
4 perf/perf-regs/arch-support.txt
3 debug/KASAN/arch-support.txt
2 vm/PG_uncached/arch-support.txt
2 vm/huge-vmap/arch-support.txt
2 sched/numa-balancing/arch-support.txt
2 sched/membarrier-sync-core/arch-support.txt
2 locking/cmpxchg-local/arch-support.txt
2 debug/optprobes/arch-support.txt
2 debug/kprobes-on-ftrace/arch-support.txt
1 vm/TLB/arch-support.txt
1 locking/queued-spinlocks/arch-support.txt
1 locking/queued-rwlocks/arch-support.txt
1 debug/user-ret-profiler/arch-support.txt
0 lib/strncasecmp/arch-support.txt
Note that the list does not include riscv or nds32 yet, these still
need to be added.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
This wires up the existing generic huge-vmap feature, which allows
ioremap() to use PMD or PUD sized block mappings. It also adds support
to the unmap path for dealing with block mappings, which will allow us
to unmap the __init region using unmap_kernel_range() in a subsequent
patch.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Pull metag updates from James Hogan:
"Metag architecture changes for v4.3.
Just a couple of changes for v4.3-rc1. A preparatory IRQ patch to
prepare for moving irq_data struct members, and a tweak to
Documentation/features since Meta2 could support THP"
* tag 'metag-for-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag:
Documentation/features/vm: Meta2 is capable of THP
metag/irq: Use access helper irq_data_get_affinity_mask()
Change metag Transparent Huge Pages (THP) support from .. to TODO. Meta2
has variable sized pages, between 4KB and 4MB, specified at the 1st
level page table level, and already supports hugetlbfs, so supporting
THP is theoretically possible too.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-metag@vger.kernel.org
Cc: linux-doc@vger.kernel.org