IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
guoren@kernel.org <guoren@kernel.org> says:
From: Guo Ren <guoren@linux.alibaba.com>
When the task is in COMPAT mode, the TASK_SIZE should be 2GB, so
STACK_TOP_MAX and arch_get_mmap_end must be limited to 2 GB. This series
fixes the problem made by commit: add2cc6b6515 ("RISC-V: mm: Restrict
address space for sv39,sv48,sv57") and optimizes the related coding
convention of TASK_SIZE.
* b4-shazam-merge:
riscv: mm: Fixup compat arch_get_mmap_end
riscv: mm: Fixup compat mode boot failure
Link: https://lore.kernel.org/r/20231222115703.2404036-1-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
When the task is in COMPAT mode, the arch_get_mmap_end should be 2GB,
not TASK_SIZE_64. The TASK_SIZE has contained is_compat_mode()
detection, so change the definition of STACK_TOP_MAX to TASK_SIZE
directly.
Cc: stable@vger.kernel.org
Fixes: add2cc6b6515 ("RISC-V: mm: Restrict address space for sv39,sv48,sv57")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20231222115703.2404036-3-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
In COMPAT mode, the STACK_TOP is DEFAULT_MAP_WINDOW (0x80000000), but
the TASK_SIZE is 0x7fff000. When the user stack is upon 0x7fff000, it
will cause a user segment fault. Sometimes, it would cause boot
failure when the whole rootfs is rv32.
Freeing unused kernel image (initmem) memory: 2236K
Run /sbin/init as init process
Starting init: /sbin/init exists but couldn't execute it (error -14)
Run /etc/init as init process
...
Increase the TASK_SIZE to cover STACK_TOP.
Cc: stable@vger.kernel.org
Fixes: add2cc6b6515 ("RISC-V: mm: Restrict address space for sv39,sv48,sv57")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20231222115703.2404036-2-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The ending NULL is not taken into account by strncat(), so switch to
strlcat() to correctly compute the size of the available memory when
appending CONFIG_CMDLINE to 'early_cmdline'.
Fixes: 26e7aacb83df ("riscv: Allow to downgrade paging mode from the command line")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/9f66d2b58c8052d4055e90b8477ee55d9a0914f9.1698564026.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Christoph Muellner <christoph.muellner@vrull.eu> says:
From: Christoph Müllner <christoph.muellner@vrull.eu>
When building the RISC-V selftests with a riscv32 compiler I ran into
a couple of compiler warnings. While riscv32 support for these tests is
questionable, the fixes are so trivial that it is probably best to simply
apply them.
Note that the missing-include patch and some format string warnings
are also relevant for riscv64.
* b4-shazam-merge:
tools: selftests: riscv: Fix compile warnings in mm tests
tools: selftests: riscv: Fix compile warnings in vector tests
tools: selftests: riscv: Add missing include for vector test
tools: selftests: riscv: Fix compile warnings in cbo
tools: selftests: riscv: Fix compile warnings in hwprobe
Link: https://lore.kernel.org/r/20231123185821.2272504-1-christoph.muellner@vrull.eu
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
When building the mm tests with a riscv32 compiler, we see a range
of shift-count-overflow errors from shifting 1UL by more than 32 bits
in do_mmaps(). Since, the relevant code is only called from code that
is gated by `__riscv_xlen == 64`, we can just apply the same gating
to do_mmaps().
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231123185821.2272504-6-christoph.muellner@vrull.eu
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
GCC prints a couple of format string warnings when compiling
the vector tests. Let's follow the recommendation in
Documentation/printk-formats.txt to fix these warnings.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231123185821.2272504-5-christoph.muellner@vrull.eu
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
GCC raises the following warning:
warning: 'status' may be used uninitialized
The warning comes from the fact, that the signature of waitpid() is
unknown and therefore the initialization of GCC cannot be guessed.
Let's add the relevant header to address this warning.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231123185821.2272504-4-christoph.muellner@vrull.eu
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
GCC prints a couple of format string warnings when compiling
the cbo test. Let's follow the recommendation in
Documentation/printk-formats.txt to fix these warnings.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231123185821.2272504-3-christoph.muellner@vrull.eu
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
GCC prints a couple of format string warnings when compiling
the hwprobe test. Let's follow the recommendation in
Documentation/printk-formats.txt to fix these warnings.
Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231123185821.2272504-2-christoph.muellner@vrull.eu
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Allow to defer the flushing of the TLB when unmapping pages, which allows
to reduce the numbers of IPI and the number of sfence.vma.
The ubenchmarch used in commit 43b3dfdd0455 ("arm64: support
batched/deferred tlb shootdown during page reclamation/migration") that
was multithreaded to force the usage of IPI shows good performance
improvement on all platforms:
* Unmatched: ~34%
* TH1520 : ~78%
* Qemu : ~81%
In addition, perf on qemu reports an important decrease in time spent
dealing with IPIs:
Before: 68.17% main [kernel.kallsyms] [k] __sbi_rfence_v02_call
After : 8.64% main [kernel.kallsyms] [k] __sbi_rfence_v02_call
* Benchmark:
int stick_this_thread_to_core(int core_id) {
int num_cores = sysconf(_SC_NPROCESSORS_ONLN);
if (core_id < 0 || core_id >= num_cores)
return EINVAL;
cpu_set_t cpuset;
CPU_ZERO(&cpuset);
CPU_SET(core_id, &cpuset);
pthread_t current_thread = pthread_self();
return pthread_setaffinity_np(current_thread,
sizeof(cpu_set_t), &cpuset);
}
static void *fn_thread (void *p_data)
{
int ret;
pthread_t thread;
stick_this_thread_to_core((int)p_data);
while (1) {
sleep(1);
}
return NULL;
}
int main()
{
volatile unsigned char *p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
pthread_t threads[4];
int ret;
for (int i = 0; i < 4; ++i) {
ret = pthread_create(&threads[i], NULL, fn_thread, (void *)i);
if (ret)
{
printf("%s", strerror (ret));
}
}
memset(p, 0x88, SIZE);
for (int k = 0; k < 10000; k++) {
/* swap in */
for (int i = 0; i < SIZE; i += 4096) {
(void)p[i];
}
/* swap out */
madvise(p, SIZE, MADV_PAGEOUT);
}
for (int i = 0; i < 4; i++)
{
pthread_cancel(threads[i]);
}
for (int i = 0; i < 4; i++)
{
pthread_join(threads[i], NULL);
}
return 0;
}
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Tested-by: Jisheng Zhang <jszhang@kernel.org> # Tested on TH1520
Tested-by: Nam Cao <namcao@linutronix.de>
Link: https://lore.kernel.org/r/20240108193640.344929-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
This will allow better TLB utilization and then should be more performant.
Before:
---[ vmemmap start ]---
0xffff8d8002000000-0xffff8d8012000000 0x000000046ec00000 256M PTE . .. .. D A G . . W R V
---[ vmemmap end ]---
After:
---[ vmemmap start ]---
0xffff8d8002000000-0xffff8d8012000000 0x000000046ec00000 256M PMD . .. .. D A G . . W R V
---[ vmemmap end ]---
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20231214132935.212864-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Following the examples of cbom-block-size and cboz-block-size,
cbop-block-size is the cache size of Zicbop (cbo.prefetch) operations.
The most common case is to have all cache block sizes to be the same
size (e.g. profiles such as rva22u64 mandates a 64 bytes size for all
cache operations), but there's no specification requirement for that,
and an implementation can have different cache sizes for each operation.
Cc: Rob Herring <robh@kernel.org>
Cc: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231029123500.739409-1-dbarboza@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Jisheng Zhang <jszhang@kernel.org> says:
Previously, we use alternative mechanism to dynamically patch
the CMO operations for THEAD C906/C910 during boot for performance
reason. But as pointed out by Arnd, "there is already a significant
cost in accessing the invalidated cache lines afterwards, which is
likely going to be much higher than the cost of an indirect branch".
And indeed, there's no performance difference with GMAC and EMMC per
my test on Sipeed Lichee Pi 4A board.
Use riscv_nonstd_cache_ops for THEAD C906/C910 CMO to simplify
the alternative code, and to acchieve Arnd's goal -- "I think
moving the THEAD ops at the same level as all nonstandard operations
makes sense, but I'd still leave CMO as an explicit fast path that
avoids the indirect branch. This seems like the right thing to do both
for readability and for platforms on which the indirect branch has a
noticeable overhead."
To make bisect easy, I use two patches here: patch1 does the conversion
which just mimics current CMO behavior via. riscv_nonstd_cache_ops, I
assume no functionalities changes. patch2 uses T-HEAD PA based CMO
instructions so that we don't need to covert PA to VA.
* b4-shazam-merge:
riscv: errata: thead: use pa based instructions for CMO
riscv: errata: thead: use riscv_nonstd_cache_ops for CMO
Link: https://lore.kernel.org/r/20231114143338.2406-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
There are some extensions that contain numbers, such as Zve32f, which
are enabled by the "max" cpu type in QEMU.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20231208-uncolored-oxidant-5ab37dd3ab84@spud
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The current description implies that only a single address translation
mode is available to the operating system. However, some implementations
support multiple address translation modes, and the operating system is
free to choose between them.
Per the RISC-V privileged specification, Sv48 implementations must also
implement Sv39, and likewise Sv57 implies support for Sv48. This means
it is possible to describe all supported address translation modes using
a single value, by naming the largest supported mode. This appears to
have been the intended usage of the property, so note it explicitly.
Fixes: 4fd669a8c487 ("dt-bindings: riscv: convert cpu binding to json-schema")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20231227175739.1453782-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Anup Patel <apatel@ventanamicro.com> says:
The SBI v2.0 specification is now frozen. The SBI v2.0 specification defines
SBI debug console (DBCN) extension which replaces the legacy SBI v0.1
functions sbi_console_putchar() and sbi_console_getchar().
(Refer v2.0-rc5 at https://github.com/riscv-non-isa/riscv-sbi-doc/releases)
This series adds support for SBI debug console (DBCN) extension in
Linux RISC-V.
To try these patches with KVM RISC-V, use KVMTOOL from the
riscv_zbx_zicntr_smstateen_condops_v1 branch at:
https://github.com/avpatel/kvmtool.git
* b4-shazam-merge:
RISC-V: Enable SBI based earlycon support
tty: Add SBI debug console support to HVC SBI driver
tty/serial: Add RISC-V SBI debug console based earlycon
RISC-V: Add SBI debug console helper routines
RISC-V: Add stubs for sbi_console_putchar/getchar()
Link: https://lore.kernel.org/r/20231124070905.1043092-1-apatel@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
When the SUSP SBI extension is present it implies that the standard
"suspend to RAM" type is available. Wire it up to the generic
platform suspend support, also applying the already present support
for non-retentive CPU suspend. When the kernel is built with
CONFIG_SUSPEND, one can do 'echo mem > /sys/power/state' to suspend.
Resumption will occur when a platform-specific wake-up event arrives.
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Tested-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20231206110807.35882-4-ajones@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Charlie Jenkins <charlie@rivosinc.com> says:
When modules are loaded while there is not ample allocatable memory,
there was previously not proper error handling. This series fixes a
use-after-free error and a different issue that caused a non graceful
exit after memory was not properly allocated.
* b4-shazam-merge:
riscv: Fix relocation_hashtable size
riscv: Correctly free relocation hashtable on error
riscv: Fix module loading free order
Link: https://lore.kernel.org/r/20240104-module_loading_fix-v3-0-a71f8de6ce0f@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Jisheng Zhang <jszhang@kernel.org> says:
Some riscv implementations such as T-HEAD's C906, C908, C910 and C920
support efficient unaligned access, for performance reason we want
to enable HAVE_EFFICIENT_UNALIGNED_ACCESS on these platforms. To
avoid performance regressions on non efficient unaligned access
platforms, HAVE_EFFICIENT_UNALIGNED_ACCESS can't be globally selected.
To solve this problem, runtime code patching based on the detected
speed is a good solution. But that's not easy, it involves lots of
work to modify vairous subsystems such as net, mm, lib and so on.
This can be done step by step.
So let's take an easier solution: add support to efficient unaligned
access and hide the support under NONPORTABLE.
patch1 introduces RISCV_EFFICIENT_UNALIGNED_ACCESS which depends on
NONPORTABLE, if users know during config time that the kernel will be
only run on those efficient unaligned access hw platforms, they can
enable it. Obviously, generic unified kernel Image shouldn't enable it.
patch2 adds support DCACHE_WORD_ACCESS when MMU and
RISCV_EFFICIENT_UNALIGNED_ACCESS.
Below test program and step shows how much performance can be improved:
$ cat tt.c
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#define ITERATIONS 1000000
#define PATH "123456781234567812345678123456781"
int main(void)
{
unsigned long i;
struct stat buf;
for (i = 0; i < ITERATIONS; i++)
stat(PATH, &buf);
return 0;
}
$ gcc -O2 tt.c
$ touch 123456781234567812345678123456781
$ time ./a.out
Per my test on T-HEAD C910 platforms, the above test performance is
improved by about 7.5%.
* b4-shazam-merge:
riscv: select DCACHE_WORD_ACCESS for efficient unaligned access HW
riscv: introduce RISCV_EFFICIENT_UNALIGNED_ACCESS
Link: https://lore.kernel.org/r/20231225044207.3821-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
T-HEAD CPUs such as C906/C910/C920 support phy address based CMO, use
them so that we don't need to convert to virt address.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20231114143338.2406-3-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Previously, we use alternative mechanism to dynamically patch
the CMO operations for THEAD C906/C910 during boot for performance
reason. But as pointed out by Arnd, "there is already a significant
cost in accessing the invalidated cache lines afterwards, which is
likely going to be much higher than the cost of an indirect branch".
And indeed, there's no performance difference with GMAC and EMMC per
my test on Sipeed Lichee Pi 4A board.
Use riscv_nonstd_cache_ops for THEAD C906/C910 CMO to simplify
the alternative code, and to acchieve Arnd's goal -- "I think
moving the THEAD ops at the same level as all nonstandard operations
makes sense, but I'd still leave CMO as an explicit fast path that
avoids the indirect branch. This seems like the right thing to do both
for readability and for platforms on which the indirect branch has a
noticeable overhead."
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Tested-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20231114143338.2406-2-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Let us enable SBI based earlycon support in defconfig for both RV32
and RV64 so that "earlycon=sbi" can be used again.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231124070905.1043092-6-apatel@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
We extend the existing RISC-V SBI earlycon support to use the new
RISC-V SBI debug console extension.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20231124070905.1043092-4-apatel@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Let us provide SBI debug console helper routines which can be
shared by serial/earlycon-riscv-sbi.c and hvc/hvc_riscv_sbi.c.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231124070905.1043092-3-apatel@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The functions sbi_console_putchar() and sbi_console_getchar() are
not defined when CONFIG_RISCV_SBI_V01 is disabled so let us add
stub of these functions to avoid "#ifdef" on user side.
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20231124070905.1043092-2-apatel@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
When there is not enough allocatable memory for the relocation
hashtable, module loading should exit gracefully. Previously, this was
attempted to be accomplished by checking if an unsigned number is less
than zero which does not work. Instead have the caller check if the
hashtable was correctly allocated and add a comment explaining that
hashtable_bits that is 0 is valid.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Fixes: d8792a5734b0 ("riscv: Safely remove entries from relocation list")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/r/202312132019.iYGTwW0L-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Julia Lawall <julia.lawall@inria.fr>
Closes: https://lore.kernel.org/r/202312120044.wTI1Uyaa-lkp@intel.com/
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Link: https://lore.kernel.org/r/20240104-module_loading_fix-v3-2-a71f8de6ce0f@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
DCACHE_WORD_ACCESS uses the word-at-a-time API for optimised string
comparisons in the vfs layer.
This patch implements support for load_unaligned_zeropad in much the
same way as has been done for arm64.
Here is the test program and step:
$ cat tt.c
#include <sys/types.h>
#include <sys/stat.h>
#include <unistd.h>
#define ITERATIONS 1000000
#define PATH "123456781234567812345678123456781"
int main(void)
{
unsigned long i;
struct stat buf;
for (i = 0; i < ITERATIONS; i++)
stat(PATH, &buf);
return 0;
}
$ gcc -O2 tt.c
$ touch 123456781234567812345678123456781
$ time ./a.out
Per my test on T-HEAD C910 platforms, the above test performance is
improved by about 7.5%.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Link: https://lore.kernel.org/r/20231225044207.3821-3-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Some riscv implementations such as T-HEAD's C906, C908, C910 and C920
support efficient unaligned access, for performance reason we want
to enable HAVE_EFFICIENT_UNALIGNED_ACCESS on these platforms. To
avoid performance regressions on other non efficient unaligned access
platforms, HAVE_EFFICIENT_UNALIGNED_ACCESS can't be globally selected.
To solve this problem, runtime code patching based on the detected
speed is a good solution. But that's not easy, it involves lots of
work to modify vairous subsystems such as net, mm, lib and so on.
This can be done step by step.
So let's take an easier solution: add support to efficient unaligned
access and hide the support under NONPORTABLE.
Now let's introduce RISCV_EFFICIENT_UNALIGNED_ACCESS which depends on
NONPORTABLE, if users know during config time that the kernel will be
only run on those efficient unaligned access hw platforms, they can
enable it. Obviously, generic unified kernel Image shouldn't enable it.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20231225044207.3821-2-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Clément Léger <cleger@rivosinc.com> says:
This series add support for a few more extensions that are present in
the RVA22U64/RVA23U64 (either mandatory or optional) and that are useful
for userspace:
- Zicond
- Zacas
- Ztso
Series currently based on riscv/for-next.
* b4-shazam-lts:
riscv: hwprobe: export Zicond extension
riscv: hwprobe: export Zacas ISA extension
riscv: add ISA extension parsing for Zacas
dt-bindings: riscv: add Zacas ISA extension description
riscv: hwprobe: export Ztso ISA extension
riscv: add ISA extension parsing for Ztso
Link: https://lore.kernel.org/r/20231220155723.684081-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Add description for the Zacas ISA extension which was ratified recently.
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20231220155723.684081-4-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Add support to parse the Ztso string in the riscv,isa string. The
bindings already supports it but not the ISA parsing code.
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20231220155723.684081-2-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
asm-generic/export.h is a wrapper for linux/export.h, with explicit request
to use linux/export.h directly.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Link: https://lore.kernel.org/r/20231214191922.GQ1674809@ZenIV
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Frederik Haxel <haxel@fzi.de> says:
XIP boot seems to be broken for some time now. A likely reason why no one
seems to have noticed this is that XIP is more difficult to test, as it is
currently not easily testable with QEMU.
These patches fix the XIP boot and allow an XIP build without BUILTIN_DTB,
which in turn makes it easier to test an image with the QEMU virt machine.
* b4-shazam-merge:
riscv: Allow disabling of BUILTIN_DTB for XIP
riscv: Fixed wrong register in XIP_FIXUP_FLASH_OFFSET macro
riscv: Make XIP bootable again
Link: https://lore.kernel.org/r/20231212130116.848530-1-haxel@fzi.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
I don't usually merge these in, but I missed sending a PR due to the
holidays.
* palmer/fixes:
riscv: Fix set_direct_map_default_noflush() to reset _PAGE_EXEC
riscv: Fix module_alloc() that did not reset the linear mapping permissions
riscv: Fix wrong usage of lm_alias() when splitting a huge linear mapping
riscv: Check if the code to patch lies in the exit section
riscv: errata: andes: Probe for IOCP only once in boot stage
riscv: Fix SMP when shadow call stacks are enabled
dt-bindings: perf: riscv,pmu: drop unneeded quotes
riscv: fix misaligned access handling of C.SWSP and C.SDSP
RISC-V: hwprobe: Always use u64 for extension bits
Support rv32 ULEB128 test
riscv: Correct type casting in module loading
riscv: Safely remove entries from relocation list
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The save_v_state() is technically sending a __user pointer through
__put_user() and thus is generating a sparse warning so force the
value to be "void *" to fix:
arch/riscv/kernel/signal.c:94:16: warning: incorrect type in initializer (different address spaces)
arch/riscv/kernel/signal.c:94:16: expected void *__val
arch/riscv/kernel/signal.c:94:16: got void [noderef] __user *[assigned] datap
Fixes: 8ee0b41898fa26f66e32 ("riscv: signal: Add sigcontext save/restore for vector")
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Link: https://lore.kernel.org/r/20231123142708.261733-1-ben.dooks@codethink.co.uk
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The instruction reading code can read from either user or kernel addresses
and thus the use of __user on pointers to instructions depends on which
context. Fix a few sparse warnings by using __user for user-accesses and
remove it when not.
Fixes:
arch/riscv/kernel/traps_misaligned.c:361:21: warning: dereference of noderef expression
arch/riscv/kernel/traps_misaligned.c:373:21: warning: dereference of noderef expression
arch/riscv/kernel/traps_misaligned.c:381:21: warning: dereference of noderef expression
arch/riscv/kernel/traps_misaligned.c:322:24: warning: incorrect type in initializer (different address spaces)
arch/riscv/kernel/traps_misaligned.c:322:24: expected unsigned char const [noderef] __user *__gu_ptr
arch/riscv/kernel/traps_misaligned.c:322:24: got unsigned char const [usertype] *addr
arch/riscv/kernel/traps_misaligned.c:361:21: warning: dereference of noderef expression
arch/riscv/kernel/traps_misaligned.c:373:21: warning: dereference of noderef expression
arch/riscv/kernel/traps_misaligned.c:381:21: warning: dereference of noderef expression
arch/riscv/kernel/traps_misaligned.c:332:24: warning: incorrect type in initializer (different address spaces)
arch/riscv/kernel/traps_misaligned.c:332:24: expected unsigned char [noderef] __user *__gu_ptr
arch/riscv/kernel/traps_misaligned.c:332:24: got unsigned char [usertype] *addr
Fixes: 7c83232161f60 ("riscv: add support for misaligned trap handling in S-mode")
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Link: https://lore.kernel.org/r/20231123141617.259591-1-ben.dooks@codethink.co.uk
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
As said in the help of ARCH_WANTS_NO_INSTR entry in arch/Kconfig:
"An architecture should select this if the noinstr macro is being used on
functions to denote that the toolchain should avoid instrumenting such
functions and is required for correctness."
Select ARCH_WANTS_NO_INSTR for correctness.
PS: The reason we didn't find any issue so far is that the
CC_HAS_NO_PROFILE_FN_ATTR is true.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Link: https://lore.kernel.org/r/20231123142223.1787-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Samuel Holland <samuel.holland@sifive.com> says:
This series cleans up some duplicated and dead code around the RISC-V
CPU operations, that was copied from arm64 but is not needed here. The
result is a bit of memory savings and removal of a few SBI calls during
boot, with no functional change.
* b4-shazam-merge:
riscv: Use the same CPU operations for all CPUs
riscv: Remove unused members from struct cpu_operations
riscv: Deduplicate code in setup_smp()
Link: https://lore.kernel.org/r/20231121234736.3489608-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
This file is not used since commit 72f045d19f25 ("riscv: Fixup
difference with defconfig"), where it was replaced by the
32-bit.config fragment. Delete the old file to avoid any confusion.
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20231121225320.3430550-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Andrew Jones <ajones@ventanamicro.com> says:
This series introduces a flag for the hwprobe syscall which effectively
reverses its behavior from getting the values of keys for a set of cpus
to getting the cpus for a set of key-value pairs.
* b4-shazam-merge:
RISC-V: selftests: Add which-cpus hwprobe test
RISC-V: hwprobe: Introduce which-cpus flag
RISC-V: Move the hwprobe syscall to its own file
RISC-V: hwprobe: Clarify cpus size parameter
Link: https://lore.kernel.org/r/20231122164700.127954-6-ajones@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
This enables, among other things, testing with the QEMU virt machine.
To build an XIP kernel for the QEMU virt machine, configure the
the kernel as desired and apply the following configuration
```
CONFIG_NONPORTABLE=y
CONFIG_XIP_KERNEL=y
CONFIG_XIP_PHYS_ADDR=0x20000000
CONFIG_PHYS_RAM_BASE=0x80200000
CONFIG_BUILTIN_DTB=n
```
Since the QEMU virt flash memory expects a 32 MB file, the built image
must be padded. For example, with
`truncate -s 32M arch/riscv/boot/xipImage`
The kernel can be started using the following command in QEMU (v8+)
```
qemu-system-riscv64 -M virt,pflash0=pflash0 \
-blockdev node-name=pflash0,driver=file,read-only=on,\
filename=arch/riscv/boot/xipImage <optional parameters>
```
Signed-off-by: Frederik Haxel <haxel@fzi.de>
Link: https://lore.kernel.org/r/20231212130116.848530-4-haxel@fzi.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>