linux

iv/linux

Author	SHA1	Message	Date
Christophe Leroy	e89aa642be	powerpc/ftrace: Use PPC_RAW_xxx() macros instead of opencoding. PPC_RAW_xxx() macros are self explanatory and less error prone than open coding. Use them in ftrace.c Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9292094c9a69cef6d29ee83f435a557b59c45065.1652074503.git.christophe.leroy@csgroup.eu	2022-05-19 23:11:29 +10:00
Christophe Leroy	cf9df92a82	powerpc/ftrace: Use BRANCH_SET_LINK instead of value 1 To make it explicit, use BRANCH_SET_LINK instead of value 1 when calling create_branch(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/d57847063ac93660a5af620d4df1847f10edf61a.1652074503.git.christophe.leroy@csgroup.eu	2022-05-19 23:11:29 +10:00
Christophe Leroy	ccf6607e45	powerpc/ftrace: Remove ftrace_plt_tramps[] ftrace_plt_tramps table is never filled so it is useless. Remove it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/daeeb618a6619e3a7e3f82f1bd83ca7c25af6330.1652074503.git.christophe.leroy@csgroup.eu	2022-05-19 23:11:29 +10:00
Christophe Leroy	c2cba93d1a	powerpc/ftrace: Use CONFIG_FUNCTION_TRACER instead of CONFIG_DYNAMIC_FTRACE Since commit 0c0c52306f47 ("powerpc: Only support DYNAMIC_FTRACE not static"), CONFIG_DYNAMIC_FTRACE is always selected when CONFIG_FUNCTION_TRACER is selected. To avoid confusion and have the reader wonder what's happen when CONFIG_FUNCTION_TRACER is selected and CONFIG_DYNAMIC_FTRACE is not, use CONFIG_FUNCTION_TRACER in ifdefs instead of CONFIG_DYNAMIC_FTRACE. As CONFIG_FUNCTION_GRAPH_TRACER depends on CONFIG_FUNCTION_TRACER, ftrace.o doesn't need to appear for both symbols in Makefile. Then as ftrace.o is built only when CONFIG_FUNCTION_TRACER is selected ifdef CONFIG_FUNCTION_TRACER is not needed in ftrace.c, and since it implies CONFIG_DYNAMIC_FTRACE, CONFIG_DYNAMIC_FTRACE is not needed in ftrace.c Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/628d357503eb90b4a034f99b7df516caaff4d279.1652074503.git.christophe.leroy@csgroup.eu	2022-05-19 23:11:29 +10:00
Christophe Leroy	a3d0f5b4b7	powerpc/ftrace: Don't include ftrace.o for CONFIG_FTRACE_SYSCALLS Since commit 7bea7ac0ca01 ("powerpc/syscalls: Fix syscall tracing") ftrace.o is not needed anymore for CONFIG_FTRACE_SYSCALLS. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/275932a5d61543b825ff9a64f61abed6da5d4a2a.1652074503.git.christophe.leroy@csgroup.eu	2022-05-19 23:11:29 +10:00
Christophe Leroy	23b44fc248	powerpc/ftrace: Make __ftrace_make_{nop/call}() common to PPC32 and PPC64 Since c93d4f6ecf4b ("powerpc/ftrace: Add module_trampoline_target() for PPC32"), __ftrace_make_nop() for PPC32 is very similar to the one for PPC64. Same for __ftrace_make_call(). Make them common. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/96f53c237316dab4b1b8c682685266faa92da816.1652074503.git.christophe.leroy@csgroup.eu	2022-05-19 23:11:29 +10:00
Christophe Leroy	5b89492c03	powerpc: Finalise cleanup around ABI use Now that we have CONFIG_PPC64_ELF_ABI_V1 and CONFIG_PPC64_ELF_ABI_V2, get rid of all indirect detection of ABI version. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/709d9d69523c14c8a9fba4486395dca0f2d675b1.1652074503.git.christophe.leroy@csgroup.eu	2022-05-19 23:11:29 +10:00
Christophe Leroy	7d40aff821	powerpc: Replace PPC64_ELF_ABI_v{1/2} by CONFIG_PPC64_ELF_ABI_V{1/2} Replace all uses of PPC64_ELF_ABI_v1 and PPC64_ELF_ABI_v2 by resp CONFIG_PPC64_ELF_ABI_V1 and CONFIG_PPC64_ELF_ABI_V2. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ba13d59e8c50bc9aa6328f1c7f0c0d0278e0a3a7.1652074503.git.christophe.leroy@csgroup.eu	2022-05-19 23:11:29 +10:00
Christophe Leroy	661aa88039	powerpc: Add CONFIG_PPC64_ELF_ABI_V1 and CONFIG_PPC64_ELF_ABI_V2 At the time being, we use CONFIG_CPU_LITTLE_ENDIAN and CONFIG_CPU_BIG_ENDIAN to pass -mabi=elfv1 or elfv2 to compiler, then define a PPC64_ELF_ABI_v1 or PPC64_ELF_ABI_v2 macro in asm/types.h based on _CALL_ELF define set by the compiler. Make it more straight forward with a CONFIG option that is directly usable. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1eca1addbc550167da9841c7340a010d0c4b2200.1652074503.git.christophe.leroy@csgroup.eu	2022-05-19 23:11:28 +10:00
Christophe Leroy	bbffdd2fc7	powerpc/ftrace: Use patch_instruction() return directly Instead of returning -EPERM when patch_instruction() fails, just return what patch_instruction returns. That simplifies ftrace_modify_code(): 0: 94 21 ff c0 stwu r1,-64(r1) 4: 93 e1 00 3c stw r31,60(r1) 8: 7c 7f 1b 79 mr. r31,r3 c: 40 80 00 30 bge 3c <ftrace_modify_code+0x3c> 10: 93 c1 00 38 stw r30,56(r1) 14: 7c 9e 23 78 mr r30,r4 18: 7c a4 2b 78 mr r4,r5 1c: 80 bf 00 00 lwz r5,0(r31) 20: 7c 1e 28 40 cmplw r30,r5 24: 40 82 00 34 bne 58 <ftrace_modify_code+0x58> 28: 83 c1 00 38 lwz r30,56(r1) 2c: 7f e3 fb 78 mr r3,r31 30: 83 e1 00 3c lwz r31,60(r1) 34: 38 21 00 40 addi r1,r1,64 38: 48 00 00 00 b 38 <ftrace_modify_code+0x38> 38: R_PPC_REL24 patch_instruction Before: 0: 94 21 ff c0 stwu r1,-64(r1) 4: 93 e1 00 3c stw r31,60(r1) 8: 7c 7f 1b 79 mr. r31,r3 c: 40 80 00 4c bge 58 <ftrace_modify_code+0x58> 10: 93 c1 00 38 stw r30,56(r1) 14: 7c 9e 23 78 mr r30,r4 18: 7c a4 2b 78 mr r4,r5 1c: 80 bf 00 00 lwz r5,0(r31) 20: 7c 08 02 a6 mflr r0 24: 90 01 00 44 stw r0,68(r1) 28: 7c 1e 28 40 cmplw r30,r5 2c: 40 82 00 48 bne 74 <ftrace_modify_code+0x74> 30: 7f e3 fb 78 mr r3,r31 34: 48 00 00 01 bl 34 <ftrace_modify_code+0x34> 34: R_PPC_REL24 patch_instruction 38: 80 01 00 44 lwz r0,68(r1) 3c: 20 63 00 00 subfic r3,r3,0 40: 83 c1 00 38 lwz r30,56(r1) 44: 7c 63 19 10 subfe r3,r3,r3 48: 7c 08 03 a6 mtlr r0 4c: 83 e1 00 3c lwz r31,60(r1) 50: 38 21 00 40 addi r1,r1,64 54: 4e 80 00 20 blr It improves ftrace activation/deactivation duration by about 3%. Modify patch_instruction() return on failure to -EPERM in order to match with ftrace expectations. Other users of patch_instruction() do not care about the exact error value returned. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/49a8597230713e2633e7d9d7b56140787c4a7e20.1652074503.git.christophe.leroy@csgroup.eu	2022-05-19 23:11:28 +10:00
Christophe Leroy	2c920fca8c	powerpc/ftrace: Inline ftrace_modify_code() Inlining ftrace_modify_code(), it increases a bit the size of ftrace code but brings 5% improvment on ftrace activation. Usually in C files we let gcc decide what to do but here it really help to 'help' gcc to decide to inline, thought we don't want to force it with an __always_inline that would be too much for CONFIG_CC_OPTIMIZE_FOR_SIZE. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1597a06d57cfc80e6853838c4066e799bf6c7977.1652074503.git.christophe.leroy@csgroup.eu	2022-05-19 23:11:28 +10:00
Christophe Leroy	d2f47dabf1	powerpc/code-patching: Inline create_branch() create_branch() is a good candidate for inlining because: - Flags can be folded in. - Range tests are likely to be already done. Hence reducing the create_branch() to only a set of instructions. So inline it. It improves ftrace activation by 10%. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/69851cc9a7bf8f03d025e6d29e165f2d0bd3bb6e.1652074503.git.christophe.leroy@csgroup.eu	2022-05-19 23:11:28 +10:00
Christophe Leroy	a1facd2578	powerpc/ftrace: Use is_offset_in_branch_range() Use is_offset_in_branch_range() instead of create_branch() to check if a target is within branch range. This patch together with the previous one improves ftrace activation time by 7% Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/912ae51782f5a53c44e435497c8c3fb5cc632387.1652074503.git.christophe.leroy@csgroup.eu	2022-05-19 23:11:28 +10:00
Christophe Leroy	1acbf27e8a	powerpc/code-patching: Inline is_offset_in_{cond}_branch_range() Test in is_offset_in_branch_range() and is_offset_in_cond_branch_range() are simple tests that are worth inlining. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/a05be0ccb7373e6a9789a1988fcd0c810f5f9269.1652074503.git.christophe.leroy@csgroup.eu	2022-05-19 23:11:28 +10:00
Christophe Leroy	ae3a2a2188	powerpc/ftrace: Remove redundant create_branch() calls Since commit d5937db114e4 ("powerpc/code-patching: Fix patch_branch() return on out-of-range failure") patch_branch() fails with -ERANGE when trying to branch out of range. No need to perform the test twice. Remove redundant create_branch() calls. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/aa45fbad0b4b7493080835d8276c0cb4ce146503.1652074503.git.christophe.leroy@csgroup.eu	2022-05-19 23:11:28 +10:00
Christophe Leroy	d996d5053e	powerpc/ftrace: Refactor prepare_ftrace_return() When we have CONFIG_DYNAMIC_FTRACE_WITH_ARGS, prepare_ftrace_return() is called by ftrace_graph_func() otherwise prepare_ftrace_return() is called from assembly. Refactor prepare_ftrace_return() into a static __prepare_ftrace_return() that will be called by both prepare_ftrace_return() and ftrace_graph_func(). It will allow GCC to fold __prepare_ftrace_return() inside ftrace_graph_func(). Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/0d42deafe353980c66cf19d3132638c05ba9f4a9.1652074503.git.christophe.leroy@csgroup.eu	2022-05-19 23:11:28 +10:00
Nicholas Piggin	804c0a166f	powerpc/rtas: enture rtas_call is called with MMU enabled rtas_call must not be called with the MMU disabled because in case of rtas error, log_error is called which requires MMU enabled. Add a test and warning for this. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220308135047.478297-14-npiggin@gmail.com	2022-05-19 23:11:27 +10:00
Nicholas Piggin	014b2e896c	powerpc/rtas: Leave MSR[RI] enabled over RTAS call PAPR specifies that RTAS may be called with MSR[RI] enabled if the calling context is recoverable, and RTAS will manage RI as necessary. Call the rtas entry point with RI enabled, and add a check to ensure the caller has RI enabled. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220308135047.478297-10-npiggin@gmail.com	2022-05-19 23:11:27 +10:00
Nicholas Piggin	5c86bd02b3	powerpc/rtas: PACA can be restored directly from SPRG On 64-bit, PACA is saved in a SPRG so it does not need to be saved on stack. We also don't need to mask off the top bits for real mode addresses because the architecture does this for us. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220308135047.478297-8-npiggin@gmail.com	2022-05-19 23:11:27 +10:00
Nicholas Piggin	c5a65e0a42	powerpc/rtas: Call enter_rtas with MSR[EE] disabled Disable MSR[EE] in C code rather than asm. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220308135047.478297-5-npiggin@gmail.com	2022-05-19 23:11:27 +10:00
Nicholas Piggin	4e949faae2	powerpc/rtas: Fix whitespace in rtas_entry.S The code was moved verbatim including whitespace cruft. Fix that. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220308135047.478297-4-npiggin@gmail.com	2022-05-19 23:11:27 +10:00
Nicholas Piggin	07940b4b61	powerpc/rtas: Make enter_rtas a nokprobe symbol on 64-bit This symbol is marked nokprobe on 32-bit but not 64-bit, add it. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220308135047.478297-3-npiggin@gmail.com	2022-05-19 23:11:27 +10:00
Nicholas Piggin	838ee286ec	powerpc/rtas: Move rtas entry assembly into its own file This makes working on the code a bit easier. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220308135047.478297-2-npiggin@gmail.com	2022-05-19 23:11:27 +10:00
Nicholas Piggin	2896b2dff4	powerpc/signal: Report minimum signal frame size to userspace via AT_MINSIGSTKSZ Implement the AT_MINSIGSTKSZ AUXV entry, allowing userspace to dynamically size stack allocations in a manner forward-compatible with new processor state saved in the signal frame For now these statically find the maximum signal frame size rather than doing any runtime testing of features to minimise the size. glibc 2.34 will take advantage of this, as will applications that use use _SC_MINSIGSTKSZ and _SC_SIGSTKSZ. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> References: 94b07c1f8c39 ("arm64: signal: Report signal frame size to userspace via auxv") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220307182734.289289-2-npiggin@gmail.com	2022-05-19 23:11:26 +10:00
Nicholas Piggin	2f82ec1975	powerpc/64: Bump SIGSTKSZ and MINSIGSTKSZ The sad tale of SIGSTKSZ and MINSIGSTKSZ is documented in glibc.git commit f7c399cff5bd ("PowerPC SIGSTKSZ"), which explains why glibc does not use the kernel defines for these constants. Since then in fact there has been a further expansion of the signal stack frame size on little-endian with linux commit 573ebfa6601f ("powerpc: Increase stack redzone for 64-bit userspace to 512 bytes"), which has caused it to exceed even the glibc defines. See kernel commit 63dee5df43a3 ("powerpc: Allow 4224 bytes of stack expansion for the signal frame") for more details of the history of the expansion. Increase MINSIGSTKSZ to 8192 which is double the current glibc value and fits the current stack frame with room to grow. SIGSTKSZ is set to 4x the minimum as convention. glibc will have to be updated as well. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220307182734.289289-1-npiggin@gmail.com	2022-05-19 23:11:26 +10:00
Nathan Chancellor	4406b12214	powerpc/vdso: Link with ld.lld when requested The PowerPC vDSO uses $(CC) to link, which differs from the rest of the kernel, which uses $(LD) directly. As a result, the default linker of the compiler is used, which may differ from the linker requested by the builder. For example: $ make ARCH=powerpc LLVM=1 mrproper defconfig arch/powerpc/kernel/vdso/ ... $ llvm-readelf -p .comment arch/powerpc/kernel/vdso/vdso{32,64}.so.dbg File: arch/powerpc/kernel/vdso/vdso32.so.dbg String dump of section '.comment': [ 0] clang version 14.0.0 (Fedora 14.0.0-1.fc37) File: arch/powerpc/kernel/vdso/vdso64.so.dbg String dump of section '.comment': [ 0] clang version 14.0.0 (Fedora 14.0.0-1.fc37) LLVM=1 sets LD=ld.lld but ld.lld is not used to link the vDSO; GNU ld is because "ld" is the default linker for clang on most Linux platforms. This is a problem for Clang's Link Time Optimization as implemented in the kernel because use of GNU ld with LTO requires the LLVMgold plugin, which is not technically supported for ld.bfd per https://llvm.org/docs/GoldPlugin.html. Furthermore, if LLVMgold.so is missing from a user's system, the build will fail, even though LTO as it is implemented in the kernel requires ld.lld to avoid this dependency in the first place. Ultimately, the PowerPC vDSO should be converted to compiling and linking with $(CC) and $(LD) respectively but there were issues last time this was tried, potentially due to older but supported tool versions. To avoid regressing GCC + binutils, use the compiler option '-fuse-ld', which tells the compiler which linker to use when it is invoked as both the compiler and linker. Use '-fuse-ld=lld' when LD=ld.lld has been specified (CONFIG_LD_IS_LLD) so that the vDSO is linked with the same linker as the rest of the kernel. $ llvm-readelf -p .comment arch/powerpc/kernel/vdso/vdso{32,64}.so.dbg File: arch/powerpc/kernel/vdso/vdso32.so.dbg String dump of section '.comment': [ 0] Linker: LLD 14.0.0 [ 14] clang version 14.0.0 (Fedora 14.0.0-1.fc37) File: arch/powerpc/kernel/vdso/vdso64.so.dbg String dump of section '.comment': [ 0] Linker: LLD 14.0.0 [ 14] clang version 14.0.0 (Fedora 14.0.0-1.fc37) LD can be a full path to ld.lld, which will not be handled properly by '-fuse-ld=lld' if the full path to ld.lld is outside of the compiler's search path. '-fuse-ld' can take a path to the linker but it is deprecated in clang 12.0.0; '--ld-path' is preferred for this scenario. Use '--ld-path' if it is supported, as it will handle a full path or just 'ld.lld' properly. See the LLVM commit below for the full details of '--ld-path'. Signed-off-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://github.com/ClangBuiltLinux/linux/issues/774 Link: `1bc5c84710` Link: https://lore.kernel.org/r/20220511185001.3269404-3-nathan@kernel.org	2022-05-19 23:11:26 +10:00
Nathan Chancellor	e247172854	powerpc/vdso: Remove unused ENTRY in linker scripts When linking vdso{32,64}.so.dbg with ld.lld, there is a warning about not finding _start for the starting address: ld.lld: warning: cannot find entry symbol _start; not setting start address ld.lld: warning: cannot find entry symbol _start; not setting start address Looking at GCC + GNU ld, the entry point address is 0x0: $ llvm-readelf -h vdso{32,64}.so.dbg &\| rg "(File\|Entry point address):" File: vdso32.so.dbg Entry point address: 0x0 File: vdso64.so.dbg Entry point address: 0x0 This matches what ld.lld emits: $ powerpc64le-linux-gnu-readelf -p .comment vdso{32,64}.so.dbg File: vdso32.so.dbg String dump of section '.comment': [ 0] Linker: LLD 14.0.0 [ 14] clang version 14.0.0 (Fedora 14.0.0-1.fc37) File: vdso64.so.dbg String dump of section '.comment': [ 0] Linker: LLD 14.0.0 [ 14] clang version 14.0.0 (Fedora 14.0.0-1.fc37) $ llvm-readelf -h vdso{32,64}.so.dbg &\| rg "(File\|Entry point address):" File: vdso32.so.dbg Entry point address: 0x0 File: vdso64.so.dbg Entry point address: 0x0 Remove ENTRY to remove the warning, as it is unnecessary for the vDSO to function correctly. Signed-off-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220511185001.3269404-2-nathan@kernel.org	2022-05-19 23:11:26 +10:00
Kevin Hao	d9e5c3e9e7	powerpc: Export mmu_feature_keys[] as non-GPL When the mmu_feature_keys[] was introduced in the commit c12e6f24d413 ("powerpc: Add option to use jump label for mmu_has_feature()"), it is unlikely that it would be used either directly or indirectly in the out of tree modules. So we exported it as GPL only. But with the evolution of the codes, especially the PPC_KUAP support, it may be indirectly referenced by some primitive macro or inline functions such as get_user() or __copy_from_user_inatomic(), this will make it impossible to build many non GPL modules (such as ZFS) on ppc architecture. Fix this by exposing the mmu_feature_keys[] to the non-GPL modules too. Fixes: 7613f5a66bec ("powerpc/64s/kuap: Use mmu_has_feature()") Reported-by: Nathaniel Filardo <nwfilardo@gmail.com> Signed-off-by: Kevin Hao <haokexin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220329085709.4132729-1-haokexin@gmail.com	2022-05-19 23:11:26 +10:00
Guilherme G. Piccoli	e2aa34ce80	powerpc/setup: Refactor/untangle panic notifiers The panic notifiers infrastructure is a bit limited in the scope of the callbacks - basically every kind of functionality is dropped in a list that runs in the same point during the kernel panic path. This is not really on par with the complexities and particularities of architecture / hypervisors' needs, and a refactor is ongoing. As part of this refactor, it was observed that powerpc has 2 notifiers, with mixed goals: one is just a KASLR offset dumper, whereas the other aims to hard-disable IRQs (necessary on panic path), warn firmware of the panic event (fadump) and run low-level platform-specific machinery that might stop kernel execution and never come back. Clearly, the 2nd notifier has opposed goals: disable IRQs / fadump should run earlier while low-level platform actions should run late since it might not even return. Hence, this patch decouples the notifiers splitting them in three: - First one is responsible for hard-disable IRQs and fadump, should run early; - The kernel KASLR offset dumper is really an informative notifier, harmless and may run at any moment in the panic path; - The last notifier should run last, since it aims to perform low-level actions for specific platforms, and might never return. It is also only registered for 2 platforms, pseries and ps3. The patch better documents the notifiers and clears the code too, also removing a useless header. Currently no functionality change should be observed, but after the planned panic refactor we should expect more panic reliability with this patch. Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Reviewed-by: Hari Bathini <hbathini@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220427224924.592546-9-gpiccoli@igalia.com	2022-05-19 23:11:26 +10:00
Michael Ellerman	b104e41cda	Merge branch 'topic/ppc-kvm' into next Merge our KVM topic branch.	2022-05-19 23:10:42 +10:00
Fabiano Rosas	ad55bae7dc	KVM: PPC: Book3S HV: Fix vcore_blocked tracepoint We removed most of the vcore logic from the P9 path but there's still a tracepoint that tried to dereference vc->runner. Fixes: ecb6a7207f92 ("KVM: PPC: Book3S HV P9: Remove most of the vcore logic") Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220328215831.320409-1-farosas@linux.ibm.com	2022-05-19 00:44:50 +10:00
Alexey Kardashevskiy	b22af90419	KVM: PPC: Book3s: Remove real mode interrupt controller hcalls handlers Currently we have 2 sets of interrupt controller hypercalls handlers for real and virtual modes, this is from POWER8 times when switching MMU on was considered an expensive operation. POWER9 however does not have dependent threads and MMU is enabled for handling hcalls so the XIVE native or XICS-on-XIVE real mode handlers never execute on real P9 and later CPUs. This untemplate the handlers and only keeps the real mode handlers for XICS native (up to POWER8) and remove the rest of dead code. Changes in functions are mechanical except few missing empty lines to make checkpatch.pl happy. The default implemented hcalls list already contains XICS hcalls so no change there. This should not cause any behavioral change. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Acked-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220509071150.181250-1-aik@ozlabs.ru	2022-05-19 00:44:28 +10:00
Alexey Kardashevskiy	29592181c5	KVM: PPC: Book3s: PR: Enable default TCE hypercalls When KVM_CAP_PPC_ENABLE_HCALL was introduced, H_GET_TCE and H_PUT_TCE were already implemented and enabled by default; however H_GET_TCE was missed out on PR KVM (probably because the handler was in the real mode code at the time). This enables H_GET_TCE by default. While at this, this wraps the checks in ifdef CONFIG_SPAPR_TCE_IOMMU just like HV KVM. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220506073737.3823347-1-aik@ozlabs.ru	2022-05-19 00:44:15 +10:00
Alexey Kardashevskiy	cad32d9d42	KVM: PPC: Book3s: Retire H_PUT_TCE/etc real mode handlers LoPAPR defines guest visible IOMMU with hypercalls to use it - H_PUT_TCE/etc. Implemented first on POWER7 where hypercalls would trap in the KVM in the real mode (with MMU off). The problem with the real mode is some memory is not available and some API usage crashed the host but enabling MMU was an expensive operation. The problems with the real mode handlers are: 1. Occasionally these cannot complete the request so the code is copied+modified to work in the virtual mode, very little is shared; 2. The real mode handlers have to be linked into vmlinux to work; 3. An exception in real mode immediately reboots the machine. If the small DMA window is used, the real mode handlers bring better performance. However since POWER8, there has always been a bigger DMA window which VMs use to map the entire VM memory to avoid calling H_PUT_TCE. Such 1:1 mapping happens once and uses H_PUT_TCE_INDIRECT (a bulk version of H_PUT_TCE) which virtual mode handler is even closer to its real mode version. On POWER9 hypercalls trap straight to the virtual mode so the real mode handlers never execute on POWER9 and later CPUs. So with the current use of the DMA windows and MMU improvements in POWER9 and later, there is no point in duplicating the code. The 32bit passed through devices may slow down but we do not have many of these in practice. For example, with this applied, a 1Gbit ethernet adapter still demostrates above 800Mbit/s of actual throughput. This removes the real mode handlers from KVM and related code from the powernv platform. This updates the list of implemented hcalls in KVM-HV as the realmode handlers are removed. This changes ABI - kvmppc_h_get_tce() moves to the KVM module and kvmppc_find_table() is static now. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220506053755.3820702-1-aik@ozlabs.ru	2022-05-19 00:44:01 +10:00
Michael Ellerman	750137ec6c	Merge branch 'fixes' into topic/ppc-kvm Merge our fixes branch. In parciular this brings in the KVM TCE handling fix, which is a prerequisite for a subsequent patch.	2022-05-19 00:43:04 +10:00
Fabiano Rosas	1d1cd0f12a	KVM: PPC: Book3S HV: Initialize AMOR in nested entry The hypervisor always sets AMOR to ~0, but let's ensure we're not passing stale values around. Signed-off-by: Fabiano Rosas <farosas@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220425142151.1495142-1-farosas@linux.ibm.com	2022-05-19 00:28:49 +10:00
Michael Ellerman	a5fc286f69	Merge branch 'fixes' into next Merge our fixes branch from this cycle. In particular this brings in a papr_scm.c change which a subsequent patch has a dependency on.	2022-05-19 00:11:51 +10:00
Bo Liu	15eb1b6afc	KVM: PPC: Book3S HV: Use consistent type for return value of kvm_age_rmapp() The return value type defined in the function kvm_age_rmapp() is "bool", but the return value type defined in the implementation of the function kvm_age_rmapp() is "int". Change the return value type to "bool". Signed-off-by: Bo Liu <liubo03@inspur.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220401065252.36472-1-liubo03@inspur.com	2022-05-18 23:34:39 +10:00
Xiaomeng Tong	300981abdd	KVM: PPC: Book3S HV: fix incorrect NULL check on list iterator The bug is here: if (!p) return ret; The list iterator value 'p' will always be set and non-NULL by list_for_each_entry(), so it is incorrect to assume that the iterator value will be NULL if the list is empty or no element is found. To fix the bug, Use a new value 'iter' as the list iterator, while use the old value 'p' as a dedicated variable to point to the found element. Fixes: dfaa973ae960 ("KVM: PPC: Book3S HV: In H_SVM_INIT_DONE, migrate remaining normal-GFNs to secure-GFNs") Cc: stable@vger.kernel.org # v5.9+ Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220414062103.8153-1-xiam0nd.tong@gmail.com	2022-05-18 23:31:35 +10:00
Bagas Sanjaya	d53c36e6c8	KVM: PPC: Book3S HV: remove extraneous asterisk from rm_host_ipi_action() comment kernel test robot reported kernel-doc warning for rm_host_ipi_action(): arch/powerpc/kvm/book3s_hv_rm_xics.c:887: warning: This comment starts with '/*', but isn't a kernel-doc comment. Host Operations poked by RM KVM Since the function is static, remove the extraneous (second) asterisk at the head of function comment. Fixes: 0c2a66062470cd ("KVM: PPC: Book3S HV: Host side kick VCPU when poked by real-mode KVM") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/linux-doc/202204252334.Cd2IsiII-lkp@intel.com/ Link: https://lore.kernel.org/r/20220506070747.16309-1-bagasdotme@gmail.com	2022-05-18 23:27:24 +10:00
Greg Kroah-Hartman	d6da35e0c6	Merge 5.18-rc7 into usb-next We need the tty fixes in here as well, as we need to revert one of them :( Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2022-05-16 15:39:23 +02:00
Linus Torvalds	bc403203d6	powerpc fixes for 5.18 #5 - Fix KVM PR on 32-bit, which was broken by some MMU code refactoring. Thanks to: Alexander Graf, Matt Evans. -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmKA5psTHG1wZUBlbGxl cm1hbi5pZC5hdQAKCRBR6+o8yOGlgHnjD/4/YU1y7XU/h0RjYwiKfVlVbIFXjFkB xMgijdI87SCSU94O8h/7WqhVEwfS8dM+jBa9F4Ky6nOqLMnAE1B/zPzE1P9KAvLl kilLQ9Wl/VhR+PqD62UofevhwsPA+neLkKKAfWMjMvRp4wHvAk18ILNAxhq4D3hP 7KyK4sQntBVwnObWuopJoEeKuoDFa/0VRr/VC29CKowWRzIgwo/OVBBzAxrYUT0g HjhNO4zPyJ6mxanO24cWokXekxWKU07NvIeU4dP6nOreUJvfxfpmEbhsxrC8YLo1 Fy7MWGQLwvq8D7UHeWPh+OOhIuPQDIDseXaW3m4mkLmMqxy5Y+D2aCyto/Rl8B97 yPqWmcYjdLoac0CS3l+CQUcCTc5YPZ7zDkPkrKdvutyrPqgTkTgRFWrOIlLJtV7f H++hVj0RMiFyjPPAUWNtByKwdwq+lAN1OqrtrksWhX6aSZJIgcR0pUQM1YFqDQU1 IOErcs2JjrwxYOGDIgB6zE/UNClDaoS5J0p4utgtnRyQB1VHnUqv8+3wP3+7ATYG I7smX7ce/RRtLvJTXIgNc406ibj/TOW9icneBZ9m1+2WWcBb4JpUMiBBwyhCsn7x xHzJLof4LLKxCdjalBb8w7mQ4yO5i9IHrOEgb/TD3P5MXjBhWZOljky+j7NUXzgb 9IFdAGOPnXwcFw== =HVUs -----END PGP SIGNATURE----- Merge tag 'powerpc-5.18-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fix from Michael Ellerman: - Fix KVM PR on 32-bit, which was broken by some MMU code refactoring. Thanks to: Alexander Graf, and Matt Evans. * tag 'powerpc-5.18-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: KVM: PPC: Book3S PR: Enable MSR_DR for switch_mmu_context()	2022-05-15 06:46:03 -07:00
Baolin Wang	ae07562909	mm: change huge_ptep_clear_flush() to return the original pte Patch series "Fix CONT-PTE/PMD size hugetlb issue when unmapping or migrating", v4. presently, migrating a hugetlb page or unmapping a poisoned hugetlb page, we'll use ptep_clear_flush() and set_pte_at() to nuke the page table entry and remap it, and this is incorrect for CONT-PTE or CONT-PMD size hugetlb page, which will cause potential data consistent issue. This patch set will change to use hugetlb related APIs to fix this issue. Note: Mike pointed out the huge_ptep_get() will only return the one specific value, and it would not take into account the dirty or young bits of CONT-PTE/PMDs like the huge_ptep_get_and_clear() [1]. This inconsistent issue is not introduced by this patch set, and this issue will be addressed in another thread [2]. Meanwhile the uffd for hugetlb case [3] pointed out by Gerald also needs another patch to address. [1] https://lore.kernel.org/linux-mm/85bd80b4-b4fd-0d3f-a2e5-149559f2f387@oracle.com/ [2] https://lore.kernel.org/all/cover.1651998586.git.baolin.wang@linux.alibaba.com/ [3] https://lore.kernel.org/linux-mm/20220503120343.6264e126@thinkpad/ This patch (of 3): It is incorrect to use ptep_clear_flush() to nuke a hugetlb page table when unmapping or migrating a hugetlb page, and will change to use huge_ptep_clear_flush() instead in the following patches. So this is a preparation patch, which changes the huge_ptep_clear_flush() to return the original pte to help to nuke a hugetlb page table. [baolin.wang@linux.alibaba.com: fix build in several more architectures] Link: https://lkml.kernel.org/r/0009a4cd-2826-e8be-e671-f050d4f18d5d@linux.alibaba.com [sfr@canb.auug.org.au: fixup] Link: https://lkml.kernel.org/r/20220511181531.7f27a5c1@canb.auug.org.au Link: https://lkml.kernel.org/r/cover.1652270205.git.baolin.wang@linux.alibaba.com Link: https://lkml.kernel.org/r/20f77ddab90baa249bd24504c413189b82acde69.1652270205.git.baolin.wang@linux.alibaba.com Link: https://lkml.kernel.org/r/cover.1652147571.git.baolin.wang@linux.alibaba.com Link: https://lkml.kernel.org/r/dcf065868cce35bceaf138613ad27f17bb7c0c19.1652147571.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Acked-by: Mike Kravetz <mike.kravetz@oracle.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Helge Deller <deller@gmx.de> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Yoshinori Sato <ysato@users.osdn.me> Cc: Rich Felker <dalias@libc.org> Cc: David S. Miller <davem@davemloft.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2022-05-13 16:48:55 -07:00
Jason A. Donenfeld	4088358321	powerpc: define get_cycles macro for arch-override PowerPC defines a get_cycles() function, but it does not do the usual `#define get_cycles get_cycles` dance, making it impossible for generic code to see if an arch-specific function was defined. While the get_cycles() ifdef is not currently used, the following timekeeping patch in this series will depend on the macro existing (or not existing) when defining random_get_entropy(). Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@ozlabs.org> Cc: Paul Mackerras <paulus@samba.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>	2022-05-13 23:59:23 +02:00
Nicholas Piggin	2852ebfa10	KVM: PPC: Book3S HV Nested: L2 LPCR should inherit L1 LPES setting The L1 should not be able to adjust LPES mode for the L2. Setting LPES if the L0 needs it clear would cause external interrupts to be sent to L2 and missed by the L0. Clearing LPES when it may be set, as typically happens with XIVE enabled could cause a performance issue despite having no native XIVE support in the guest, because it will cause mediated interrupts for the L2 to be taken in HV mode, which then have to be injected. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220303053315.1056880-7-npiggin@gmail.com	2022-05-13 21:34:33 +10:00
Nicholas Piggin	11681b79b1	KVM: PPC: Book3S HV Nested: L2 must not run with L1 xive context The PowerNV L0 currently pushes the OS xive context when running a vCPU, regardless of whether it is running a nested guest. The problem is that xive OS ring interrupts will be delivered while the L2 is running. At the moment, by default, the L2 guest runs with LPCR[LPES]=0, which actually makes external interrupts go to the L0. That causes the L2 to exit and the interrupt taken or injected into the L1, so in some respects this behaves like an escalation. It's not clear if this was deliberate or not, there's no comment about it and the L1 is actually allowed to clear LPES in the L2, so it's confusing at best. When the L2 is running, the L1 is essentially in a ceded state with respect to external interrupts (it can't respond to them directly and won't get scheduled again absent some additional event). So the natural way to solve this is when the L0 handles a H_ENTER_NESTED hypercall to run the L2, have it arm the escalation interrupt and don't push the L1 context while running the L2. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220303053315.1056880-6-npiggin@gmail.com	2022-05-13 21:34:33 +10:00
Nicholas Piggin	42b4a2b347	KVM: PPC: Book3S HV P9: Split !nested case out from guest entry The differences between nested and !nested will become larger in later changes so split them out for readability. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220303053315.1056880-5-npiggin@gmail.com	2022-05-13 21:34:33 +10:00
Nicholas Piggin	ad5ace91c5	KVM: PPC: Book3S HV P9: Move cede logic out of XIVE escalation rearming Move the cede abort logic out of xive escalation rearming and into the caller to prepare for handling a similar case with nested guest entry. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220303053315.1056880-4-npiggin@gmail.com	2022-05-13 21:34:33 +10:00
Nicholas Piggin	026728dc5d	KVM: PPC: Book3S HV P9: Inject pending xive interrupts at guest entry If there is a pending xive interrupt, inject it at guest entry (if MSR[EE] is enabled) rather than take another interrupt when the guest is entered. If xive is enabled then LPCR[LPES] is set so this behaviour should be expected. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220303053315.1056880-3-npiggin@gmail.com	2022-05-13 21:34:33 +10:00
Nicholas Piggin	f104df7d51	KVM: PPC: Book3S HV: Remove KVMPPC_NR_LPIDS KVMPPC_NR_LPIDS no longer represents any size restriction on the LPID space and can be removed. A CPU with more than 12 LPID bits implemented will now be able to create more than 4095 guests. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220123120043.3586018-7-npiggin@gmail.com	2022-05-13 21:33:34 +10:00

1 2 3 4 5 ...

25146 Commits