linux

iv/linux

Author	SHA1	Message	Date
Vitaly Kuznetsov	4fc096a99e	KVM: Raise the maximum number of user memslots Current KVM_USER_MEM_SLOTS limits are arch specific (512 on Power, 509 on x86, 32 on s390, 16 on MIPS) but they don't really need to be. Memory slots are allocated dynamically in KVM when added so the only real limitation is 'id_to_index' array which is 'short'. We don't have any other KVM_MEM_SLOTS_NUM/KVM_USER_MEM_SLOTS-sized statically defined structures. Low KVM_USER_MEM_SLOTS can be a limiting factor for some configurations. In particular, when QEMU tries to start a Windows guest with Hyper-V SynIC enabled and e.g. 256 vCPUs the limit is hit as SynIC requires two pages per vCPU and the guest is free to pick any GFN for each of them, this fragments memslots as QEMU wants to have a separate memslot for each of these pages (which are supposed to act as 'overlay' pages). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210127175731.2020089-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-02-09 08:17:08 -05:00
Michael Ellerman	e7eb919057	powerpc/64s: Handle program checks in wrong endian during early boot There's a short window during boot where although the kernel is running little endian, any exceptions will cause the CPU to switch back to big endian. This situation persists until we call configure_exceptions(), which calls either the hypervisor or OPAL to configure the CPU so that exceptions will be taken in little endian (via HID0[HILE]). We don't intend to take exceptions during early boot, but one way we sometimes do is via a WARN/BUG etc. Those all boil down to a trap instruction, which will cause a program check exception. The first instruction of the program check handler is an mtsprg, which when executed in the wrong endian is an lhzu with a ~3GB displacement from r3. The content of r3 is random, so that becomes a load from some random location, and depending on the system (installed RAM etc.) can easily lead to a checkstop, or an infinitely recursive page fault. That prevents whatever the WARN/BUG was complaining about being printed to the console, and the user just sees a dead system. We can fix it by having a trampoline at the beginning of the program check handler that detects we are in the wrong endian, and flips us back to the correct endian. We can't flip MSR[LE] using mtmsr (alas), so we have to use rfid. That requires backing up SRR0/1 as well as a GPR. To do that we use SPRG0/2/3 (SPRG1 is already used for the paca). SPRG3 is user readable, but this trampoline is only active very early in boot, and SPRG3 will be reinitialised in vdso_getcpu_init() before userspace starts. With this trampoline in place we can survive a WARN early in boot and print a stack trace, which is eventually printed to the console once the console is up, eg: [83565.758545] kexec_core: Starting new kernel [ 0.000000] ------------[ cut here ]------------ [ 0.000000] static_key_enable_cpuslocked(): static key '0xc000000000ea6160' used before call to jump_label_init() [ 0.000000] WARNING: CPU: 0 PID: 0 at kernel/jump_label.c:166 static_key_enable_cpuslocked+0xfc/0x120 [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0-gcc-8.2.0-dirty #618 [ 0.000000] NIP: c0000000002fd46c LR: c0000000002fd468 CTR: c000000000170660 [ 0.000000] REGS: c000000001227940 TRAP: 0700 Not tainted (5.10.0-gcc-8.2.0-dirty) [ 0.000000] MSR: 9000000002823003 <SF,HV,VEC,VSX,FP,ME,RI,LE> CR: 24882422 XER: 20040000 [ 0.000000] CFAR: 0000000000000730 IRQMASK: 1 [ 0.000000] GPR00: c0000000002fd468 c000000001227bd0 c000000001228300 0000000000000065 [ 0.000000] GPR04: 0000000000000001 0000000000000065 c0000000010cf970 000000000000000d [ 0.000000] GPR08: 0000000000000000 0000000000000000 0000000000000000 c00000000122763f [ 0.000000] GPR12: 0000000000002000 c000000000f8a980 0000000000000000 0000000000000000 [ 0.000000] GPR16: 0000000000000000 0000000000000000 c000000000f88c8e c000000000f88c9a [ 0.000000] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 0.000000] GPR24: 0000000000000000 c000000000dea3a8 0000000000000000 c000000000f35114 [ 0.000000] GPR28: 0000002800000000 c000000000f88c9a c000000000f88c8e c000000000ea6160 [ 0.000000] NIP [c0000000002fd46c] static_key_enable_cpuslocked+0xfc/0x120 [ 0.000000] LR [c0000000002fd468] static_key_enable_cpuslocked+0xf8/0x120 [ 0.000000] Call Trace: [ 0.000000] [c000000001227bd0] [c0000000002fd468] static_key_enable_cpuslocked+0xf8/0x120 (unreliable) [ 0.000000] [c000000001227c40] [c0000000002fd4c0] static_key_enable+0x30/0x50 [ 0.000000] [c000000001227c70] [c000000000f6629c] early_page_poison_param+0x58/0x9c [ 0.000000] [c000000001227cb0] [c000000000f351b8] do_early_param+0xa4/0x10c [ 0.000000] [c000000001227d30] [c00000000011e020] parse_args+0x270/0x5e0 [ 0.000000] [c000000001227e20] [c000000000f35864] parse_early_options+0x48/0x5c [ 0.000000] [c000000001227e40] [c000000000f358d0] parse_early_param+0x58/0x84 [ 0.000000] [c000000001227e70] [c000000000f3a368] early_init_devtree+0xc4/0x490 [ 0.000000] [c000000001227f10] [c000000000f3bca0] early_setup+0xc8/0x1c8 [ 0.000000] [c000000001227f90] [000000000000c320] 0xc320 [ 0.000000] Instruction dump: [ 0.000000] 4bfffddd 7c2004ac 39200001 913f0000 4bffffb8 7c651b78 3c82ffac 3c62ffc0 [ 0.000000] 38841b00 3863f310 4bdf03a5 60000000 <0fe00000> 4bffff38 60000000 60000000 [ 0.000000] random: get_random_bytes called from print_oops_end_marker+0x40/0x80 with crng_init=0 [ 0.000000] ---[ end trace 0000000000000000 ]--- [ 0.000000] dt-cpu-ftrs: setup for ISA 3000 Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210202130207.1303975-2-mpe@ellerman.id.au	2021-02-09 01:10:16 +11:00
Michael Ellerman	0ecf6a9e47	powerpc/64: Make stack tracing work during very early boot If we try to stack trace very early during boot, either due to a WARN/BUG or manual dump_stack(), we will oops in valid_emergency_stack() when we try to dereference the paca_ptrs array. The fix is simple, we just return false if paca_ptrs isn't allocated yet. The stack pointer definitely isn't part of any emergency stack because we haven't allocated any yet. Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210202130207.1303975-1-mpe@ellerman.id.au	2021-02-09 01:10:16 +11:00
Christopher M. Riedl	73287caa92	powerpc64/idle: Fix SP offsets when saving GPRs The idle entry/exit code saves/restores GPRs in the stack "red zone" (Protected Zone according to PowerPC64 ELF ABI v2). However, the offset used for the first GPR is incorrect and overwrites the back chain - the Protected Zone actually starts below the current SP. In practice this is probably not an issue, but it's still incorrect so fix it. Also expand the comments to explain why using the stack "red zone" instead of creating a new stackframe is appropriate here. Signed-off-by: Christopher M. Riedl <cmr@codefail.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210206072342.5067-1-cmr@codefail.de	2021-02-09 01:10:16 +11:00
Christophe Leroy	b842d131c7	powerpc/32s: Allow constant folding in mtsr()/mfsr() On the same way as we did in wrtee(), add an alternative using mtsr/mfsr instructions instead of mtsrin/mfsrin when the segment register can be determined at compile time. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/9baed0ff9d76723ec90f1b567ddd4ac1ecc7a190.1612612022.git.christophe.leroy@csgroup.eu	2021-02-09 01:10:15 +11:00
Christophe Leroy	179ae57dba	powerpc/32s: mfsrin()/mtsrin() become mfsr()/mtsr() Function names should tell what the function does, not how. mfsrin() and mtsrin() are read/writing segment registers. They are called that way because they are using mfsrin and mtsrin instructions, but it doesn't matter for the caller. In preparation of following patch, change their name to mfsr() and mtsr() in order to make it obvious they manipulate segment registers without messing up with how they do it. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/f92d99f4349391b77766745900231aa880a0efb5.1612612022.git.christophe.leroy@csgroup.eu	2021-02-09 01:10:15 +11:00
Christophe Leroy	fd659e8f2c	powerpc/32s: Change mfsrin() into a static inline function mfsrin() is a macro. Change in into an inline function to avoid conflicts in KVM and make it more evolutive. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/72c7b9879e2e2e6f5c27dadda6486386c2b50f23.1612612022.git.christophe.leroy@csgroup.eu	2021-02-09 01:10:15 +11:00
Christophe Leroy	8524e2e764	powerpc/uaccess: Perform barrier_nospec() in KUAP allowance helpers barrier_nospec() in uaccess helpers is there to protect against speculative accesses around access_ok(). When using user_access_begin() sequences together with unsafe_get_user() like macros, barrier_nospec() is called for every single read although we know the access_ok() is done onece. Since all user accesses must be granted by a call to either allow_read_from_user() or allow_read_write_user() which will always happen after the access_ok() check, move the barrier_nospec() there. Reported-by: Christopher M. Riedl <cmr@codefail.de> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/c72f014730823b413528e90ab6c4d3bcb79f8497.1612692067.git.christophe.leroy@csgroup.eu	2021-02-09 01:10:15 +11:00
Sandipan Das	22b89ba178	powerpc/sstep: Fix darn emulation Commit 8813ff49607e ("powerpc/sstep: Check instruction validity against ISA version before emulation") introduced a proper way to skip unknown instructions. This makes sure that the same is used for the darn instruction when the range selection bits have a reserved value. Fixes: a23987ef267a ("powerpc: sstep: Add support for darn instruction") Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210204080744.135785-2-sandipan@linux.ibm.com	2021-02-09 01:10:08 +11:00
Sandipan Das	bbda4b6c7d	powerpc/sstep: Fix load-store and update emulation The Power ISA says that the fixed-point load and update instructions must neither use R0 for the base address (RA) nor have the destination (RT) and the base address (RA) as the same register. Similarly, for fixed-point stores and floating-point loads and stores, the instruction is invalid when R0 is used as the base address (RA). This is applicable to the following instructions. * Load Byte and Zero with Update (lbzu) * Load Byte and Zero with Update Indexed (lbzux) * Load Halfword and Zero with Update (lhzu) * Load Halfword and Zero with Update Indexed (lhzux) * Load Halfword Algebraic with Update (lhau) * Load Halfword Algebraic with Update Indexed (lhaux) * Load Word and Zero with Update (lwzu) * Load Word and Zero with Update Indexed (lwzux) * Load Word Algebraic with Update Indexed (lwaux) * Load Doubleword with Update (ldu) * Load Doubleword with Update Indexed (ldux) * Load Floating Single with Update (lfsu) * Load Floating Single with Update Indexed (lfsux) * Load Floating Double with Update (lfdu) * Load Floating Double with Update Indexed (lfdux) * Store Byte with Update (stbu) * Store Byte with Update Indexed (stbux) * Store Halfword with Update (sthu) * Store Halfword with Update Indexed (sthux) * Store Word with Update (stwu) * Store Word with Update Indexed (stwux) * Store Doubleword with Update (stdu) * Store Doubleword with Update Indexed (stdux) * Store Floating Single with Update (stfsu) * Store Floating Single with Update Indexed (stfsux) * Store Floating Double with Update (stfdu) * Store Floating Double with Update Indexed (stfdux) E.g. the following behaviour is observed for an invalid load and update instruction having RA = RT. While a userspace program having an instruction word like 0xe9ce0001, i.e. ldu r14, 0(r14), runs without getting receiving a SIGILL on a Power system (observed on P8 and P9), the outcome of executing that instruction word varies and its behaviour can be considered to be undefined. Attaching an uprobe at that instruction's address results in emulation which currently performs the load as well as writes the effective address back to the base register. This might not match the outcome from hardware. To remove any inconsistencies, this adds additional checks for the aforementioned instructions to make sure that the emulation infrastructure treats them as unknown. The kernel can then fallback to executing such instructions on hardware. Fixes: 0016a4cf5582 ("powerpc: Emulate most Book I instructions in emulate_step()") Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210204080744.135785-1-sandipan@linux.ibm.com	2021-02-09 01:09:48 +11:00
Christophe Leroy	903178d0ce	powerpc/8xx: Fix software emulation interrupt For unimplemented instructions or unimplemented SPRs, the 8xx triggers a "Software Emulation Exception" (0x1000). That interrupt doesn't set reason bits in SRR1 as the "Program Check Exception" does. Go through emulation_assist_interrupt() to set REASON_ILLEGAL. Fixes: fbbcc3bb139e ("powerpc/8xx: Remove SoftwareEmulation()") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/ad782af87a222efc79cfb06079b0fd23d4224eaf.1612515180.git.christophe.leroy@csgroup.eu	2021-02-09 01:09:46 +11:00
Athira Rajeev	d137845c97	powerpc/perf: Record counter overflow always if SAMPLE_IP is unset While sampling for marked events, currently we record the sample only if the SIAR valid bit of Sampled Instruction Event Register (SIER) is set. SIAR_VALID bit is used for fetching the instruction address from Sampled Instruction Address Register(SIAR). But there are some usecases, where the user is interested only in the PMU stats at each counter overflow and the exact IP of the overflow event is not required. Dropping SIAR invalid samples will fail to record some of the counter overflows in such cases. Example of such usecase is dumping the PMU stats (event counts) after some regular amount of instructions/events from the userspace (ex: via ptrace). Here counter overflow is indicated to userspace via signal handler, and captured by monitoring and enabling I/O signaling on the event file descriptor. In these cases, we expect to get sample/overflow indication after each specified sample_period. Perf event attribute will not have PERF_SAMPLE_IP set in the sample_type if exact IP of the overflow event is not requested. So while profiling if SAMPLE_IP is not set, just record the counter overflow irrespective of SIAR_VALID check. Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> [mpe: Reflow comment and if formatting] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1612516492-1428-1-git-send-email-atrajeev@linux.vnet.ibm.com	2021-02-09 01:09:46 +11:00
Nathan Lynch	768d70e19b	powerpc/pseries/dlpar: handle ibm, configure-connector delay status dlpar_configure_connector() has two problems in its handling of ibm,configure-connector's return status: 1. When the status is -2 (busy, call again), we call ibm,configure-connector again immediately without checking whether to schedule, which can result in monopolizing the CPU. 2. Extended delay status (9900..9905) goes completely unhandled, causing the configuration to unnecessarily terminate. Fix both of these issues by using rtas_busy_delay(). Fixes: ab519a011caa ("powerpc/pseries: Kernel DLPAR Infrastructure") Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210107025900.410369-1-nathanl@linux.ibm.com	2021-02-09 01:09:46 +11:00
Nicholas Piggin	3cb1aa7aa3	powerpc/64s: Implement ptep_clear_flush_young that does not flush TLBs Similarly to the x86 commit b13b1d2d8692 ("x86/mm: In the PTE swapout page reclaim case clear the accessed bit instead of flushing the TLB"), implement ptep_clear_flush_young that does not actually flush the TLB in the case the referenced bit is cleared. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201217134731.488135-8-npiggin@gmail.com	2021-02-09 01:09:45 +11:00
Nicholas Piggin	032b7f0893	powerpc/64s/radix: serialize_against_pte_lookup IPIs trim mm_cpumask serialize_against_pte_lookup() performs IPIs to all CPUs in mm_cpumask. Take this opportunity to try trim the CPU out of mm_cpumask. This can reduce the cost of future serialize_against_pte_lookup() and/or the cost of future TLB flushes. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201217134731.488135-7-npiggin@gmail.com	2021-02-09 01:09:45 +11:00
Nicholas Piggin	9393544842	powerpc/64s/radix: occasionally attempt to trim mm_cpumask A single-threaded process that is flushing its own address space is so far the only case where the mm_cpumask is attempted to be trimmed. This patch expands that to flush in other situations, multi-threaded processes and external sources. For now it's a relatively simple occasional trim attempt. The main aim is to add the mechanism, tweaking and tuning can come with more data. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201217134731.488135-6-npiggin@gmail.com	2021-02-09 01:09:45 +11:00
Nicholas Piggin	780de40601	powerpc/64s/radix: Allow mm_cpumask trimming from external sources mm_cpumask trimming is currently restricted to be issued by the current thread of a single-threaded mm. This patch relaxes that and allows the mask to be trimmed from any context. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201217134731.488135-5-npiggin@gmail.com	2021-02-09 01:09:44 +11:00
Nicholas Piggin	54bb503345	powerpc/64s/radix: Check for no TLB flush required If there are no CPUs in mm_cpumask, no TLB flush is required at all. This patch adds a check for this case. Currently it's not tested for, in fact mm_is_thread_local() returns false if the current CPU is not in mm_cpumask, so it's treated as a global flush. This can come up in some cases like exec failure before the new mm has ever been switched to. This patch reduces TLBIE instructions required to build a kernel from about 120,000 to 45,000. Another situation it could help is page reclaim, KSM, THP, etc., (i.e., asynch operations external to the process) where the process is sleeping and has all TLBs flushed out of all CPUs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201217134731.488135-4-npiggin@gmail.com	2021-02-09 01:09:44 +11:00
Nicholas Piggin	26418b36a1	powerpc/64s/radix: refactor TLB flush type selection The logic to decide what kind of TLB flush is required (local, global, or IPI) is spread multiple times over the several kinds of TLB flushes. Move it all into a single function which may issue IPIs if necessary, and also returns a flush type that is to be used. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201217134731.488135-3-npiggin@gmail.com	2021-02-09 01:09:44 +11:00
Nicholas Piggin	a2496049f1	powerpc/64s/radix: add warning and comments in mm_cpumask trim Add a comment explaining part of the logic for mm_cpumask trimming, and add a (hopefully graceful) check and warning in case something gets it wrong. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20201217134731.488135-2-npiggin@gmail.com	2021-02-09 01:09:44 +11:00
Athira Rajeev	e79b76e03b	powerpc/perf: Expose Performance Monitor Counter SPR's as part of extended regs Currently Monitor Mode Control Registers and Sampling registers are part of extended regs. Patch adds support to include Performance Monitor Counter Registers (PMC1 to PMC6 ) as part of extended registers. PMCs are saved in the perf interrupt handler as part of per-cpu array 'pmcs' in struct cpu_hw_events. While capturing the register values for extended regs, fetch these saved PMC values. Simplified the PERF_REG_PMU_MASK_300/31 definition to include PMU SPRs MMCR0 to PMC6. Exclude the unsupported SPRs (MMCR3, SIER2, SIER3) from extended mask value for CPU_FTR_ARCH_300 in the new definition. PERF_REG_EXTENDED_MAX is used to check if any index beyond the extended registers is requested in the sample. Have one PERF_REG_EXTENDED_MAX for CPU_FTR_ARCH_300/CPU_FTR_ARCH_31 since perf_reg_validate function already checks the extended mask for the presence of any unsupported register. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1612335337-1888-3-git-send-email-atrajeev@linux.vnet.ibm.com	2021-02-09 01:09:44 +11:00
Athira Rajeev	91f3469a43	powerpc/perf: Include PMCs as part of per-cpu cpuhw_events struct To support capturing of PMC's as part of extended registers, the value of SPR's PMC1 to PMC6 has to be saved in the starting of PMI interrupt handler. This is needed since we are resetting the overflown PMC before creating sample and hence directly reading SPRN_PMCx in 'perf_reg_value' will be capturing the modified value. To solve this, add a per-cpu array as part of structure cpu_hw_events and use this array to capture PMC values in the perf interrupt handler. Patch also re-factor's the interrupt handler code to use this per-cpu array instead of current local array. Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1612335337-1888-2-git-send-email-atrajeev@linux.vnet.ibm.com	2021-02-09 01:09:44 +11:00
Sandipan Das	266d8f7586	powerpc/pkeys: Remove unused code This removes arch_supports_pkeys(), arch_usable_pkeys() and thread_pkey_regs_() which are remnants from the following: commit 06bb53b33804 ("powerpc: store and restore the pkey state across context switches") commit 2cd4bd192ee9 ("powerpc/pkeys: Fix handling of pkey state across fork()") commit cf43d3b26452 ("powerpc: Enable pkey subsystem") arch_supports_pkeys() and arch_usable_pkeys() were unused since their introduction while thread_pkey_regs_() became unused after the introduction of the following: commit d5fa30e6993f ("powerpc/book3s64/pkeys: Reset userspace AMR correctly on exec") commit 48a8ab4eeb82 ("powerpc/book3s64/pkeys: Don't update SPRN_AMR when in kernel mode") Signed-off-by: Sandipan Das <sandipan@linux.ibm.com> Reviewed-by: Ram Pai <linuxram@us.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210202150050.75335-1-sandipan@linux.ibm.com	2021-02-09 01:09:44 +11:00
Bhaskar Chowdhury	ea7826583f	powerpc/44x: Fix a spelling mismach to mismatch in head_44x.S s/mismach/mismatch/ Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210202093746.5198-1-unixbhaskar@gmail.com	2021-02-09 00:10:51 +11:00
Chengyang Fan	6c6fdbb2b7	powerpc: remove unneeded semicolons Remove superfluous semicolons after function definitions. Signed-off-by: Chengyang Fan <cy.fan@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210125095338.1719405-1-cy.fan@huawei.com	2021-02-09 00:10:50 +11:00
Michael Ellerman	665d8d5876	powerpc/akebono: Fix unmet dependency errors The AKEBONO config has various selects under it, including some with user-selectable dependencies, which means those dependencies can be disabled. This leads to warnings from Kconfig. This can be seen with eg: $ make allnoconfig $ ./scripts/config --file build~/.config -k -e CONFIG_44x -k -e CONFIG_PPC_47x -e CONFIG_AKEBONO $ make olddefconfig WARNING: unmet direct dependencies detected for ATA Depends on [n]: HAS_IOMEM [=y] && BLOCK [=n] Selected by [y]: - AKEBONO [=y] && PPC_47x [=y] WARNING: unmet direct dependencies detected for NETDEVICES Depends on [n]: NET [=n] Selected by [y]: - AKEBONO [=y] && PPC_47x [=y] WARNING: unmet direct dependencies detected for ETHERNET Depends on [n]: NETDEVICES [=y] && NET [=n] Selected by [y]: - AKEBONO [=y] && PPC_47x [=y] WARNING: unmet direct dependencies detected for MMC_SDHCI Depends on [n]: MMC [=n] && HAS_DMA [=y] Selected by [y]: - AKEBONO [=y] && PPC_47x [=y] WARNING: unmet direct dependencies detected for MMC_SDHCI_PLTFM Depends on [n]: MMC [=n] && MMC_SDHCI [=y] Selected by [y]: - AKEBONO [=y] && PPC_47x [=y] The problem is that AKEBONO is using select to enable things that are not true dependencies, but rather things you probably want enabled in an AKEBONO kernel. That is what a defconfig is for. So drop those selects and instead move those symbols into the defconfig. This fixes all the kconfig warnings, and the result of make 44x/akebono_defconfig is the same before and after the patch. Reported-by: Yury Norov <yury.norov@gmail.com> Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Tested-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20210201012503.940145-1-mpe@ellerman.id.au	2021-02-09 00:10:50 +11:00
Nicholas Piggin	86dbb39416	powerpc/64s: runlatch interrupt handling in C There is no need for this to be in asm, use the new intrrupt entry wrapper. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-42-npiggin@gmail.com	2021-02-09 00:10:50 +11:00
Nicholas Piggin	6ecbb582b6	powerpc/64s: move NMI soft-mask handling to C Saving and restoring soft-mask state can now be done in C using the interrupt handler wrapper functions. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-41-npiggin@gmail.com	2021-02-09 00:10:50 +11:00
Nicholas Piggin	118178e62e	powerpc: move NMI entry/exit code into wrapper This moves the common NMI entry and exit code into the interrupt handler wrappers. This changes the behaviour of soft-NMI (watchdog) and HMI interrupts, and also MCE interrupts on 64e, by adding missing parts of the NMI entry to them. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-40-npiggin@gmail.com	2021-02-09 00:10:50 +11:00
Nicholas Piggin	74c3354bc1	powerpc/pseries/mce: restore msr before returning from handler The pseries real-mode machine check handler can enable the MMU, and return from the handler with the MMU still enabled. This works, but real-mode handler wrapper exit handlers want to rely on the MMU being in real-mode. So change the pseries handler to restore the MSR after it has finished virtual mode tasks. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1612702361.lm7fqo56re.astroid@bobo.none	2021-02-09 00:10:50 +11:00
Nicholas Piggin	56acfdd8bf	powerpc/64: entry cpu time accounting in C There is no need for this to be in asm, use the new interrupt entry wrapper. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-39-npiggin@gmail.com	2021-02-09 00:10:49 +11:00
Nicholas Piggin	2994e1babf	powerpc/64: move account_stolen_time into its own function This will be used by interrupt entry as well. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-38-npiggin@gmail.com	2021-02-09 00:10:49 +11:00
Nicholas Piggin	75b96950fd	powerpc/64s: reconcile interrupts in C There is no need for this to be in asm, use the new intrrupt entry wrapper. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-37-npiggin@gmail.com	2021-02-09 00:10:49 +11:00
Nicholas Piggin	f821bc97de	powerpc/64s: move context tracking exit to interrupt exit path The interrupt handler wrapper functions are not the ideal place to maintain context tracking because after they return, the low level exit code must then determine if there are interrupts to replay, or if the task should be preempted, etc. Those paths (e.g., schedule_user) include their own exception_enter/exit pairs to fix this up but it's a bit hacky (see schedule_user() comments). Ideally context tracking will go to user mode only when there are no more interrupts or context switches or other exit processing work to handle. 64e can not do this because it does not use the C interrupt exit code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-36-npiggin@gmail.com	2021-02-09 00:10:49 +11:00
Nicholas Piggin	1b1b6a6f4c	powerpc: handle irq_enter/irq_exit in interrupt handler wrappers Move irq_enter/irq_exit into asynchronous interrupt handler wrappers. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-35-npiggin@gmail.com	2021-02-09 00:10:49 +11:00
Nicholas Piggin	6fdb0f410b	powerpc/64: add context tracking to asynchronous interrupts Previously context tracking was not done for asynchronous interrupts, (those that run in interrupt context), and if those would cause a reschedule when they exit, then scheduling functions (schedule_user, preempt_schedule_irq) call exception_enter/exit to fix this up and exit user context. This is a hack we would like to get away from, so do context tracking for asynchronous interrupts too. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-34-npiggin@gmail.com	2021-02-09 00:10:48 +11:00
Nicholas Piggin	540d4d34be	powerpc/64: context tracking move to interrupt wrappers This moves exception_enter/exit calls to wrapper functions for synchronous interrupts. More interrupt handlers are covered by this than previously. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-33-npiggin@gmail.com	2021-02-09 00:10:46 +11:00
Nicholas Piggin	a008f8f9fd	powerpc/64s/hash: improve context tracking of hash faults This moves the 64s/hash context tracking from hash_page_mm() to __do_hash_fault(), so it's no longer called by OCXL / SPU accelerators, which was certainly the wrong thing to be doing, because those callers are not low level interrupt handlers, so should have entered a kernel context tracking already. Then remain in kernel context for the duration of the fault, rather than enter/exit for the hash fault then enter/exit for the page fault, which is pointless. Even still, calling exception_enter/exit in __do_hash_fault seems questionable because that's touching per-cpu variables, tracing, etc., which might have been interrupted by this hash fault or themselves cause hash faults. But maybe I miss something because hash_page_mm very deliberately calls trace_hash_fault too, for example. So for now go with it, it's no worse than before, in this regard. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-32-npiggin@gmail.com	2021-02-09 00:02:12 +11:00
Nicholas Piggin	2a06bf3e95	powerpc/64: context tracking remove _TIF_NOHZ Add context tracking to the system call handler explicitly, and remove _TIF_NOHZ. This improves system call performance when nohz_full is enabled. On a POWER9, gettid scv system call cost on a nohz_full CPU improves from 1129 cycles to 1004 cycles and on a housekeeping CPU from 550 cycles to 430 cycles. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-31-npiggin@gmail.com	2021-02-09 00:02:12 +11:00
Nicholas Piggin	e6f8a6c86c	powerpc: add interrupt_cond_local_irq_enable helper Simple helper for synchronous interrupt handlers (i.e., process-context) to enable interrupts if it was taken in an interrupts-enabled context. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-30-npiggin@gmail.com	2021-02-09 00:02:12 +11:00
Nicholas Piggin	3a96570ffc	powerpc: convert interrupt handlers to use wrappers Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-29-npiggin@gmail.com	2021-02-09 00:02:12 +11:00
Nicholas Piggin	fd3f1e0f13	powerpc/traps: factor common code from program check and emulation assist Move the program check handling into a function called by both, rather than have the emulation assist handler call the program check handler. This allows each of these handlers to be implemented with "interrupt wrappers" in a later change. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1612702475.d6qyt6qtfy.astroid@bobo.none	2021-02-09 00:02:12 +11:00
Nicholas Piggin	25b7e6bb74	powerpc: add interrupt wrapper entry / exit stub functions These will be used by subsequent patches. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-28-npiggin@gmail.com	2021-02-09 00:02:12 +11:00
Nicholas Piggin	8d41fc618a	powerpc: interrupt handler wrapper functions Add wrapper functions (derived from x86 macros) for interrupt handler functions. This allows interrupt entry code to be written in C. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-27-npiggin@gmail.com	2021-02-09 00:02:11 +11:00
Nicholas Piggin	11cb0a25f7	powerpc: improve handling of unrecoverable system reset If an unrecoverable system reset hits in process context, the system does not have to panic. Similar to machine check, call nmi_exit() before die(). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-26-npiggin@gmail.com	2021-02-09 00:02:11 +11:00
Nicholas Piggin	c538938fa2	powerpc/mce: ensure machine check handler always tests RI A machine check that is handled must still check MSR[RI] for recoverability of the interrupted context. Without this patch it's possible for a handled machine check to return to a context where it has clobbered live registers. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-25-npiggin@gmail.com	2021-02-09 00:02:11 +11:00
Nicholas Piggin	209e9d500e	powerpc: introduce die_mce As explained by commit daf00ae71dad ("powerpc/traps: restore recoverability of machine_check interrupts"), die() can't be called from within nmi_enter to nicely kill a process context that was interrupted. nmi_exit must be called first. This adds a function die_mce which takes care of this for machine check handlers. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-24-npiggin@gmail.com	2021-02-09 00:02:11 +11:00
Nicholas Piggin	dcdb4f1296	powerpc/cell: tidy up pervasive declarations These are declared in ras.h and defined in ras.c so remove them from pervasive.h Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-23-npiggin@gmail.com	2021-02-09 00:02:11 +11:00
Nicholas Piggin	6c6aee009e	powerpc: add and use unknown_async_exception This is currently the same as unknown_exception, but it will diverge after interrupt wrappers are added and code moved out of asm into the wrappers (e.g., async handlers will check FINISH_NAP). Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-22-npiggin@gmail.com	2021-02-09 00:02:11 +11:00
Nicholas Piggin	0440b8a22c	powerpc/time: move timer_broadcast_interrupt prototype to asm/time.h Interrupt handler prototypes are going to be rearranged in a future patch, so tidy this out of the way first. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210130130852.2952424-21-npiggin@gmail.com	2021-02-09 00:02:10 +11:00

... 2 3 4 5 6 ...

23239 Commits