linux

iv/linux

Author	SHA1	Message	Date
Nathan Lynch	22887f319a	powerpc/pseries: delete scanlog Remove the pseries scanlog driver. This code supports functions from Power4-era servers that are not present on targets currently supported by arch/powerpc. System manuals from this time have this description: Scan Dump data is a set of chip data that the service processor gathers after a system malfunction. It consists of chip scan rings, chip trace arrays, and Scan COM (SCOM) registers. This data is stored in the scan-log partition of the system’s Nonvolatile Random Access Memory (NVRAM). PowerVM partition firmware development doesn't recognize the associated function call or property, and they don't see any references to them in their codebase. It seems to have been specific to non-virtualized pseries. References: Historical Linux commit from February 2003 (interesting to note this seems to be the source of non-GPL exports for rtas_call etc): https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/?id=f92e361842d5251e50562b09664082dcbd0548bb IntelliStation and pSeries docs which refer to the feature: http://ps-2.retropc.se/basil.holloway/ALL%20PDF/380635.pdf http://ps-2.kev009.com/rs6000/manuals/p/p615-6C3-6E3/6C3_and_6E3_Users_Guide_SA38-0629.pdf Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210920173203.1800475-1-nathanl@linux.ibm.com	2021-11-25 11:25:33 +11:00
Nathan Lynch	53cadf7dee	powerpc/rtas: kernel-doc fixes Fix the following issues reported by kernel-doc: $ scripts/kernel-doc -v -none arch/powerpc/kernel/rtas.c arch/powerpc/kernel/rtas.c:810: info: Scanning doc for function rtas_activate_firmware arch/powerpc/kernel/rtas.c:818: warning: contents before sections arch/powerpc/kernel/rtas.c:841: info: Scanning doc for function rtas_call_reentrant arch/powerpc/kernel/rtas.c:893: warning: This comment starts with '/*', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst Find a specific pseries error log in an RTAS extended event log. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211116215806.928235-1-nathanl@linux.ibm.com	2021-11-25 11:25:32 +11:00
Christophe Leroy	8b8a8f0ab3	powerpc/code-patching: Improve verification of patchability Today, patch_instruction() assumes that it is called exclusively on valid addresses, and only checks that it is not called on an init address after init section has been freed. Improve verification by calling kernel_text_address() instead. kernel_text_address() already includes a verification of initmem release. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/bc683d499a411730504b132a924de0ccc2ef1f79.1636971137.git.christophe.leroy@csgroup.eu	2021-11-25 11:25:32 +11:00
Jason Wang	a3bcfc182b	powerpc/tsi108: make EXPORT_SYMBOL follow its function immediately EXPORT_SYMBOL(foo); should immediately follow its function/variable. Signed-off-by: Jason Wang <wangborong@cdjrlc.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211114115616.493815-1-wangborong@cdjrlc.com	2021-11-25 11:25:32 +11:00
Hari Bathini	e919c0b232	bpf ppc32: Access only if addr is kernel address With KUAP enabled, any kernel code which wants to access userspace needs to be surrounded by disable-enable KUAP. But that is not happening for BPF_PROBE_MEM load instruction. Though PPC32 does not support read protection, considering the fact that PTR_TO_BTF_ID (which uses BPF_PROBE_MEM mode) could either be a valid kernel pointer or NULL but should never be a pointer to userspace address, execute BPF_PROBE_MEM load only if addr is kernel address, otherwise set dst_reg=0 and move on. This will catch NULL, valid or invalid userspace pointers. Only bad kernel pointer will be handled by BPF exception table. [Alexei suggested for x86] Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211012123056.485795-9-hbathini@linux.ibm.com	2021-11-25 11:25:32 +11:00
Hari Bathini	23b51916ee	bpf ppc32: Add BPF_PROBE_MEM support for JIT BPF load instruction with BPF_PROBE_MEM mode can cause a fault inside kernel. Append exception table for such instructions within BPF program. Unlike other archs which uses extable 'fixup' field to pass dest_reg and nip, BPF exception table on PowerPC follows the generic PowerPC exception table design, where it populates both fixup and extable sections within BPF program. fixup section contains 3 instructions, first 2 instructions clear dest_reg (lower & higher 32-bit registers) and last instruction jumps to next instruction in the BPF code. extable 'insn' field contains relative offset of the instruction and 'fixup' field contains relative offset of the fixup entry. Example layout of BPF program with extable present: +------------------+ \| \| \| \| 0x4020 -->\| lwz r28,4(r4) \| \| \| \| \| 0x40ac -->\| lwz r3,0(r24) \| \| lwz r4,4(r24) \| \| \| \| \| \|------------------\| 0x4278 -->\| li r28,0 \| \ \| li r27,0 \| \| fixup entry \| b 0x4024 \| / 0x4284 -->\| li r4,0 \| \| li r3,0 \| \| b 0x40b4 \| \|------------------\| 0x4290 -->\| insn=0xfffffd90 \| \ extable entry \| fixup=0xffffffe4 \| / 0x4298 -->\| insn=0xfffffe14 \| \| fixup=0xffffffe8 \| +------------------+ (Addresses shown here are chosen random, not real) Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211012123056.485795-8-hbathini@linux.ibm.com	2021-11-25 11:25:32 +11:00
Ravi Bangoria	9c70c7147f	bpf ppc64: Access only if addr is kernel address On PPC64 with KUAP enabled, any kernel code which wants to access userspace needs to be surrounded by disable-enable KUAP. But that is not happening for BPF_PROBE_MEM load instruction. So, when BPF program tries to access invalid userspace address, page-fault handler considers it as bad KUAP fault: Kernel attempted to read user page (d0000000) - exploit attempt? (uid: 0) Considering the fact that PTR_TO_BTF_ID (which uses BPF_PROBE_MEM mode) could either be a valid kernel pointer or NULL but should never be a pointer to userspace address, execute BPF_PROBE_MEM load only if addr is kernel address, otherwise set dst_reg=0 and move on. This will catch NULL, valid or invalid userspace pointers. Only bad kernel pointer will be handled by BPF exception table. [Alexei suggested for x86] Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211012123056.485795-7-hbathini@linux.ibm.com	2021-11-25 11:25:32 +11:00
Ravi Bangoria	983bdc0245	bpf ppc64: Add BPF_PROBE_MEM support for JIT BPF load instruction with BPF_PROBE_MEM mode can cause a fault inside kernel. Append exception table for such instructions within BPF program. Unlike other archs which uses extable 'fixup' field to pass dest_reg and nip, BPF exception table on PowerPC follows the generic PowerPC exception table design, where it populates both fixup and extable sections within BPF program. fixup section contains two instructions, first instruction clears dest_reg and 2nd jumps to next instruction in the BPF code. extable 'insn' field contains relative offset of the instruction and 'fixup' field contains relative offset of the fixup entry. Example layout of BPF program with extable present: +------------------+ \| \| \| \| 0x4020 -->\| ld r27,4(r3) \| \| \| \| \| 0x40ac -->\| lwz r3,0(r4) \| \| \| \| \| \|------------------\| 0x4280 -->\| li r27,0 \| \ fixup entry \| b 0x4024 \| / 0x4288 -->\| li r3,0 \| \| b 0x40b0 \| \|------------------\| 0x4290 -->\| insn=0xfffffd90 \| \ extable entry \| fixup=0xffffffec \| / 0x4298 -->\| insn=0xfffffe14 \| \| fixup=0xffffffec \| +------------------+ (Addresses shown here are chosen random, not real) Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211012123056.485795-6-hbathini@linux.ibm.com	2021-11-25 11:25:32 +11:00
Hari Bathini	f15a71b388	powerpc/ppc-opcode: introduce PPC_RAW_BRANCH() macro Define and use PPC_RAW_BRANCH() macro instead of open coding it. This macro is used while adding BPF_PROBE_MEM support. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211012123056.485795-5-hbathini@linux.ibm.com	2021-11-25 11:25:31 +11:00
Hari Bathini	efa95f031b	bpf powerpc: refactor JIT compiler code Refactor powerpc LDX JITing code to simplify adding BPF_PROBE_MEM support. Signed-off-by: Hari Bathini <hbathini@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211012123056.485795-4-hbathini@linux.ibm.com	2021-11-25 11:25:31 +11:00
Ravi Bangoria	04c04205bc	bpf powerpc: Remove extra_pass from bpf_jit_build_body() In case of extra_pass, usual JIT passes are always skipped. So, extra_pass is always false while calling bpf_jit_build_body() and can be removed. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211012123056.485795-3-hbathini@linux.ibm.com	2021-11-25 11:25:31 +11:00
Ravi Bangoria	c9ce7c36e4	bpf powerpc: Remove unused SEEN_STACK SEEN_STACK is unused on PowerPC. Remove it. Also, have SEEN_TAILCALL use 0x40000000. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211012123056.485795-2-hbathini@linux.ibm.com	2021-11-25 11:25:31 +11:00
Oliver O'Halloran	157616f3c2	powerpc/eeh: Use a goto for recovery failures The EEH recovery logic in eeh_handle_normal_event() has some pretty strange flow control. If we remove all the actual recovery logic we're left with the following skeleton: if (result != PCI_ERS_RESULT_DISCONNECT) { ... } if (result != PCI_ERS_RESULT_DISCONNECT) { ... } if (result == PCI_ERS_RESULT_NONE) { ... } if (result == PCI_ERS_RESULT_CAN_RECOVER) { ... } if (result == PCI_ERS_RESULT_CAN_RECOVER) { ... } if (result == PCI_ERS_RESULT_NEED_RESET) { ... } if ((result == PCI_ERS_RESULT_RECOVERED) \|\| (result == PCI_ERS_RESULT_NONE)) { ... goto out; } /* * unsuccessful recovery / PCI_ERS_RESULT_DISCONECTED * handling is here. */ ... out: ... Most of the "if () { ... }" blocks above change "result" to PCI_ERS_RESULT_DISCONNECTED if an error occurs in that recovery step. This makes the control flow a bit confusing since it breaks the early-exit pattern that is generally used in Linux. In any case we end up handling the error in the final else block so why not just jump there directly? Doing so also allows us to de-indent a bunch of code. No functional changes. [dja: rebase on top of linux-next + my preceeding refactor, move clearing the EEH_DEV_NO_HANDLER bit above the first goto so that it is always clear in the error handler code as it was before.] Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211015070628.1331635-2-dja@axtens.net	2021-11-25 11:25:31 +11:00
Daniel Axtens	10b34ece13	powerpc/eeh: Small refactor of eeh_handle_normal_event() The control flow of eeh_handle_normal_event() is a bit tricky. Break out one of the error handling paths - rather than be in an else block, we'll make it part of the regular body of the function and put a 'goto out;' in the true limb of the if. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211015070628.1331635-1-dja@axtens.net	2021-11-25 11:25:31 +11:00
Cédric Le Goater	1e7684dc4f	powerpc/xive: Add a debugfs toggle for save-restore On POWER10, the automatic "save & restore" of interrupt context is always available. Provide a way to deactivate it for tests or performance. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211105102636.1016378-11-clg@kaod.org	2021-11-25 11:25:31 +11:00
Cédric Le Goater	c21ee04f11	powerpc/xive: Add a kernel parameter for StoreEOI StoreEOI is activated by default on platforms supporting the feature (POWER10) and will be used as soon as firmware advertises its availability. The kernel parameter provides a way to deactivate its use. It can be still be reactivated through debugfs. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211105102636.1016378-10-clg@kaod.org	2021-11-25 11:25:30 +11:00
Cédric Le Goater	d7bc1e376c	powerpc/xive: Add a debugfs toggle for StoreEOI It can be used to deactivate temporarily StoreEOI for tests or performance on platforms supporting the feature (POWER10) Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211105102636.1016378-9-clg@kaod.org	2021-11-25 11:25:30 +11:00
Cédric Le Goater	08f3f61021	powerpc/xive: Add a debugfs file to dump EQs The XIVE driver under Linux uses a single interrupt priority and only one event queue is configured per CPU. Expose the contents under a 'xive/eqs/cpuX' debugfs file. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211105102636.1016378-8-clg@kaod.org	2021-11-25 11:25:30 +11:00
Cédric Le Goater	33e1d4a152	powerpc/xive: Rename the 'cpus' debugfs file to 'ipis' and remove the EQ entries output which is not very useful since only the next two events of the queue are taken into account. We will improve the dump of the EQ in the next patches. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211105102636.1016378-7-clg@kaod.org	2021-11-25 11:25:30 +11:00
Cédric Le Goater	baed14de78	powerpc/xive: Change the debugfs file 'xive' into a directory Use a 'cpus' file to dump CPU states and 'interrupts' to dump IRQ states. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211105102636.1016378-6-clg@kaod.org	2021-11-25 11:25:30 +11:00
Cédric Le Goater	412877dfae	powerpc/xive: Introduce xive_core_debugfs_create() and fix some compile issues when !CONFIG_DEBUG_FS. Signed-off-by: Cédric Le Goater <clg@kaod.org> [mpe: Add empty stub to fix !CONFIG_DEBUG_FS build] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211105102636.1016378-5-clg@kaod.org	2021-11-25 11:25:30 +11:00
Cédric Le Goater	756c52c632	powerpc/xive: Activate StoreEOI on P10 StoreEOI (the capability to EOI with a store) requires load-after-store ordering in some cases to be reliable. P10 introduced a new offset for load operations to enforce correct ordering and the XIVE driver has the required support since kernel 5.8, commit b1f9be9392f0 ("powerpc/xive: Enforce load-after-store ordering when StoreEOI is active") Since skiboot v7, StoreEOI support is advertised on P10 with a new flag on the PowerNV platform. See skiboot commit 4bd7d84afe46 ("xive/p10: Introduce a new OPAL_XIVE_IRQ_STORE_EOI2 flag"). When detected, activate the feature. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211105102636.1016378-4-clg@kaod.org	2021-11-25 11:25:30 +11:00
Cédric Le Goater	bd5b00c6cf	powerpc/xive: Introduce an helper to print out interrupt characteristics and extend output of debugfs and xmon with addresses of the ESB management and trigger pages. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211105102636.1016378-3-clg@kaod.org	2021-11-25 11:25:29 +11:00
Cédric Le Goater	44b9c8ddcb	powerpc/xive: Replace pr_devel() by pr_debug() to ease debug These routines are not on hot code paths and pr_debug() is easier to activate. Also add a '0x' prefix to hex printed values (HW IRQ number). Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211105102636.1016378-2-clg@kaod.org	2021-11-25 11:25:29 +11:00
Nicholas Piggin	d02fa40d75	powerpc/powernv: Remove POWER9 PVR version check for entry and uaccess flushes These aren't necessarily POWER9 only, and it's not to say some new vulnerability may not get discovered on other processors for which we would like the flexibility of having the workaround enabled by firmware. Remove the restriction that the workarounds only apply to POWER9. However POWER7 and POWER8 are not affected, and they may not have older firmware that does not advertise this, so clear these workarounds manually. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Joel Stanley <joel@jms.id.au> [mpe: Incorporate changes from Nick, reword comment slightly.] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210503130243.891868-5-npiggin@gmail.com	2021-11-25 11:25:29 +11:00
Julia Lawall	a1d2b210ff	powerpc/btext: add missing of_node_put for_each_node_by_type performs an of_node_get on each iteration, so a break out of the loop requires an of_node_put. A simplified version of the semantic patch that fixes this problem is as follows (http://coccinelle.lip6.fr): // <smpl> @@ local idexpression n; expression e; @@ for_each_node_by_type(n,...) { ... ( of_node_put(n); \| e = n \| + of_node_put(n); ? break; ) ... } ... when != n // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1448051604-25256-6-git-send-email-Julia.Lawall@lip6.fr	2021-11-25 11:25:29 +11:00
Julia Lawall	a841fd009e	powerpc/cell: add missing of_node_put for_each_node_by_name performs an of_node_get on each iteration, so a break out of the loop requires an of_node_put. A simplified version of the semantic patch that fixes this problem is as follows (http://coccinelle.lip6.fr): // <smpl> @@ expression e,e1; local idexpression n; @@ for_each_node_by_name(n, e1) { ... when != of_node_put(n) when != e = n ( return n; \| + of_node_put(n); ? return ...; ) ... } // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1448051604-25256-7-git-send-email-Julia.Lawall@lip6.fr	2021-11-25 11:25:29 +11:00
Julia Lawall	7d405a939c	powerpc/powernv: add missing of_node_put for_each_compatible_node performs an of_node_get on each iteration, so a break out of the loop requires an of_node_put. A simplified version of the semantic patch that fixes this problem is as follows (http://coccinelle.lip6.fr): // <smpl> @@ local idexpression n; expression e; @@ for_each_compatible_node(n,...) { ... ( of_node_put(n); \| e = n \| + of_node_put(n); ? break; ) ... } ... when != n // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1448051604-25256-4-git-send-email-Julia.Lawall@lip6.fr	2021-11-25 11:25:29 +11:00
Julia Lawall	f6e82647ff	powerpc/6xx: add missing of_node_put for_each_compatible_node performs an of_node_get on each iteration, so a break out of the loop requires an of_node_put. A simplified version of the semantic patch that fixes this problem is as follows (http://coccinelle.lip6.fr): // <smpl> @@ expression e; local idexpression n; @@ @@ local idexpression n; expression e; @@ for_each_compatible_node(n,...) { ... ( of_node_put(n); \| e = n \| + of_node_put(n); ? break; ) ... } ... when != n // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/1448051604-25256-2-git-send-email-Julia.Lawall@lip6.fr	2021-11-25 11:25:29 +11:00
Nicholas Piggin	9c5a432a55	KVM: PPC: Book3S HV P9: Remove subcore HMI handling On POWER9 and newer, rather than the complex HMI synchronisation and subcore state, have each thread un-apply the guest TB offset before calling into the early HMI handler. This allows the subcore state to be avoided, including subcore enter / exit guest, which includes an expensive divide that shows up slightly in profiles. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-54-npiggin@gmail.com	2021-11-24 21:09:03 +11:00
Nicholas Piggin	6398326b9b	KVM: PPC: Book3S HV P9: Stop using vc->dpdes The P9 path uses vc->dpdes only for msgsndp / SMT emulation. This adds an ordering requirement between vcpu->doorbell_request and vc->dpdes for no real benefit. Use vcpu->doorbell_request directly. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-53-npiggin@gmail.com	2021-11-24 21:09:03 +11:00
Nicholas Piggin	617326ff01	KVM: PPC: Book3S HV P9: Tidy kvmppc_create_dtl_entry This goes further to removing vcores from the P9 path. Also avoid the memset in favour of explicitly initialising all fields. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-52-npiggin@gmail.com	2021-11-24 21:09:03 +11:00
Nicholas Piggin	ecb6a7207f	KVM: PPC: Book3S HV P9: Remove most of the vcore logic The P9 path always uses one vcpu per vcore, so none of the vcore, locks, stolen time, blocking logic, shared waitq, etc., is required. Remove most of it. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-51-npiggin@gmail.com	2021-11-24 21:09:02 +11:00
Nicholas Piggin	434398ab5e	KVM: PPC: Book3S HV P9: Avoid cpu_in_guest atomics on entry and exit cpu_in_guest is set to determine if a CPU needs to be IPI'ed to exit the guest and notice the need_tlb_flush bit. This can be implemented as a global per-CPU pointer to the currently running guest instead of per-guest cpumasks, saving 2 atomics per entry/exit. P7/8 doesn't require cpu_in_guest, nor does a nested HV (only the L0 does), so move it to the P9 HV path. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-50-npiggin@gmail.com	2021-11-24 21:09:02 +11:00
Nicholas Piggin	4c9a68914e	KVM: PPC: Book3S HV P9: Add unlikely annotation for !mmu_ready The mmu will almost always be ready. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-49-npiggin@gmail.com	2021-11-24 21:09:02 +11:00
Nicholas Piggin	f08cbf5c7d	KVM: PPC: Book3S HV P9: Avoid changing MSR[RI] in entry and exit kvm_hstate.in_guest provides the equivalent of MSR[RI]=0 protection, and it covers the existing MSR[RI]=0 section in late entry and early exit, so clearing and setting MSR[RI] in those cases does not actually do anything useful. Remove the RI manipulation and replace it with comments. Make the in_guest memory accesses a bit closer to a proper critical section pattern. This speeds up guest entry/exit performance. This also removes the MSR[RI] warnings which aren't very interesting and would cause crashes if they hit due to causing an interrupt in non-recoverable code. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-48-npiggin@gmail.com	2021-11-24 21:09:02 +11:00
Nicholas Piggin	241d1f19f0	KVM: PPC: Book3S HV P9: Optimise hash guest SLB saving slbmfee/slbmfev instructions are very expensive, moreso than a regular mfspr instruction, so minimising them significantly improves hash guest exit performance. The slbmfev is only required if slbmfee found a valid SLB entry. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-47-npiggin@gmail.com	2021-11-24 21:09:02 +11:00
Nicholas Piggin	b49c65c5f9	KVM: PPC: Book3S HV P9: Improve mfmsr performance on entry Rearrange the MSR saving on entry so it does not follow the mtmsrd to disable interrupts, avoiding a possible RAW scoreboard stall. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-46-npiggin@gmail.com	2021-11-24 21:09:02 +11:00
Nicholas Piggin	46dea77f79	KVM: PPC: Book3S HV Nested: Avoid extra mftb() in nested entry mftb() is expensive and one can be avoided on nested guest dispatch. If the time checking code distinguishes between the L0 timer and the nested HV timer, then both can be tested in the same place with the same mftb() value. This also nicely illustrates the relationship between the L0 and nested HV timers. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-45-npiggin@gmail.com	2021-11-24 21:09:02 +11:00
Nicholas Piggin	d5c0e8332d	KVM: PPC: Book3S HV P9: Avoid tlbsync sequence on radix guest exit Use the existing TLB flushing logic to IPI the previous CPU and run the necessary barriers before running a guest vCPU on a new physical CPU, to do the necessary radix GTSE barriers for handling the case of an interrupted guest tlbie sequence. This requires the vCPU TLB flush sequence that is currently just done on one thread, to be expanded to ensure the other threads execute a ptesync, because causing them to exit the guest will no longer cause a ptesync by itself. This results in more IPIs than the TLB flush logic requires, but it's a significant win for common case scheduling when the vCPU remains on the same physical CPU. This saves about 520 cycles (nearly 10%) on a guest entry+exit micro benchmark on a POWER9. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-44-npiggin@gmail.com	2021-11-24 21:09:01 +11:00
Nicholas Piggin	0ba0e5d5a6	KVM: PPC: Book3S HV: Split P8 from P9 path guest vCPU TLB flushing This creates separate functions for old and new paths for vCPU TLB flushing, which will reduce complexity of the next change. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-43-npiggin@gmail.com	2021-11-24 21:09:01 +11:00
Nicholas Piggin	a089a6869e	KVM: PPC: Book3S HV P9: Don't restore PSSCR if not needed This also moves the PSSCR update in nested entry to avoid a SPR scoreboard stall. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-42-npiggin@gmail.com	2021-11-24 21:09:01 +11:00
Nicholas Piggin	9c75f65f35	KVM: PPC: Book3S HV P9: Test dawr_enabled() before saving host DAWR SPRs Some of the DAWR SPR access is already predicated on dawr_enabled(), apply this to the remainder of the accesses. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-41-npiggin@gmail.com	2021-11-24 21:09:01 +11:00
Nicholas Piggin	cf3b16cfa6	KVM: PPC: Book3S HV P9: Comment and fix MMU context switching code Tighten up partition switching code synchronisation and comments. In particular, hwsync ; isync is required after the last access that is performed in the context of a partition, before the partition is switched away from. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-40-npiggin@gmail.com	2021-11-24 21:09:01 +11:00
Nicholas Piggin	5236756d04	KVM: PPC: Book3S HV P9: Use Linux SPR save/restore to manage some host SPRs Linux implements SPR save/restore including storage space for registers in the task struct for process context switching. Make use of this similarly to the way we make use of the context switching fp/vec save restore. This improves code reuse, allows some stack space to be saved, and helps with avoiding VRSAVE updates if they are not required. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-39-npiggin@gmail.com	2021-11-24 21:09:01 +11:00
Nicholas Piggin	022ecb960c	KVM: PPC: Book3S HV P9: Demand fault TM facility registers Use HFSCR facility disabling to implement demand faulting for TM, with a hysteresis counter similar to the load_fp etc counters in context switching that implement the equivalent demand faulting for userspace facilities. This speeds up guest entry/exit by avoiding the register save/restore when a guest is not frequently using them. When a guest does use them often, there will be some additional demand fault overhead, but these are not commonly used facilities. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-38-npiggin@gmail.com	2021-11-24 21:09:01 +11:00
Nicholas Piggin	a3e18ca8ab	KVM: PPC: Book3S HV P9: Demand fault EBB facility registers Use HFSCR facility disabling to implement demand faulting for EBB, with a hysteresis counter similar to the load_fp etc counters in context switching that implement the equivalent demand faulting for userspace facilities. This speeds up guest entry/exit by avoiding the register save/restore when a guest is not frequently using them. When a guest does use them often, there will be some additional demand fault overhead, but these are not commonly used facilities. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-37-npiggin@gmail.com	2021-11-24 21:09:01 +11:00
Nicholas Piggin	34e02d555d	KVM: PPC: Book3S HV P9: More SPR speed improvements This avoids more scoreboard stalls and reduces mtSPRs. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-36-npiggin@gmail.com	2021-11-24 21:09:00 +11:00
Nicholas Piggin	d55b1eccc7	KVM: PPC: Book3S HV P9: Restrict DSISR canary workaround to processors that require it Use CPU_FTR_P9_RADIX_PREFETCH_BUG to apply the workaround, to test for DD2.1 and below processors. This saves a mtSPR in guest entry. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-35-npiggin@gmail.com	2021-11-24 21:09:00 +11:00
Nicholas Piggin	3e7b337902	KVM: PPC: Book3S HV P9: Switch PMU to guest as late as possible This moves PMU switch to guest as late as possible in entry, and switch back to host as early as possible at exit. This helps the host get the most perf coverage of KVM entry/exit code as possible. This is slightly suboptimal for SPR scheduling point of view when the PMU is enabled, but when perf is disabled there is no real difference. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20211123095231.1036501-34-npiggin@gmail.com	2021-11-24 21:09:00 +11:00

1 2 3 4 5 ...

24412 Commits