linux/arch/arm/mm/copypage-feroceon.c

109 lines
3.1 KiB
C
Raw Normal View History

// SPDX-License-Identifier: GPL-2.0-only
/*
* linux/arch/arm/mm/copypage-feroceon.S
*
* Copyright (C) 2008 Marvell Semiconductors
*
* This handles copy_user_highpage and clear_user_page on Feroceon
* more optimally than the generic implementations.
*/
#include <linux/init.h>
#include <linux/highmem.h>
ARM: 8805/2: remove unneeded naked function usage The naked attribute is known to confuse some old gcc versions when function arguments aren't explicitly listed as inline assembly operands despite the gcc documentation. That resulted in commit 9a40ac86152c ("ARM: 6164/1: Add kto and kfrom to input operands list."). Yet that commit has problems of its own by having assembly operand constraints completely wrong. If the generated code has been OK since then, it is due to luck rather than correctness. So this patch also provides proper assembly operand constraints, and removes two instances of redundant register usages in the implementation while at it. Inspection of the generated code with this patch doesn't show any obvious quality degradation either, so not relying on __naked at all will make the code less fragile, and avoid some issues with clang. The only remaining __naked instances (excluding the kprobes test cases) are exynos_pm_power_up_setup(), tc2_pm_power_up_setup() and cci_enable_port_for_self(. But in the first two cases, only the function address is used by the compiler with no chance of inlining it by mistake, and the third case is called from assembly code only. And the fact that no stack is available when the corresponding code is executed does warrant the __naked usage in those cases. Signed-off-by: Nicolas Pitre <nico@linaro.org> Reviewed-by: Stefan Agner <stefan@agner.ch> Tested-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-11-07 19:49:00 +03:00
static void feroceon_copy_user_page(void *kto, const void *kfrom)
{
ARM: 8805/2: remove unneeded naked function usage The naked attribute is known to confuse some old gcc versions when function arguments aren't explicitly listed as inline assembly operands despite the gcc documentation. That resulted in commit 9a40ac86152c ("ARM: 6164/1: Add kto and kfrom to input operands list."). Yet that commit has problems of its own by having assembly operand constraints completely wrong. If the generated code has been OK since then, it is due to luck rather than correctness. So this patch also provides proper assembly operand constraints, and removes two instances of redundant register usages in the implementation while at it. Inspection of the generated code with this patch doesn't show any obvious quality degradation either, so not relying on __naked at all will make the code less fragile, and avoid some issues with clang. The only remaining __naked instances (excluding the kprobes test cases) are exynos_pm_power_up_setup(), tc2_pm_power_up_setup() and cci_enable_port_for_self(. But in the first two cases, only the function address is used by the compiler with no chance of inlining it by mistake, and the third case is called from assembly code only. And the fact that no stack is available when the corresponding code is executed does warrant the __naked usage in those cases. Signed-off-by: Nicolas Pitre <nico@linaro.org> Reviewed-by: Stefan Agner <stefan@agner.ch> Tested-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-11-07 19:49:00 +03:00
int tmp;
asm volatile ("\
ARM: 9263/1: use .arch directives instead of assembler command line flags Similar to commit a6c30873ee4a ("ARM: 8989/1: use .fpu assembler directives instead of assembler arguments"). GCC and GNU binutils support setting the "sub arch" via -march=, -Wa,-march, target function attribute, and .arch assembler directive. Clang was missing support for -Wa,-march=, but this was implemented in clang-13. The behavior of both GCC and Clang is to prefer -Wa,-march= over -march= for assembler and assembler-with-cpp sources, but Clang will warn about the -march= being unused. clang: warning: argument unused during compilation: '-march=armv6k' [-Wunused-command-line-argument] Since most assembler is non-conditionally assembled with one sub arch (modulo arch/arm/delay-loop.S which conditionally is assembled as armv4 based on CONFIG_ARCH_RPC, and arch/arm/mach-at91/pm-suspend.S which is conditionally assembled as armv7-a based on CONFIG_CPU_V7), prefer the .arch assembler directive. Add a few more instances found in compile testing as found by Arnd and Nathan. Link: https://github.com/llvm/llvm-project/commit/1d51c699b9e2ebc5bcfdbe85c74cc871426333d4 Link: https://bugs.llvm.org/show_bug.cgi?id=48894 Link: https://github.com/ClangBuiltLinux/linux/issues/1195 Link: https://github.com/ClangBuiltLinux/linux/issues/1315 Suggested-by: Arnd Bergmann <arnd@arndb.de> Suggested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-10-24 22:44:41 +03:00
.arch armv5te \n\
ARM: 8805/2: remove unneeded naked function usage The naked attribute is known to confuse some old gcc versions when function arguments aren't explicitly listed as inline assembly operands despite the gcc documentation. That resulted in commit 9a40ac86152c ("ARM: 6164/1: Add kto and kfrom to input operands list."). Yet that commit has problems of its own by having assembly operand constraints completely wrong. If the generated code has been OK since then, it is due to luck rather than correctness. So this patch also provides proper assembly operand constraints, and removes two instances of redundant register usages in the implementation while at it. Inspection of the generated code with this patch doesn't show any obvious quality degradation either, so not relying on __naked at all will make the code less fragile, and avoid some issues with clang. The only remaining __naked instances (excluding the kprobes test cases) are exynos_pm_power_up_setup(), tc2_pm_power_up_setup() and cci_enable_port_for_self(. But in the first two cases, only the function address is used by the compiler with no chance of inlining it by mistake, and the third case is called from assembly code only. And the fact that no stack is available when the corresponding code is executed does warrant the __naked usage in those cases. Signed-off-by: Nicolas Pitre <nico@linaro.org> Reviewed-by: Stefan Agner <stefan@agner.ch> Tested-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-11-07 19:49:00 +03:00
1: ldmia %1!, {r2 - r7, ip, lr} \n\
pld [%1, #0] \n\
pld [%1, #32] \n\
pld [%1, #64] \n\
pld [%1, #96] \n\
pld [%1, #128] \n\
pld [%1, #160] \n\
pld [%1, #192] \n\
stmia %0, {r2 - r7, ip, lr} \n\
ldmia %1!, {r2 - r7, ip, lr} \n\
mcr p15, 0, %0, c7, c14, 1 @ clean and invalidate D line\n\
add %0, %0, #32 \n\
stmia %0, {r2 - r7, ip, lr} \n\
ldmia %1!, {r2 - r7, ip, lr} \n\
mcr p15, 0, %0, c7, c14, 1 @ clean and invalidate D line\n\
add %0, %0, #32 \n\
stmia %0, {r2 - r7, ip, lr} \n\
ldmia %1!, {r2 - r7, ip, lr} \n\
mcr p15, 0, %0, c7, c14, 1 @ clean and invalidate D line\n\
add %0, %0, #32 \n\
stmia %0, {r2 - r7, ip, lr} \n\
ldmia %1!, {r2 - r7, ip, lr} \n\
mcr p15, 0, %0, c7, c14, 1 @ clean and invalidate D line\n\
add %0, %0, #32 \n\
stmia %0, {r2 - r7, ip, lr} \n\
ldmia %1!, {r2 - r7, ip, lr} \n\
mcr p15, 0, %0, c7, c14, 1 @ clean and invalidate D line\n\
add %0, %0, #32 \n\
stmia %0, {r2 - r7, ip, lr} \n\
ldmia %1!, {r2 - r7, ip, lr} \n\
mcr p15, 0, %0, c7, c14, 1 @ clean and invalidate D line\n\
add %0, %0, #32 \n\
stmia %0, {r2 - r7, ip, lr} \n\
ldmia %1!, {r2 - r7, ip, lr} \n\
mcr p15, 0, %0, c7, c14, 1 @ clean and invalidate D line\n\
add %0, %0, #32 \n\
stmia %0, {r2 - r7, ip, lr} \n\
subs %2, %2, #(32 * 8) \n\
mcr p15, 0, %0, c7, c14, 1 @ clean and invalidate D line\n\
add %0, %0, #32 \n\
bne 1b \n\
ARM: 8805/2: remove unneeded naked function usage The naked attribute is known to confuse some old gcc versions when function arguments aren't explicitly listed as inline assembly operands despite the gcc documentation. That resulted in commit 9a40ac86152c ("ARM: 6164/1: Add kto and kfrom to input operands list."). Yet that commit has problems of its own by having assembly operand constraints completely wrong. If the generated code has been OK since then, it is due to luck rather than correctness. So this patch also provides proper assembly operand constraints, and removes two instances of redundant register usages in the implementation while at it. Inspection of the generated code with this patch doesn't show any obvious quality degradation either, so not relying on __naked at all will make the code less fragile, and avoid some issues with clang. The only remaining __naked instances (excluding the kprobes test cases) are exynos_pm_power_up_setup(), tc2_pm_power_up_setup() and cci_enable_port_for_self(. But in the first two cases, only the function address is used by the compiler with no chance of inlining it by mistake, and the third case is called from assembly code only. And the fact that no stack is available when the corresponding code is executed does warrant the __naked usage in those cases. Signed-off-by: Nicolas Pitre <nico@linaro.org> Reviewed-by: Stefan Agner <stefan@agner.ch> Tested-by: Stefan Agner <stefan@agner.ch> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2018-11-07 19:49:00 +03:00
mcr p15, 0, %2, c7, c10, 4 @ drain WB"
: "+&r" (kto), "+&r" (kfrom), "=&r" (tmp)
: "2" (PAGE_SIZE)
: "r2", "r3", "r4", "r5", "r6", "r7", "ip", "lr");
}
void feroceon_copy_user_highpage(struct page *to, struct page *from,
unsigned long vaddr, struct vm_area_struct *vma)
{
void *kto, *kfrom;
kto = kmap_atomic(to);
kfrom = kmap_atomic(from);
ARM: Flush user mapping on VIVT processors when copying a page Steven Walter <stevenrwalter@gmail.com> writes: > I've been tracking down an instance of userspace data corruption, > and I believe I have found a window during fork where data can be > lost. The corruption is occurring on an ARMv5 system with VIVT > caches. Here's the scenario in question. Thread A is forking, > Thread B is running in userspace: > > Thread A: flush_cache_mm() (dup_mmap) > Thread B: writes to a page in the above mm > Thread A: pte_wrprotect() the above page (copy_one_pte) > Thread B: writes to the same page again > > During thread B's second write, he'll take a fault and enter the > do_wp_page() case. We'll end up calling copy_page(), which notably > uses the kernel virtual addresses for the old and new pages. This > means that the new page does not necessarily have the data from the > first write. Now there are two conflicting copies of the same > cache-line in dcache. If the userspace cache-line flushes before > the kernel cache-line, we lose the changes made during the first > write. do_wp_page does call flush_dcache_page on the newly-copied > page, but there's still a window where the CPU could flush the > userspace cache-line before then. Resolve this by flushing the user mapping before copying the page on processors with a writeback VIVT cache. Note: this does have a performance impact, and so needs further consideration before being merged - can we optimize out some of the cache flushes if, eg, we know that the page isn't yet mapped? Thread: <e06498070903061426o5875ad13hc6328aa0d3f08ed7@mail.gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
2009-10-05 18:34:22 +04:00
flush_cache_page(vma, vaddr, page_to_pfn(from));
feroceon_copy_user_page(kto, kfrom);
kunmap_atomic(kfrom);
kunmap_atomic(kto);
}
void feroceon_clear_user_highpage(struct page *page, unsigned long vaddr)
{
void *ptr, *kaddr = kmap_atomic(page);
asm volatile ("\
mov r1, %2 \n\
mov r2, #0 \n\
mov r3, #0 \n\
mov r4, #0 \n\
mov r5, #0 \n\
mov r6, #0 \n\
mov r7, #0 \n\
mov ip, #0 \n\
mov lr, #0 \n\
1: stmia %0, {r2-r7, ip, lr} \n\
subs r1, r1, #1 \n\
mcr p15, 0, %0, c7, c14, 1 @ clean and invalidate D line\n\
add %0, %0, #32 \n\
bne 1b \n\
mcr p15, 0, r1, c7, c10, 4 @ drain WB"
: "=r" (ptr)
: "0" (kaddr), "I" (PAGE_SIZE / 32)
: "r1", "r2", "r3", "r4", "r5", "r6", "r7", "ip", "lr");
kunmap_atomic(kaddr);
}
struct cpu_user_fns feroceon_user_fns __initdata = {
.cpu_clear_user_highpage = feroceon_clear_user_highpage,
.cpu_copy_user_highpage = feroceon_copy_user_highpage,
};