IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
commit 0e25498f8cd43c1b5aa327f373dd094e9a006da7 upstream.
There are two big uses of do_exit. The first is it's design use to be
the guts of the exit(2) system call. The second use is to terminate
a task after something catastrophic has happened like a NULL pointer
in kernel code.
Add a function make_task_dead that is initialy exactly the same as
do_exit to cover the cases where do_exit is called to handle
catastrophic failure. In time this can probably be reduced to just a
light wrapper around do_task_dead. For now keep it exactly the same so
that there will be no behavioral differences introducing this new
concept.
Replace all of the uses of do_exit that use it for catastraphic
task cleanup with make_task_dead to make it clear what the code
is doing.
As part of this rename rewind_stack_do_exit
rewind_stack_and_make_dead.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e112b032a72c78f15d0c803c5dc6be444c2e6c66 upstream.
Currently in arm64, FDT is mapped to RO before it's passed to
early_init_dt_scan(). However, there might be some codes
(eg. commit "fdt: add support for rng-seed") that need to modify FDT
during init. Map FDT to RO after early fixups are done.
Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Will Deacon <will@kernel.org>
[mkbestas: fixed trivial conflicts for 4.19 backport]
Signed-off-by: Michael Bestas <mkbestas@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a9c406e6462ff14956d690de7bbe5131a5677dc9 upstream.
Adding a second set of vectors to .entry.tramp.text will make it
larger than a single 4K page.
Allow the trampoline text to occupy up to three pages by adding two
more fixmap slots. Previous changes to tramp_valias allowed it to reach
beyond a single page.
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit a194c5f2d2b3a05428805146afcabe5140b5d378 ]
The @node passed to cpumask_of_node() can be NUMA_NO_NODE, in that
case it will trigger the following WARN_ON(node >= nr_node_ids) due to
mismatched data types of @node and @nr_node_ids. Actually we should
return cpu_all_mask just like most other architectures do if passed
NUMA_NO_NODE.
Also add a similar check to the inline cpumask_of_node() in numa.h.
Signed-off-by: Zhengyuan Liu <liuzhengyuan@tj.kylinos.cn>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Link: https://lore.kernel.org/r/20200921023936.21846-1-liuzhengyuan@tj.kylinos.cn
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 027d0c7101f50cf03aeea9eebf484afd4920c8d3 upstream.
The static analyzer in GCC 10 spotted that in huge_pte_alloc() we may
pass a NULL pmdp into pte_alloc_map() when pmd_alloc() returns NULL:
| CC arch/arm64/mm/pageattr.o
| CC arch/arm64/mm/hugetlbpage.o
| from arch/arm64/mm/hugetlbpage.c:10:
| arch/arm64/mm/hugetlbpage.c: In function ‘huge_pte_alloc’:
| ./arch/arm64/include/asm/pgtable-types.h:28:24: warning: dereference of NULL ‘pmdp’ [CWE-690] [-Wanalyzer-null-dereference]
| ./arch/arm64/include/asm/pgtable.h:436:26: note: in expansion of macro ‘pmd_val’
| arch/arm64/mm/hugetlbpage.c:242:10: note: in expansion of macro ‘pte_alloc_map’
| |arch/arm64/mm/hugetlbpage.c:232:10:
| |./arch/arm64/include/asm/pgtable-types.h:28:24:
| ./arch/arm64/include/asm/pgtable.h:436:26: note: in expansion of macro ‘pmd_val’
| arch/arm64/mm/hugetlbpage.c:242:10: note: in expansion of macro ‘pte_alloc_map’
This can only occur when the kernel cannot allocate a page, and so is
unlikely to happen in practice before other systems start failing.
We can avoid this by bailing out if pmd_alloc() fails, as we do earlier
in the function if pud_alloc() fails.
Fixes: 66b3923a1a0f ("arm64: hugetlb: add support for PTE contiguous bit")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Kyrill Tkachov <kyrylo.tkachov@arm.com>
Cc: <stable@vger.kernel.org> # 4.5.x-
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 24cecc37746393432d994c0dbc251fb9ac7c5d72 upstream.
The ARMv8 64-bit architecture supports execute-only user permissions by
clearing the PTE_USER and PTE_UXN bits, practically making it a mostly
privileged mapping but from which user running at EL0 can still execute.
The downside, however, is that the kernel at EL1 inadvertently reading
such mapping would not trip over the PAN (privileged access never)
protection.
Revert the relevant bits from commit cab15ce604e5 ("arm64: Introduce
execute-only page access permissions") so that PROT_EXEC implies
PROT_READ (and therefore PTE_USER) until the architecture gains proper
support for execute-only user mappings.
Fixes: cab15ce604e5 ("arm64: Introduce execute-only page access permissions")
Cc: <stable@vger.kernel.org> # 4.9.x-
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 77cfe950901e5c13aca2df6437a05f39dd9a929b ]
The dummy node ID is marked into all memory ranges on the system. So the
dummy node really extends the entire memblock.memory. Hence report correct
extent information for the dummy node using memblock range helper functions
instead of the range [0LLU, PFN_PHYS(max_pfn) - 1)].
Fixes: 1a2db30034 ("arm64, numa: Add NUMA support for arm64 platforms")
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e8d54b62c55ab6201de6d195fc2c276294c1f6ae ]
Do not offset mmap base address because of stack randomization if current
task does not want randomization. Note that x86 already implements this
behaviour.
Link: http://lkml.kernel.org/r/20190730055113.23635-4-alex@ghiti.fr
Signed-off-by: Alexandre Ghiti <alex@ghiti.fr>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f32c7a8e45105bd0af76872bf6eef0438ff12fb2 ]
While the MMUs is disabled, I-cache speculation can result in
instructions being fetched from the PoC. During boot we may patch
instructions (e.g. for alternatives and jump labels), and these may be
dirty at the PoU (and stale at the PoC).
Thus, while the MMU is disabled in the KPTI pagetable fixup code we may
load stale instructions into the I-cache, potentially leading to
subsequent crashes when executing regions of code which have been
modified at runtime.
Similarly to commit:
8ec41987436d566f ("arm64: mm: ensure patched kernel text is fetched from PoU")
... we can invalidate the I-cache after enabling the MMU to prevent such
issues.
The KPTI pagetable fixup code itself should be clean to the PoC per the
boot protocol, so no maintenance is required for this code.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 0c1f14ed12262f45a3af1d588e4d7bd12438b8f5 ]
This change makes CONFIG_ZONE_DMA32 defuly y and allows users
to overwrite it only when CONFIG_EXPERT=y.
For the SoCs that do not need CONFIG_ZONE_DMA32, this is the
first step to manage all available memory by a single
zone(normal zone) to reduce the overhead of multiple zones.
The change also fixes a build error when CONFIG_NUMA=y and
CONFIG_ZONE_DMA32=n.
arch/arm64/mm/init.c:195:17: error: use of undeclared identifier 'ZONE_DMA32'
max_zone_pfns[ZONE_DMA32] = PFN_DOWN(max_zone_dma_phys());
Change since v1:
1. only expose CONFIG_ZONE_DMA32 when CONFIG_EXPERT=y
2. remove redundant IS_ENABLED(CONFIG_ZONE_DMA32)
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Miles Chen <miles.chen@mediatek.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 7ba36eccb3f83983a651efd570b4f933ecad1b5c ]
The arm64 ptdump code can race with concurrent modification of the
kernel page tables. At the time this was added, this was sound as:
* Modifications to leaf entries could result in stale information being
logged, but would not result in a functional problem.
* Boot time modifications to non-leaf entries (e.g. freeing of initmem)
were performed when the ptdump code cannot be invoked.
* At runtime, modifications to non-leaf entries only occurred in the
vmalloc region, and these were strictly additive, as intermediate
entries were never freed.
However, since commit:
commit 324420bf91f6 ("arm64: add support for ioremap() block mappings")
... it has been possible to create huge mappings in the vmalloc area at
runtime, and as part of this existing intermediate levels of table my be
removed and freed.
It's possible for the ptdump code to race with this, and continue to
walk tables which have been freed (and potentially poisoned or
reallocated). As a result of this, the ptdump code may dereference bogus
addresses, which could be fatal.
Since huge-vmap is a TLB and memory optimization, we can disable it when
the runtime ptdump code is in use to avoid this problem.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Fixes: 324420bf91f60582 ("arm64: add support for ioremap() block mappings")
Acked-by: Ard Biesheuvel <ard.biesheuvel@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 969f5ea627570e91c9d54403287ee3ed657f58fe upstream.
Revisions of the Cortex-A76 CPU prior to r4p0 are affected by an erratum
that can prevent interrupts from being taken when single-stepping.
This patch implements a software workaround to prevent userspace from
effectively being able to disable interrupts.
Cc: <stable@vger.kernel.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a98d9ae937d256ed679a935fc82d9deaa710d98e upstream.
DMA allocations that can't sleep may return non-remapped addresses, but
we do not properly handle them in the mmap and get_sgtable methods.
Resolve non-vmalloc addresses using virt_to_page to handle this corner
case.
Cc: <stable@vger.kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 827a108e354db633698f0b4a10c1ffd2b1f8d1d0 upstream.
When the CPU comes out of suspend, the firmware may have modified the OS
Double Lock Register. Save it in an unused slot of cpu_suspend_ctx, and
restore it on resume.
Cc: <stable@vger.kernel.org>
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit c8a43c18a97845e7f94ed7d181c11f41964976a2 ]
When KASLR is enabled (CONFIG_RANDOMIZE_BASE=y), the top 4K of kernel
virtual address space may be mapped to physical addresses despite being
reserved for ERR_PTR values.
Fix the randomization of the linear region so that we avoid mapping the
last page of the virtual address space.
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: liyueyi <liyueyi@live.com>
[will: rewrote commit message; merged in suggestion from Ard]
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sasha Levin (Microsoft) <sashal@kernel.org>
commit b9a4b9d084d978f80eb9210727c81804588b42ff upstream.
FAR_EL1 is UNKNOWN for all debug exceptions other than those caused by
taking a hardware watchpoint. Unfortunately, if a debug handler returns
a non-zero value, then we will propagate the UNKNOWN FAR value to
userspace via the si_addr field of the SIGTRAP siginfo_t.
Instead, let's set si_addr to take on the PC of the faulting instruction,
which we have available in the current pt_regs.
Cc: <stable@vger.kernel.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 132fdc379eb143932d209a20fd581e1ce7630960 upstream.
Commit 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache
for kernel mappings") was aimed at fixing the I-cache invalidation for
kernel mappings. However, it inadvertently caused all cache maintenance
for user mappings via set_pte_at() -> __sync_icache_dcache() ->
sync_icache_aliases() to call kick_all_cpus_sync().
Reported-by: Shijith Thotton <sthotton@marvell.com>
Tested-by: Shijith Thotton <sthotton@marvell.com>
Reported-by: Wandun Chen <chenwandun@huawei.com>
Fixes: 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings")
Cc: <stable@vger.kernel.org> # 4.19.x-
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 33309ecda0070506c49182530abe7728850ebe78 ]
The dcache_by_line_op macro suffers from a couple of small problems:
First, the GAS directives that are currently being used rely on
assembler behavior that is not documented, and probably not guaranteed
to produce the correct behavior going forward. As a result, we end up
with some undefined symbols in cache.o:
$ nm arch/arm64/mm/cache.o
...
U civac
...
U cvac
U cvap
U cvau
This is due to the fact that the comparisons used to select the
operation type in the dcache_by_line_op macro are comparing symbols
not strings, and even though it seems that GAS is doing the right
thing here (undefined symbols by the same name are equal to each
other), it seems unwise to rely on this.
Second, when patching in a DC CVAP instruction on CPUs that support it,
the fallback path consists of a DC CVAU instruction which may be
affected by CPU errata that require ARM64_WORKAROUND_CLEAN_CACHE.
Solve these issues by unrolling the various maintenance routines and
using the conditional directives that are documented as operating on
strings. To avoid the complexity of nested alternatives, we move the
DC CVAP patching to __clean_dcache_area_pop, falling back to a branch
to __clean_dcache_area_poc if DCPOP is not supported by the CPU.
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Suggested-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 3238c359acee4ab57f15abb5a82b8ab38a661ee7 upstream.
We need to invalidate the caches *before* clearing the buffer via the
non-cacheable alias, else in the worst case __dma_flush_area() may
write back dirty lines over the top of our nice new zeros.
Fixes: dd65a941f6ba ("arm64: dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag")
Cc: <stable@vger.kernel.org> # 4.18.x-
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
For contiguous hugetlb, huge_ptep_set_access_flags performs a
get_clear_flush (which then flushes the TLBs) even when no change of ptes
is necessary.
Unfortunately, this behaviour can lead to back-to-back page faults being
generated when running with multiple threads that access the same
contiguous huge page.
Thread 1 | Thread 2
-----------------------------+------------------------------
hugetlb_fault |
huge_ptep_set_access_flags |
-> invalidate pte range | hugetlb_fault
continue processing | wait for hugetlb_fault_mutex
release mutex and return | huge_ptep_set_access_flags
| -> invalidate pte range
hugetlb_fault
...
This patch changes huge_ptep_set_access_flags s.t. we first read the
contiguous range of ptes (whilst preserving dirty information); the pte
range is only then invalidated where necessary and this prevents further
spurious page faults.
Fixes: d8bdcff28764 ("arm64: hugetlb: Add break-before-make logic for contiguous entries")
Reported-by: Lei Zhang <zhang.lei@jp.fujitsu.com>
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In the contiguous bit hugetlb break-before-make code we assume that all
hugetlb pages are young.
In fact, remove_migration_pte is able to place an old hugetlb pte so
this assumption is not valid.
This patch fixes the contiguous hugetlb scanning code to preserve young
ptes.
Fixes: d8bdcff28764 ("arm64: hugetlb: Add break-before-make logic for contiguous entries")
Signed-off-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In pmd_free_pte_page() and pud_free_pmd_page() we try to warn if they
hit a present non-table entry. In both cases we'll warn for non-present
entries, as the VM_WARN_ON() only checks the entry is not a table entry.
This has been observed to result in warnings when booting a v4.19-rc2
kernel under qemu.
Fix this by bailing out earlier for non-present entries.
Fixes: ec28bb9c9b0826d7 ("arm64: Implement page table free interfaces")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Merge updates from Andrew Morton:
- a few misc things
- a few Y2038 fixes
- ntfs fixes
- arch/sh tweaks
- ocfs2 updates
- most of MM
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (111 commits)
mm/hmm.c: remove unused variables align_start and align_end
fs/userfaultfd.c: remove redundant pointer uwq
mm, vmacache: hash addresses based on pmd
mm/list_lru: introduce list_lru_shrink_walk_irq()
mm/list_lru.c: pass struct list_lru_node* as an argument to __list_lru_walk_one()
mm/list_lru.c: move locking from __list_lru_walk_one() to its caller
mm/list_lru.c: use list_lru_walk_one() in list_lru_walk_node()
mm, swap: make CONFIG_THP_SWAP depend on CONFIG_SWAP
mm/sparse: delete old sparse_init and enable new one
mm/sparse: add new sparse_init_nid() and sparse_init()
mm/sparse: move buffer init/fini to the common place
mm/sparse: use the new sparse buffer functions in non-vmemmap
mm/sparse: abstract sparse buffer allocations
mm/hugetlb.c: don't zero 1GiB bootmem pages
mm, page_alloc: double zone's batchsize
mm/oom_kill.c: document oom_lock
mm/hugetlb: remove gigantic page support for HIGHMEM
mm, oom: remove sleep from under oom_lock
kernel/dma: remove unsupported gfp_mask parameter from dma_alloc_from_contiguous()
mm/cma: remove unsupported gfp_mask parameter from cma_alloc()
...
The CMA memory allocator doesn't support standard gfp flags for memory
allocation, so there is no point having it as a parameter for
dma_alloc_from_contiguous() function. Replace it by a boolean no_warn
argument, which covers all the underlaying cma_alloc() function
supports.
This will help to avoid giving false feeling that this function supports
standard gfp flags and callers can pass __GFP_ZERO to get zeroed buffer,
what has already been an issue: see commit dd65a941f6ba ("arm64:
dma-mapping: clear buffers allocated with FORCE_CONTIGUOUS flag").
Link: http://lkml.kernel.org/r/20180709122020eucas1p21a71b092975cb4a3b9954ffc63f699d1~-sqUFoa-h2939329393eucas1p2Y@eucas1p2.samsung.com
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Acked-by: Michał Nazarewicz <mina86@mina86.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use new return type vm_fault_t for fault handler. For now, this is just
documenting that the function returns a VM_FAULT value rather than an
errno. Once all instances are converted, vm_fault_t will become a
distinct type.
Ref-> commit 1c8f422059ae ("mm: change return type to vm_fault_t")
In this patch all the caller of handle_mm_fault() are changed to return
vm_fault_t type.
Link: http://lkml.kernel.org/r/20180617084810.GA6730@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: James Hogan <jhogan@kernel.org>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Palmer Dabbelt <palmer@sifive.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: David S. Miller <davem@davemloft.net>
Cc: Richard Weinberger <richard@nod.at>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Levin, Alexander (Sasha Levin)" <alexander.levin@verizon.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Fix boot on Hikey-960 by avoiding an IPI with interrupts disabled
- Fix address truncation in pfn_valid() implementation
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCgAGBQJbdp+EAAoJELescNyEwWM0Ld8H/iJqjvPwNLRC0KGL/rCQJH70
D80qlNBnwlrs2eUJTeNeRVZC+t2l9vJIoT17W938WkjxV+DSGDsfFDy3/BQ7VTji
7e33mwFBNoH+feAfMYmzht3sRlvyZ0oqXSIq/GrdZ8a4Gg/6iNVz7K1kpboBVFXp
LFnFIN4I7mNwdl1nAyNmnU081MMWfyvgRB82Xd9eS00KCAm3ueHfkwBNcwkfulDg
RT2ZXPzwd3Yxsdy3Z+r1vyXMHAw2GjcYpL5pjvHf34zMdvqkk03sMsx2yReuSR1U
M6MpNCdZfWHgMlFWbsEoEOd0g0CF5s6TQK3hBqoUEE3AUVNrQ8ixZMip326axoQ=
=C2YW
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"A couple of arm64 fixes
- Fix boot on Hikey-960 by avoiding an IPI with interrupts disabled
- Fix address truncation in pfn_valid() implementation"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: check for upper PAGE_SHIFT bits in pfn_valid()
arm64: Avoid calling stop_machine() when patching jump labels
ARM64's pfn_valid() shifts away the upper PAGE_SHIFT bits of the input
before seeing if the PFN is valid. This leads to false positives when
some of the upper bits are set, but the lower bits match a valid PFN.
For example, the following userspace code looks up a bogus entry in
/proc/kpageflags:
int pagemap = open("/proc/self/pagemap", O_RDONLY);
int pageflags = open("/proc/kpageflags", O_RDONLY);
uint64_t pfn, val;
lseek64(pagemap, [...], SEEK_SET);
read(pagemap, &pfn, sizeof(pfn));
if (pfn & (1UL << 63)) { /* valid PFN */
pfn &= ((1UL << 55) - 1); /* clear flag bits */
pfn |= (1UL << 55);
lseek64(pageflags, pfn * sizeof(uint64_t), SEEK_SET);
read(pageflags, &val, sizeof(val));
}
On ARM64 this causes the userspace process to crash with SIGSEGV rather
than reading (1 << KPF_NOPAGE). kpageflags_read() treats the offset as
valid, and stable_page_flags() will try to access an address between the
user and kernel address ranges.
Fixes: c1cc1552616d ("arm64: MMU initialisation")
Cc: stable@vger.kernel.org
Signed-off-by: Greg Hackmann <ghackmann@google.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
A bunch of good stuff in here:
- Wire up support for qspinlock, replacing our trusty ticket lock code
- Add an IPI to flush_icache_range() to ensure that stale instructions
fetched into the pipeline are discarded along with the I-cache lines
- Support for the GCC "stackleak" plugin
- Support for restartable sequences, plus an arm64 port for the selftest
- Kexec/kdump support on systems booting with ACPI
- Rewrite of our syscall entry code in C, which allows us to zero the
GPRs on entry from userspace
- Support for chained PMU counters, allowing 64-bit event counters to be
constructed on current CPUs
- Ensure scheduler topology information is kept up-to-date with CPU
hotplug events
- Re-enable support for huge vmalloc/IO mappings now that the core code
has the correct hooks to use break-before-make sequences
- Miscellaneous, non-critical fixes and cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCgAGBQJbbV41AAoJELescNyEwWM0WoEIALhrKtsIn6vqFlSs/w6aDuJL
cMWmFxjTaKLmIq2+cJIdFLOJ3CH80Pu9gB+nEv/k+cZdCTfUVKfRf28HTpmYWsht
bb4AhdHMC7yFW752BHk+mzJspeC8h/2Rm8wMuNVplZ3MkPrwo3vsiuJTofLhVL/y
BihlU3+5sfBvCYIsWnuEZIev+/I/s/qm1ASiqIcKSrFRZP6VTt5f9TC75vFI8seW
7yc3odKb0CArexB8yBjiPNziehctQF42doxQyL45hezLfWw4qdgHOSiwyiOMxEz9
Fwwpp8Tx33SKLNJgqoqYznGW9PhYJ7n2Kslv19uchJrEV+mds82vdDNaWRULld4=
=kQn6
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"A bunch of good stuff in here. Worth noting is that we've pulled in
the x86/mm branch from -tip so that we can make use of the core
ioremap changes which allow us to put down huge mappings in the
vmalloc area without screwing up the TLB. Much of the positive
diffstat is because of the rseq selftest for arm64.
Summary:
- Wire up support for qspinlock, replacing our trusty ticket lock
code
- Add an IPI to flush_icache_range() to ensure that stale
instructions fetched into the pipeline are discarded along with the
I-cache lines
- Support for the GCC "stackleak" plugin
- Support for restartable sequences, plus an arm64 port for the
selftest
- Kexec/kdump support on systems booting with ACPI
- Rewrite of our syscall entry code in C, which allows us to zero the
GPRs on entry from userspace
- Support for chained PMU counters, allowing 64-bit event counters to
be constructed on current CPUs
- Ensure scheduler topology information is kept up-to-date with CPU
hotplug events
- Re-enable support for huge vmalloc/IO mappings now that the core
code has the correct hooks to use break-before-make sequences
- Miscellaneous, non-critical fixes and cleanups"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (90 commits)
arm64: alternative: Use true and false for boolean values
arm64: kexec: Add comment to explain use of __flush_icache_range()
arm64: sdei: Mark sdei stack helper functions as static
arm64, kaslr: export offset in VMCOREINFO ELF notes
arm64: perf: Add cap_user_time aarch64
efi/libstub: Only disable stackleak plugin for arm64
arm64: drop unused kernel_neon_begin_partial() macro
arm64: kexec: machine_kexec should call __flush_icache_range
arm64: svc: Ensure hardirq tracing is updated before return
arm64: mm: Export __sync_icache_dcache() for xen-privcmd
drivers/perf: arm-ccn: Use devm_ioremap_resource() to map memory
arm64: Add support for STACKLEAK gcc plugin
arm64: Add stack information to on_accessible_stack
drivers/perf: hisi: update the sccl_id/ccl_id when MT is supported
arm64: fix ACPI dependencies
rseq/selftests: Add support for arm64
arm64: acpi: fix alignment fault in accessing ACPI
efi/arm: map UEFI memory map even w/o runtime services enabled
efi/arm: preserve early mapping of UEFI memory map longer for BGRT
drivers: acpi: add dependency of EFI for arm64
...
- Revert two ACPICA commits that are not needed any more (Erik
Schmauss).
- Rework property graph support in the ACPI device properties
framework to make it behave more like the analogous DT code
and update the documentation of it (Sakari Ailus).
- Change the default ACPI device status after initialization
to ACPI_STA_DEFAULT instead of 0 (Hans de Goede).
- Add a special platform driver for enumerating multiple I2C devices
hooked up to the same object in the ACPI tables (Hans de Goede).
- Fix the ACPI battery driver to avoid reporting full capacity on
systems without support for that and clean it up (Hans de Goede,
Dmitry Rozhkov, Lucas Rangit Magasweran).
- Add two system wakeup quirks to the ACPI EC driver (Aaron Ma,
Mika Westerberg).
- Add the touchscreen on Dell Venue Pro 7139 to the list of "always
present" devices to make it work (Tristian Celestin).
- Revert a special tables handling quirk for Dell XPS 9570 and
Precision M5530 which is not needed any more (Kai Heng Feng).
- Add support for a new OEM _OSI string to allow system vendors to
work around issues with NVidia HDMI audio (Alex Hung).
- Prevent the ACPI button driver from reporting excessive system
wakeup events and clean it up (Ravi Chandra Sadineni, Randy Dunlap).
- Clean up two minor code style issues in the ACPI core and GHES
handling on ARM64 (Dongjiu Geng, John Garry, Tom Todd).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJbcqQyAAoJEILEb/54YlRxpeIP/i+eB9sq+Pwwb6rpvrcx8JkQ
cWYbjBQPZLsqqHBQDlwVC4I5o0gn6OhRsQlgSQdxheXA+G6kRtbh942oPptIH7NY
vF0g0OYFZGhahjgkLyBtXlacMi+A+aSFkZ6lc4Ie4x/nFAELumjpYtef4v09Ecme
p1G5wz5KYv+47t0DHl+Lb1NglIHsLKtiNMiy9QHyXl3oxrcbW7VIG5o8INw6TgIg
/yaYOAzU8UMjnhcn6gDgrD+OKdux8jt3yAaHW/90/zBp4qbAJ9fp8zhb5LP7M0sE
r63pC2qUq7NDKAl99sMuX5I2jSub9d2yJcjcOSyt+FnB/ErsBCbJGm5rEkK4sRHv
wkV8ybHdq+PkOblR1ob5LMvDXcTlsVTMPonS/oPbEH4J7+jbJjU4vxZxblJGyan6
6GeiOoI6tMyZB7DBiTJH4EgwcNu6GESfPURnyUqRGCUswS7ocQBmRdSbsf6hRgn9
loOZnYXb+PPIxYHH9deFsej71d6En9bMUyUZmOGeLWZ+NSW69XISgnke0tmtgvlZ
pj3zJ7Egnw4SBms2gL/VtPEulxJ2MxScju1RbKMpjP86oIDrO4u3pD8tYSTouSSG
P3AjFXVqpKMTmfVUnbqu/Avg7GsuDHcv8lgUbY8BQGLXaZVni2oJBtcwc8Ca0lJg
8q0xTvX+YEnuOwgvNoNz
=5mW8
-----END PGP SIGNATURE-----
Merge tag 'acpi-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI updates from Rafael Wysocki:
"These revert two ACPICA commits that are not needed any more, rework
the property graphs support in ACPI to be more aligned with the
analogous DT code, add some new quirks and remove one that isn't
needed any more, add a special platform driver to enumerate multiple
I2C devices hooked up to the same device object in the ACPI tables and
update the battery and button drivers.
Specifics:
- Revert two ACPICA commits that are not needed any more (Erik
Schmauss).
- Rework property graph support in the ACPI device properties
framework to make it behave more like the analogous DT code and
update the documentation of it (Sakari Ailus).
- Change the default ACPI device status after initialization to
ACPI_STA_DEFAULT instead of 0 (Hans de Goede).
- Add a special platform driver for enumerating multiple I2C devices
hooked up to the same object in the ACPI tables (Hans de Goede).
- Fix the ACPI battery driver to avoid reporting full capacity on
systems without support for that and clean it up (Hans de Goede,
Dmitry Rozhkov, Lucas Rangit Magasweran).
- Add two system wakeup quirks to the ACPI EC driver (Aaron Ma, Mika
Westerberg).
- Add the touchscreen on Dell Venue Pro 7139 to the list of "always
present" devices to make it work (Tristian Celestin).
- Revert a special tables handling quirk for Dell XPS 9570 and
Precision M5530 which is not needed any more (Kai Heng Feng).
- Add support for a new OEM _OSI string to allow system vendors to
work around issues with NVidia HDMI audio (Alex Hung).
- Prevent the ACPI button driver from reporting excessive system
wakeup events and clean it up (Ravi Chandra Sadineni, Randy
Dunlap).
- Clean up two minor code style issues in the ACPI core and GHES
handling on ARM64 (Dongjiu Geng, John Garry, Tom Todd)"
* tag 'acpi-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (34 commits)
platform/x86: Add ACPI i2c-multi-instantiate pseudo driver
ACPI / x86: utils: Remove status workaround from acpi_device_always_present()
ACPI / scan: Create platform device for fwnodes with multiple i2c devices
ACPI / scan: Initialize status to ACPI_STA_DEFAULT
ACPI / EC: Add another entry for Thinkpad X1 Carbon 6th
ACPI: bus: Fix a pointer coding style issue
arm64 / ACPI: clean the additional checks before calling ghes_notify_sea()
ACPI / scan: Add static attribute to indirect_io_hosts[]
ACPI / battery: Do not export energy_full[_design] on devices without full_charge_capacity
ACPI / EC: Use ec_no_wakeup on ThinkPad X1 Yoga 3rd
ACPI / battery: get rid of negations in conditions
ACPI / battery: use specialized print macros
ACPI / battery: reorder headers alphabetically
ACPI / battery: drop inclusion of init.h
ACPI: battery: remove redundant old_present check on insertion
ACPI: property: graph: Update graph documentation to use generic references
ACPI: property: graph: Improve graph documentation for port/ep numbering
ACPI: property: graph: Fix graph documentation
ACPI: property: Update documentation for hierarchical data extension 1.1
ACPI: property: Document key numbering for hierarchical data extension refs
...
Pull x86 mm updates from Thomas Gleixner:
- Make lazy TLB mode even lazier to avoid pointless switch_mm()
operations, which reduces CPU load by 1-2% for memcache workloads
- Small cleanups and improvements all over the place
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Remove redundant check for kmem_cache_create()
arm/asm/tlb.h: Fix build error implicit func declaration
x86/mm/tlb: Make clear_asid_other() static
x86/mm/tlb: Skip atomic operations for 'init_mm' in switch_mm_irqs_off()
x86/mm/tlb: Always use lazy TLB mode
x86/mm/tlb: Only send page table free TLB flush to lazy TLB CPUs
x86/mm/tlb: Make lazy TLB mode lazier
x86/mm/tlb: Restructure switch_mm_irqs_off()
x86/mm/tlb: Leave lazy TLB mode at page table free time
mm: Allocate the mm_cpumask (mm->cpu_bitmap[]) dynamically based on nr_cpu_ids
x86/mm: Add TLB purge to free pmd/pte page interfaces
ioremap: Update pgtable free interfaces with addr
x86/mm: Disable ioremap free page handling on x86-PAE
In order to remove the additional check before calling the
ghes_notify_sea(), make stub definition when !CONFIG_ACPI_APEI_SEA.
After this cleanup, we can simply call the ghes_notify_sea() to let
APEI driver handle the SEA notification.
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 2c4541e24c55 ("mm: use vma_init() to initialize VMAs on stack and
data segments") tried to initialize various left-over ad-hoc vma's
"properly", but actually made things worse for the temporary vma's used
for TLB flushing.
vma_init() doesn't actually initialize all of the vma, just a few
fields, so doing something like
- struct vm_area_struct vma = { .vm_mm = tlb->mm, };
+ struct vm_area_struct vma;
+
+ vma_init(&vma, tlb->mm);
was actually very bad: instead of having a nicely initialized vma with
every field but "vm_mm" zeroed, you'd have an entirely uninitialized vma
with only a couple of fields initialized. And they weren't even fields
that the code in question mostly cared about.
The flush_tlb_range() function takes a "struct vma" rather than a
"struct mm_struct", because a few architectures actually care about what
kind of range it is - being able to only do an ITLB flush if it's a
range that doesn't have data accesses enabled, for example. And all the
normal users already have the vma for doing the range invalidation.
But a few people want to call flush_tlb_range() with a range they just
made up, so they also end up using a made-up vma. x86 just has a
special "flush_tlb_mm_range()" function for this, but other
architectures (arm and ia64) do the "use fake vma" thing instead, and
thus got caught up in the vma_init() changes.
At the same time, the TLB flushing code really doesn't care about most
other fields in the vma, so vma_init() is just unnecessary and
pointless.
This fixes things by having an explicit "this is just an initializer for
the TLB flush" initializer macro, which is used by the arm/arm64/ia64
people who mis-use this interface with just a dummy vma.
Fixes: 2c4541e24c55 ("mm: use vma_init() to initialize VMAs on stack and data segments")
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge misc fixes from Andrew Morton:
"11 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
kvm, mm: account shadow page tables to kmemcg
zswap: re-check zswap_is_full() after do zswap_shrink()
include/linux/eventfd.h: include linux/errno.h
mm: fix vma_is_anonymous() false-positives
mm: use vma_init() to initialize VMAs on stack and data segments
mm: introduce vma_init()
mm: fix exports that inadvertently make put_page() EXPORT_SYMBOL_GPL
ipc/sem.c: prevent queue.status tearing in semop
mm: disallow mappings that conflict for devm_memremap_pages()
kasan: only select SLUB_DEBUG with SYSFS=y
delayacct: fix crash in delayacct_blkio_end() after delayacct init failure
The xen-privcmd driver, which can be modular, calls set_pte_at()
which in turn may call __sync_icache_dcache().
The call to __sync_icache_dcache() may be optimised out because it is
conditional on !pte_special(), and xen-privcmd calls pte_mkspecial().
But it seems unwise to rely on this optimisation.
Fixes: 3ad0876554ca ("xen/privcmd: add IOCTL_PRIVCMD_MMAP_RESOURCE")
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Make sure to initialize all VMAs properly, not only those which come
from vm_area_cachep.
Link: http://lkml.kernel.org/r/20180724121139.62570-3-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd reports the following arm64 randconfig build error with the PSI
patches that add another page flag:
/git/arm-soc/arch/arm64/mm/init.c: In function 'mem_init':
/git/arm-soc/include/linux/compiler.h:357:38: error: call to
'__compiletime_assert_618' declared with attribute error: BUILD_BUG_ON
failed: sizeof(struct page) > (1 << STRUCT_PAGE_MAX_SHIFT)
The additional page flag causes other information stored in
page->flags to get bumped into their own struct page member:
#if SECTIONS_WIDTH+ZONES_WIDTH+NODES_SHIFT+LAST_CPUPID_SHIFT <=
BITS_PER_LONG - NR_PAGEFLAGS
#define LAST_CPUPID_WIDTH LAST_CPUPID_SHIFT
#else
#define LAST_CPUPID_WIDTH 0
#endif
#if defined(CONFIG_NUMA_BALANCING) && LAST_CPUPID_WIDTH == 0
#define LAST_CPUPID_NOT_IN_PAGE_FLAGS
#endif
which in turn causes the struct page size to exceed the size set in
STRUCT_PAGE_MAX_SHIFT. This value is an an estimate used to size the
VMEMMAP page array according to address space and struct page size.
However, the check is performed - and triggers here - on a !VMEMMAP
config, which consumes an additional 22 page bits for the sparse
section id. When VMEMMAP is enabled, those bits are returned, cpupid
doesn't need its own member, and the page passes the VMEMMAP check.
Restrict that check to the situation it was meant to check: that we
are sizing the VMEMMAP page array correctly.
Says Arnd:
Further experiments show that the build error already existed before,
but was only triggered with larger values of CONFIG_NR_CPU and/or
CONFIG_NODES_SHIFT that might be used in actual configurations but
not in randconfig builds.
With longer CPU and node masks, I could recreate the problem with
kernels as old as linux-4.7 when arm64 NUMA support got added.
Reported-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Arnd Bergmann <arnd@arndb.de>
Cc: stable@vger.kernel.org
Fixes: 1a2db300348b ("arm64, numa: Add NUMA support for arm64 platforms.")
Fixes: 3e1907d5bf5a ("arm64: mm: move vmemmap region right below the linear region")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Now that we have sysreg_clear_set(), we can consistently use this
instead of config_sctlr_el1().
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Dave Martin <dave.martin@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
lkdtm calls flush_icache_range(), which results in an out-of-line call
to __flush_icache_range(), which is not exported to modules.
Export the symbol to modules to fix this build breakage.
Fixes: 3b8c9f1cdfc5 ("arm64: IPI each CPU after invalidating the I-cache for kernel mappings")
Signed-off-by: Will Deacon <will.deacon@arm.com>
Currently numa_clear_node removes both cpu information from the NUMA
node cpumap as well as the NUMA node id from the cpu. Similarly
numa_store_cpu_info updates both percpu nodeid and NUMA cpumap.
However we need to retain the numa node id for the cpu and only remove
the cpu information from the numa node cpumap during CPU hotplug out.
The same can be extended for hotplugging in the CPU.
This patch separates out numa_{add,remove}_cpu from numa_clear_node and
numa_store_cpu_info.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Reviewed-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Tested-by: Ganapatrao Kulkarni <ganapatrao.kulkarni@cavium.com>
Tested-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
arm64 requires break-before-make. Originally, before
setting up new pmd/pud entry for huge mapping, in few
cases, the modifying pmd/pud entry was still valid
and pointing to next level page table as we only
clear off leaf PTE in unmap leg.
a) This was resulting into stale entry in TLBs (as few
TLBs also cache intermediate mapping for performance
reasons)
b) Also, modifying pmd/pud was the only reference to
next level page table and it was getting lost without
freeing it. So, page leaks were happening.
Implement pud_free_pmd_page() and pmd_free_pte_page() to
enforce BBM and also free the leaking page tables.
Implementation requires,
1) Clearing off the current pud/pmd entry
2) Invalidation of TLB
3) Freeing of the un-used next level page tables
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Chintan Pandya <cpandya@codeaurora.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Pull in core ioremap changes from -tip, since we depend on these for
re-enabling huge I/O mappings on arm64.
Signed-off-by: Will Deacon <will.deacon@arm.com>
When invalidating the instruction cache for a kernel mapping via
flush_icache_range(), it is also necessary to flush the pipeline for
other CPUs so that instructions fetched into the pipeline before the
I-cache invalidation are discarded. For example, if module 'foo' is
unloaded and then module 'bar' is loaded into the same area of memory,
a CPU could end up executing instructions from 'foo' when branching into
'bar' if these instructions were fetched into the pipeline before 'foo'
was unloaded.
Whilst this is highly unlikely to occur in practice, particularly as
any exception acts as a context-synchronizing operation, following the
letter of the architecture requires us to execute an ISB on each CPU
in order for the new instruction stream to be visible.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The following kernel panic was observed on ARM64 platform due to a stale
TLB entry.
1. ioremap with 4K size, a valid pte page table is set.
2. iounmap it, its pte entry is set to 0.
3. ioremap the same address with 2M size, update its pmd entry with
a new value.
4. CPU may hit an exception because the old pmd entry is still in TLB,
which leads to a kernel panic.
Commit b6bdb7517c3d ("mm/vmalloc: add interfaces to free unmapped page
table") has addressed this panic by falling to pte mappings in the above
case on ARM64.
To support pmd mappings in all cases, TLB purge needs to be performed
in this case on ARM64.
Add a new arg, 'addr', to pud_free_pmd_page() and pmd_free_pte_page()
so that TLB purge can be added later in seprate patches.
[toshi.kani@hpe.com: merge changes, rewrite patch description]
Fixes: 28ee90fe6048 ("x86/mm: implement free pmd/pte page interfaces")
Signed-off-by: Chintan Pandya <cpandya@codeaurora.org>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: mhocko@suse.com
Cc: akpm@linux-foundation.org
Cc: hpa@zytor.com
Cc: linux-mm@kvack.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: Will Deacon <will.deacon@arm.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: stable@vger.kernel.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/20180627141348.21777-3-toshi.kani@hpe.com
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.
Signed-off-by: Peng Donglin <dolinux.peng@gmail.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
When rewriting swapper using nG mappings, we must performance cache
maintenance around each page table access in order to avoid coherency
problems with the host's cacheable alias under KVM. To ensure correct
ordering of the maintenance with respect to Device memory accesses made
with the Stage-1 MMU disabled, DMBs need to be added between the
maintenance and the corresponding memory access.
This patch adds a missing DMB between writing a new page table entry and
performing a clean+invalidate on the same line.
Fixes: f992b4dfd58b ("arm64: kpti: Add ->enable callback to remap swapper using nG mappings")
Cc: <stable@vger.kernel.org> # 4.16.x-
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
dma_alloc_*() buffers might be exposed to userspace via mmap() call, so
they should be cleared on allocation. In case of IOMMU-based dma-mapping
implementation such buffer clearing was missing in the code path for
DMA_ATTR_FORCE_CONTIGUOUS flag handling, because dma_alloc_from_contiguous()
doesn't honor __GFP_ZERO flag. This patch fixes this issue. For more
information on clearing buffers allocated by dma_alloc_* functions,
see commit 6829e274a623 ("arm64: dma-mapping: always clear allocated
buffers").
Fixes: 44176bb38fa4 ("arm64: Add support for DMA_ATTR_FORCE_CONTIGUOUS to IOMMU")
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
With PHYS_ADDR_MAX there is now a type safe variant for all bits set.
Make use of it.
Patch created using a semantic patch as follows:
// <smpl>
@@
typedef phys_addr_t;
@@
-(phys_addr_t)ULLONG_MAX
+PHYS_ADDR_MAX
// </smpl>
Link: http://lkml.kernel.org/r/20180419214204.19322-1-stefan@agner.ch
Signed-off-by: Stefan Agner <stefan@agner.ch>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com> [arm64]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Spectre v4 mitigation (Speculative Store Bypass Disable) support for
arm64 using SMC firmware call to set a hardware chicken bit
- ACPI PPTT (Processor Properties Topology Table) parsing support and
enable the feature for arm64
- Report signal frame size to user via auxv (AT_MINSIGSTKSZ). The
primary motivation is Scalable Vector Extensions which requires more
space on the signal frame than the currently defined MINSIGSTKSZ
- ARM perf patches: allow building arm-cci as module, demote dev_warn()
to dev_dbg() in arm-ccn event_init(), miscellaneous cleanups
- cmpwait() WFE optimisation to avoid some spurious wakeups
- L1_CACHE_BYTES reverted back to 64 (for performance reasons that have
to do with some network allocations) while keeping ARCH_DMA_MINALIGN
to 128. cache_line_size() returns the actual hardware Cache Writeback
Granule
- Turn LSE atomics on by default in Kconfig
- Kernel fault reporting tidying
- Some #include and miscellaneous cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAlsaoqsACgkQa9axLQDI
XvH+8RAAqRCrEtkNPS7zxHyMK/D2cxSy9EVtlJ1sxhmsONEe5t5MDTWX9byobQ5A
PAKMSQBQgUvecqHLOtD7SJWef1il30zgWmc/yPcgNv3OsA1Au7j2g3ht/Drw+N5I
Vy0aOUEtw+Jzs7y/CJyl6lufSkkOzszOujt2Nybiz6omztOrwkW9isKnURzQBNj5
gquZI35h604YJ9F0TqS6ZqU7tNcuB9q02FxvVBpLmb83jP4jSEjYACUJwVVxvEAB
UXjdD4N130rRXDS5OMRWo5+4SAj+kPYhdVYEvaDx7xTOIRHhXK05GlJbsUAc5E6l
xy810fH5Dm0diYpVvYWTA5J+BU1jNOvCys5zKWl7gs2P8YB59PdqY4M2YBPNGb5H
PaVgq73TZAsww6ZInbZlK+wZOIxZZIOf//Z+QKn6EPtu3RmzIFWwyttTj01w1E3i
LhjcUoGnvxJFcMoCr59ihDwfP9nkCVrNc4REOGaWDk6L/t/bOfaZfDz+OCGbwQdL
akCFKZI6q5O/no+YfhtdtNFpCQb/Bo1J88KuotICRXq8z4vO41zIG53bi97W8QeG
rCBiX0NxUxYJ3ybus7kZHTmMGieMyEHP28n12QffwvJj4vJBsUXQBrV8hclx0djZ
HMt7iPi/0BW6nVV7ngIgN3cdCpaDCEGRsfO4Ch0rFZrC9UbYQnE=
=uums
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"Apart from the core arm64 and perf changes, the Spectre v4 mitigation
touches the arm KVM code and the ACPI PPTT support touches drivers/
(acpi and cacheinfo). I should have the maintainers' acks in place.
Summary:
- Spectre v4 mitigation (Speculative Store Bypass Disable) support
for arm64 using SMC firmware call to set a hardware chicken bit
- ACPI PPTT (Processor Properties Topology Table) parsing support and
enable the feature for arm64
- Report signal frame size to user via auxv (AT_MINSIGSTKSZ). The
primary motivation is Scalable Vector Extensions which requires
more space on the signal frame than the currently defined
MINSIGSTKSZ
- ARM perf patches: allow building arm-cci as module, demote
dev_warn() to dev_dbg() in arm-ccn event_init(), miscellaneous
cleanups
- cmpwait() WFE optimisation to avoid some spurious wakeups
- L1_CACHE_BYTES reverted back to 64 (for performance reasons that
have to do with some network allocations) while keeping
ARCH_DMA_MINALIGN to 128. cache_line_size() returns the actual
hardware Cache Writeback Granule
- Turn LSE atomics on by default in Kconfig
- Kernel fault reporting tidying
- Some #include and miscellaneous cleanups"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (53 commits)
arm64: Fix syscall restarting around signal suppressed by tracer
arm64: topology: Avoid checking numa mask for scheduler MC selection
ACPI / PPTT: fix build when CONFIG_ACPI_PPTT is not enabled
arm64: cpu_errata: include required headers
arm64: KVM: Move VCPU_WORKAROUND_2_FLAG macros to the top of the file
arm64: signal: Report signal frame size to userspace via auxv
arm64/sve: Thin out initialisation sanity-checks for sve_max_vl
arm64: KVM: Add ARCH_WORKAROUND_2 discovery through ARCH_FEATURES_FUNC_ID
arm64: KVM: Handle guest's ARCH_WORKAROUND_2 requests
arm64: KVM: Add ARCH_WORKAROUND_2 support for guests
arm64: KVM: Add HYP per-cpu accessors
arm64: ssbd: Add prctl interface for per-thread mitigation
arm64: ssbd: Introduce thread flag to control userspace mitigation
arm64: ssbd: Restore mitigation status on CPU resume
arm64: ssbd: Skip apply_ssbd if not using dynamic mitigation
arm64: ssbd: Add global mitigation state accessor
arm64: Add 'ssbd' command-line option
arm64: Add ARCH_WORKAROUND_2 probing
arm64: Add per-cpu infrastructure to call ARCH_WORKAROUND_2
arm64: Call ARCH_WORKAROUND_2 on transitions between EL0 and EL1
...