Commit Graph

428 Commits

Author SHA1 Message Date
Mark Rutland
623b476fc8 arm64: move sp_el0 and tpidr_el1 into cpu_suspend_ctx
When returning from idle, we rely on the fact that thread_info lives at
the end of the kernel stack, and restore this by masking the saved stack
pointer. Subsequent patches will sever the relationship between the
stack and thread_info, and to cater for this we must save/restore sp_el0
explicitly, storing it in cpu_suspend_ctx.

As cpu_suspend_ctx must be doubleword aligned, this leaves us with an
extra slot in cpu_suspend_ctx. We can use this to save/restore tpidr_el1
in the same way, which simplifies the code, avoiding pointer chasing on
the restore path (as we no longer need to load thread_info::cpu followed
by the relevant slot in __per_cpu_offset based on this).

This patch stashes both registers in cpu_suspend_ctx.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-11 18:25:44 +00:00
Huang Shijie
0c2f0afe35 arm64: hugetlb: fix the wrong address for several functions
The libhugetlbfs meets several failures since the following functions
do not use the correct address:
   huge_ptep_get_and_clear()
   huge_ptep_set_access_flags()
   huge_ptep_set_wrprotect()
   huge_ptep_clear_flush()

This patch fixes the wrong address for them.

Signed-off-by: Huang Shijie <shijie.huang@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-09 16:55:13 +00:00
Huang Shijie
20156ce236 arm64: hugetlb: remove the wrong pmd check in find_num_contig()
The find_num_contig() will return 1 when the pmd is not present.
It will cause a kernel dead loop in the following scenaro:

   1.) pmd entry is not present.

   2.) the page fault occurs:
       ... hugetlb_fault() --> hugetlb_no_page() --> set_huge_pte_at()

   3.) set_huge_pte_at() will only set the first PMD entry, since the
       find_num_contig just return 1 in this case. So the PMD entries
       are all empty except the first one.

   4.) when kernel accesses the address mapped by the second PMD entry,
       a new page fault occurs:
       ... hugetlb_fault() --> huge_ptep_set_access_flags()

       The second PMD entry is still empty now.

   5.) When the kernel returns, the access will cause a page fault again.
       The kernel will run like the "4)" above.
       We will see a dead loop since here.

The dead loop is caught in the 32M hugetlb page (2M PMD + Contiguous bit).

This patch removes wrong pmd check, and fixes this dead loop.

This patch also removes the redundant checks for PGD/PUD in
the find_num_contig().

Acked-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Huang Shijie <shijie.huang@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-09 16:54:55 +00:00
Catalin Marinas
6ed0038d5e arm64: Fix typo in add_default_hugepagesz() for 64K pages
The default hugepage size when 64K pages are enabled is set to 2MB using
the contiguous PTE bit. The add_default_hugepagesz(), however, uses
CONT_PMD_SHIFT instead of CONT_PTE_SHIFT. There is no functional change
since the values are the same.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-09 16:52:21 +00:00
Pratyush Anand
9842ceae9f arm64: Add uprobe support
This patch adds support for uprobe on ARM64 architecture.

Unit tests for following have been done so far and they have been found
working
    1. Step-able instructions, like sub, ldr, add etc.
    2. Simulation-able like ret, cbnz, cbz etc.
    3. uretprobe
    4. Reject-able instructions like sev, wfe etc.
    5. trapped and abort xol path
    6. probe at unaligned user address.
    7. longjump test cases

Currently it does not support aarch32 instruction probing.

Signed-off-by: Pratyush Anand <panand@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-07 18:15:21 +00:00
Laura Abbott
1404d6f13e arm64: dump: Add checking for writable and exectuable pages
Page mappings with full RWX permissions are a security risk. x86
has an option to walk the page tables and dump any bad pages.
(See e1a58320a3 ("x86/mm: Warn on W^X mappings")). Add a similar
implementation for arm64.

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[catalin.marinas@arm.com: folded fix for KASan out of bounds from Mark Rutland]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-07 18:15:04 +00:00
Laura Abbott
ae5d1cf358 arm64: dump: Make the page table dumping seq_file optional
The page table dumping code always assumes it will be dumping to a
seq_file to userspace. Future code will be taking advantage of
the page table dumping code but will not need the seq_file. Make
the seq_file optional for these cases.

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-07 18:15:04 +00:00
Laura Abbott
4ddb9bf833 arm64: dump: Make ptdump debugfs a separate option
ptdump_register currently initializes a set of page table information and
registers debugfs. There are uses for the ptdump option without wanting the
debugfs options. Split this out to make it a separate option.

Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-07 18:15:04 +00:00
Ard Biesheuvel
0bfc445dec arm64: mm: set the contiguous bit for kernel mappings where appropriate
Now that we no longer allow live kernel PMDs to be split, it is safe to
start using the contiguous bit for kernel mappings. So set the contiguous
bit in the kernel page mappings for regions whose size and alignment are
suitable for this.

This enables the following contiguous range sizes for the virtual mapping
of the kernel image, and for the linear mapping:

          granule size |  cont PTE  |  cont PMD  |
          -------------+------------+------------+
               4 KB    |    64 KB   |   32 MB    |
              16 KB    |     2 MB   |    1 GB*   |
              64 KB    |     2 MB   |   16 GB*   |

* Only when built for 3 or more levels of translation. This is due to the
  fact that a 2 level configuration only consists of PGDs and PTEs, and the
  added complexity of dealing with folded PMDs is not justified considering
  that 16 GB contiguous ranges are likely to be ignored by the hardware (and
  16k/2 levels is a niche configuration)

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-07 18:15:04 +00:00
Ard Biesheuvel
f14c66ce81 arm64: mm: replace 'block_mappings_allowed' with 'page_mappings_only'
In preparation of adding support for contiguous PTE and PMD mappings,
let's replace 'block_mappings_allowed' with 'page_mappings_only', which
will be a more accurate description of the nature of the setting once we
add such contiguous mappings into the mix.

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-07 18:15:04 +00:00
Ard Biesheuvel
e98216b521 arm64: mm: BUG on unsupported manipulations of live kernel mappings
Now that we take care not manipulate the live kernel page tables in a
way that may lead to TLB conflicts, the case where a table mapping is
replaced by a block mapping can no longer occur. So remove the handling
of this at the PUD and PMD levels, and instead, BUG() on any occurrence
of live kernel page table manipulations that modify anything other than
the permission bits.

Since mark_rodata_ro() is the only caller where the kernel mappings that
are being manipulated are actually live, drop the various conditional
flush_tlb_all() invocations, and add a single call to mark_rodata_ro()
instead.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-07 18:15:03 +00:00
Robin Murphy
b7b941afe5 arm64: Remove pointless WARN_ON in DMA teardown
We expect arch_teardown_dma_ops() to be called very late in a device's
life, after it has been removed from its bus, and thus after the IOMMU
bus notifier has run. As such, even if this funny little check did make
sense, it's unlikely to achieve what it thinks it's trying to do anyway.
It's a residual trace of an earlier implementation which didn't belong
here from the start; belatedly snuff it out.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-11-07 18:15:03 +00:00
Hanjun Guo
3f7a09f44e arm64/numa: fix incorrect log for memory-less node
When booting on NUMA system with memory-less node (no
memory dimm on this memory controller), the print
for setup_node_data() is incorrect:

NUMA: Initmem setup node 2 [mem 0x00000000-0xffffffffffffffff]

It can be fixed by printing [mem 0x00000000-0x00000000] when
end_pfn is 0, but print <memory-less node> will be more useful.

Fixes: 1a2db30034 ("arm64, numa: Add NUMA support for arm64 platforms.")
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ganapatrao Kulkarni <gkulkarni@caviumnetworks.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yisheng Xie <xieyisheng1@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-26 18:21:51 +01:00
Yisheng Xie
26984c3bc2 arm64/numa: fix pcpu_cpu_distance() to get correct CPU proximity
The pcpu_build_alloc_info() function group CPUs according to their
proximity, by call callback function @cpu_distance_fn from different
ARCHs.

For arm64 the callback of @cpu_distance_fn is
    pcpu_cpu_distance(from, to)
        -> node_distance(from, to)
The @from and @to for function node_distance() should be nid.

However, pcpu_cpu_distance() in arch/arm64/mm/numa.c just past the
cpu id for @from and @to, and didn't convert to numa node id.

For this incorrect cpu proximity get from ARCH, it may cause each CPU
in one group and make group_cnt out of bound:

	setup_per_cpu_areas()
		pcpu_embed_first_chunk()
			pcpu_build_alloc_info()
in pcpu_build_alloc_info, since cpu_distance_fn will return
REMOTE_DISTANCE if we pass cpu ids (0,1,2...), so
cpu_distance_fn(cpu, tcpu) > LOCAL_DISTANCE will wrongly be ture.

This may results in triggering the BUG_ON(unit != nr_units) later:

[    0.000000] kernel BUG at mm/percpu.c:1916!
[    0.000000] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.9.0-rc1-00003-g14155ca-dirty #26
[    0.000000] Hardware name: Hisilicon Hi1616 Evaluation Board (DT)
[    0.000000] task: ffff000008d6e900 task.stack: ffff000008d60000
[    0.000000] PC is at pcpu_embed_first_chunk+0x420/0x704
[    0.000000] LR is at pcpu_embed_first_chunk+0x3bc/0x704
[    0.000000] pc : [<ffff000008c754f4>] lr : [<ffff000008c75490>] pstate: 800000c5
[    0.000000] sp : ffff000008d63eb0
[    0.000000] x29: ffff000008d63eb0 [    0.000000] x28: 0000000000000000
[    0.000000] x27: 0000000000000040 [    0.000000] x26: ffff8413fbfcef00
[    0.000000] x25: 0000000000000042 [    0.000000] x24: 0000000000000042
[    0.000000] x23: 0000000000001000 [    0.000000] x22: 0000000000000046
[    0.000000] x21: 0000000000000001 [    0.000000] x20: ffff000008cb3bc8
[    0.000000] x19: ffff8413fbfcf570 [    0.000000] x18: 0000000000000000
[    0.000000] x17: ffff000008e49ae0 [    0.000000] x16: 0000000000000003
[    0.000000] x15: 000000000000001e [    0.000000] x14: 0000000000000004
[    0.000000] x13: 0000000000000000 [    0.000000] x12: 000000000000006f
[    0.000000] x11: 00000413fbffff00 [    0.000000] x10: 0000000000000004
[    0.000000] x9 : 0000000000000000 [    0.000000] x8 : 0000000000000001
[    0.000000] x7 : ffff8413fbfcf63c [    0.000000] x6 : ffff000008d65d28
[    0.000000] x5 : ffff000008d65e50 [    0.000000] x4 : 0000000000000000
[    0.000000] x3 : ffff000008cb3cc8 [    0.000000] x2 : 0000000000000040
[    0.000000] x1 : 0000000000000040 [    0.000000] x0 : 0000000000000000
[...]
[    0.000000] Call trace:
[    0.000000] Exception stack(0xffff000008d63ce0 to 0xffff000008d63e10)
[    0.000000] 3ce0: ffff8413fbfcf570 0001000000000000 ffff000008d63eb0 ffff000008c754f4
[    0.000000] 3d00: ffff000008d63d50 ffff0000081af210 00000413fbfff010 0000000000001000
[    0.000000] 3d20: ffff000008d63d50 ffff0000081af220 00000413fbfff010 0000000000001000
[    0.000000] 3d40: 00000413fbfcef00 0000000000000004 ffff000008d63db0 ffff0000081af390
[    0.000000] 3d60: 00000413fbfcef00 0000000000001000 0000000000000000 0000000000001000
[    0.000000] 3d80: 0000000000000000 0000000000000040 0000000000000040 ffff000008cb3cc8
[    0.000000] 3da0: 0000000000000000 ffff000008d65e50 ffff000008d65d28 ffff8413fbfcf63c
[    0.000000] 3dc0: 0000000000000001 0000000000000000 0000000000000004 00000413fbffff00
[    0.000000] 3de0: 000000000000006f 0000000000000000 0000000000000004 000000000000001e
[    0.000000] 3e00: 0000000000000003 ffff000008e49ae0
[    0.000000] [<ffff000008c754f4>] pcpu_embed_first_chunk+0x420/0x704
[    0.000000] [<ffff000008c6658c>] setup_per_cpu_areas+0x38/0xc8
[    0.000000] [<ffff000008c608d8>] start_kernel+0x10c/0x390
[    0.000000] [<ffff000008c601d8>] __primary_switched+0x5c/0x64
[    0.000000] Code: b8018660 17ffffd7 6b16037f 54000080 (d4210000)
[    0.000000] ---[ end trace 0000000000000000 ]---
[    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!

Fix by getting cpu's node id with early_cpu_to_node() then pass it
to node_distance() as the original intention.

Fixes: 7af3a0a992 ("arm64/numa: support HAVE_SETUP_PER_CPU_AREA")
Signed-off-by: Yisheng Xie <xieyisheng1@huawei.com>
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-26 18:21:51 +01:00
Mark Rutland
f7881bd644 arm64: remove pr_cont abuse from mem_init
All the lines printed by mem_init are independent, with each ending with
a newline. While they logically form a large block, none are actually
continuations of previous lines.

The kernel-side printk code and the userspace demsg tool differ in their
handling of KERN_CONT following a newline, and while this isn't always a
problem kernel-side, it does cause difficulty for userspace. Using
pr_cont causes the userspace tool to not print line prefix (e.g.
timestamps) even when following a newline, mis-aligning the output and
making it harder to read, e.g.

[    0.000000] Virtual kernel memory layout:
[    0.000000]     modules : 0xffff000000000000 - 0xffff000008000000   (   128 MB)
    vmalloc : 0xffff000008000000 - 0xffff7dffbfff0000   (129022 GB)
      .text : 0xffff000008080000 - 0xffff0000088b0000   (  8384 KB)
    .rodata : 0xffff0000088b0000 - 0xffff000008c50000   (  3712 KB)
      .init : 0xffff000008c50000 - 0xffff000008d50000   (  1024 KB)
      .data : 0xffff000008d50000 - 0xffff000008e25200   (   853 KB)
       .bss : 0xffff000008e25200 - 0xffff000008e6bec0   (   284 KB)
    fixed   : 0xffff7dfffe7fd000 - 0xffff7dfffec00000   (  4108 KB)
    PCI I/O : 0xffff7dfffee00000 - 0xffff7dffffe00000   (    16 MB)
    vmemmap : 0xffff7e0000000000 - 0xffff800000000000   (  2048 GB maximum)
              0xffff7e0000000000 - 0xffff7e0026000000   (   608 MB actual)
    memory  : 0xffff800000000000 - 0xffff800980000000   ( 38912 MB)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=6, Nodes=1

Fix this by using pr_notice consistently for all lines, which both the
kernel and userspace are happy with.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-20 15:27:56 +01:00
James Morse
7209c86860 arm64: mm: Set PSTATE.PAN from the cpu_enable_pan() call
Commit 338d4f49d6 ("arm64: kernel: Add support for Privileged Access
Never") enabled PAN by enabling the 'SPAN' feature-bit in SCTLR_EL1.
This means the PSTATE.PAN bit won't be set until the next return to the
kernel from userspace. On a preemptible kernel we may schedule work that
accesses userspace on a CPU before it has done this.

Now that cpufeature enable() calls are scheduled via stop_machine(), we
can set PSTATE.PAN from the cpu_enable_pan() call.

Add WARN_ON_ONCE(in_interrupt()) to check the PSTATE value we updated
is not immediately discarded.

Reported-by: Tony Thompson <anthony.thompson@arm.com>
Reported-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
[will: fixed typo in comment]
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-20 09:50:53 +01:00
James Morse
2a6dcb2b5f arm64: cpufeature: Schedule enable() calls instead of calling them via IPI
The enable() call for a cpufeature/errata is called using on_each_cpu().
This issues a cross-call IPI to get the work done. Implicitly, this
stashes the running PSTATE in SPSR when the CPU receives the IPI, and
restores it when we return. This means an enable() call can never modify
PSTATE.

To allow PAN to do this, change the on_each_cpu() call to use
stop_machine(). This schedules the work on each CPU which allows
us to modify PSTATE.

This involves changing the protype of all the enable() functions.

enable_cpu_capabilities() is called during boot and enables the feature
on all online CPUs. This path now uses stop_machine(). CPU features for
hotplug'd CPUs are enabled by verify_local_cpu_features() which only
acts on the local CPU, and can already modify the running PSTATE as it
is called from secondary_start_kernel().

Reported-by: Tony Thompson <anthony.thompson@arm.com>
Reported-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-10-20 09:50:53 +01:00
Linus Torvalds
56e520c7a0 IOMMU Updates for Linux v4.9
Including:
 
 	* Support for interrupt virtualization in the AMD IOMMU driver.
 	  These patches were shared with the KVM tree and are already
 	  merged through that tree.
 
 	* Generic DT-binding support for the ARM-SMMU driver. With this
 	  the driver now makes use of the generic DMA-API code. This
 	  also required some changes outside of the IOMMU code, but
 	  these are acked by the respective maintainers.
 
 	* More cleanups and fixes all over the place.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJX/PmUAAoJECvwRC2XARrjyloP/1hymxXC2yXZ4EIBTHSO5X+c
 jSJaGTIbQAdQDpllscSNJ0Au43L3vGtJcHo4JqwEERNlwLsU82LH7QJhq+q1La/b
 5cPaY5gI3E++qxQt8umuZJAIUQthFYrfGoS5lJc5t5r/d8iVsLWbW4VkR19/1o7A
 4/Uz7ETmi9VVy8Hkvumx+PQ0VHJet381KB7ud9LU5Spim0En2AAGwZXLMkmxXd2W
 uDQ+O1rlDVc2/ka3+GmfZEml5EASWRqS/MTNoU/ZbQGYWKCWygXbuiqt6gLudWjx
 dCR1Knh68b0gN6k/QAj8XY/1gkfmZ3YkfS0AHIMLYTFRT51BuxOrkXrBdkYnWEBv
 UirmaiV87SlR1j83yb3ZmjpBPvd2sGWYFDqY1P0riLutjGUS6zycWWs13olvbfbz
 SFrH7PT7JPQGYprI1oVn4ihszjN1NZ4+Gj7QBhyFW6FtvqTzmaFVsMOlDIeg1FwR
 k8cOzov4NG33Bp4IpsHK8e0/qV6K3oJOiOQgCyQp9kPKK+UWv9v9+HaEA7npJuRV
 c+lTE6j3G4LjEoVybkqm8TiPKxTMVNjUjgA3kwB2yNkCQT7hTCNYIAFrtfCYjYdo
 B1dnFE7feVqtoimnu2qvkVs59hWlF7Hc3RRHoBxMmO8DLwl9n2OcmoQIeCTsviss
 i9aNwC9bzBs+Hd3X/psB
 =1hFE
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull IOMMU updates from Joerg Roedel:

 - support for interrupt virtualization in the AMD IOMMU driver. These
   patches were shared with the KVM tree and are already merged through
   that tree.

 - generic DT-binding support for the ARM-SMMU driver. With this the
   driver now makes use of the generic DMA-API code. This also required
   some changes outside of the IOMMU code, but these are acked by the
   respective maintainers.

 - more cleanups and fixes all over the place.

* tag 'iommu-updates-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (40 commits)
  iommu/amd: No need to wait iommu completion if no dte irq entry change
  iommu/amd: Free domain id when free a domain of struct dma_ops_domain
  iommu/amd: Use standard bitmap operation to set bitmap
  iommu/amd: Clean up the cmpxchg64 invocation
  iommu/io-pgtable-arm: Check for v7s-incapable systems
  iommu/dma: Avoid PCI host bridge windows
  iommu/dma: Add support for mapping MSIs
  iommu/arm-smmu: Set domain geometry
  iommu/arm-smmu: Wire up generic configuration support
  Docs: dt: document ARM SMMU generic binding usage
  iommu/arm-smmu: Convert to iommu_fwspec
  iommu/arm-smmu: Intelligent SMR allocation
  iommu/arm-smmu: Add a stream map entry iterator
  iommu/arm-smmu: Streamline SMMU data lookups
  iommu/arm-smmu: Refactor mmu-masters handling
  iommu/arm-smmu: Keep track of S2CR state
  iommu/arm-smmu: Consolidate stream map entry state
  iommu/arm-smmu: Handle stream IDs more dynamically
  iommu/arm-smmu: Set PRIVCFG in stage 1 STEs
  iommu/arm-smmu: Support non-PCI devices with SMMUv3
  ...
2016-10-11 12:52:41 -07:00
Linus Torvalds
7af8a0f808 arm64 updates for 4.9:
- Support for execute-only page permissions
 - Support for hibernate and DEBUG_PAGEALLOC
 - Support for heterogeneous systems with mismatches cache line sizes
 - Errata workarounds (A53 843419 update and QorIQ A-008585 timer bug)
 - arm64 PMU perf updates, including cpumasks for heterogeneous systems
 - Set UTS_MACHINE for building rpm packages
 - Yet another head.S tidy-up
 - Some cleanups and refactoring, particularly in the NUMA code
 - Lots of random, non-critical fixes across the board
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJX7k31AAoJELescNyEwWM0XX0H/iOaWCfKlWOhvBsStGUCsLrK
 XryTzQT2KjdnLKf3jwP+1ateCuBR5ROurYxoDCX5/7mD63c5KiI338Vbv61a1lE1
 AAwjt1stmQVUg/j+kqnuQwB/0DYg+2C8se3D3q5Iyn7zc19cDZJEGcBHNrvLMufc
 XgHrgHgl/rzBDDlHJXleknDFge/MfhU5/Q1vJMRRb4JYrpAtmIokzCO75CYMRcCT
 ND2QbmppKtsyuFPGUTVbAFzJlP6dGKb3eruYta7/ct5d0pJQxav3u98D2yWGfjdM
 YaYq1EmX5Pol7rWumqLtk0+mA9yCFcKLLc+PrJu20Vx0UkvOq8G8Xt70sHNvZU8=
 =gdPM
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:
 "It's a bit all over the place this time with no "killer feature" to
  speak of.  Support for mismatched cache line sizes should help people
  seeing whacky JIT failures on some SoCs, and the big.LITTLE perf
  updates have been a long time coming, but a lot of the changes here
  are cleanups.

  We stray outside arch/arm64 in a few areas: the arch/arm/ arch_timer
  workaround is acked by Russell, the DT/OF bits are acked by Rob, the
  arch_timer clocksource changes acked by Marc, CPU hotplug by tglx and
  jump_label by Peter (all CC'd).

  Summary:

   - Support for execute-only page permissions
   - Support for hibernate and DEBUG_PAGEALLOC
   - Support for heterogeneous systems with mismatches cache line sizes
   - Errata workarounds (A53 843419 update and QorIQ A-008585 timer bug)
   - arm64 PMU perf updates, including cpumasks for heterogeneous systems
   - Set UTS_MACHINE for building rpm packages
   - Yet another head.S tidy-up
   - Some cleanups and refactoring, particularly in the NUMA code
   - Lots of random, non-critical fixes across the board"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (100 commits)
  arm64: tlbflush.h: add __tlbi() macro
  arm64: Kconfig: remove SMP dependence for NUMA
  arm64: Kconfig: select OF/ACPI_NUMA under NUMA config
  arm64: fix dump_backtrace/unwind_frame with NULL tsk
  arm/arm64: arch_timer: Use archdata to indicate vdso suitability
  arm64: arch_timer: Work around QorIQ Erratum A-008585
  arm64: arch_timer: Add device tree binding for A-008585 erratum
  arm64: Correctly bounds check virt_addr_valid
  arm64: migrate exception table users off module.h and onto extable.h
  arm64: pmu: Hoist pmu platform device name
  arm64: pmu: Probe default hw/cache counters
  arm64: pmu: add fallback probe table
  MAINTAINERS: Update ARM PMU PROFILING AND DEBUGGING entry
  arm64: Improve kprobes test for atomic sequence
  arm64/kvm: use alternative auto-nop
  arm64: use alternative auto-nop
  arm64: alternative: add auto-nop infrastructure
  arm64: lse: convert lse alternatives NOP padding to use __nops
  arm64: barriers: introduce nops and __nops macros for NOP sequences
  arm64: sysreg: replace open-coded mrs_s/msr_s with {read,write}_sysreg_s
  ...
2016-10-03 08:58:35 -07:00
Joerg Roedel
6e0a16673c Merge branch 'for-joerg/arm-smmu/updates' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux into arm/smmu 2016-09-20 13:24:14 +02:00
Paul Gortmaker
0edfa83916 arm64: migrate exception table users off module.h and onto extable.h
These files were only including module.h for exception table
related functions.  We've now separated that content out into its
own file "extable.h" so now move over to that and avoid all the
extra header content in module.h that we don't really need to compile
these files.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-20 09:36:21 +01:00
Robin Murphy
fade1ec055 iommu/dma: Avoid PCI host bridge windows
With our DMA ops enabled for PCI devices, we should avoid allocating
IOVAs which a host bridge might misinterpret as peer-to-peer DMA and
lead to faults, corruption or other badness. To be safe, punch out holes
for all of the relevant host bridge's windows when initialising a DMA
domain for a PCI device.

CC: Marek Szyprowski <m.szyprowski@samsung.com>
CC: Inki Dae <inki.dae@samsung.com>
Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-16 09:34:22 +01:00
Mark Rutland
6ba3b554f5 arm64: use alternative auto-nop
Make use of the new alternative_if and alternative_else_nop_endif and
get rid of our homebew NOP sleds, making the code simpler to read.

Note that for cpu_do_switch_mm the ret has been moved out of the
alternative sequence, and in the default case there will be three
additional NOPs executed.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-12 10:46:07 +01:00
Zhen Lei
7ba5f605f3 arm64/numa: remove the limitation that cpu0 must bind to node0
1. Remove the old binding code.
2. Read the nid of cpu0 from dts.
3. Fallback the nid of cpu0 to 0 when numa=off is set in bootargs.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:59:09 +01:00
Zhen Lei
df7ffa34cc arm64/numa: remove some useless code
When the deleted code is executed, only the bit of cpu0 was set on
cpu_possible_mask. So that, only set_cpu_numa_node(0, NUMA_NO_NODE); will
be executed. And map_cpu_to_node(0, 0) will soon be called. So these code
can be safely removed.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:59:09 +01:00
Zhen Lei
7af3a0a992 arm64/numa: support HAVE_SETUP_PER_CPU_AREA
To make each percpu area allocated from its local numa node. Without this
patch, all percpu areas will be allocated from the node which cpu0 belongs
to.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:59:09 +01:00
Kefeng Wang
f11c7bacd5 arm64: numa: Use pr_fmt()
Use pr_fmt to prefix kernel output, and remove duplicated msg
of NUMA turned off.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:59:09 +01:00
Zhen Lei
794224ea56 arm64/numa: avoid inconsistent information to be printed
numa_init may return error because of numa configuration error. So "No
NUMA configuration found" is inaccurate. In fact, specific configuration
error information should be immediately printed by the testing branch.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-09 14:59:08 +01:00
Kefeng Wang
dae8c235d9 arm64: mm: drop fixup_init() and mm.h
There is only fixup_init() in mm.h , and it is only called
in free_initmem(), so move the codes from fixup_init() into
free_initmem(), then drop fixup_init() and mm.h.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-09-06 19:09:38 +01:00
James Morse
744c6c37cc arm64: kernel: Fix unmasked debug exceptions when restoring mdscr_el1
Changes to make the resume from cpu_suspend() code behave more like
secondary boot caused debug exceptions to be unmasked early by
__cpu_setup(). We then go on to restore mdscr_el1 in cpu_do_resume(),
potentially taking break or watch points based on uninitialised registers.

Mask debug exceptions in cpu_do_resume(), which is specific to resume
from cpu_suspend(). Debug exceptions will be restored to their original
state by local_dbg_restore() in cpu_suspend(), which runs after
hw_breakpoint_restore() has re-initialised the other registers.

Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Fixes: cabe1c81ea ("arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va")
Cc: <stable@vger.kernel.org> # 4.7+
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-09-02 17:19:55 +01:00
James Morse
5ebe3a44cc arm64: hibernate: Support DEBUG_PAGEALLOC
DEBUG_PAGEALLOC removes the valid bit of page table entries to prevent
any access to unallocated memory. Hibernate uses this as a hint that those
pages don't need to be saved/restored. This patch adds the
kernel_page_present() function it uses.

hibernate.c copies the resume kernel's linear map for use during restore.
Add _copy_pte() to fill-in the holes made by DEBUG_PAGEALLOC in the resume
kernel, so we can restore data the original kernel had at these addresses.

Finally, DEBUG_PAGEALLOC means the linear-map alias of KERNEL_START to
KERNEL_END may have holes in it, so we can't lazily clean this whole
area to the PoC. Only clean the new mmuoff region, and the kernel/kvm
idmaps.

This reverts commit da24eb1f3f.

Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-25 18:00:30 +01:00
James Morse
b611303811 arm64: vmlinux.ld: Add mmuoff data sections and move mmuoff text into idmap
Resume from hibernate needs to clean any text executed by the kernel with
the MMU off to the PoC. Collect these functions together into the
.idmap.text section as all this code is tightly coupled and also needs
the same cleaning after resume.

Data is more complicated, secondary_holding_pen_release is written with
the MMU on, clean and invalidated, then read with the MMU off. In contrast
__boot_cpu_mode is written with the MMU off, the corresponding cache line
is invalidated, so when we read it with the MMU on we don't get stale data.
These cache maintenance operations conflict with each other if the values
are within a Cache Writeback Granule (CWG) of each other.
Collect the data into two sections .mmuoff.data.read and .mmuoff.data.write,
the linker script ensures mmuoff.data.write section is aligned to the
architectural maximum CWG of 2KB.

Signed-off-by: James Morse <james.morse@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-25 18:00:30 +01:00
Catalin Marinas
cab15ce604 arm64: Introduce execute-only page access permissions
The ARMv8 architecture allows execute-only user permissions by clearing
the PTE_UXN and PTE_USER bits. However, the kernel running on a CPU
implementation without User Access Override (ARMv8.2 onwards) can still
access such page, so execute-only page permission does not protect
against read(2)/write(2) etc. accesses. Systems requiring such
protection must enable features like SECCOMP.

This patch changes the arm64 __P100 and __S100 protection_map[] macros
to the new __PAGE_EXECONLY attributes. A side effect is that
pte_user() no longer triggers for __PAGE_EXECONLY since PTE_USER isn't
set. To work around this, the check is done on the PTE_NG bit via the
pte_ng() macro. VM_READ is also checked now for page faults.

Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-25 18:00:29 +01:00
Jisheng Zhang
5a9e3e156e arm64: apply __ro_after_init to some objects
These objects are set during initialization, thereafter are read only.

Previously I only want to mark vdso_pages, vdso_spec, vectors_page and
cpu_ops as __read_mostly from performance point of view. Then inspired
by Kees's patch[1] to apply more __ro_after_init for arm, I think it's
better to mark them as __ro_after_init. What's more, I find some more
objects are also read only after init. So apply __ro_after_init to all
of them.

This patch also removes global vdso_pagelist and tries to clean up
vdso_spec[] assignment code.

[1] http://www.spinics.net/lists/arm-kernel/msg523188.html

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Jisheng Zhang <jszhang@marvell.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-22 12:32:29 +01:00
Kwangwoo Lee
d34fdb7081 arm64: mm: convert __dma_* routines to use start, size
__dma_* routines have been converted to use start and size instread of
start and end addresses. The patch was origianlly for adding
__clean_dcache_area_poc() which will be used in pmem driver to clean
dcache to the PoC(Point of Coherency) in arch_wb_cache_pmem().

The functionality of __clean_dcache_area_poc()  was equivalent to
__dma_clean_range(). The difference was __dma_clean_range() uses the end
address, but __clean_dcache_area_poc() uses the size to clean.

Thus, __clean_dcache_area_poc() has been revised with a fallthrough
function of __dma_clean_range() after the change that __dma_* routines
use start and size instead of using start and end.

As a consequence of using start and size, the name of __dma_* routines
has also been altered following the terminology below:
    area: takes a start and size
    range: takes a start and end

Reviewed-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Kwangwoo Lee <kwangwoo.lee@sk.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-22 10:00:48 +01:00
Catalin Marinas
a93a4d6232 arm64: Fix shift warning in arch/arm64/mm/dump.c
When building with 48-bit VAs and 16K page configuration, it's possible
to get the following warning when building the arm64 page table dumping
code:

arch/arm64/mm/dump.c: In function ‘walk_pud’:
arch/arm64/mm/dump.c:274:102: warning: right shift count >= width of type [-Wshift-count-overflow]

This is because pud_offset(pgd, 0) performs a shift to the right by 36
while the value 0 has the type 'int' by default, therefore 32-bit.

This patch modifies all the p*_offset() uses in arch/arm64/mm/dump.c to
use 0UL for the address argument.

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-08-18 12:38:11 +01:00
Catalin Marinas
bfe6c8a89e arm64: Fix NUMA build error when !CONFIG_ACPI
Since asm/acpi.h is only included by linux/acpi.h when CONFIG_ACPI is
enabled, disabling the latter leads to the following build error on
arm64:

arch/arm64/mm/numa.c: In function ‘arm64_numa_init’:
arch/arm64/mm/numa.c:395:24: error: ‘arm64_acpi_numa_init’ undeclared (first use in this function)
   if (!acpi_disabled && !numa_init(arm64_acpi_numa_init))

This patch include the asm/acpi.h explicitly in arch/arm64/mm/numa.c for
the arm64_acpi_numa_init() definition.

Fixes: d8b47fca8c ("arm64, ACPI, NUMA: NUMA support based on SRAT and SLIT")
Reviewed-by: Hanjun Guo <hanjun.guo@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-08-17 17:16:58 +01:00
Laura Abbott
9adeb8e72d arm64: Handle el1 synchronous instruction aborts cleanly
Executing from a non-executable area gives an ugly message:

lkdtm: Performing direct entry EXEC_RODATA
lkdtm: attempting ok execution at ffff0000084c0e08
lkdtm: attempting bad execution at ffff000008880700
Bad mode in Synchronous Abort handler detected on CPU2, code 0x8400000e -- IABT (current EL)
CPU: 2 PID: 998 Comm: sh Not tainted 4.7.0-rc2+ #13
Hardware name: linux,dummy-virt (DT)
task: ffff800077e35780 ti: ffff800077970000 task.ti: ffff800077970000
PC is at lkdtm_rodata_do_nothing+0x0/0x8
LR is at execute_location+0x74/0x88

The 'IABT (current EL)' indicates the error but it's a bit cryptic
without knowledge of the ARM ARM. There is also no indication of the
specific address which triggered the fault. The increase in kernel
page permissions makes hitting this case more likely as well.
Handling the case in the vectors gives a much more familiar looking
error message:

lkdtm: Performing direct entry EXEC_RODATA
lkdtm: attempting ok execution at ffff0000084c0840
lkdtm: attempting bad execution at ffff000008880680
Unable to handle kernel paging request at virtual address ffff000008880680
pgd = ffff8000089b2000
[ffff000008880680] *pgd=00000000489b4003, *pud=0000000048904003, *pmd=0000000000000000
Internal error: Oops: 8400000e [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 997 Comm: sh Not tainted 4.7.0-rc1+ #24
Hardware name: linux,dummy-virt (DT)
task: ffff800077f9f080 ti: ffff800008a1c000 task.ti: ffff800008a1c000
PC is at lkdtm_rodata_do_nothing+0x0/0x8
LR is at execute_location+0x74/0x88

Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-08-12 17:58:48 +01:00
Linus Torvalds
194d6ad32e arm64 fixes:
- Fix HugeTLB leak due to CoW and PTE_RDONLY mismatch
 - Avoid accessing unmapped FDT fields when checking validity
 - Correctly account for vDSO AUX entry in ARCH_DLINFO
 - Fix kallsyms with absolute expressions in linker script
 - Kill unnecessary symbol-based relocs in vmlinux
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCgAGBQJXpFZ5AAoJELescNyEwWM0PI4IALsTuHRzClOSMDLiqMUj8t+5
 WNAcqybxAjCOVxAHckhweju++TeJBxcRH1nvBoNwiHIdHTv4fq1TZ3PeEq9kWMg5
 JbKjYjvd9dW8k6LXMya8iXCYtG3kzbNejkNpOTVebC86yvas1IiEjNb/ztPdhJeM
 HBSOkhfk8RcskfNxhuscZzGXbbdH9/R+XSTNRHN/RwCZH8PlInmduD9BbMvDhZyP
 NLFonD2IgQ4as1kYG/HdIcw0BamHiURjd043+gyoqMvm7JjPksRzlQnr91SMkX17
 LykXjHYPi2Me3aTrZ1NtkUNd5FHLHZ6/b9Wg6nA19d5KWkd3ER9uSJqGxkkbnt0=
 =dtGK
 -----END PGP SIGNATURE-----

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:

 - fix HugeTLB leak due to CoW and PTE_RDONLY mismatch

 - avoid accessing unmapped FDT fields when checking validity

 - correctly account for vDSO AUX entry in ARCH_DLINFO

 - fix kallsyms with absolute expressions in linker script

 - kill unnecessary symbol-based relocs in vmlinux

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Fix copy-on-write referencing in HugeTLB
  arm64: mm: avoid fdt_check_header() before the FDT is fully mapped
  arm64: Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO
  arm64: relocatable: suppress R_AARCH64_ABS64 relocations in vmlinux
  arm64: vmlinux.lds: make __rela_offset and __dynsym_offset ABSOLUTE
2016-08-06 08:58:59 -04:00
Krzysztof Kozlowski
00085f1efa dma-mapping: use unsigned long for dma_attrs
The dma-mapping core and the implementations do not change the DMA
attributes passed by pointer.  Thus the pointer can point to const data.
However the attributes do not have to be a bitfield.  Instead unsigned
long will do fine:

1. This is just simpler.  Both in terms of reading the code and setting
   attributes.  Instead of initializing local attributes on the stack
   and passing pointer to it to dma_set_attr(), just set the bits.

2. It brings safeness and checking for const correctness because the
   attributes are passed by value.

Semantic patches for this change (at least most of them):

    virtual patch
    virtual context

    @r@
    identifier f, attrs;

    @@
    f(...,
    - struct dma_attrs *attrs
    + unsigned long attrs
    , ...)
    {
    ...
    }

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

and

    // Options: --all-includes
    virtual patch
    virtual context

    @r@
    identifier f, attrs;
    type t;

    @@
    t f(..., struct dma_attrs *attrs);

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
     )

Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Acked-by: Vineet Gupta <vgupta@synopsys.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Acked-by: Mark Salter <msalter@redhat.com> [c6x]
Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> [cris]
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> [drm]
Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Fabien Dessenne <fabien.dessenne@st.com> [bdisp]
Reviewed-by: Marek Szyprowski <m.szyprowski@samsung.com> [vb2-core]
Acked-by: David Vrabel <david.vrabel@citrix.com> [xen]
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [xen swiotlb]
Acked-by: Joerg Roedel <jroedel@suse.de> [iommu]
Acked-by: Richard Kuo <rkuo@codeaurora.org> [hexagon]
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Acked-by: Gerald Schaefer <gerald.schaefer@de.ibm.com> [s390]
Acked-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Acked-by: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no> [avr32]
Acked-by: Vineet Gupta <vgupta@synopsys.com> [arc]
Acked-by: Robin Murphy <robin.murphy@arm.com> [arm64 and dma-iommu]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-08-04 08:50:07 -04:00
Ard Biesheuvel
04a8481061 arm64: mm: avoid fdt_check_header() before the FDT is fully mapped
As reported by Zijun, the fdt_check_header() call in __fixmap_remap_fdt()
is not safe since it is not guaranteed that the FDT header is mapped
completely. Due to the minimum alignment of 8 bytes, the only fields we
can assume to be mapped are 'magic' and 'totalsize'.

Since the OF layer is in charge of validating the FDT image, and we are
only interested in making reasonably sure that the size field contains
a meaningful value, replace the fdt_check_header() call with an explicit
comparison of the magic field's value against the expected value.

Cc: <stable@vger.kernel.org>
Reported-by: Zijun Hu <zijun_hu@htc.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2016-08-01 14:17:01 +01:00
Dennis Chen
cb0a650213 arm64:acpi: fix the acpi alignment exception when 'mem=' specified
When booting an ACPI enabled kernel with 'mem=x', there is the
possibility that ACPI data regions from the firmware will lie above the
memory limit.  Ordinarily these will be removed by
memblock_enforce_memory_limit(.).

Unfortunately, this means that these regions will then be mapped by
acpi_os_ioremap(.) as device memory (instead of normal) thus unaligned
accessess will then provoke alignment faults.

In this patch we adopt memblock_mem_limit_remove_map instead, and this
preserves these ACPI data regions (marked NOMAP) thus ensuring that
these regions are not mapped as device memory.

For example, below is an alignment exception observed on ARM platform
when booting the kernel with 'acpi=on mem=8G':

  ...
  Unable to handle kernel paging request at virtual address ffff0000080521e7
  pgd = ffff000008aa0000
  [ffff0000080521e7] *pgd=000000801fffe003, *pud=000000801fffd003, *pmd=000000801fffc003, *pte=00e80083ff1c1707
  Internal error: Oops: 96000021 [#1] PREEMPT SMP
  Modules linked in:
  CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc3-next-20160616+ #172
  Hardware name: AMD Overdrive/Supercharger/Default string, BIOS ROD1001A 02/09/2016
  task: ffff800001ef0000 ti: ffff800001ef8000 task.ti: ffff800001ef8000
  PC is at acpi_ns_lookup+0x520/0x734
  LR is at acpi_ns_lookup+0x4a4/0x734
  pc : [<ffff0000083b8b10>] lr : [<ffff0000083b8a94>] pstate: 60000045
  sp : ffff800001efb8b0
  x29: ffff800001efb8c0 x28: 000000000000001b
  x27: 0000000000000001 x26: 0000000000000000
  x25: ffff800001efb9e8 x24: ffff000008a10000
  x23: 0000000000000001 x22: 0000000000000001
  x21: ffff000008724000 x20: 000000000000001b
  x19: ffff0000080521e7 x18: 000000000000000d
  x17: 00000000000038ff x16: 0000000000000002
  x15: 0000000000000007 x14: 0000000000007fff
  x13: ffffff0000000000 x12: 0000000000000018
  x11: 000000001fffd200 x10: 00000000ffffff76
  x9 : 000000000000005f x8 : ffff000008725fa8
  x7 : ffff000008a8df70 x6 : ffff000008a8df70
  x5 : ffff000008a8d000 x4 : 0000000000000010
  x3 : 0000000000000010 x2 : 000000000000000c
  x1 : 0000000000000006 x0 : 0000000000000000
  ...
    acpi_ns_lookup+0x520/0x734
    acpi_ds_load1_begin_op+0x174/0x4fc
    acpi_ps_build_named_op+0xf8/0x220
    acpi_ps_create_op+0x208/0x33c
    acpi_ps_parse_loop+0x204/0x838
    acpi_ps_parse_aml+0x1bc/0x42c
    acpi_ns_one_complete_parse+0x1e8/0x22c
    acpi_ns_parse_table+0x8c/0x128
    acpi_ns_load_table+0xc0/0x1e8
    acpi_tb_load_namespace+0xf8/0x2e8
    acpi_load_tables+0x7c/0x110
    acpi_init+0x90/0x2c0
    do_one_initcall+0x38/0x12c
    kernel_init_freeable+0x148/0x1ec
    kernel_init+0x10/0xec
    ret_from_fork+0x10/0x40
  Code: b9009fbc 2a00037b 36380057 3219037b (b9400260)
  ---[ end trace 03381e5eb0a24de4 ]---
  Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

With 'efi=debug', we can see those ACPI regions loaded by firmware on
that board as:

  efi:   0x0083ff185000-0x0083ff1b4fff [Reserved           |   |  |  |  |  |  |  |   |WB|WT|WC|UC]*
  efi:   0x0083ff1b5000-0x0083ff1c2fff [ACPI Reclaim Memory|   |  |  |  |  |  |  |   |WB|WT|WC|UC]*
  efi:   0x0083ff223000-0x0083ff224fff [ACPI Memory NVS    |   |  |  |  |  |  |  |   |WB|WT|WC|UC]*

Link: http://lkml.kernel.org/r/1468475036-5852-3-git-send-email-dennis.chen@arm.com
Acked-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Dennis Chen <dennis.chen@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Kaly Xin <kaly.xin@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-28 16:07:41 -07:00
Linus Torvalds
e831101a73 arm64 updates for 4.8:
- Kexec support for arm64
 - Kprobes support
 - Expose MIDR_EL1 and REVIDR_EL1 CPU identification registers to sysfs
 - Trapping of user space cache maintenance operations and emulation in
   the kernel (CPU errata workaround)
 - Clean-up of the early page tables creation (kernel linear mapping, EFI
   run-time maps) to avoid splitting larger blocks (e.g. pmds) into
   smaller ones (e.g. ptes)
 - VDSO support for CLOCK_MONOTONIC_RAW in clock_gettime()
 - ARCH_HAS_KCOV enabled for arm64
 - Optimise IP checksum helpers
 - SWIOTLB optimisation to only allocate/initialise the buffer if the
   available RAM is beyond the 32-bit mask
 - Properly handle the "nosmp" command line argument
 - Fix for the initialisation of the CPU debug state during early boot
 - vdso-offsets.h build dependency workaround
 - Build fix when RANDOMIZE_BASE is enabled with MODULES off
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXmF/UAAoJEGvWsS0AyF7x+jwP/2fErtX6FTXmdG0c3HBkTpuy
 gEuzN2ByWbP6Io+unLC6NvbQQb1q6c73PTqjsoeMHUx2o8YK3jgWEBcC+7AuepoZ
 YGl3r08e75a/fGrgNwEQQC1lNlgjpog4kzVDh5ji6oRXNq+OkjJGUtRPe3gBoqxv
 NAjviciID/MegQaq4SaMd26AmnjuUGKogo5vlIaXK0SemX9it+ytW7eLAXuVY+gW
 EvO3Nxk0Y5oZKJF8qRw6oLSmw1bwn2dD26OgfXfCiI30QBookRyWIoXRedUOZmJq
 D0+Tipd7muO4PbjlxS8aY/wd/alfnM5+TJ6HpGDo+Y1BDauXfiXMf3ktDFE5QvJB
 KgtICmC0stWwbDT35dHvz8sETsrCMA2Q/IMrnyxG+nj9BxVQU7rbNrxfCXesJy7Q
 4EsQbcTyJwu+ECildBezfoei99XbFZyWk2vKSkTCFKzgwXpftGFaffgZ3DIzBAHH
 IjecDqIFENC8ymrjyAgrGjeFG+2WB/DBgoSS3Baiz6xwQqC4wFMnI3jPECtJjb/U
 6e13f+onXu5lF1YFKAiRjGmqa/G1ZMr+uKZFsembuGqsZdAPkzzUHyAE9g4JVO8p
 t3gc3/M3T7oLSHuw4xi1/Ow5VGb2UvbslFrp7OpuFZ7CJAvhKlHL5rPe385utsFE
 7++5WHXHAegeJCDNAKY2
 =iJOY
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Catalin Marinas:

 - Kexec support for arm64

 - Kprobes support

 - Expose MIDR_EL1 and REVIDR_EL1 CPU identification registers to sysfs

 - Trapping of user space cache maintenance operations and emulation in
   the kernel (CPU errata workaround)

 - Clean-up of the early page tables creation (kernel linear mapping,
   EFI run-time maps) to avoid splitting larger blocks (e.g.  pmds) into
   smaller ones (e.g.  ptes)

 - VDSO support for CLOCK_MONOTONIC_RAW in clock_gettime()

 - ARCH_HAS_KCOV enabled for arm64

 - Optimise IP checksum helpers

 - SWIOTLB optimisation to only allocate/initialise the buffer if the
   available RAM is beyond the 32-bit mask

 - Properly handle the "nosmp" command line argument

 - Fix for the initialisation of the CPU debug state during early boot

 - vdso-offsets.h build dependency workaround

 - Build fix when RANDOMIZE_BASE is enabled with MODULES off

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (64 commits)
  arm64: arm: Fix-up the removal of the arm64 regs_query_register_name() prototype
  arm64: Only select ARM64_MODULE_PLTS if MODULES=y
  arm64: mm: run pgtable_page_ctor() on non-swapper translation table pages
  arm64: mm: make create_mapping_late() non-allocating
  arm64: Honor nosmp kernel command line option
  arm64: Fix incorrect per-cpu usage for boot CPU
  arm64: kprobes: Add KASAN instrumentation around stack accesses
  arm64: kprobes: Cleanup jprobe_return
  arm64: kprobes: Fix overflow when saving stack
  arm64: kprobes: WARN if attempting to step with PSTATE.D=1
  arm64: debug: remove unused local_dbg_{enable, disable} macros
  arm64: debug: remove redundant spsr manipulation
  arm64: debug: unmask PSTATE.D earlier
  arm64: localise Image objcopy flags
  arm64: ptrace: remove extra define for CPSR's E bit
  kprobes: Add arm64 case in kprobe example module
  arm64: Add kernel return probes support (kretprobes)
  arm64: Add trampoline code for kretprobes
  arm64: kprobes instruction simulation support
  arm64: Treat all entry code as non-kprobe-able
  ...
2016-07-27 11:16:05 -07:00
Linus Torvalds
0e06f5c0de Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:

 - a few misc bits

 - ocfs2

 - most(?) of MM

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (125 commits)
  thp: fix comments of __pmd_trans_huge_lock()
  cgroup: remove unnecessary 0 check from css_from_id()
  cgroup: fix idr leak for the first cgroup root
  mm: memcontrol: fix documentation for compound parameter
  mm: memcontrol: remove BUG_ON in uncharge_list
  mm: fix build warnings in <linux/compaction.h>
  mm, thp: convert from optimistic swapin collapsing to conservative
  mm, thp: fix comment inconsistency for swapin readahead functions
  thp: update Documentation/{vm/transhuge,filesystems/proc}.txt
  shmem: split huge pages beyond i_size under memory pressure
  thp: introduce CONFIG_TRANSPARENT_HUGE_PAGECACHE
  khugepaged: add support of collapse for tmpfs/shmem pages
  shmem: make shmem_inode_info::lock irq-safe
  khugepaged: move up_read(mmap_sem) out of khugepaged_alloc_page()
  thp: extract khugepaged from mm/huge_memory.c
  shmem, thp: respect MADV_{NO,}HUGEPAGE for file mappings
  shmem: add huge pages support
  shmem: get_unmapped_area align huge page
  shmem: prepare huge= mount option and sysfs knob
  mm, rmap: account shmem thp pages
  ...
2016-07-26 19:55:54 -07:00
Kirill A. Shutemov
dcddffd41d mm: do not pass mm_struct into handle_mm_fault
We always have vma->vm_mm around.

Link: http://lkml.kernel.org/r/1466021202-61880-8-git-send-email-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Ard Biesheuvel
1378dc3d4b arm64: mm: run pgtable_page_ctor() on non-swapper translation table pages
The kernel page table creation routines are accessible to other subsystems
(e.g., EFI) via the create_pgd_mapping() entry point, which allows mappings
to be created that are not covered by init_mm.

Since generic code such as apply_to_page_range() may expect translation
table pages that are not associated with init_mm to be covered by fully
constructed struct pages, add a call to pgtable_page_ctor() in the alloc
function used by create_pgd_mapping. Since it is no longer used by
create_mapping_late(), also update the name of this function to better
reflect its purpose.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Laura Abbott <labbott@redhat.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-07-25 17:43:36 +01:00
Ard Biesheuvel
258a1605c7 arm64: mm: make create_mapping_late() non-allocating
The only purpose served by create_mapping_late() is to remap the already
mapped .text and .rodata kernel segments with read-only permissions. Since
we no longer allow block mappings to be split or merged,
create_mapping_late() should not pass an allocation function pointer into
__create_pgd_mapping(). So pass NULL instead.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Laura Abbott <labbott@redhat.com>
Tested-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-07-25 17:42:53 +01:00
Rafael J. Wysocki
d85f4eb699 Merge branch 'acpi-numa'
* acpi-numa:
  ACPI / NUMA: Enable ACPI based NUMA on ARM64
  arm64, ACPI, NUMA: NUMA support based on SRAT and SLIT
  ACPI / processor: Add acpi_map_madt_entry()
  ACPI / NUMA: Improve SRAT error detection and add messages
  ACPI / NUMA: Move acpi_numa_memory_affinity_init() to drivers/acpi/numa.c
  ACPI / NUMA: remove unneeded acpi_numa=1
  ACPI / NUMA: move bad_srat() and srat_disabled() to drivers/acpi/numa.c
  x86 / ACPI / NUMA: cleanup acpi_numa_processor_affinity_init()
  arm64, NUMA: Cleanup NUMA disabled messages
  arm64, NUMA: rework numa_add_memblk()
  ACPI / NUMA: move acpi_numa_slit_init() to drivers/acpi/numa.c
  ACPI / NUMA: Move acpi_numa_arch_fixup() to ia64 only
  ACPI / NUMA: remove duplicate NULL check
  ACPI / NUMA: Replace ACPI_DEBUG_PRINT() with pr_debug()
  ACPI / NUMA: Use pr_fmt() instead of printk
2016-07-25 13:40:39 +02:00
Catalin Marinas
a95b0644b3 Merge branch 'for-next/kprobes' into for-next/core
* kprobes:
  arm64: kprobes: Add KASAN instrumentation around stack accesses
  arm64: kprobes: Cleanup jprobe_return
  arm64: kprobes: Fix overflow when saving stack
  arm64: kprobes: WARN if attempting to step with PSTATE.D=1
  kprobes: Add arm64 case in kprobe example module
  arm64: Add kernel return probes support (kretprobes)
  arm64: Add trampoline code for kretprobes
  arm64: kprobes instruction simulation support
  arm64: Treat all entry code as non-kprobe-able
  arm64: Blacklist non-kprobe-able symbol
  arm64: Kprobes with single stepping support
  arm64: add conditional instruction simulation support
  arm64: Add more test functions to insn.c
  arm64: Add HAVE_REGS_AND_STACK_ACCESS_API feature
2016-07-21 18:20:41 +01:00
Will Deacon
2ce39ad151 arm64: debug: unmask PSTATE.D earlier
Clearing PSTATE.D is one of the requirements for generating a debug
exception. The arm64 booting protocol requires that PSTATE.D is set,
since many of the debug registers (for example, the hw_breakpoint
registers) are UNKNOWN out of reset and could potentially generate
spurious, fatal debug exceptions in early boot code if PSTATE.D was
clear. Once the debug registers have been safely initialised, PSTATE.D
is cleared, however this is currently broken for two reasons:

(1) The boot CPU clears PSTATE.D in a postcore_initcall and secondary
    CPUs clear PSTATE.D in secondary_start_kernel. Since the initcall
    runs after SMP (and the scheduler) have been initialised, there is
    no guarantee that it is actually running on the boot CPU. In this
    case, the boot CPU is left with PSTATE.D set and is not capable of
    generating debug exceptions.

(2) In a preemptible kernel, we may explicitly schedule on the IRQ
    return path to EL1. If an IRQ occurs with PSTATE.D set in the idle
    thread, then we may schedule the kthread_init thread, run the
    postcore_initcall to clear PSTATE.D and then context switch back
    to the idle thread before returning from the IRQ. The exception
    return path will then restore PSTATE.D from the stack, and set it
    again.

This patch fixes the problem by moving the clearing of PSTATE.D earlier
to proc.S. This has the desirable effect of clearing it in one place for
all CPUs, long before we have to worry about the scheduler or any
exception handling. We ensure that the previous reset of MDSCR_EL1 has
completed before unmasking the exception, so that any spurious
exceptions resulting from UNKNOWN debug registers are not generated.

Without this patch applied, the kprobes selftests have been seen to fail
under KVM, where we end up attempting to step the OOL instruction buffer
with PSTATE.D set and therefore fail to complete the step.

Cc: <stable@vger.kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reported-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2016-07-19 16:56:46 +01:00