146010 Commits

Author SHA1 Message Date
Linus Torvalds
45df60cd2c ARM: SoC fixes for 4.17
Here is a very small set of fixes for inclusion in linux-4.17-rc1:
 Two changes for the maintainer file, and one more fix for the newly
 added npcm platform, to enable the level 2 cache controller.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJazib3AAoJEGCrR//JCVIndE8P/1vRE7HcR+DPVwjDWGdfDhQ+
 LuF86+eZRwxHewykA+RQpsdQ8c2N2g+5oL20EvtJpInS/2iE36evZicFIMHQKfnL
 cbij5fi2/hlCRGWfscr2g5Zq1KNPSLqyfRms5z27TRddYoZMYqMg67TwwkQ0tdmf
 WeQ+2Dg17SaZecGFLWmr8cJlo+bx6U32KYHOMo7X8mhW91GPEHLqh/u+LhVQPnnt
 LdZ4IcMH4lC/uXl39qojT8aWmzm2fcKkYBJAMbm3cL3tu76Jyy8rEy5AAdqjHsin
 Xb+wcnCfk6q+dO3OjFhA3bvgDw4LUIyntMiFQvANWWMiGrmIynw9VYIMiQ2Y0S15
 Kv9e2uwkUHiuWJZCDbFCa9pa0kDmxMpoC45wDBR2ktuAa4HB/BP6PKTFrLC0Fxag
 KJvRRDklxHDWbsA+mZTser1mEKG9kslnGNfF3LCALWnFkMgWKiV1NuLr8HucaO7d
 7b7ppofmjva+qcHjDd8I9r4qFnLcWDJHZg3cZus+4WjTHt5ws+ueb0pYPNIKvnE1
 dhvjY9dSmeEZrM8rqLcUD9Npnm4GA+ySFpFty1jPP4hb0Q+Ho1p8BssP1Ktxw8mX
 FQ59x/RYj/5+yPj7st0ERZsxir4D39H3PlYsTWeqjZCCG8VMgBX+UND0ozh7ytYM
 1+1r1us28shsL5Xp6HRo
 =kiZ6
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Arnd Bergmann:
 "Here is a very small set of fixes for inclusion in linux-4.17-rc1: Two
  changes for the maintainer file, and one more fix for the newly added
  npcm platform, to enable the level 2 cache controller"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  MAINTAINERS: Update ASPEED entry with details
  MAINTAINERS: Migrate oxnas list to groups.io
  arm: npcm: enable L2 cache in NPCM7xx architecture
2018-04-11 16:12:21 -07:00
Linus Torvalds
b82b6813ff nios2 update for v4.17-rc1
nios2: Use read_persistent_clock64() instead of read_persistent_clock()
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJawyvYAAoJEFWoEK+e3syCrUoP/j/0Tdqhsh4WULaFh7InUJ/x
 Ww0C83PuOvzfmqgv8e9a7tciWjdd+lhs7jAROd2FLxa+WYwr1wdnvWZfie2XEscN
 p4clq28tTcNql3pRGFnjd//qDFziZf0XuKkdi2J6AnAu1nP3tY+y9KZD8mpgF0/9
 K4v7Xb51Grj4ltk10t1wZhsuTbkEw+fysZQFEiEwBrir11vVZi+gWpYdRetN8GrQ
 xw7MDxj1rK6tGDL+0vIsmmOKp0CLZKNrkUItBhb2aQUEadhJzZn7CMIhmkRGzLLF
 6eouYZ0rdS8rGWjn1otolXsnKaqPlHoLKR8oG/b4zpQzxNd55Aa11kEKzSNXySXH
 stFJj5lG4DGho2UJmEWk54NHRtbtNEWo0VXBMHgGtg9qoaNQBVVN+4wvV2EKfD3u
 jrGCH1V9ZEHsOsAGbip8mKSELZfxnWF9UBinxw252ZnZoWpLXISaxY/Z45ubK6jv
 t5hNWOwHQO2zvcSHEMn6Q+19CYumFpD7IAYN4sMuIYPrxOJcZfMhdkcJyQwXQ/sR
 OWjCbLneDuSpZ3G/MxG76ilS+K9JlRF0/l3fr0MJW186RloVwlD/AN1Zk4d+asVR
 +341gNiDsSxAXdEnpcBItXFHj7hLUE1mOUXCMdN+y1spPK8YGf68vCeinRMf8lXy
 DRVlYdx5Wdlrd7r6tKQD
 =l1lE
 -----END PGP SIGNATURE-----

Merge tag 'nios2-v4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2

Pull nios2 update from Ley Foon Tan:
 "Use read_persistent_clock64() instead of read_persistent_clock()"

* tag 'nios2-v4.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
  nios2: Use read_persistent_clock64() instead of read_persistent_clock()
2018-04-11 16:02:18 -07:00
Helge Deller
6769828703 parisc: Prevent panic at system halt
When issuing a "shutdown -h now", the reboot syscall calls kernel_halt()
which shouldn't return, otherwise one gets this panic:

reboot: System halted
Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000000
CPU: 0 PID: 1 Comm: systemd-shutdow Not tainted 4.16.0-32bit+ #560
Backtrace:
 [<1018a694>] show_stack+0x18/0x28
 [<106e68a8>] dump_stack+0x80/0x10c
 [<101a4df8>] panic+0xfc/0x290
 [<101a90b8>] do_exit+0x73c/0x914
 [<101c7e38>] SyS_reboot+0x190/0x1d4
 [<1017e444>] syscall_exit+0x0/0x14

Fix it by letting machine_halt() call machine_power_off() which doesn't
return.

Signed-off-by: Helge Deller <deller@gmx.de>
2018-04-11 22:28:41 +02:00
Ard Biesheuvel
24534b3511 arm64: assembler: add macros to conditionally yield the NEON under PREEMPT
Add support macros to conditionally yield the NEON (and thus the CPU)
that may be called from the assembler code.

In some cases, yielding the NEON involves saving and restoring a non
trivial amount of context (especially in the CRC folding algorithms),
and so the macro is split into three, and the code in between is only
executed when the yield path is taken, allowing the context to be preserved.
The third macro takes an optional label argument that marks the resume
path after a yield has been performed.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-04-11 18:50:34 +01:00
Ard Biesheuvel
0f468e221c arm64: assembler: add utility macros to push/pop stack frames
We are going to add code to all the NEON crypto routines that will
turn them into non-leaf functions, so we need to manage the stack
frames. To make this less tedious and error prone, add some macros
that take the number of callee saved registers to preserve and the
extra size to allocate in the stack frame (for locals) and emit
the ldp/stp sequences.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-04-11 18:50:34 +01:00
Marc Zyngier
e8b22d0f45 arm64: Move the content of bpi.S to hyp-entry.S
bpi.S was introduced as we were starting to build the Spectre v2
mitigation framework, and it was rather unclear that it would
become strictly KVM specific.

Now that the picture is a lot clearer, let's move the content
of that file to hyp-entry.S, where it actually belong.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-04-11 18:49:30 +01:00
Marc Zyngier
22765f30db arm64: Get rid of __smccc_workaround_1_hvc_*
The very existence of __smccc_workaround_1_hvc_* is a thinko, as
KVM will never use a HVC call to perform the branch prediction
invalidation. Even as a nested hypervisor, it would use an SMC
instruction.

Let's get rid of it.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-04-11 18:49:30 +01:00
Marc Zyngier
8892b71885 arm64: capabilities: Rework EL2 vector hardening entry
Since 5e7951ce19ab ("arm64: capabilities: Clean up midr range helpers"),
capabilities must be represented with a single entry. If multiple
CPU types can use the same capability, then they need to be enumerated
in a list.

The EL2 hardening stuff (which affects both A57 and A72) managed to
escape the conversion in the above patch thanks to the 4.17 merge
window. Let's fix it now.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-04-11 18:49:30 +01:00
Shanker Donthineni
4bc352ffb3 arm64: KVM: Use SMCCC_ARCH_WORKAROUND_1 for Falkor BP hardening
The function SMCCC_ARCH_WORKAROUND_1 was introduced as part of SMC
V1.1 Calling Convention to mitigate CVE-2017-5715. This patch uses
the standard call SMCCC_ARCH_WORKAROUND_1 for Falkor chips instead
of Silicon provider service ID 0xC2001700.

Cc: <stable@vger.kernel.org> # 4.14+
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
[maz: reworked errata framework integration]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-04-11 18:49:30 +01:00
Matthew Wilcox
b93b016313 page cache: use xa_lock
Remove the address_space ->tree_lock and use the xa_lock newly added to
the radix_tree_root.  Rename the address_space ->page_tree to ->i_pages,
since we don't really care that it's a tree.

[willy@infradead.org: fix nds32, fs/dax.c]
  Link: http://lkml.kernel.org/r/20180406145415.GB20605@bombadil.infradead.orgLink: http://lkml.kernel.org/r/20180313132639.17387-9-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Acked-by: Jeff Layton <jlayton@redhat.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-11 10:28:39 -07:00
Matthew Wilcox
d339d705f7 unicore32: turn flush_dcache_mmap_lock into a no-op
Unicore doesn't walk the VMA tree in its flush_dcache_page()
implementation, so has no need to take the tree_lock.

Link: http://lkml.kernel.org/r/20180313132639.17387-5-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-11 10:28:39 -07:00
Matthew Wilcox
427c896f26 arm64: turn flush_dcache_mmap_lock into a no-op
ARM64 doesn't walk the VMA tree in its flush_dcache_page()
implementation, so has no need to take the tree_lock.

Link: http://lkml.kernel.org/r/20180313132639.17387-4-willy@infradead.org
Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-11 10:28:39 -07:00
Masahiro Yamada
2dd8a62c64 linux/const.h: move UL() macro to include/linux/const.h
ARM, ARM64 and UniCore32 duplicate the definition of UL():

  #define UL(x) _AC(x, UL)

This is not actually arch-specific, so it will be useful to move it to a
common header.  Currently, we only have the uapi variant for
linux/const.h, so I am creating include/linux/const.h.

I also added _UL(), _ULL() and ULL() because _AC() is mostly used in
the form either _AC(..., UL) or _AC(..., ULL).  I expect they will be
replaced in follow-up cleanups.  The underscore-prefixed ones should
be used for exported headers.

Link: http://lkml.kernel.org/r/1519301715-31798-4-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-11 10:28:38 -07:00
Pavel Tatashin
6f84f8d158 xen, mm: allow deferred page initialization for xen pv domains
Juergen Gross noticed that commit f7f99100d8d ("mm: stop zeroing memory
during allocation in vmemmap") broke XEN PV domains when deferred struct
page initialization is enabled.

This is because the xen's PagePinned() flag is getting erased from
struct pages when they are initialized later in boot.

Juergen fixed this problem by disabling deferred pages on xen pv
domains.  It is desirable, however, to have this feature available as it
reduces boot time.  This fix re-enables the feature for pv-dmains, and
fixes the problem the following way:

The fix is to delay setting PagePinned flag until struct pages for all
allocated memory are initialized, i.e.  until after free_all_bootmem().

A new x86_init.hyper op init_after_bootmem() is called to let xen know
that boot allocator is done, and hence struct pages for all the
allocated memory are now initialized.  If deferred page initialization
is enabled, the rest of struct pages are going to be initialized later
in boot once page_alloc_init_late() is called.

xen_after_bootmem() walks page table's pages and marks them pinned.

Link: http://lkml.kernel.org/r/20180226160112.24724-2-pasha.tatashin@oracle.com
Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Tested-by: Juergen Gross <jgross@suse.com>
Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Alok Kataria <akataria@vmware.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Mathias Krause <minipli@googlemail.com>
Cc: Jinbum Park <jinb.park7@gmail.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Jia Zhang <zhang.jia@linux.alibaba.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-11 10:28:38 -07:00
Michal Hocko
a4ff8e8620 mm: introduce MAP_FIXED_NOREPLACE
Patch series "mm: introduce MAP_FIXED_NOREPLACE", v2.

This has started as a follow up discussion [3][4] resulting in the
runtime failure caused by hardening patch [5] which removes MAP_FIXED
from the elf loader because MAP_FIXED is inherently dangerous as it
might silently clobber an existing underlying mapping (e.g.  stack).
The reason for the failure is that some architectures enforce an
alignment for the given address hint without MAP_FIXED used (e.g.  for
shared or file backed mappings).

One way around this would be excluding those archs which do alignment
tricks from the hardening [6].  The patch is really trivial but it has
been objected, rightfully so, that this screams for a more generic
solution.  We basically want a non-destructive MAP_FIXED.

The first patch introduced MAP_FIXED_NOREPLACE which enforces the given
address but unlike MAP_FIXED it fails with EEXIST if the given range
conflicts with an existing one.  The flag is introduced as a completely
new one rather than a MAP_FIXED extension because of the backward
compatibility.  We really want a never-clobber semantic even on older
kernels which do not recognize the flag.  Unfortunately mmap sucks
wrt flags evaluation because we do not EINVAL on unknown flags.  On
those kernels we would simply use the traditional hint based semantic so
the caller can still get a different address (which sucks) but at least
not silently corrupt an existing mapping.  I do not see a good way
around that.  Except we won't export expose the new semantic to the
userspace at all.

It seems there are users who would like to have something like that.
Jemalloc has been mentioned by Michael Ellerman [7]

Florian Weimer has mentioned the following:
: glibc ld.so currently maps DSOs without hints.  This means that the kernel
: will map right next to each other, and the offsets between them a completely
: predictable.  We would like to change that and supply a random address in a
: window of the address space.  If there is a conflict, we do not want the
: kernel to pick a non-random address. Instead, we would try again with a
: random address.

John Hubbard has mentioned CUDA example
: a) Searches /proc/<pid>/maps for a "suitable" region of available
: VA space.  "Suitable" generally means it has to have a base address
: within a certain limited range (a particular device model might
: have odd limitations, for example), it has to be large enough, and
: alignment has to be large enough (again, various devices may have
: constraints that lead us to do this).
:
: This is of course subject to races with other threads in the process.
:
: Let's say it finds a region starting at va.
:
: b) Next it does:
:     p = mmap(va, ...)
:
: *without* setting MAP_FIXED, of course (so va is just a hint), to
: attempt to safely reserve that region. If p != va, then in most cases,
: this is a failure (almost certainly due to another thread getting a
: mapping from that region before we did), and so this layer now has to
: call munmap(), before returning a "failure: retry" to upper layers.
:
:     IMPROVEMENT: --> if instead, we could call this:
:
:             p = mmap(va, ... MAP_FIXED_NOREPLACE ...)
:
:         , then we could skip the munmap() call upon failure. This
:         is a small thing, but it is useful here. (Thanks to Piotr
:         Jaroszynski and Mark Hairgrove for helping me get that detail
:         exactly right, btw.)
:
: c) After that, CUDA suballocates from p, via:
:
:      q = mmap(sub_region_start, ... MAP_FIXED ...)
:
: Interestingly enough, "freeing" is also done via MAP_FIXED, and
: setting PROT_NONE to the subregion. Anyway, I just included (c) for
: general interest.

Atomic address range probing in the multithreaded programs in general
sounds like an interesting thing to me.

The second patch simply replaces MAP_FIXED use in elf loader by
MAP_FIXED_NOREPLACE.  I believe other places which rely on MAP_FIXED
should follow.  Actually real MAP_FIXED usages should be docummented
properly and they should be more of an exception.

[1] http://lkml.kernel.org/r/20171116101900.13621-1-mhocko@kernel.org
[2] http://lkml.kernel.org/r/20171129144219.22867-1-mhocko@kernel.org
[3] http://lkml.kernel.org/r/20171107162217.382cd754@canb.auug.org.au
[4] http://lkml.kernel.org/r/1510048229.12079.7.camel@abdul.in.ibm.com
[5] http://lkml.kernel.org/r/20171023082608.6167-1-mhocko@kernel.org
[6] http://lkml.kernel.org/r/20171113094203.aofz2e7kueitk55y@dhcp22.suse.cz
[7] http://lkml.kernel.org/r/87efp1w7vy.fsf@concordia.ellerman.id.au

This patch (of 2):

MAP_FIXED is used quite often to enforce mapping at the particular range.
The main problem of this flag is, however, that it is inherently dangerous
because it unmaps existing mappings covered by the requested range.  This
can cause silent memory corruptions.  Some of them even with serious
security implications.  While the current semantic might be really
desiderable in many cases there are others which would want to enforce the
given range but rather see a failure than a silent memory corruption on a
clashing range.  Please note that there is no guarantee that a given range
is obeyed by the mmap even when it is free - e.g.  arch specific code is
allowed to apply an alignment.

Introduce a new MAP_FIXED_NOREPLACE flag for mmap to achieve this
behavior.  It has the same semantic as MAP_FIXED wrt.  the given address
request with a single exception that it fails with EEXIST if the requested
address is already covered by an existing mapping.  We still do rely on
get_unmaped_area to handle all the arch specific MAP_FIXED treatment and
check for a conflicting vma after it returns.

The flag is introduced as a completely new one rather than a MAP_FIXED
extension because of the backward compatibility.  We really want a
never-clobber semantic even on older kernels which do not recognize the
flag.  Unfortunately mmap sucks wrt.  flags evaluation because we do not
EINVAL on unknown flags.  On those kernels we would simply use the
traditional hint based semantic so the caller can still get a different
address (which sucks) but at least not silently corrupt an existing
mapping.  I do not see a good way around that.

[mpe@ellerman.id.au: fix whitespace]
[fail on clashing range with EEXIST as per Florian Weimer]
[set MAP_FIXED before round_hint_to_min as per Khalid Aziz]
Link: http://lkml.kernel.org/r/20171213092550.2774-2-mhocko@kernel.org
Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Russell King - ARM Linux <linux@armlinux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Abdul Haleem <abdhalee@linux.vnet.ibm.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Jason Evans <jasone@google.com>
Cc: David Goldblatt <davidtgoldblatt@gmail.com>
Cc: Edward Tomasz Napierała <trasz@FreeBSD.org>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-11 10:28:38 -07:00
Kees Cook
8f2af155b5 exec: pass stack rlimit into mm layout functions
Patch series "exec: Pin stack limit during exec".

Attempts to solve problems with the stack limit changing during exec
continue to be frustrated[1][2].  In addition to the specific issues
around the Stack Clash family of flaws, Andy Lutomirski pointed out[3]
other places during exec where the stack limit is used and is assumed to
be unchanging.  Given the many places it gets used and the fact that it
can be manipulated/raced via setrlimit() and prlimit(), I think the only
way to handle this is to move away from the "current" view of the stack
limit and instead attach it to the bprm, and plumb this down into the
functions that need to know the stack limits.  This series implements
the approach.

[1] 04e35f4495dd ("exec: avoid RLIMIT_STACK races with prlimit()")
[2] 779f4e1c6c7c ("Revert "exec: avoid RLIMIT_STACK races with prlimit()"")
[3] to security@kernel.org, "Subject: existing rlimit races?"

This patch (of 3):

Since it is possible that the stack rlimit can change externally during
exec (either via another thread calling setrlimit() or another process
calling prlimit()), provide a way to pass the rlimit down into the
per-architecture mm layout functions so that the rlimit can stay in the
bprm structure instead of sitting in the signal structure until exec is
finalized.

Link: http://lkml.kernel.org/r/1518638796-20819-2-git-send-email-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Hugh Dickins <hughd@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Greg KH <greg@kroah.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
Cc: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-11 10:28:37 -07:00
Joonsoo Kim
3d2054ad8c ARM: CMA: avoid double mapping to the CMA area if CONFIG_HIGHMEM=y
CMA area is now managed by the separate zone, ZONE_MOVABLE, to fix many
MM related problems.  In this implementation, if CONFIG_HIGHMEM = y,
then ZONE_MOVABLE is considered as HIGHMEM and the memory of the CMA
area is also considered as HIGHMEM.  That means that they are considered
as the page without direct mapping.  However, CMA area could be in a
lowmem and the memory could have direct mapping.

In ARM, when establishing a new mapping for DMA, direct mapping should
be cleared since two mapping with different cache policy could cause
unknown problem.  With this patch, PageHighmem() for the CMA memory
located in lowmem returns true so that the function for DMA mapping
cannot notice whether it needs to clear direct mapping or not,
correctly.  To handle this situation, this patch always clears direct
mapping for such CMA memory.

Link: http://lkml.kernel.org/r/1512114786-5085-4-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Tested-by: Tony Lindgren <tony@atomide.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Laura Abbott <lauraa@codeaurora.org>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-11 10:28:32 -07:00
Michal Hocko
666feb21a0 mm, migrate: remove reason argument from new_page_t
No allocation callback is using this argument anymore.  new_page_node
used to use this parameter to convey node_id resp.  migration error up
to move_pages code (do_move_page_to_node_array).  The error status never
made it into the final status field and we have a better way to
communicate node id to the status field now.  All other allocation
callbacks simply ignored the argument so we can drop it finally.

[mhocko@suse.com: fix migration callback]
  Link: http://lkml.kernel.org/r/20180105085259.GH2801@dhcp22.suse.cz
[akpm@linux-foundation.org: fix alloc_misplaced_dst_page()]
[mhocko@kernel.org: fix build]
  Link: http://lkml.kernel.org/r/20180103091134.GB11319@dhcp22.suse.cz
Link: http://lkml.kernel.org/r/20180103082555.14592-3-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: Andrea Reale <ar@linux.vnet.ibm.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-11 10:28:32 -07:00
Martin Schwidefsky
6a3d1e81a4 s390: correct nospec auto detection init order
With CONFIG_EXPOLINE_AUTO=y the call of spectre_v2_auto_early() via
early_initcall is done *after* the early_param functions. This
overwrites any settings done with the nobp/no_spectre_v2/spectre_v2
parameters. The code patching for the kernel is done after the
evaluation of the early parameters but before the early_initcall
is done. The end result is a kernel image that is patched correctly
but the kernel modules are not.

Make sure that the nospec auto detection function is called before the
early parameters are evaluated and before the code patching is done.

Fixes: 6e179d64126b ("s390: add automatic detection of the spectre defense")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-11 17:46:00 +02:00
Rafael J. Wysocki
51798deaff Merge branches 'pm-cpuidle' and 'pm-qos'
* pm-cpuidle:
  tick-sched: avoid a maybe-uninitialized warning
  cpuidle: Add definition of residency to sysfs documentation
  time: hrtimer: Use timerqueue_iterate_next() to get to the next timer
  nohz: Avoid duplication of code related to got_idle_tick
  nohz: Gather tick_sched booleans under a common flag field
  cpuidle: menu: Avoid selecting shallow states with stopped tick
  cpuidle: menu: Refine idle state selection for running tick
  sched: idle: Select idle state before stopping the tick
  time: hrtimer: Introduce hrtimer_next_event_without()
  time: tick-sched: Split tick_nohz_stop_sched_tick()
  cpuidle: Return nohz hint from cpuidle_select()
  jiffies: Introduce USER_TICK_USEC and redefine TICK_USEC
  sched: idle: Do not stop the tick before cpuidle_idle_call()
  sched: idle: Do not stop the tick upfront in the idle loop
  time: tick-sched: Reorganize idle tick management code

* pm-qos:
  PM / QoS: mark expected switch fall-throughs
2018-04-11 13:22:46 +02:00
Helge Deller
71d577db01 parisc: Switch to generic COMPAT_BINFMT_ELF
Drop our own compat binfmt implementation in
arch/parisc/kernel/binfmt_elf32.c in favour of the generic
implementation with CONFIG_COMPAT_BINFMT_ELF.

While cleaning up the dependencies, I noticed that ELF_PLATFORM was strangely
defined: On a 32-bit kernel, it was defined to "PARISC", while when running in
compat mode on a 64-bit kernel it was defined to "PARISC32". Since it doesn't
seem to be used in glibc yet, it's now defined in both cases to "PARISC". In
any case, it can be distinguished because it's either a 32-bit or a 64-bit ELF
file.

Signed-off-by: Helge Deller <deller@gmx.de>
2018-04-11 11:40:35 +02:00
Helge Deller
2a03bb9e7a parisc: Move cache flush functions into .text.hot section
and move the disable_sr_hashing() C and assembly functions into the
.init section.

Signed-off-by: Helge Deller <deller@gmx.de>
2018-04-11 11:40:35 +02:00
Helge Deller
75abf64287 parisc/signal: Add FPE_CONDTRAP for conditional trap handling
Posix and common sense requires that SI_USER not be a signal specific
si_code. Thus add a new FPE_CONDTRAP si_code for conditional traps.

Signed-off-by: Helge Deller <deller@gmx.de>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
2018-04-11 11:40:35 +02:00
Harald Freudenberger
af4a72276d s390/zcrypt: Support up to 256 crypto adapters.
There was an artificial restriction on the card/adapter id
to only 6 bits but all the AP commands do support adapter
ids with 8 bit. This patch removes this restriction to 64
adapters and now up to 256 adapter can get addressed.

Some of the ioctl calls work on the max number of cards
possible (which was 64). These ioctls are now deprecated
but still supported. All the defines, structs and ioctl
interface declarations have been kept for compabibility.
There are now new ioctls (and defines for these) with an
additional '2' appended which provide the extended versions
with 256 cards supported.

Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-11 10:36:27 +02:00
Nicholas Piggin
19ce7909ed KVM: PPC: Book3S HV: trace_tlbie must not be called in realmode
This crashes with a "Bad real address for load" attempting to load
from the vmalloc region in realmode (faulting address is in DAR).

  Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1]
  LE SMP NR_CPUS=2048 NUMA PowerNV
  CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted 4.16.0-01530-g43d1859f0994
  NIP:  c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580
  REGS: c000000fff76dd80 TRAP: 0200   Not tainted  (4.16.0-01530-g43d1859f0994)
  MSR:  9000000000201003 <SF,HV,ME,RI,LE>  CR: 48082222  XER: 00000000
  CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3
  NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0
  LR [c0000000000c2430] do_tlbies+0x230/0x2f0

I suspect the reason is the per-cpu data is not in the linear chunk.
This could be restored if that was able to be fixed, but for now,
just remove the tracepoints.

Fixes: 0428491cba92 ("powerpc/mm: Trace tlbie(l) instructions")
Cc: stable@vger.kernel.org # v4.13+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-11 12:35:33 +10:00
Aneesh Kumar K.V
032900e62c powerpc/8xx: Fix build with hugetlbfs enabled
8xx uses the slice code when hugetlbfs is enabled. We missed a header
include on 8xx which resulted in the below build failure:

  config: mpc885_ads_defconfig + CONFIG_HUGETLBFS

  arch/powerpc/mm/slice.c: In function 'slice_get_unmapped_area':
  arch/powerpc/mm/slice.c:655:2: error: implicit declaration of function 'need_extra_context'
  arch/powerpc/mm/slice.c:656:3: error: implicit declaration of function 'alloc_extended_context'

on PPC64 the mmu_context.h was included via linux/pkeys.h

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-11 12:00:23 +10:00
Nicholas Piggin
3b8070335f powerpc/powernv: Fix OPAL NVRAM driver OPAL_BUSY loops
The OPAL NVRAM driver does not sleep in case it gets OPAL_BUSY or
OPAL_BUSY_EVENT from firmware, which causes large scheduling
latencies, and various lockup errors to trigger (again, BMC reboot
can cause it).

Fix this by converting it to the standard form OPAL_BUSY loop that
sleeps.

Fixes: 628daa8d5abf ("powerpc/powernv: Add RTC and NVRAM support plus RTAS fallbacks")
Depends-on: 34dd25de9fe3 ("powerpc/powernv: define a standard delay for OPAL_BUSY type retry loops")
Cc: stable@vger.kernel.org # v3.2+
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-11 11:58:43 +10:00
Linus Torvalds
f77cfbe645 c6x changes 4.17
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJazMUIAAoJEOiN4VijXeFPVtQP/3JlMhI6l1GVRBFBSQRCf6+g
 bM11aPrm1Df8wPagI1FzF823PBeHREnOLsgqo9694bFaK8UVx1Fic8DrMdPZulPx
 7NkAV9nUf51HJzM5IgqZIGFa9ene3QafsKSBwC/HPkYLeXqusjtMYDzdUkzZIXij
 BuhiAWFADV/geScibaJ4fn4DmSzXlcG0grUI5whTHdO8heXv0XxPHKRbwMPCx1of
 U8vvvVCQdCbxTEbsm151qMLAfpa4FSOFDAr5WO8q6aXiKM4Zw/WCUONYAjiQLPY+
 nr+azs7+DPWMdeLbr4BqKXirch0FhHelqHbGOYrxQJCg9t+EHZ3sGoOTlPYR56Id
 Hl6giyzwyEDizyHbXMFUT5Y39s0/aCru73K9a51mVgCKYDIGw2rT0onc+Rv3vkXR
 sGyTJWXQJQbt3wsEnBnpn5Zk24fYLP0wBXU0M6J8n21sYQyCD5GWzN6W2FT2iusr
 qhfhsaL6/zPvD1leHFMHY8Q89w4a32WqopKsQRslbFSea/bH5M4b3bq4XmFcauj0
 1Z1D2q8imFAFHhhHFBSZFLzORnfwgm66ntzPIC/P9yI4Zw7YINrSb17u+INlgrWy
 CVHQKku83diZ7lYWmYnTpxX5oNb8GuFw5GUitCxEimbqXnlT2DRIKCGSc24mcmo8
 XNrMNyhKJ6c3ELxJn4W3
 =BKqi
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming

Pull c6x updates from Mark Salter.

* tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming:
  c6x: pass endianness info to sparse
  c6x: fix platforms/plldata.c get_coreid build error
  c6x: remove unused KTHREAD_SIZE definition
2018-04-10 11:50:14 -07:00
Linus Torvalds
948869fa9f MIPS changes for 4.17
These are the main MIPS changes for 4.17. Rough overview:
 
  (1) generic platform: Add support for Microsemi Ocelot SoCs
 
  (2) crypto: Add CRC32 and CRC32C HW acceleration module
 
  (3) Various cleanups and misc improvements
 
 Miscellaneous:
 
  - Hang more efficiently on halt/powerdown/restart
 
  - pm-cps: Block system suspend when a JTAG probe is present
 
  - Expand make help text for generic defconfigs
    - Refactor handling of legacy defconfigs
 
  - Determine the entry point from the ELF file header to fix microMIPS
    for certain toolchains
 
  - Introduce isa-rev.h for MIPS_ISA_REV and use to simplify other code
 
 Minor cleanups:
 
  - DTS: boston/ci20: Unit name cleanups and correction
 
  - kdump: Make the default for PHYSICAL_START always 64-bit
 
  - Constify gpio_led in Alchemy, AR7, and TXX9
 
  - Silence a couple of W=1 warnings
 
  - Remove duplicate includes
 
 Platform support:
 
 ath79:
 
  - Fix AR724X_PLL_REG_PCIE_CONFIG offset
 
 BCM47xx:
 
  - FIRMWARE: Use mac_pton() for MAC address parsing
 
  - Add Luxul XAP1500/XWR1750 WiFi LEDs
 
  - Use standard reset button for Luxul XWR-1750
 
 BMIPS:
 
  - Enable CONFIG_BRCMSTB_PM in bmips_stb_defconfig for build coverage
 
  - Add STB PM, wake-up timer, watchdog DT nodes
 
 Generic platform:
 
  - Add support for Microsemi Ocelot
    - dt-bindings: Add vendor prefix for Microsemi Corporation
    - dt-bindings: Add bindings for Microsemi SoCs
    - Add ocelot SoC & PCB123 board DTS files
    - MAINTAINERS: Add entry for Microsemi MIPS SoCs
 
  - Enable crc32-mips on r6 configs
 
 Octeon:
 
  - Drop '.' after newlines in printk calls
 
 ralink:
 
  - pci-mt7621: Enable PCIe on MT7688
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEd80NauSabkiESfLYbAtpk944dnoFAlrL1tMACgkQbAtpk944
 dnpkhg//UnKOmj4xiQf5ik4XjSDHo2dcUzQKPLq/grzpc9yC2I6bgJ32gcQnD/rw
 x2f64DAjmbIJEY/hCGLEOH82/SqU5vF5gI+BtSKE6Ti28dyQH68pvyHRYIZBFx62
 WRYAf50Spa/q/bOJYv4/eTqjEc7FPMaUll3kH5NMiBB3X5bIeq3w+x78YBKjAqPB
 IJgHF/JnlpD8UvQ0RCl1fIXhkRYeNuTyiAseVI3ygAqbMFMGkTevePWl82D+pWuo
 W9BFaliNt/9TNgq5/L91a/uhSuo6INRls1tFQT9cPa6n4GJxLpUtO7a3v4Xgaukj
 TeXTkXZl4PEUfeTA3cVaXrENTN4sCwThkxvrKwefbUBiIa+/TDi/Wgxi3n+rKwi2
 Vu4V36uxnO/vNU7Sgp/jHdRJyxw06GfRy55MvToFlk1S25Gwt6zW2O0o8nAHwCDW
 zvoRfSkasElKwnM3zKR/mrdDn7qPulFfiyWyKuuY3bIXsaTnItO1JJ/rbgKuIXr7
 lo0eVZ7GhuWfuLCEHZ99rWsGz0SjuQQIb4c6faNELp6gTwIpz9bbsOHRuajvfoSF
 8jmPjUocbaRR0odyFBqsejwBmeebDoeIZljzD50ImSoYftEMgRoQ448mIm0EuWVU
 KY+85MIeK5zwWXKd4VBexgLvPggGURaOEEXKisgvoE3NjCuO2nw=
 =4vUc
 -----END PGP SIGNATURE-----

Merge tag 'mips_4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips

Pull MIPS updates from James Hogan:
 "These are the main MIPS changes for 4.17. Rough overview:

   (1) generic platform: Add support for Microsemi Ocelot SoCs

   (2) crypto: Add CRC32 and CRC32C HW acceleration module

   (3) Various cleanups and misc improvements

  More detailed summary:

  Miscellaneous:
   - hang more efficiently on halt/powerdown/restart
   - pm-cps: Block system suspend when a JTAG probe is present
   - expand make help text for generic defconfigs
   - refactor handling of legacy defconfigs
   - determine the entry point from the ELF file header to fix microMIPS
     for certain toolchains
   - introduce isa-rev.h for MIPS_ISA_REV and use to simplify other code

  Minor cleanups:
   - DTS: boston/ci20: Unit name cleanups and correction
   - kdump: Make the default for PHYSICAL_START always 64-bit
   - constify gpio_led in Alchemy, AR7, and TXX9
   - silence a couple of W=1 warnings
   - remove duplicate includes

  Platform support:
  Generic platform:
   - add support for Microsemi Ocelot
   - dt-bindings: Add vendor prefix for Microsemi Corporation
   - dt-bindings: Add bindings for Microsemi SoCs
   - add ocelot SoC & PCB123 board DTS files
   - MAINTAINERS: Add entry for Microsemi MIPS SoCs
   - enable crc32-mips on r6 configs

  ath79:
   - fix AR724X_PLL_REG_PCIE_CONFIG offset

  BCM47xx:
   - firmware: Use mac_pton() for MAC address parsing
   - add Luxul XAP1500/XWR1750 WiFi LEDs
   - use standard reset button for Luxul XWR-1750

  BMIPS:
   - enable CONFIG_BRCMSTB_PM in bmips_stb_defconfig for build coverage
   - add STB PM, wake-up timer, watchdog DT nodes

  Octeon:
   - drop '.' after newlines in printk calls

  ralink:
   - pci-mt7621: Enable PCIe on MT7688"

* tag 'mips_4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips: (37 commits)
  MIPS: BCM47XX: Use standard reset button for Luxul XWR-1750
  MIPS: BCM47XX: Add Luxul XAP1500/XWR1750 WiFi LEDs
  MIPS: Make the default for PHYSICAL_START always 64-bit
  MIPS: Use the entry point from the ELF file header
  MAINTAINERS: Add entry for Microsemi MIPS SoCs
  MIPS: generic: Add support for Microsemi Ocelot
  MIPS: mscc: Add ocelot PCB123 device tree
  MIPS: mscc: Add ocelot dtsi
  dt-bindings: mips: Add bindings for Microsemi SoCs
  dt-bindings: Add vendor prefix for Microsemi Corporation
  MIPS: ath79: Fix AR724X_PLL_REG_PCIE_CONFIG offset
  MIPS: pci-mt7620: Enable PCIe on MT7688
  MIPS: pm-cps: Block system suspend when a JTAG probe is present
  MIPS: VDSO: Replace __mips_isa_rev with MIPS_ISA_REV
  MIPS: BPF: Replace __mips_isa_rev with MIPS_ISA_REV
  MIPS: cpu-features.h: Replace __mips_isa_rev with MIPS_ISA_REV
  MIPS: Introduce isa-rev.h to define MIPS_ISA_REV
  MIPS: Hang more efficiently on halt/powerdown/restart
  FIRMWARE: bcm47xx_nvram: Replace mac address parsing
  MIPS: BMIPS: Add Broadcom STB watchdog nodes
  ...
2018-04-10 11:39:22 -07:00
Linus Torvalds
9f3a0941fb libnvdimm for 4.17
* A rework of the filesytem-dax implementation provides for detection of
   unmap operations (truncate / hole punch) colliding with in-progress
   device-DMA. A fix for these collisions remains a work-in-progress
   pending resolution of truncate latency and starvation regressions.
 
 * The of_pmem driver expands the users of libnvdimm outside of x86 and
   ACPI to describe an implementation of persistent memory on PowerPC with
   Open Firmware / Device tree.
 
 * Address Range Scrub (ARS) handling is completely rewritten to account for
   the fact that ARS may run for 100s of seconds and there is no platform
   defined way to cancel it. ARS will now no longer block namespace
   initialization.
 
 * The NVDIMM Namespace Label implementation is updated to handle label
   areas as small as 1K, down from 128K.
 
 * Miscellaneous cleanups and updates to unit test infrastructure.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJazDt5AAoJEB7SkWpmfYgCqGMQALLwdPeY87cUK7AvQ2IXj46B
 lJgeVuHPzyQDbC03AS5uUYnnU3I5lFd7i4y7ZrywNpFs4lsb/bNmbUpQE5xp+Yvc
 1MJ/JYDIP5X4misWYm3VJo85N49+VqSRgAQk52PBigwnZ7M6/u4cSptXM9//c9JL
 /NYbat6IjjY6Tx49Tec6+F3GMZjsFLcuTVkQcREoOyOqVJE4YpP0vhNjEe0vq6vr
 EsSWiqEI5VFH4PfJwKdKj/64IKB4FGKj2A5cEgjQBxW2vw7tTJnkRkdE3jDUjqtg
 xYAqGp/Dqs4+bgdYlT817YhiOVrcr5mOHj7TKWQrBPgzKCbcG5eKDmfT8t+3NEga
 9kBlgisqIcG72lwZNA7QkEHxq1Omy9yc1hUv9qz2YA0G+J1WE8l1T15k1DOFwV57
 qIrLLUypklNZLxvrzNjclempboKc4JCUlj+TdN5E5Y6pRs55UWTXaP7Xf5O7z0vf
 l/uiiHkc3MPH73YD2PSEGFJ8m8EU0N8xhrcz3M9E2sHgYCnbty1Lw3FH0/GhThVA
 ya1mMeDdb8A2P7gWCBk1Lqeig+rJKXSey4hKM6D0njOEtMQO1H4tFqGjyfDX1xlJ
 3plUR9WBVEYzN5+9xWbwGag/ezGZ+NfcVO2gmy6yXiEph796BxRAZx/18zKRJr0m
 9eGJG1H+JspcbtLF9iHn
 =acZQ
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:
 "This cycle was was not something I ever want to repeat as there were
  several late changes that have only now just settled.

  Half of the branch up to commit d2c997c0f145 ("fs, dax: use
  page->mapping to warn...") have been in -next for several releases.
  The of_pmem driver and the address range scrub rework were late
  arrivals, and the dax work was scaled back at the last moment.

  The of_pmem driver missed a previous merge window due to an oversight.
  A sense of obligation to rectify that miss is why it is included for
  4.17. It has acks from PowerPC folks. Stephen reported a build failure
  that only occurs when merging it with your latest tree, for now I have
  fixed that up by disabling modular builds of of_pmem. A test merge
  with your tree has received a build success report from the 0day robot
  over 156 configs.

  An initial version of the ARS rework was submitted before the merge
  window. It is self contained to libnvdimm, a net code reduction, and
  passing all unit tests.

  The filesystem-dax changes are based on the wait_var_event()
  functionality from tip/sched/core. However, late review feedback
  showed that those changes regressed truncate performance to a large
  degree. The branch was rewound to drop the truncate behavior change
  and now only includes preparation patches and cleanups (with full acks
  and reviews). The finalization of this dax-dma-vs-trnucate work will
  need to wait for 4.18.

  Summary:

   - A rework of the filesytem-dax implementation provides for detection
     of unmap operations (truncate / hole punch) colliding with
     in-progress device-DMA. A fix for these collisions remains a
     work-in-progress pending resolution of truncate latency and
     starvation regressions.

   - The of_pmem driver expands the users of libnvdimm outside of x86
     and ACPI to describe an implementation of persistent memory on
     PowerPC with Open Firmware / Device tree.

   - Address Range Scrub (ARS) handling is completely rewritten to
     account for the fact that ARS may run for 100s of seconds and there
     is no platform defined way to cancel it. ARS will now no longer
     block namespace initialization.

   - The NVDIMM Namespace Label implementation is updated to handle
     label areas as small as 1K, down from 128K.

   - Miscellaneous cleanups and updates to unit test infrastructure"

* tag 'libnvdimm-for-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (39 commits)
  libnvdimm, of_pmem: workaround OF_NUMA=n build error
  nfit, address-range-scrub: add module option to skip initial ars
  nfit, address-range-scrub: rework and simplify ARS state machine
  nfit, address-range-scrub: determine one platform max_ars value
  powerpc/powernv: Create platform devs for nvdimm buses
  doc/devicetree: Persistent memory region bindings
  libnvdimm: Add device-tree based driver
  libnvdimm: Add of_node to region and bus descriptors
  libnvdimm, region: quiet region probe
  libnvdimm, namespace: use a safe lookup for dimm device name
  libnvdimm, dimm: fix dpa reservation vs uninitialized label area
  libnvdimm, testing: update the default smart ctrl_temperature
  libnvdimm, testing: Add emulation for smart injection commands
  nfit, address-range-scrub: introduce nfit_spa->ars_state
  libnvdimm: add an api to cast a 'struct nd_region' to its 'struct device'
  nfit, address-range-scrub: fix scrub in-progress reporting
  dax, dm: allow device-mapper to operate without dax support
  dax: introduce CONFIG_DAX_DRIVER
  fs, dax: use page->mapping to warn if truncate collides with a busy page
  ext2, dax: introduce ext2_dax_aops
  ...
2018-04-10 10:25:57 -07:00
Linus Torvalds
fbe173e3ff RTC for 4.17
Subsystem:
  - Add tracepoints
  - Rework of the RTC/nvmem API to allow drivers to discard struct nvmem_config
    after registration
  - New range API, drivers can now expose the useful range of the RTC
  - New offset API the core is now able to add an offset to the RTC time,
    modifying the supported range.
  - Multiple rtc_time64_to_tm fixes
  - Handle time_t overflow on 32 bit platforms in the core instead of letting
    drivers do crazy things.
  - remove rtc_control API
 
 New driver:
  - Intersil ISL12026
 
 Drivers:
  - Drivers exposing the RTC non volatile memory have been converted to use nvmem
  - Removed useless time and date validation
  - Removed an indirection pattern that was a cargo cult from ancient drivers
  - Removed VLA usage
  - Fixed a possible race condition in probe functions
  - AB8540 support is dropped from ab8500
  - pcf85363 now has alarm support
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEXx9Viay1+e7J/aM4AyWl4gNJNJIFAlrL3s4ACgkQAyWl4gNJ
 NJI+MxAAgc56UEXi5chqKpHE6GHL2aPWan9duHB6FmGrKWt/FJmpJ8UGylLFlvaW
 dpRGvW0Dynz45UztegcrbHMU6+B2O0hboL4/GVZpYAkcNAcu7Lf0ULho2rsQSDmW
 WJpemmdRxyQY1IkWmw7z7KAkMzhAfYZiVmWmVwMRZfMcKJ3DLEldfgRtkN+g0UdB
 tayWQY3mS02ki16e2figsgwZRmUUhQslDfpKlesInXOzUMmLgVWhf1QxJSEUcfs0
 AMp75vD2YvVJ/RHy/6BilQbqP9EVnaG4NHqJGFSOddazA7u+3HGubEFboI8NuPXb
 2fCvfrNux7pgtQsBF9dnpCqWlukE7sF5aDyIUvjYnr0vUm2D/CwdXglGvQSQVWea
 5GxPTWBdaCL0V7GD5OSZcfUGyz1TN/7NSUItdLSr9YK13dL+tqhcYYm5ytXJLzvO
 Z4GyUEoCOMprMJ9j5KU/TXSjauDmPDl8YZ5B93lPcNOh7y+b/2r3umBaInyZrFzX
 1WJ6FWtuhbRfEwuQtgQHBsobt9eTwZfo8C2y22HBo/fdWfBv45feiiHNXt3OVrA+
 aws6pwfVf1H0UsvpQkXtlPthAx3sbOTndDKAUhntb/U/zA9c7fTrfUyVOQ1bArJy
 p6tl/cmlpq68AjZB+0d3zUXQkQH4Syu+EqZrWItkE6XxZm+khDE=
 =l9i8
 -----END PGP SIGNATURE-----

Merge tag 'rtc-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux

Pull RTC updates from Alexandre Belloni:
 "This contains a few series that have been in preparation for a while
  and that will help systems with RTCs that will fail in 2038, 2069 or
  2100.

  Subsystem:
   - Add tracepoints
   - Rework of the RTC/nvmem API to allow drivers to discard struct
     nvmem_config after registration
   - New range API, drivers can now expose the useful range of the RTC
   - New offset API the core is now able to add an offset to the RTC
     time, modifying the supported range.
   - Multiple rtc_time64_to_tm fixes
   - Handle time_t overflow on 32 bit platforms in the core instead of
     letting drivers do crazy things.
   - remove rtc_control API

  New driver:
   - Intersil ISL12026

  Drivers:
   - Drivers exposing the RTC non volatile memory have been converted to
     use nvmem
   - Removed useless time and date validation
   - Removed an indirection pattern that was a cargo cult from ancient
     drivers
   - Removed VLA usage
   - Fixed a possible race condition in probe functions
   - AB8540 support is dropped from ab8500
   - pcf85363 now has alarm support"

* tag 'rtc-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (128 commits)
  rtc: snvs: Fix usage of snvs_rtc_enable
  rtc: mt7622: fix module autoloading for OF platform drivers
  rtc: isl12022: use true and false for boolean values
  rtc: ab8500: Drop AB8540 support
  rtc: remove a warning during scripts/kernel-doc step
  rtc: 88pm860x: remove artificial limitation
  rtc: 88pm80x: remove artificial limitation
  rtc: st-lpc: remove artificial limitation
  rtc: mrst: remove artificial limitation
  rtc: mv: remove artificial limitation
  rtc: hctosys: Ensure system time doesn't overflow time_t
  parisc: time: stop validating rtc_time in .read_time
  rtc: pcf85063: fix clearing bits in pcf85063_start_clock
  rtc: at91sam9: Set name of regmap_config
  rtc: s5m: Remove VLA usage
  rtc: s5m: Move enum from rtc.h to rtc-s5m.c
  rtc: remove VLA usage
  rtc: Add useful timestamp definitions
  rtc: Add one offset seconds to expand RTC range
  rtc: Factor out the RTC range validation into rtc_valid_range()
  ...
2018-04-10 10:22:27 -07:00
Li RongQing
a774635db5 x86/apic: Fix signedness bug in APIC ID validity checks
The APIC ID as parsed from ACPI MADT is validity checked with the
apic->apic_id_valid() callback, which depends on the selected APIC type.

For non X2APIC types APIC IDs >= 0xFF are invalid, but values > 0x7FFFFFFF
are detected as valid. This happens because the 'apicid' argument of the
apic_id_valid() callback is type 'int'. So the resulting comparison

   apicid < 0xFF

evaluates to true for all unsigned int values > 0x7FFFFFFF which are handed
to default_apic_id_valid(). As a consequence, invalid APIC IDs in !X2APIC
mode are considered valid and accounted as possible CPUs.

Change the apicid argument type of the apic_id_valid() callback to u32 so
the evaluation is unsigned and returns the correct result.

[ tglx: Massaged changelog ]

Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Cc: jgross@suse.com
Cc: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: hpa@zytor.com
Link: https://lkml.kernel.org/r/1523322966-10296-1-git-send-email-lirongqing@baidu.com
2018-04-10 16:46:39 +02:00
Tomer Maimon
d893c4de01 arm: npcm: enable L2 cache in NPCM7xx architecture
This patch Enable ARM L2 cache module in Nuvoton NPCM7xx BMC
by adding L2 cache parameters into NPCM7xx DT machine start structure.

At patch V7 arm: npcm: add basic support for Nuvoton BMCs we got comments
regarding the flags use in L2 cache module.
- https://www.spinics.net/lists/arm-kernel/msg613212.html

After checking again the L2 cache use in the NPCM7xx,
the only L2 cache flag we need to set is L2C_AUX_CTRL_SHARED_OVERRIDE
and it is done in the device tree:
https://patchwork.kernel.org/patch/10063497/

L2 cache flag mask allowed all the flag option.

Signed-off-by: Tomer Maimon <tmaimon77@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-04-10 16:40:03 +02:00
Kirill A. Shutemov
d94a155c59 x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption
Some features (Intel MKTME, AMD SME) reduce the number of effectively
available physical address bits. cpuinfo_x86::x86_phys_bits is adjusted
accordingly during the early cpu feature detection.

Though if get_cpu_cap() is called later again then this adjustement is
overwritten. That happens in setup_pku(), which is called after
detect_tme().

To address this, extract the address sizes enumeration into a separate
function, which is only called only from early_identify_cpu() and from
generic_identify().

This makes get_cpu_cap() safe to be called later during boot proccess
without overwriting cpuinfo_x86::x86_phys_bits.

[ tglx: Massaged changelog ]

Fixes: cb06d8e3d020 ("x86/tme: Detect if TME and MKTME is activated by BIOS")
Reported-by: Kai Huang <kai.huang@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: linux-mm@kvack.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lkml.kernel.org/r/20180410092704.41106-1-kirill.shutemov@linux.intel.com
2018-04-10 16:33:21 +02:00
Nicholas Piggin
34dd25de9f powerpc/powernv: define a standard delay for OPAL_BUSY type retry loops
This is the start of an effort to tidy up and standardise all the
delays. Existing loops have a range of delay/sleep periods from 1ms
to 20ms, and some have no delay. They all loop forever except rtc,
which times out after 10 retries, and that uses 10ms delays. So use
10ms as our standard delay. The OPAL maintainer agrees 10ms is a
reasonable starting point.

The idea is to use the same recipe everywhere, once this is proven to
work then it will be documented as an OPAL API standard. Then both
firmware and OS can agree, and if a particular call needs something
else, then that can be documented with reasoning.

This is not the end-all of this effort, it's just a relatively easy
change that fixes some existing high latency delays. There should be
provision for standardising timeouts and/or interruptible loops where
possible, so non-fatal firmware errors don't cause hangs.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-11 00:05:23 +10:00
Luc Van Oostenryck
85fa2cc511 c6x: pass endianness info to sparse
c6x depends on the macro '_BIG_ENDIAN' being defined or not
to correctly select or define endian-specific macros, structures
or pieces of code.

This macro is predefined by the compiler but sparse knows nothing
about it and thus may pre-process files differently from what
gcc would.

Fix this by passing '-D_BIG_ENDIAN' when compiling a big-endian
kernel, like GCC would have done.

To: Mark Salter <msalter@redhat.com>
To: Aurelien Jacquiot <a-jacquiot@ti.com>
CC: linux-c6x-dev@linux-c6x.org
Signed-off-by: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Signed-off-by: Mark Salter <msalter@redhat.com>
2018-04-10 09:58:58 -04:00
Randy Dunlap
319938bd6f c6x: fix platforms/plldata.c get_coreid build error
Fix build error reported by the 0day bot by including the header
file for that macro.

Fixes this build error: (should fix; not tested)
arch/c6x/platforms/plldata.c: In function 'c6472_setup_clocks':
arch/c6x/platforms/plldata.c:279:33: error: implicit declaration of function 'get_coreid'; did you mean 'get_order'? [-Werror=implicit-function-declaration]
      c6x_core_clk.parent = &sysclks[get_coreid() + 1];

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Aurelien Jacquiot <jacquiot.aurelien@gmail.com>
Cc: linux-c6x-dev@linux-c6x.org
Cc: Ingo Molnar <mingo@kernel.org>

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Mark Salter <msalter@redhat.com>
2018-04-10 09:58:38 -04:00
Jérémy Lefaure
f5ad907e31 c6x: remove unused KTHREAD_SIZE definition
KTHREAD_SIZE has never been used since it has been defined for c6x arch.
Let's remove this useless definition.

Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Signed-off-by: Mark Salter <msalter@redhat.com>
2018-04-10 09:58:38 -04:00
Boris Ostrovsky
a5a18ae73b xen/pvh: Indicate XENFEAT_linux_rsdp_unrestricted to Xen
Pre-4.17 kernels ignored start_info's rsdp_paddr pointer and instead
relied on finding RSDP in standard location in BIOS RO memory. This
has worked since that's where Xen used to place it.

However, with recent Xen change (commit 4a5733771e6f ("libxl: put RSDP
for PVH guest near 4GB")) it prefers to keep RSDP at a "non-standard"
address. Even though as of commit b17d9d1df3c3 ("x86/xen: Add pvh
specific rsdp address retrieval function") Linux is able to find RSDP,
for back-compatibility reasons we need to indicate to Xen that we can
handle this, an we do so by setting XENFEAT_linux_rsdp_unrestricted
flag in ELF notes.

(Also take this opportunity and sync features.h header file with Xen)

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
2018-04-10 09:22:22 -04:00
Harald Freudenberger
2a80786d47 s390/zcrypt: Remove deprecated ioctls.
This patch removes the old status calls which have been marked
as deprecated since at least 2 years now. There is no known
application or library relying on these ioctls any more.

Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-10 07:39:01 +02:00
Vasily Gorbik
96c0cdbc7c s390/ipl: remove reipl_method and dump_method
reipl_method and dump_method have been used in addition to reipl_type
and dump_type, because a single reipl_type could be achieved with
multiple reipl_method (same for dump_type/method). After dropping
non-diag308_set based reipl methods, there is a single method per
reipl_type/dump_type and reipl_method and dump_method could be simply
removed.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-10 07:39:00 +02:00
Vasily Gorbik
3b9678472b s390/ipl: correct kdump reipl block checksum calculation
s390 kdump reipl implementation relies on os_info kernel structure
residing in old memory being dumped. os_info contains reipl block,
which is used (if valid) by the kdump kernel for reipl parameters.

The problem is that the reipl block and its checksum inside
os_info is updated only when /sys/firmware/reipl/reipl_type is
written. This sets an offset of a reipl block for "reipl_type" and
re-calculates reipl block checksum. Any further alteration of values
under /sys/firmware/reipl/{reipl_type}/ without subsequent write to
/sys/firmware/reipl/reipl_type lead to incorrect os_info reipl block
checksum. In such a case kdump kernel ignores it and reboots using
default logic.

To fix this, os_info reipl block update is moved right before kdump
execution.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-10 07:39:00 +02:00
Vasily Gorbik
0649685073 s390/ipl: remove non-existing functions declaration
do_reipl, do_halt and do_poff are not defined anywhere. Cleaning up
functions declaration.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-10 07:39:00 +02:00
Vasily Gorbik
d485235b00 s390: assume diag308 set always works
diag308 set has been available for many machine generations, and
alternative reipl code paths has not been exercised and seems to be
broken without noticing for a while now. So, cleaning up all obsolete
reipl methods except currently used ones, assuming that diag308 set
always works.

Also removing not longer needed reset callbacks.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-10 07:38:59 +02:00
Vasily Gorbik
ecc0df0f23 s390/ipl: avoid adding scpdata to cmdline during ftp/dvd boot
Add missing ipl parmblock validity check to append_ipl_scpdata.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-10 07:38:59 +02:00
Vasily Gorbik
a0832b3aef s390/ipl: correct ipl parmblock valid checks
In some cases diag308_set_works used to be misused as "we have valid ipl
parmblock", which is not the case when diag308 set works, but there is
no ipl parmblock (diag308 store returns DIAG308_RC_NOCONFIG). Such checks
are adjusted to reuse ipl_block_valid instead of diag308_set_works.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-10 07:38:59 +02:00
Vasily Gorbik
d08091ac96 s390/ipl: rely on diag308 store to get ipl info
For both ccw and fcp boot retrieve ipl info from ipl block received via
diag308 store. Old scsi ipl parm block handling and cio_get_iplinfo are
removed. Ipl type is deducted from ipl block (if valid).

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-10 07:38:59 +02:00
Vasily Gorbik
283abedb1b s390/ipl: move ipl_flags to ipl.c
ipl_flags and corresponding enum are not used outside of ipl.c and will
be reworked in later commits.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-10 07:38:59 +02:00
Vasily Gorbik
e9627da113 s390/ipl: get rid of ipl_ssid and ipl_devno
ipl_ssid and ipl_devno used to be used during ccw boot when diag308 store
was not available. Reuse ipl_block to store those values.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-10 07:38:58 +02:00
Vasily Gorbik
bdbfe18595 s390/ipl: unite diag308 and scsi boot ipl blocks
Ipl parm blocks received via "diag308 store" and during scsi boot at
IPL_PARMBLOCK_ORIGIN are merged into the "ipl_block".

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2018-04-10 07:38:58 +02:00