565 Commits

Author SHA1 Message Date
Dave Hansen
4fe2d8b11a x86/entry: Rename SYSENTER_stack to CPU_ENTRY_AREA_entry_stack
If the kernel oopses while on the trampoline stack, it will print
"<SYSENTER>" even if SYSENTER is not involved.  That is rather confusing.

The "SYSENTER" stack is used for a lot more than SYSENTER now.  Give it a
better string to display in stack dumps, and rename the kernel code to
match.

Also move the 32-bit code over to the new naming even though it still uses
the entry stack only for SYSENTER.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bp@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-22 20:13:02 +01:00
Andy Lutomirski
4831b77940 x86/vsyscall/64: Warn and fail vsyscall emulation in NATIVE mode
If something goes wrong with pagetable setup, vsyscall=native will
accidentally fall back to emulation.  Make it warn and fail so that we
notice.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-22 20:13:01 +01:00
Andy Lutomirski
49275fef98 x86/vsyscall/64: Explicitly set _PAGE_USER in the pagetable hierarchy
The kernel is very erratic as to which pagetables have _PAGE_USER set.  The
vsyscall page gets lucky: it seems that all of the relevant pagetables are
among the apparently arbitrary ones that set _PAGE_USER.  Rather than
relying on chance, just explicitly set _PAGE_USER.

This will let us clean up pagetable setup to stop setting _PAGE_USER.  The
added code can also be reused by pagetable isolation to manage the
_PAGE_USER bit in the usermode tables.

[ tglx: Folded paravirt fix from Juergen Gross ]

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-22 20:13:01 +01:00
Linus Torvalds
64a48099b3 Merge branch 'WIP.x86-pti.entry-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 syscall entry code changes for PTI from Ingo Molnar:
 "The main changes here are Andy Lutomirski's changes to switch the
  x86-64 entry code to use the 'per CPU entry trampoline stack'. This,
  besides helping fix KASLR leaks (the pending Page Table Isolation
  (PTI) work), also robustifies the x86 entry code"

* 'WIP.x86-pti.entry-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
  x86/cpufeatures: Make CPU bugs sticky
  x86/paravirt: Provide a way to check for hypervisors
  x86/paravirt: Dont patch flush_tlb_single
  x86/entry/64: Make cpu_entry_area.tss read-only
  x86/entry: Clean up the SYSENTER_stack code
  x86/entry/64: Remove the SYSENTER stack canary
  x86/entry/64: Move the IST stacks into struct cpu_entry_area
  x86/entry/64: Create a per-CPU SYSCALL entry trampoline
  x86/entry/64: Return to userspace from the trampoline stack
  x86/entry/64: Use a per-CPU trampoline stack for IDT entries
  x86/espfix/64: Stop assuming that pt_regs is on the entry stack
  x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0
  x86/entry: Remap the TSS into the CPU entry area
  x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct
  x86/dumpstack: Handle stack overflow on all stacks
  x86/entry: Fix assumptions that the HW TSS is at the beginning of cpu_tss
  x86/kasan/64: Teach KASAN about the cpu_entry_area
  x86/mm/fixmap: Generalize the GDT fixmap mechanism, introduce struct cpu_entry_area
  x86/entry/gdt: Put per-CPU GDT remaps in ascending order
  x86/dumpstack: Add get_stack_info() support for the SYSENTER stack
  ...
2017-12-18 08:59:15 -08:00
Andy Lutomirski
c482feefe1 x86/entry/64: Make cpu_entry_area.tss read-only
The TSS is a fairly juicy target for exploits, and, now that the TSS
is in the cpu_entry_area, it's no longer protected by kASLR.  Make it
read-only on x86_64.

On x86_32, it can't be RO because it's written by the CPU during task
switches, and we use a task gate for double faults.  I'd also be
nervous about errata if we tried to make it RO even on configurations
without double fault handling.

[ tglx: AMD confirmed that there is no problem on 64-bit with TSS RO.  So
  	it's probably safe to assume that it's a non issue, though Intel
  	might have been creative in that area. Still waiting for
  	confirmation. ]

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bpetkov@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150606.733700132@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-17 14:27:52 +01:00
Andy Lutomirski
0f9a48100f x86/entry: Clean up the SYSENTER_stack code
The existing code was a mess, mainly because C arrays are nasty.  Turn
SYSENTER_stack into a struct, add a helper to find it, and do all the
obvious cleanups this enables.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bpetkov@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150606.653244723@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-17 14:27:51 +01:00
Andy Lutomirski
3386bc8aed x86/entry/64: Create a per-CPU SYSCALL entry trampoline
Handling SYSCALL is tricky: the SYSCALL handler is entered with every
single register (except FLAGS), including RSP, live.  It somehow needs
to set RSP to point to a valid stack, which means it needs to save the
user RSP somewhere and find its own stack pointer.  The canonical way
to do this is with SWAPGS, which lets us access percpu data using the
%gs prefix.

With PAGE_TABLE_ISOLATION-like pagetable switching, this is
problematic.  Without a scratch register, switching CR3 is impossible, so
%gs-based percpu memory would need to be mapped in the user pagetables.
Doing that without information leaks is difficult or impossible.

Instead, use a different sneaky trick.  Map a copy of the first part
of the SYSCALL asm at a different address for each CPU.  Now RIP
varies depending on the CPU, so we can use RIP-relative memory access
to access percpu memory.  By putting the relevant information (one
scratch slot and the stack address) at a constant offset relative to
RIP, we can make SYSCALL work without relying on %gs.

A nice thing about this approach is that we can easily switch it on
and off if we want pagetable switching to be configurable.

The compat variant of SYSCALL doesn't have this problem in the first
place -- there are plenty of scratch registers, since we don't care
about preserving r8-r15.  This patch therefore doesn't touch SYSCALL32
at all.

This patch actually seems to be a small speedup.  With this patch,
SYSCALL touches an extra cache line and an extra virtual page, but
the pipeline no longer stalls waiting for SWAPGS.  It seems that, at
least in a tight loop, the latter outweights the former.

Thanks to David Laight for an optimization tip.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bpetkov@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150606.403607157@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-17 14:27:50 +01:00
Andy Lutomirski
3e3b9293d3 x86/entry/64: Return to userspace from the trampoline stack
By itself, this is useless.  It gives us the ability to run some final code
before exit that cannnot run on the kernel stack.  This could include a CR3
switch a la PAGE_TABLE_ISOLATION or some kernel stack erasing, for
example.  (Or even weird things like *changing* which kernel stack gets
used as an ASLR-strengthening mechanism.)

The SYSRET32 path is not covered yet.  It could be in the future or
we could just ignore it and force the slow path if needed.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150606.306546484@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-17 14:27:50 +01:00
Andy Lutomirski
7f2590a110 x86/entry/64: Use a per-CPU trampoline stack for IDT entries
Historically, IDT entries from usermode have always gone directly
to the running task's kernel stack.  Rearrange it so that we enter on
a per-CPU trampoline stack and then manually switch to the task's stack.
This touches a couple of extra cachelines, but it gives us a chance
to run some code before we touch the kernel stack.

The asm isn't exactly beautiful, but I think that fully refactoring
it can wait.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150606.225330557@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-17 14:27:38 +01:00
Andy Lutomirski
72f5e08dbb x86/entry: Remap the TSS into the CPU entry area
This has a secondary purpose: it puts the entry stack into a region
with a well-controlled layout.  A subsequent patch will take
advantage of this to streamline the SYSCALL entry code to be able to
find it more easily.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bpetkov@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150605.962042855@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-17 13:59:56 +01:00
Andy Lutomirski
1a79797b58 x86/entry/64: Allocate and enable the SYSENTER stack
This will simplify future changes that want scratch variables early in
the SYSENTER handler -- they'll be able to spill registers to the
stack.  It also lets us get rid of a SWAPGS_UNSAFE_STACK user.

This does not depend on CONFIG_IA32_EMULATION=y because we'll want the
stack space even without IA32 emulation.

As far as I can tell, the reason that this wasn't done from day 1 is
that we use IST for #DB and #BP, which is IMO rather nasty and causes
a lot more problems than it solves.  But, since #DB uses IST, we don't
actually need a real stack for SYSENTER (because SYSENTER with TF set
will invoke #DB on the IST stack rather than the SYSENTER stack).

I want to remove IST usage from these vectors some day, and this patch
is a prerequisite for that as well.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Link: https://lkml.kernel.org/r/20171204150605.312726423@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-17 13:59:53 +01:00
Boris Ostrovsky
e17f823453 x86/entry/64/paravirt: Use paravirt-safe macro to access eflags
Commit 1d3e53e8624a ("x86/entry/64: Refactor IRQ stacks and make them
NMI-safe") added DEBUG_ENTRY_ASSERT_IRQS_OFF macro that acceses eflags
using 'pushfq' instruction when testing for IF bit. On PV Xen guests
looking at IF flag directly will always see it set, resulting in 'ud2'.

Introduce SAVE_FLAGS() macro that will use appropriate save_fl pv op when
running paravirt.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Cc: xen-devel@lists.xenproject.org
Link: https://lkml.kernel.org/r/20171204150604.899457242@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-17 13:59:52 +01:00
Linus Torvalds
dd53a4214d Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar:

 - make CR4 handling irq-safe, which bug vmware guests ran into

 - don't crash on early IRQs in Xen guests

 - don't crash secondary CPU bringup if #UD assisted WARN()ings are
   triggered

 - make X86_BUG_FXSAVE_LEAK optional on newer AMD CPUs that have the fix

 - fix AMD Fam17h microcode loading

 - fix broadcom_postcore_init() if ACPI is disabled

 - fix resume regression in __restore_processor_context()

 - fix Sparse warnings

 - fix a GCC-8 warning

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/vdso: Change time() prototype to match __vdso_time()
  x86: Fix Sparse warnings about non-static functions
  x86/power: Fix some ordering bugs in __restore_processor_context()
  x86/PCI: Make broadcom_postcore_init() check acpi_disabled
  x86/microcode/AMD: Add support for fam17h microcode loading
  x86/cpufeatures: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD
  x86/idt: Load idt early in start_secondary
  x86/xen: Support early interrupts in xen pv guests
  x86/tlb: Disable interrupts when changing CR4
  x86/tlb: Refactor CR4 setting and shadow write
2017-12-06 17:47:29 -08:00
Arnd Bergmann
88edb57d1e x86/vdso: Change time() prototype to match __vdso_time()
gcc-8 warns that time() is an alias for __vdso_time() but the two
have different prototypes:

  arch/x86/entry/vdso/vclock_gettime.c:327:5: error: 'time' alias between functions of incompatible types 'int(time_t *)' {aka 'int(long int *)'} and 'time_t(time_t *)' {aka 'long int(long int *)'} [-Werror=attribute-alias]
   int time(time_t *t)
       ^~~~
  arch/x86/entry/vdso/vclock_gettime.c:318:16: note: aliased declaration here

I could not figure out whether this is intentional, but I see that
changing it to return time_t avoids the warning.

Returning 'int' from time() is also a bit questionable, as it causes an
overflow in y2038 even on 64-bit architectures that use a 64-bit time_t
type. On 32-bit architecture with 64-bit time_t, time() should always
be implement by the C library by calling a (to be added) clock_gettime()
variant that takes a sufficiently wide argument.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Link: http://lkml.kernel.org/r/20171204150203.852959-1-arnd@arndb.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-12-06 21:31:46 +01:00
Miroslav Benes
43347d56c8 livepatch: send a fake signal to all blocking tasks
Live patching consistency model is of LEAVE_PATCHED_SET and
SWITCH_THREAD. This means that all tasks in the system have to be marked
one by one as safe to call a new patched function. Safe means when a
task is not (sleeping) in a set of patched functions. That is, no
patched function is on the task's stack. Another clearly safe place is
the boundary between kernel and userspace. The patching waits for all
tasks to get outside of the patched set or to cross the boundary. The
transition is completed afterwards.

The problem is that a task can block the transition for quite a long
time, if not forever. It could sleep in a set of patched functions, for
example.  Luckily we can force the task to leave the set by sending it a
fake signal, that is a signal with no data in signal pending structures
(no handler, no sign of proper signal delivered). Suspend/freezer use
this to freeze the tasks as well. The task gets TIF_SIGPENDING set and
is woken up (if it has been sleeping in the kernel before) or kicked by
rescheduling IPI (if it was running on other CPU). This causes the task
to go to kernel/userspace boundary where the signal would be handled and
the task would be marked as safe in terms of live patching.

There are tasks which are not affected by this technique though. The
fake signal is not sent to kthreads. They should be handled differently.
They can be woken up so they leave the patched set and their
TIF_PATCH_PENDING can be cleared thanks to stack checking.

For the sake of completeness, if the task is in TASK_RUNNING state but
not currently running on some CPU it doesn't get the IPI, but it would
eventually handle the signal anyway. Second, if the task runs in the
kernel (in TASK_RUNNING state) it gets the IPI, but the signal is not
handled on return from the interrupt. It would be handled on return to
the userspace in the future when the fake signal is sent again. Stack
checking deals with these cases in a better way.

If the task was sleeping in a syscall it would be woken by our fake
signal, it would check if TIF_SIGPENDING is set (by calling
signal_pending() predicate) and return ERESTART* or EINTR. Syscalls with
ERESTART* return values are restarted in case of the fake signal (see
do_signal()). EINTR is propagated back to the userspace program. This
could disturb the program, but...

* each process dealing with signals should react accordingly to EINTR
  return values.
* syscalls returning EINTR happen to be quite common situation in the
  system even if no fake signal is sent.
* freezer sends the fake signal and does not deal with EINTR anyhow.
  Thus EINTR values are returned when the system is resumed.

The very safe marking is done in architectures' "entry" on syscall and
interrupt/exception exit paths, and in a stack checking functions of
livepatch.  TIF_PATCH_PENDING is cleared and the next
recalc_sigpending() drops TIF_SIGPENDING. In connection with this, also
call klp_update_patch_state() before do_signal(), so that
recalc_sigpending() in dequeue_signal() can clear TIF_PATCH_PENDING
immediately and thus prevent a double call of do_signal().

Note that the fake signal is not sent to stopped/traced tasks. Such task
prevents the patching to finish till it continues again (is not traced
anymore).

Last, sending the fake signal is not automatic. It is done only when
admin requests it by writing 1 to signal sysfs attribute in livepatch
sysfs directory.

Signed-off-by: Miroslav Benes <mbenes@suse.cz>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: x86@kernel.org
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-12-04 22:34:57 +01:00
Linus Torvalds
02fc87b117 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar:
 - topology enumeration fixes
 - KASAN fix
 - two entry fixes (not yet the big series related to KASLR)
 - remove obsolete code
 - instruction decoder fix
 - better /dev/mem sanity checks, hopefully working better this time
 - pkeys fixes
 - two ACPI fixes
 - 5-level paging related fixes
 - UMIP fixes that should make application visible faults more debuggable
 - boot fix for weird virtualization environment

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  x86/decoder: Add new TEST instruction pattern
  x86/PCI: Remove unused HyperTransport interrupt support
  x86/umip: Fix insn_get_code_seg_params()'s return value
  x86/boot/KASLR: Remove unused variable
  x86/entry/64: Add missing irqflags tracing to native_load_gs_index()
  x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow
  x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing
  x86/pkeys/selftests: Fix protection keys write() warning
  x86/pkeys/selftests: Rename 'si_pkey' to 'siginfo_pkey'
  x86/mpx/selftests: Fix up weird arrays
  x86/pkeys: Update documentation about availability
  x86/umip: Print a warning into the syslog if UMIP-protected instructions are used
  x86/smpboot: Fix __max_logical_packages estimate
  x86/topology: Avoid wasting 128k for package id array
  perf/x86/intel/uncore: Cache logical pkg id in uncore driver
  x86/acpi: Reduce code duplication in mp_override_legacy_irq()
  x86/acpi: Handle SCI interrupts above legacy space gracefully
  x86/boot: Fix boot failure when SMP MP-table is based at 0
  x86/mm: Limit mmap() of /dev/mem to valid physical addresses
  x86/selftests: Add test for mapping placement for 5-level paging
  ...
2017-11-26 14:11:54 -08:00
Andy Lutomirski
ca37e57bbe x86/entry/64: Add missing irqflags tracing to native_load_gs_index()
Running this code with IRQs enabled (where dummy_lock is a spinlock):

static void check_load_gs_index(void)
{
	/* This will fail. */
	load_gs_index(0xffff);

	spin_lock(&dummy_lock);
	spin_unlock(&dummy_lock);
}

Will generate a lockdep warning.  The issue is that the actual write
to %gs would cause an exception with IRQs disabled, and the exception
handler would, as an inadvertent side effect, update irqflag tracing
to reflect the IRQs-off status.  native_load_gs_index() would then
turn IRQs back on and return with irqflag tracing still thinking that
IRQs were off.  The dummy lock-and-unlock causes lockdep to notice the
error and warn.

Fix it by adding the missing tracing.

Apparently nothing did this in a context where it mattered.  I haven't
tried to find a code path that would actually exhibit the warning if
appropriately nasty user code were running.

I suspect that the security impact of this bug is very, very low --
production systems don't run with lockdep enabled, and the warning is
mostly harmless anyway.

Found during a quick audit of the entry code to try to track down an
unrelated bug that Ingo found in some still-in-development code.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/e1aeb0e6ba8dd430ec36c8a35e63b429698b4132.1511411918.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-23 07:54:25 +01:00
Andy Lutomirski
548c3050ea x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing
When I added entry_SYSCALL_64_after_hwframe(), I left TRACE_IRQS_OFF
before it.  This means that users of entry_SYSCALL_64_after_hwframe()
were responsible for invoking TRACE_IRQS_OFF, and the one and only
user (Xen, added in the same commit) got it wrong.

I think this would manifest as a warning if a Xen PV guest with
CONFIG_DEBUG_LOCKDEP=y were used with context tracking.  (The
context tracking bit is to cause lockdep to get invoked before we
turn IRQs back on.)  I haven't tested that for real yet because I
can't get a kernel configured like that to boot at all on Xen PV.

Move TRACE_IRQS_OFF below the label.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Fixes: 8a9949bc71a7 ("x86/xen/64: Rearrange the SYSCALL entries")
Link: http://lkml.kernel.org/r/9150aac013b7b95d62c2336751d5b6e91d2722aa.1511325444.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-22 06:35:48 +01:00
Linus Torvalds
09bd7c75e5 Kbuild updates for v4.15
One of the most remarkable improvements in this cycle is, Kbuild is
 now able to cache the result of shell commands.  Some variables are
 expensive to compute, for example, $(call cc-option,...) invokes the
 compiler.  It is not efficient to redo this computation every time,
 even when we are not actually building anything.  Kbuild creates a
 hidden file ".cache.mk" that contains invoked shell commands and
 their results.  The speed-up should be noticeable.
 
 Summary:
 
 - Fix arch build issues (hexagon, sh)
 
 - Clean up various Makefiles and scripts
 
 - Fix wrong usage of {CFLAGS,LDFLAGS}_MODULE in arch Makefiles
 
 - Cache variables that are expensive to compute
 
 - Improve cc-ldopton and ld-option for Clang
 
 - Optimize output directory creation
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJaDxaLAAoJED2LAQed4NsGIHQP/isMxxaIxIAWU56+ZcII74k7
 639VgrKi9n5y25d1dBRTQg+vReHE6E2JbkCqpVOu11t7m0LT7yUK8v3WwyLf1qTN
 GxnqZ/WMQU5/AYVqIWo8jN4FGHpivHJ6qbeiNJM9qN4RAkzG0sZUq746VaFZYmIR
 Lu0Gf4m4qjifkkhXsQdWT5i7yNTidPqaL6GNb+FcFkEHlVre8jma0kJlgfHxru84
 WmETpjQXvHAZ/R61vY6ekAWpqFhw3ecJY96A9npnx+SQVQdSNAdpaU0SK29jB0ON
 /SAfpHg9oa/gD0LFOKV6zkjnAkd4TEjrJEiHHhz5gjT/SbS3T1llBIGZ1oV4X7Y0
 Vlh9KWlm1FJJI4SIzc9qUaQMp6JtLfEfHKJCc45xVaN3fNrDnR8jl80x5+95ELga
 dCkZgnq5u82MtTysCbHBESwDYQaVPyIrh7In+mduglaCqhqj9KoDjoLoiGfCg7SA
 3tPflYVd629w5l5GrazJ40jWn1+ggMtgMOVooJNJ+dINCP+GxsUpH84Ww2Pdic+/
 qLdud6TeqxrZDGzWXqKNLu8alM8NGgSr101l9gIf1oqSyy63duBpMrxGDoIJS3FU
 rFDoFFUhlfkAXNbQHtVGNzKtcpCjURh992j9Fa1+NfMwSce5IHkMwTvPmNSRowi8
 0llLjXhD/bxK6FpdvlV8
 =zIdO
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:
 "One of the most remarkable improvements in this cycle is, Kbuild is
  now able to cache the result of shell commands. Some variables are
  expensive to compute, for example, $(call cc-option,...) invokes the
  compiler. It is not efficient to redo this computation every time,
  even when we are not actually building anything. Kbuild creates a
  hidden file ".cache.mk" that contains invoked shell commands and their
  results. The speed-up should be noticeable.

  Summary:

   - Fix arch build issues (hexagon, sh)

   - Clean up various Makefiles and scripts

   - Fix wrong usage of {CFLAGS,LDFLAGS}_MODULE in arch Makefiles

   - Cache variables that are expensive to compute

   - Improve cc-ldopton and ld-option for Clang

   - Optimize output directory creation"

* tag 'kbuild-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (30 commits)
  kbuild: move coccicheck help from scripts/Makefile.help to top Makefile
  sh: decompressor: add shipped files to .gitignore
  frv: .gitignore: ignore vmlinux.lds
  selinux: remove unnecessary assignment to subdir-
  kbuild: specify FORCE in Makefile.headersinst as .PHONY target
  kbuild: remove redundant mkdir from ./Kbuild
  kbuild: optimize object directory creation for incremental build
  kbuild: create object directories simpler and faster
  kbuild: filter-out PHONY targets from "targets"
  kbuild: remove redundant $(wildcard ...) for cmd_files calculation
  kbuild: create directory for make cache only when necessary
  sh: select KBUILD_DEFCONFIG depending on ARCH
  kbuild: fix linker feature test macros when cross compiling with Clang
  kbuild: shrink .cache.mk when it exceeds 1000 lines
  kbuild: do not call cc-option before KBUILD_CFLAGS initialization
  kbuild: Cache a few more calls to the compiler
  kbuild: Add a cache for generated variables
  kbuild: add forward declaration of default target to Makefile.asm-generic
  kbuild: remove KBUILD_SUBDIR_ASFLAGS and KBUILD_SUBDIR_CCFLAGS
  hexagon/kbuild: replace CFLAGS_MODULE with KBUILD_CFLAGS_MODULE
  ...
2017-11-17 17:45:29 -08:00
Linus Torvalds
051089a2ee xen: features and fixes for v4.15-rc1
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABAgAGBQJaDdh4AAoJELDendYovxMvPFAH/2QjTys2ydIAdmwke4odpJ7U
 xuy7HOQCzOeZ5YsZthzCBsN90VmnDM7X7CcB8weSdjcKlXMSAWD+J1RgkL2iAJhI
 8tzIEXECrlNuz4V5mX9TmMgtPCr4qzU3fsts0pZy4fYDq1PVWDefqOwEtbpbWabb
 wRSMq/nTb9iASTMgheSC0WfhJneqtJ+J20zrzkGPCBPRFcwfppeP8/7vpkmJslBi
 eH/pfchICM4w093T/BfavnsPvhLdjgRuwVzn6+e46s4tLnZAxnLRVQ7SXZXzBORq
 /dL/qC0XH3YXdU+XfIs//giZsmLns6SxZaMr4vs6TxFtuzZBKpLtkOKo9zndvxk=
 =sZY5
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-4.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip

Pull xen updates from Juergen Gross:
 "Xen features and fixes for v4.15-rc1

  Apart from several small fixes it contains the following features:

   - a series by Joao Martins to add vdso support of the pv clock
     interface

   - a series by Juergen Gross to add support for Xen pv guests to be
     able to run on 5 level paging hosts

   - a series by Stefano Stabellini adding the Xen pvcalls frontend
     driver using a paravirtualized socket interface"

* tag 'for-linus-4.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (34 commits)
  xen/pvcalls: fix potential endless loop in pvcalls-front.c
  xen/pvcalls: Add MODULE_LICENSE()
  MAINTAINERS: xen, kvm: track pvclock-abi.h changes
  x86/xen/time: setup vcpu 0 time info page
  x86/xen/time: set pvclock flags on xen_time_init()
  x86/pvclock: add setter for pvclock_pvti_cpu0_va
  ptp_kvm: probe for kvm guest availability
  xen/privcmd: remove unused variable pageidx
  xen: select grant interface version
  xen: update arch/x86/include/asm/xen/cpuid.h
  xen: add grant interface version dependent constants to gnttab_ops
  xen: limit grant v2 interface to the v1 functionality
  xen: re-introduce support for grant v2 interface
  xen: support priv-mapping in an HVM tools domain
  xen/pvcalls: remove redundant check for irq >= 0
  xen/pvcalls: fix unsigned less than zero error check
  xen/time: Return -ENODEV from xen_get_wallclock()
  xen/pvcalls-front: mark expected switch fall-through
  xen: xenbus_probe_frontend: mark expected switch fall-throughs
  xen/time: do not decrease steal time after live migration on xen
  ...
2017-11-16 13:06:27 -08:00
Masahiro Yamada
8a78756eb5 kbuild: create object directories simpler and faster
For the out-of-tree build, scripts/Makefile.build creates output
directories, but this operation is not efficient.

scripts/Makefile.lib calculates obj-dirs as follows:

  obj-dirs := $(dir $(multi-objs) $(obj-y))

Please notice $(sort ...) is not used here.  Usually the result is
as many "./" as objects here.

For a lot of duplicated paths, the following command is invoked.

  _dummy := $(foreach d,$(obj-dirs), $(shell [ -d $(d) ] || mkdir -p $(d)))

Then, the costly shell command is run over and over again.

I see many points for optimization:

[1] Use $(sort ...) to cut down duplicated paths before passing them
    to system call
[2] Use single $(shell ...) instead of repeating it with $(foreach ...)
    This will reduce forking.
[3] We can calculate obj-dirs more simply.  Most of objects are already
    accumulated in $(targets).  So, $(dir $(targets)) is fine and more
    comprehensive.

I also removed ugly code in arch/x86/entry/vdso/Makefile.  This is now
really unnecessary.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Douglas Anderson <dianders@chromium.org>
2017-11-16 09:07:35 +09:00
Linus Torvalds
eb4d47c8ce Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Ingo Molnar:
 "Two changes: Propagate const/__initconst, and use ARRAY_SIZE() some
  more"

* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/events/amd/iommu: Make iommu_pmu const and __initconst
  x86: Use ARRAY_SIZE
2017-11-13 16:58:05 -08:00
Linus Torvalds
d6ec9d9a4d Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Ingo Molnar:
 "Note that in this cycle most of the x86 topics interacted at a level
  that caused them to be merged into tip:x86/asm - but this should be a
  temporary phenomenon, hopefully we'll back to the usual patterns in
  the next merge window.

  The main changes in this cycle were:

  Hardware enablement:

   - Add support for the Intel UMIP (User Mode Instruction Prevention)
     CPU feature. This is a security feature that disables certain
     instructions such as SGDT, SLDT, SIDT, SMSW and STR. (Ricardo Neri)

     [ Note that this is disabled by default for now, there are some
       smaller enhancements in the pipeline that I'll follow up with in
       the next 1-2 days, which allows this to be enabled by default.]

   - Add support for the AMD SEV (Secure Encrypted Virtualization) CPU
     feature, on top of SME (Secure Memory Encryption) support that was
     added in v4.14. (Tom Lendacky, Brijesh Singh)

   - Enable new SSE/AVX/AVX512 CPU features: AVX512_VBMI2, GFNI, VAES,
     VPCLMULQDQ, AVX512_VNNI, AVX512_BITALG. (Gayatri Kammela)

  Other changes:

   - A big series of entry code simplifications and enhancements (Andy
     Lutomirski)

   - Make the ORC unwinder default on x86 and various objtool
     enhancements. (Josh Poimboeuf)

   - 5-level paging enhancements (Kirill A. Shutemov)

   - Micro-optimize the entry code a bit (Borislav Petkov)

   - Improve the handling of interdependent CPU features in the early
     FPU init code (Andi Kleen)

   - Build system enhancements (Changbin Du, Masahiro Yamada)

   - ... plus misc enhancements, fixes and cleanups"

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (118 commits)
  x86/build: Make the boot image generation less verbose
  selftests/x86: Add tests for the STR and SLDT instructions
  selftests/x86: Add tests for User-Mode Instruction Prevention
  x86/traps: Fix up general protection faults caused by UMIP
  x86/umip: Enable User-Mode Instruction Prevention at runtime
  x86/umip: Force a page fault when unable to copy emulated result to user
  x86/umip: Add emulation code for UMIP instructions
  x86/cpufeature: Add User-Mode Instruction Prevention definitions
  x86/insn-eval: Add support to resolve 16-bit address encodings
  x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode
  x86/insn-eval: Add wrapper function for 32 and 64-bit addresses
  x86/insn-eval: Add support to resolve 32-bit address encodings
  x86/insn-eval: Compute linear address in several utility functions
  resource: Fix resource_size.cocci warnings
  X86/KVM: Clear encryption attribute when SEV is active
  X86/KVM: Decrypt shared per-cpu variables when SEV is active
  percpu: Introduce DEFINE_PER_CPU_DECRYPTED
  x86: Add support for changing memory encryption attribute in early boot
  x86/io: Unroll string I/O when SEV is active
  x86/boot: Add early boot support when running with SEV active
  ...
2017-11-13 14:13:48 -08:00
Joao Martins
9f08890ab9 x86/pvclock: add setter for pvclock_pvti_cpu0_va
Right now there is only a pvclock_pvti_cpu0_va() which is defined
on kvmclock since:

commit dac16fba6fc5
("x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap")

The only user of this interface so far is kvm. This commit adds a
setter function for the pvti page and moves pvclock_pvti_cpu0_va
to pvclock, which is a more generic place to have it; and would
allow other PV clocksources to use it, such as Xen.

While moving pvclock_pvti_cpu0_va into pvclock, rename also this
function to pvclock_get_pvti_cpu0_va (including its call sites)
to be symmetric with the setter (pvclock_set_pvti_cpu0_va).

Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
2017-11-08 16:33:14 -05:00
Frederic Weisbecker
7a10e2a919 x86: Use lockdep to assert IRQs are disabled/enabled
Use lockdep to check that IRQs are enabled or disabled as expected. This
way the sanity check only shows overhead when concurrency correctness
debug code is enabled.

It also makes no more sense to fix the IRQ flags when a bug is detected
as the assertion is now pure config-dependent debugging. And to quote
Peter Zijlstra:

	The whole if !disabled, disable logic is uber paranoid programming,
	but I don't think we've ever seen that WARN trigger, and if it does
	(and then burns the kernel) we at least know what happend.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: David S . Miller <davem@davemloft.net>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/1509980490-4285-8-git-send-email-frederic@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-08 11:13:50 +01:00
Brijesh Singh
819aeee065 X86/KVM: Clear encryption attribute when SEV is active
The guest physical memory area holding the struct pvclock_wall_clock and
struct pvclock_vcpu_time_info are shared with the hypervisor. It
periodically updates the contents of the memory.

When SEV is active, the encryption attributes from the shared memory pages
must be cleared so that both hypervisor and guest can access the data.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Borislav Petkov <bp@suse.de>
Tested-by: Borislav Petkov <bp@suse.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: kvm@vger.kernel.org
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: https://lkml.kernel.org/r/20171020143059.3291-18-brijesh.singh@amd.com
2017-11-07 15:36:00 +01:00
Ingo Molnar
67caadc611 Merge branch 'x86/build' into x86/asm, to fix up conflicts
Conflicts:
	arch/x86/entry/syscalls/Makefile

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07 10:54:30 +01:00
Ingo Molnar
b3d9a13681 Merge branch 'linus' into x86/asm, to pick up fixes and resolve conflicts
Conflicts:
	arch/x86/kernel/cpu/Makefile

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07 10:53:06 +01:00
Ingo Molnar
8c5db92a70 Merge branch 'linus' into locking/core, to resolve conflicts
Conflicts:
	include/linux/compiler-clang.h
	include/linux/compiler-gcc.h
	include/linux/compiler-intel.h
	include/uapi/linux/stddef.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07 10:32:44 +01:00
Linus Torvalds
ead751507d License cleanup: add SPDX license identifiers to some files
Many source files in the tree are missing licensing information, which
 makes it harder for compliance tools to determine the correct license.
 
 By default all files without license information are under the default
 license of the kernel, which is GPL version 2.
 
 Update the files which contain no license information with the 'GPL-2.0'
 SPDX license identifier.  The SPDX identifier is a legally binding
 shorthand, which can be used instead of the full boiler plate text.
 
 This patch is based on work done by Thomas Gleixner and Kate Stewart and
 Philippe Ombredanne.
 
 How this work was done:
 
 Patches were generated and checked against linux-4.14-rc6 for a subset of
 the use cases:
  - file had no licensing information it it.
  - file was a */uapi/* one with no licensing information in it,
  - file was a */uapi/* one with existing licensing information,
 
 Further patches will be generated in subsequent months to fix up cases
 where non-standard license headers were used, and references to license
 had to be inferred by heuristics based on keywords.
 
 The analysis to determine which SPDX License Identifier to be applied to
 a file was done in a spreadsheet of side by side results from of the
 output of two independent scanners (ScanCode & Windriver) producing SPDX
 tag:value files created by Philippe Ombredanne.  Philippe prepared the
 base worksheet, and did an initial spot review of a few 1000 files.
 
 The 4.13 kernel was the starting point of the analysis with 60,537 files
 assessed.  Kate Stewart did a file by file comparison of the scanner
 results in the spreadsheet to determine which SPDX license identifier(s)
 to be applied to the file. She confirmed any determination that was not
 immediately clear with lawyers working with the Linux Foundation.
 
 Criteria used to select files for SPDX license identifier tagging was:
  - Files considered eligible had to be source code files.
  - Make and config files were included as candidates if they contained >5
    lines of source
  - File already had some variant of a license header in it (even if <5
    lines).
 
 All documentation files were explicitly excluded.
 
 The following heuristics were used to determine which SPDX license
 identifiers to apply.
 
  - when both scanners couldn't find any license traces, file was
    considered to have no license information in it, and the top level
    COPYING file license applied.
 
    For non */uapi/* files that summary was:
 
    SPDX license identifier                            # files
    ---------------------------------------------------|-------
    GPL-2.0                                              11139
 
    and resulted in the first patch in this series.
 
    If that file was a */uapi/* path one, it was "GPL-2.0 WITH
    Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
 
    SPDX license identifier                            # files
    ---------------------------------------------------|-------
    GPL-2.0 WITH Linux-syscall-note                        930
 
    and resulted in the second patch in this series.
 
  - if a file had some form of licensing information in it, and was one
    of the */uapi/* ones, it was denoted with the Linux-syscall-note if
    any GPL family license was found in the file or had no licensing in
    it (per prior point).  Results summary:
 
    SPDX license identifier                            # files
    ---------------------------------------------------|------
    GPL-2.0 WITH Linux-syscall-note                       270
    GPL-2.0+ WITH Linux-syscall-note                      169
    ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
    ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
    LGPL-2.1+ WITH Linux-syscall-note                      15
    GPL-1.0+ WITH Linux-syscall-note                       14
    ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
    LGPL-2.0+ WITH Linux-syscall-note                       4
    LGPL-2.1 WITH Linux-syscall-note                        3
    ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
    ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
 
    and that resulted in the third patch in this series.
 
  - when the two scanners agreed on the detected license(s), that became
    the concluded license(s).
 
  - when there was disagreement between the two scanners (one detected a
    license but the other didn't, or they both detected different
    licenses) a manual inspection of the file occurred.
 
  - In most cases a manual inspection of the information in the file
    resulted in a clear resolution of the license that should apply (and
    which scanner probably needed to revisit its heuristics).
 
  - When it was not immediately clear, the license identifier was
    confirmed with lawyers working with the Linux Foundation.
 
  - If there was any question as to the appropriate license identifier,
    the file was flagged for further research and to be revisited later
    in time.
 
 In total, over 70 hours of logged manual review was done on the
 spreadsheet to determine the SPDX license identifiers to apply to the
 source files by Kate, Philippe, Thomas and, in some cases, confirmation
 by lawyers working with the Linux Foundation.
 
 Kate also obtained a third independent scan of the 4.13 code base from
 FOSSology, and compared selected files where the other two scanners
 disagreed against that SPDX file, to see if there was new insights.  The
 Windriver scanner is based on an older version of FOSSology in part, so
 they are related.
 
 Thomas did random spot checks in about 500 files from the spreadsheets
 for the uapi headers and agreed with SPDX license identifier in the
 files he inspected. For the non-uapi files Thomas did random spot checks
 in about 15000 files.
 
 In initial set of patches against 4.14-rc6, 3 files were found to have
 copy/paste license identifier errors, and have been fixed to reflect the
 correct identifier.
 
 Additionally Philippe spent 10 hours this week doing a detailed manual
 inspection and review of the 12,461 patched files from the initial patch
 version early this week with:
  - a full scancode scan run, collecting the matched texts, detected
    license ids and scores
  - reviewing anything where there was a license detected (about 500+
    files) to ensure that the applied SPDX license was correct
  - reviewing anything where there was no detection but the patch license
    was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
    SPDX license was correct
 
 This produced a worksheet with 20 files needing minor correction.  This
 worksheet was then exported into 3 different .csv files for the
 different types of files to be modified.
 
 These .csv files were then reviewed by Greg.  Thomas wrote a script to
 parse the csv files and add the proper SPDX tag to the file, in the
 format that the file expected.  This script was further refined by Greg
 based on the output to detect more types of files automatically and to
 distinguish between header and source .c files (which need different
 comment types.)  Finally Greg ran the script using the .csv files to
 generate the patches.
 
 Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
 Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
 Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWfswbQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykvEwCfXU1MuYFQGgMdDmAZXEc+xFXZvqgAoKEcHDNA
 6dVh26uchcEQLN/XqUDt
 =x306
 -----END PGP SIGNATURE-----

Merge tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

Pull initial SPDX identifiers from Greg KH:
 "License cleanup: add SPDX license identifiers to some files

  Many source files in the tree are missing licensing information, which
  makes it harder for compliance tools to determine the correct license.

  By default all files without license information are under the default
  license of the kernel, which is GPL version 2.

  Update the files which contain no license information with the
  'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally
  binding shorthand, which can be used instead of the full boiler plate
  text.

  This patch is based on work done by Thomas Gleixner and Kate Stewart
  and Philippe Ombredanne.

  How this work was done:

  Patches were generated and checked against linux-4.14-rc6 for a subset
  of the use cases:

   - file had no licensing information it it.

   - file was a */uapi/* one with no licensing information in it,

   - file was a */uapi/* one with existing licensing information,

  Further patches will be generated in subsequent months to fix up cases
  where non-standard license headers were used, and references to
  license had to be inferred by heuristics based on keywords.

  The analysis to determine which SPDX License Identifier to be applied
  to a file was done in a spreadsheet of side by side results from of
  the output of two independent scanners (ScanCode & Windriver)
  producing SPDX tag:value files created by Philippe Ombredanne.
  Philippe prepared the base worksheet, and did an initial spot review
  of a few 1000 files.

  The 4.13 kernel was the starting point of the analysis with 60,537
  files assessed. Kate Stewart did a file by file comparison of the
  scanner results in the spreadsheet to determine which SPDX license
  identifier(s) to be applied to the file. She confirmed any
  determination that was not immediately clear with lawyers working with
  the Linux Foundation.

  Criteria used to select files for SPDX license identifier tagging was:

   - Files considered eligible had to be source code files.

   - Make and config files were included as candidates if they contained
     >5 lines of source

   - File already had some variant of a license header in it (even if <5
     lines).

  All documentation files were explicitly excluded.

  The following heuristics were used to determine which SPDX license
  identifiers to apply.

   - when both scanners couldn't find any license traces, file was
     considered to have no license information in it, and the top level
     COPYING file license applied.

     For non */uapi/* files that summary was:

       SPDX license identifier                            # files
       ---------------------------------------------------|-------
       GPL-2.0                                              11139

     and resulted in the first patch in this series.

     If that file was a */uapi/* path one, it was "GPL-2.0 WITH
     Linux-syscall-note" otherwise it was "GPL-2.0". Results of that
     was:

       SPDX license identifier                            # files
       ---------------------------------------------------|-------
       GPL-2.0 WITH Linux-syscall-note                        930

     and resulted in the second patch in this series.

   - if a file had some form of licensing information in it, and was one
     of the */uapi/* ones, it was denoted with the Linux-syscall-note if
     any GPL family license was found in the file or had no licensing in
     it (per prior point). Results summary:

       SPDX license identifier                            # files
       ---------------------------------------------------|------
       GPL-2.0 WITH Linux-syscall-note                       270
       GPL-2.0+ WITH Linux-syscall-note                      169
       ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
       ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
       LGPL-2.1+ WITH Linux-syscall-note                      15
       GPL-1.0+ WITH Linux-syscall-note                       14
       ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
       LGPL-2.0+ WITH Linux-syscall-note                       4
       LGPL-2.1 WITH Linux-syscall-note                        3
       ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
       ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

     and that resulted in the third patch in this series.

   - when the two scanners agreed on the detected license(s), that
     became the concluded license(s).

   - when there was disagreement between the two scanners (one detected
     a license but the other didn't, or they both detected different
     licenses) a manual inspection of the file occurred.

   - In most cases a manual inspection of the information in the file
     resulted in a clear resolution of the license that should apply
     (and which scanner probably needed to revisit its heuristics).

   - When it was not immediately clear, the license identifier was
     confirmed with lawyers working with the Linux Foundation.

   - If there was any question as to the appropriate license identifier,
     the file was flagged for further research and to be revisited later
     in time.

  In total, over 70 hours of logged manual review was done on the
  spreadsheet to determine the SPDX license identifiers to apply to the
  source files by Kate, Philippe, Thomas and, in some cases,
  confirmation by lawyers working with the Linux Foundation.

  Kate also obtained a third independent scan of the 4.13 code base from
  FOSSology, and compared selected files where the other two scanners
  disagreed against that SPDX file, to see if there was new insights.
  The Windriver scanner is based on an older version of FOSSology in
  part, so they are related.

  Thomas did random spot checks in about 500 files from the spreadsheets
  for the uapi headers and agreed with SPDX license identifier in the
  files he inspected. For the non-uapi files Thomas did random spot
  checks in about 15000 files.

  In initial set of patches against 4.14-rc6, 3 files were found to have
  copy/paste license identifier errors, and have been fixed to reflect
  the correct identifier.

  Additionally Philippe spent 10 hours this week doing a detailed manual
  inspection and review of the 12,461 patched files from the initial
  patch version early this week with:

   - a full scancode scan run, collecting the matched texts, detected
     license ids and scores

   - reviewing anything where there was a license detected (about 500+
     files) to ensure that the applied SPDX license was correct

   - reviewing anything where there was no detection but the patch
     license was not GPL-2.0 WITH Linux-syscall-note to ensure that the
     applied SPDX license was correct

  This produced a worksheet with 20 files needing minor correction. This
  worksheet was then exported into 3 different .csv files for the
  different types of files to be modified.

  These .csv files were then reviewed by Greg. Thomas wrote a script to
  parse the csv files and add the proper SPDX tag to the file, in the
  format that the file expected. This script was further refined by Greg
  based on the output to detect more types of files automatically and to
  distinguish between header and source .c files (which need different
  comment types.) Finally Greg ran the script using the .csv files to
  generate the patches.

  Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
  Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
  Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
  Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"

* tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  License cleanup: add SPDX license identifier to uapi header files with a license
  License cleanup: add SPDX license identifier to uapi header files with no license
  License cleanup: add SPDX GPL-2.0 license identifier to files with no license
2017-11-02 10:04:46 -07:00
Borislav Petkov
1e4c4f610f x86/entry/64: Shorten TEST instructions
Convert TESTL to TESTB and save 3 bytes per callsite.

No functionality change.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20171102120926.4srwerqrr7g72e2k@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-02 13:45:37 +01:00
Greg Kroah-Hartman
b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Andy Lutomirski
929bacec21 x86/entry/64: De-Xen-ify our NMI code
Xen PV is fundamentally incompatible with our fancy NMI code: it
doesn't use IST at all, and Xen entries clobber two stack slots
below the hardware frame.

Drop Xen PV support from our NMI code entirely.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Acked-by: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/bfbe711b5ae03f672f8848999a8eb2711efc7f98.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-02 11:04:42 +01:00
Juergen Gross
43e4111086 xen, x86/entry/64: Add xen NMI trap entry
Instead of trying to execute any NMI via the bare metal's NMI trap
handler use a Xen specific one for PV domains, like we do for e.g.
debug traps. As in a PV domain the NMI is handled via the normal
kernel stack this is the correct thing to do.

This will enable us to get rid of the very fragile and questionable
dependencies between the bare metal NMI handler and Xen assumptions
believed to be broken anyway.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/5baf5c0528d58402441550c5770b98e7961e7680.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-02 11:04:42 +01:00
Andy Lutomirski
c39858de69 x86/entry/64: Remove the RESTORE_..._REGS infrastructure
All users of RESTORE_EXTRA_REGS, RESTORE_C_REGS and such, and
REMOVE_PT_GPREGS_FROM_STACK are gone.  Delete the macros.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/c32672f6e47c561893316d48e06c7656b1039a36.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-02 11:04:41 +01:00
Andy Lutomirski
471ee48322 x86/entry/64: Use POP instead of MOV to restore regs on NMI return
This gets rid of the last user of the old RESTORE_..._REGS infrastructure.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/652a260f17a160789bc6a41d997f98249b73e2ab.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-02 11:04:40 +01:00
Andy Lutomirski
a512210643 x86/entry/64: Merge the fast and slow SYSRET paths
They did almost the same thing.  Remove a bunch of pointless
instructions (mostly hidden in macros) and reduce cognitive load by
merging them.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1204e20233fcab9130a1ba80b3b1879b5db3fc1f.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-02 11:04:40 +01:00
Andy Lutomirski
4fbb39108f x86/entry/64: Use pop instead of movq in syscall_return_via_sysret
Saves 64 bytes.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/6609b7f74ab31c36604ad746e019ea8495aec76c.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-02 11:04:39 +01:00
Andy Lutomirski
e53178328c x86/entry/64: Shrink paranoid_exit_restore and make labels local
paranoid_exit_restore was a copy of restore_regs_and_return_to_kernel.
Merge them and make the paranoid_exit internal labels local.

Keeping .Lparanoid_exit makes the code a bit shorter because it
allows a 2-byte jnz instead of a 5-byte jnz.

Saves 96 bytes of text.

( This is still a bit suboptimal in a non-CONFIG_TRACE_IRQFLAGS
  kernel, but fixing that would make the code rather messy. )

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/510d66a1895cda9473c84b1086f0bb974f22de6a.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-02 11:04:39 +01:00
Andy Lutomirski
e872045bfd x86/entry/64: Simplify reg restore code in the standard IRET paths
The old code restored all the registers with movq instead of pop.

In theory, this was done because some CPUs have higher movq
throughput, but any gain there would be tiny and is almost certainly
outweighed by the higher text size.

This saves 96 bytes of text.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/ad82520a207ccd851b04ba613f4f752b33ac05f7.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-02 11:04:38 +01:00
Andy Lutomirski
8a055d7f41 x86/entry/64: Move SWAPGS into the common IRET-to-usermode path
All of the code paths that ended up doing IRET to usermode did
SWAPGS immediately beforehand.  Move the SWAPGS into the common
code.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/27fd6f45b7cd640de38fb9066fd0349bcd11f8e1.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-02 11:04:38 +01:00
Andy Lutomirski
26c4ef9c49 x86/entry/64: Split the IRET-to-user and IRET-to-kernel paths
These code paths will diverge soon.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/dccf8c7b3750199b4b30383c812d4e2931811509.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-02 11:04:37 +01:00
Andy Lutomirski
9da78ba6b4 x86/entry/64: Remove the restore_c_regs_and_iret label
The only user was the 64-bit opportunistic SYSRET failure path, and
that path didn't really need it.  This change makes the
opportunistic SYSRET code a bit more straightforward and gets rid of
the label.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/be3006a7ad3326e3458cf1cc55d416252cbe1986.1509609304.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-02 11:04:36 +01:00
Masahiro Yamada
af8e947079 x86/build: Beautify build log of syscall headers
This makes the build log look nicer.

Before:
  SYSTBL  arch/x86/entry/syscalls/../../include/generated/asm/syscalls_32.h
  SYSHDR  arch/x86/entry/syscalls/../../include/generated/asm/unistd_32_ia32.h
  SYSHDR  arch/x86/entry/syscalls/../../include/generated/asm/unistd_64_x32.h
  SYSTBL  arch/x86/entry/syscalls/../../include/generated/asm/syscalls_64.h
  SYSHDR  arch/x86/entry/syscalls/../../include/generated/uapi/asm/unistd_32.h
  SYSHDR  arch/x86/entry/syscalls/../../include/generated/uapi/asm/unistd_64.h
  SYSHDR  arch/x86/entry/syscalls/../../include/generated/uapi/asm/unistd_x32.h

After:
  SYSTBL  arch/x86/include/generated/asm/syscalls_32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_32_ia32.h
  SYSHDR  arch/x86/include/generated/asm/unistd_64_x32.h
  SYSTBL  arch/x86/include/generated/asm/syscalls_64.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_32.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_64.h
  SYSHDR  arch/x86/include/generated/uapi/asm/unistd_x32.h

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: linux-kbuild@vger.kernel.org
Link: http://lkml.kernel.org/r/1509077470-2735-1-git-send-email-yamada.masahiro@socionext.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-27 10:22:45 +02:00
Mark Rutland
6aa7de0591 locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
Please do not apply this to mainline directly, instead please re-run the
coccinelle script shown below and apply its output.

For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
preference to ACCESS_ONCE(), and new code is expected to use one of the
former. So far, there's been no reason to change most existing uses of
ACCESS_ONCE(), as these aren't harmful, and changing them results in
churn.

However, for some features, the read/write distinction is critical to
correct operation. To distinguish these cases, separate read/write
accessors must be used. This patch migrates (most) remaining
ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
coccinelle script:

----
// Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
// WRITE_ONCE()

// $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch

virtual patch

@ depends on patch @
expression E1, E2;
@@

- ACCESS_ONCE(E1) = E2
+ WRITE_ONCE(E1, E2)

@ depends on patch @
expression E;
@@

- ACCESS_ONCE(E)
+ READ_ONCE(E)
----

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: davem@davemloft.net
Cc: linux-arch@vger.kernel.org
Cc: mpe@ellerman.id.au
Cc: shuah@kernel.org
Cc: snitzer@redhat.com
Cc: thor.thayer@linux.intel.com
Cc: tj@kernel.org
Cc: viro@zeniv.linux.org.uk
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-25 11:01:08 +02:00
Josh Poimboeuf
82c62fa0c4 x86/asm: Don't use the confusing '.ifeq' directive
I find the '.ifeq <expression>' directive to be confusing.  Reading it
quickly seems to suggest its opposite meaning, or that it's missing an
argument.

Improve readability by replacing all of its x86 uses with
'.if <expression> == 0'.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrei Vagin <avagin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/757da028e802c7e98d23fbab8d234b1063e161cf.1508516398.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-23 13:31:34 +02:00
Josh Poimboeuf
98990a33b7 x86/entry: Fix idtentry unwind hint
This fixes the following ORC warning in the 'int3' entry code:

  WARNING: can't dereference iret registers at ffff8801c5f17fe0 for ip ffffffff95f0d94b

The ORC metadata had the wrong stack offset for the iret registers.

Their location on the stack is dependent on whether the exception has an
error code.

Reported-and-tested-by: Andrei Vagin <avagin@virtuozzo.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 8c1f75587a18 ("x86/entry/64: Add unwind hint annotations")
Link: http://lkml.kernel.org/r/931d57f0551ed7979d5e7e05370d445c8e5137f8.1508516398.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-23 13:30:35 +02:00
Jérémy Lefaure
0cfe5b5fc0 x86: Use ARRAY_SIZE
Using the ARRAY_SIZE macro improves the readability of the code.

Found with Coccinelle with the following semantic patch:
@r depends on (org || report)@
type T;
T[] E;
position p;
@@
(
 (sizeof(E)@p /sizeof(*E))
|
 (sizeof(E)@p /sizeof(E[...]))
|
 (sizeof(E)@p /sizeof(T))
)

Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-video@atrey.karlin.mff.cuni.cz
Cc: Martin Mares <mj@ucw.cz>
Cc: Andy Lutomirski <luto@amacapital.net>
Link: https://lkml.kernel.org/r/20171001193101.8898-13-jeremy.lefaure@lse.epita.fr
2017-10-19 16:15:47 +02:00
Josh Poimboeuf
5c99b692cf x86/unwind: Use MSB for frame pointer encoding on 32-bit
On x86-32, Tetsuo Handa and Fengguang Wu reported unwinder warnings
like:

  WARNING: kernel stack regs at f60bb9c8 in swapper:1 has bad 'bp' value 0ba00000

And also there were some stack dumps with a bunch of unreliable '?'
symbols after an apic_timer_interrupt symbol, meaning the unwinder got
confused when it tried to read the regs.

The cause of those issues is that, with GCC 4.8 (and possibly older),
there are cases where GCC misaligns the stack pointer in a leaf function
for no apparent reason:

  c124a388 <acpi_rs_move_data>:
  c124a388:       55                      push   %ebp
  c124a389:       89 e5                   mov    %esp,%ebp
  c124a38b:       57                      push   %edi
  c124a38c:       56                      push   %esi
  c124a38d:       89 d6                   mov    %edx,%esi
  c124a38f:       53                      push   %ebx
  c124a390:       31 db                   xor    %ebx,%ebx
  c124a392:       83 ec 03                sub    $0x3,%esp
  ...
  c124a3e3:       83 c4 03                add    $0x3,%esp
  c124a3e6:       5b                      pop    %ebx
  c124a3e7:       5e                      pop    %esi
  c124a3e8:       5f                      pop    %edi
  c124a3e9:       5d                      pop    %ebp
  c124a3ea:       c3                      ret

If an interrupt occurs in such a function, the regs on the stack will be
unaligned, which breaks the frame pointer encoding assumption.  So on
32-bit, use the MSB instead of the LSB to encode the regs.

This isn't an issue on 64-bit, because interrupts align the stack before
writing to it.

Reported-and-tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: LKP <lkp@01.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/279a26996a482ca716605c7dbc7f2db9d8d91e81.1507597785.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-10 12:49:48 +02:00
Linus Torvalds
24e700e291 Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 apic updates from Thomas Gleixner:
 "This update provides:

   - Cleanup of the IDT management including the removal of the extra
     tracing IDT. A first step to cleanup the vector management code.

   - The removal of the paravirt op adjust_exception_frame. This is a
     XEN specific issue, but merged through this branch to avoid nasty
     merge collisions

   - Prevent dmesg spam about the TSC DEADLINE bug, when the CPU has
     disabled the TSC DEADLINE timer in CPUID.

   - Adjust a debug message in the ioapic code to print out the
     information correctly"

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits)
  x86/idt: Fix the X86_TRAP_BP gate
  x86/xen: Get rid of paravirt op adjust_exception_frame
  x86/eisa: Add missing include
  x86/idt: Remove superfluous ALIGNment
  x86/apic: Silence "FW_BUG TSC_DEADLINE disabled due to Errata" on CPUs without the feature
  x86/idt: Remove the tracing IDT leftovers
  x86/idt: Hide set_intr_gate()
  x86/idt: Simplify alloc_intr_gate()
  x86/idt: Deinline setup functions
  x86/idt: Remove unused functions/inlines
  x86/idt: Move interrupt gate initialization to IDT code
  x86/idt: Move APIC gate initialization to tables
  x86/idt: Move regular trap init to tables
  x86/idt: Move IST stack based traps to table init
  x86/idt: Move debug stack init to table based
  x86/idt: Switch early trap init to IDT tables
  x86/idt: Prepare for table based init
  x86/idt: Move early IDT setup out of 32-bit asm
  x86/idt: Move early IDT handler setup to IDT code
  x86/idt: Consolidate IDT invalidation
  ...
2017-09-04 17:43:56 -07:00