565 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
Dave Hansen
|
4fe2d8b11a |
x86/entry: Rename SYSENTER_stack to CPU_ENTRY_AREA_entry_stack
If the kernel oopses while on the trampoline stack, it will print "<SYSENTER>" even if SYSENTER is not involved. That is rather confusing. The "SYSENTER" stack is used for a lot more than SYSENTER now. Give it a better string to display in stack dumps, and rename the kernel code to match. Also move the 32-bit code over to the new naming even though it still uses the entry stack only for SYSENTER. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Andy Lutomirski
|
4831b77940 |
x86/vsyscall/64: Warn and fail vsyscall emulation in NATIVE mode
If something goes wrong with pagetable setup, vsyscall=native will accidentally fall back to emulation. Make it warn and fail so that we notice. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Andy Lutomirski
|
49275fef98 |
x86/vsyscall/64: Explicitly set _PAGE_USER in the pagetable hierarchy
The kernel is very erratic as to which pagetables have _PAGE_USER set. The vsyscall page gets lucky: it seems that all of the relevant pagetables are among the apparently arbitrary ones that set _PAGE_USER. Rather than relying on chance, just explicitly set _PAGE_USER. This will let us clean up pagetable setup to stop setting _PAGE_USER. The added code can also be reused by pagetable isolation to manage the _PAGE_USER bit in the usermode tables. [ tglx: Folded paravirt fix from Juergen Gross ] Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Linus Torvalds
|
64a48099b3 |
Merge branch 'WIP.x86-pti.entry-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 syscall entry code changes for PTI from Ingo Molnar: "The main changes here are Andy Lutomirski's changes to switch the x86-64 entry code to use the 'per CPU entry trampoline stack'. This, besides helping fix KASLR leaks (the pending Page Table Isolation (PTI) work), also robustifies the x86 entry code" * 'WIP.x86-pti.entry-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits) x86/cpufeatures: Make CPU bugs sticky x86/paravirt: Provide a way to check for hypervisors x86/paravirt: Dont patch flush_tlb_single x86/entry/64: Make cpu_entry_area.tss read-only x86/entry: Clean up the SYSENTER_stack code x86/entry/64: Remove the SYSENTER stack canary x86/entry/64: Move the IST stacks into struct cpu_entry_area x86/entry/64: Create a per-CPU SYSCALL entry trampoline x86/entry/64: Return to userspace from the trampoline stack x86/entry/64: Use a per-CPU trampoline stack for IDT entries x86/espfix/64: Stop assuming that pt_regs is on the entry stack x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0 x86/entry: Remap the TSS into the CPU entry area x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct x86/dumpstack: Handle stack overflow on all stacks x86/entry: Fix assumptions that the HW TSS is at the beginning of cpu_tss x86/kasan/64: Teach KASAN about the cpu_entry_area x86/mm/fixmap: Generalize the GDT fixmap mechanism, introduce struct cpu_entry_area x86/entry/gdt: Put per-CPU GDT remaps in ascending order x86/dumpstack: Add get_stack_info() support for the SYSENTER stack ... |
||
Andy Lutomirski
|
c482feefe1 |
x86/entry/64: Make cpu_entry_area.tss read-only
The TSS is a fairly juicy target for exploits, and, now that the TSS is in the cpu_entry_area, it's no longer protected by kASLR. Make it read-only on x86_64. On x86_32, it can't be RO because it's written by the CPU during task switches, and we use a task gate for double faults. I'd also be nervous about errata if we tried to make it RO even on configurations without double fault handling. [ tglx: AMD confirmed that there is no problem on 64-bit with TSS RO. So it's probably safe to assume that it's a non issue, though Intel might have been creative in that area. Still waiting for confirmation. ] Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bpetkov@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.733700132@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Andy Lutomirski
|
0f9a48100f |
x86/entry: Clean up the SYSENTER_stack code
The existing code was a mess, mainly because C arrays are nasty. Turn SYSENTER_stack into a struct, add a helper to find it, and do all the obvious cleanups this enables. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bpetkov@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.653244723@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Andy Lutomirski
|
3386bc8aed |
x86/entry/64: Create a per-CPU SYSCALL entry trampoline
Handling SYSCALL is tricky: the SYSCALL handler is entered with every single register (except FLAGS), including RSP, live. It somehow needs to set RSP to point to a valid stack, which means it needs to save the user RSP somewhere and find its own stack pointer. The canonical way to do this is with SWAPGS, which lets us access percpu data using the %gs prefix. With PAGE_TABLE_ISOLATION-like pagetable switching, this is problematic. Without a scratch register, switching CR3 is impossible, so %gs-based percpu memory would need to be mapped in the user pagetables. Doing that without information leaks is difficult or impossible. Instead, use a different sneaky trick. Map a copy of the first part of the SYSCALL asm at a different address for each CPU. Now RIP varies depending on the CPU, so we can use RIP-relative memory access to access percpu memory. By putting the relevant information (one scratch slot and the stack address) at a constant offset relative to RIP, we can make SYSCALL work without relying on %gs. A nice thing about this approach is that we can easily switch it on and off if we want pagetable switching to be configurable. The compat variant of SYSCALL doesn't have this problem in the first place -- there are plenty of scratch registers, since we don't care about preserving r8-r15. This patch therefore doesn't touch SYSCALL32 at all. This patch actually seems to be a small speedup. With this patch, SYSCALL touches an extra cache line and an extra virtual page, but the pipeline no longer stalls waiting for SWAPGS. It seems that, at least in a tight loop, the latter outweights the former. Thanks to David Laight for an optimization tip. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bpetkov@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.403607157@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Andy Lutomirski
|
3e3b9293d3 |
x86/entry/64: Return to userspace from the trampoline stack
By itself, this is useless. It gives us the ability to run some final code before exit that cannnot run on the kernel stack. This could include a CR3 switch a la PAGE_TABLE_ISOLATION or some kernel stack erasing, for example. (Or even weird things like *changing* which kernel stack gets used as an ASLR-strengthening mechanism.) The SYSRET32 path is not covered yet. It could be in the future or we could just ignore it and force the slow path if needed. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.306546484@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Andy Lutomirski
|
7f2590a110 |
x86/entry/64: Use a per-CPU trampoline stack for IDT entries
Historically, IDT entries from usermode have always gone directly to the running task's kernel stack. Rearrange it so that we enter on a per-CPU trampoline stack and then manually switch to the task's stack. This touches a couple of extra cachelines, but it gives us a chance to run some code before we touch the kernel stack. The asm isn't exactly beautiful, but I think that fully refactoring it can wait. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150606.225330557@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Andy Lutomirski
|
72f5e08dbb |
x86/entry: Remap the TSS into the CPU entry area
This has a secondary purpose: it puts the entry stack into a region with a well-controlled layout. A subsequent patch will take advantage of this to streamline the SYSCALL entry code to be able to find it more easily. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bpetkov@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.962042855@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Andy Lutomirski
|
1a79797b58 |
x86/entry/64: Allocate and enable the SYSENTER stack
This will simplify future changes that want scratch variables early in the SYSENTER handler -- they'll be able to spill registers to the stack. It also lets us get rid of a SWAPGS_UNSAFE_STACK user. This does not depend on CONFIG_IA32_EMULATION=y because we'll want the stack space even without IA32 emulation. As far as I can tell, the reason that this wasn't done from day 1 is that we use IST for #DB and #BP, which is IMO rather nasty and causes a lot more problems than it solves. But, since #DB uses IST, we don't actually need a real stack for SYSENTER (because SYSENTER with TF set will invoke #DB on the IST stack rather than the SYSENTER stack). I want to remove IST usage from these vectors some day, and this patch is a prerequisite for that as well. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Link: https://lkml.kernel.org/r/20171204150605.312726423@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Boris Ostrovsky
|
e17f823453 |
x86/entry/64/paravirt: Use paravirt-safe macro to access eflags
Commit 1d3e53e8624a ("x86/entry/64: Refactor IRQ stacks and make them NMI-safe") added DEBUG_ENTRY_ASSERT_IRQS_OFF macro that acceses eflags using 'pushfq' instruction when testing for IF bit. On PV Xen guests looking at IF flag directly will always see it set, resulting in 'ud2'. Introduce SAVE_FLAGS() macro that will use appropriate save_fl pv op when running paravirt. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Laight <David.Laight@aculab.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Eduardo Valentin <eduval@amazon.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Cc: aliguori@amazon.com Cc: daniel.gruss@iaik.tugraz.at Cc: hughd@google.com Cc: keescook@google.com Cc: xen-devel@lists.xenproject.org Link: https://lkml.kernel.org/r/20171204150604.899457242@linutronix.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Linus Torvalds
|
dd53a4214d |
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar: - make CR4 handling irq-safe, which bug vmware guests ran into - don't crash on early IRQs in Xen guests - don't crash secondary CPU bringup if #UD assisted WARN()ings are triggered - make X86_BUG_FXSAVE_LEAK optional on newer AMD CPUs that have the fix - fix AMD Fam17h microcode loading - fix broadcom_postcore_init() if ACPI is disabled - fix resume regression in __restore_processor_context() - fix Sparse warnings - fix a GCC-8 warning * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vdso: Change time() prototype to match __vdso_time() x86: Fix Sparse warnings about non-static functions x86/power: Fix some ordering bugs in __restore_processor_context() x86/PCI: Make broadcom_postcore_init() check acpi_disabled x86/microcode/AMD: Add support for fam17h microcode loading x86/cpufeatures: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD x86/idt: Load idt early in start_secondary x86/xen: Support early interrupts in xen pv guests x86/tlb: Disable interrupts when changing CR4 x86/tlb: Refactor CR4 setting and shadow write |
||
Arnd Bergmann
|
88edb57d1e |
x86/vdso: Change time() prototype to match __vdso_time()
gcc-8 warns that time() is an alias for __vdso_time() but the two have different prototypes: arch/x86/entry/vdso/vclock_gettime.c:327:5: error: 'time' alias between functions of incompatible types 'int(time_t *)' {aka 'int(long int *)'} and 'time_t(time_t *)' {aka 'long int(long int *)'} [-Werror=attribute-alias] int time(time_t *t) ^~~~ arch/x86/entry/vdso/vclock_gettime.c:318:16: note: aliased declaration here I could not figure out whether this is intentional, but I see that changing it to return time_t avoids the warning. Returning 'int' from time() is also a bit questionable, as it causes an overflow in y2038 even on 64-bit architectures that use a 64-bit time_t type. On 32-bit architecture with 64-bit time_t, time() should always be implement by the C library by calling a (to be added) clock_gettime() variant that takes a sufficiently wide argument. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Link: http://lkml.kernel.org/r/20171204150203.852959-1-arnd@arndb.de Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Miroslav Benes
|
43347d56c8 |
livepatch: send a fake signal to all blocking tasks
Live patching consistency model is of LEAVE_PATCHED_SET and SWITCH_THREAD. This means that all tasks in the system have to be marked one by one as safe to call a new patched function. Safe means when a task is not (sleeping) in a set of patched functions. That is, no patched function is on the task's stack. Another clearly safe place is the boundary between kernel and userspace. The patching waits for all tasks to get outside of the patched set or to cross the boundary. The transition is completed afterwards. The problem is that a task can block the transition for quite a long time, if not forever. It could sleep in a set of patched functions, for example. Luckily we can force the task to leave the set by sending it a fake signal, that is a signal with no data in signal pending structures (no handler, no sign of proper signal delivered). Suspend/freezer use this to freeze the tasks as well. The task gets TIF_SIGPENDING set and is woken up (if it has been sleeping in the kernel before) or kicked by rescheduling IPI (if it was running on other CPU). This causes the task to go to kernel/userspace boundary where the signal would be handled and the task would be marked as safe in terms of live patching. There are tasks which are not affected by this technique though. The fake signal is not sent to kthreads. They should be handled differently. They can be woken up so they leave the patched set and their TIF_PATCH_PENDING can be cleared thanks to stack checking. For the sake of completeness, if the task is in TASK_RUNNING state but not currently running on some CPU it doesn't get the IPI, but it would eventually handle the signal anyway. Second, if the task runs in the kernel (in TASK_RUNNING state) it gets the IPI, but the signal is not handled on return from the interrupt. It would be handled on return to the userspace in the future when the fake signal is sent again. Stack checking deals with these cases in a better way. If the task was sleeping in a syscall it would be woken by our fake signal, it would check if TIF_SIGPENDING is set (by calling signal_pending() predicate) and return ERESTART* or EINTR. Syscalls with ERESTART* return values are restarted in case of the fake signal (see do_signal()). EINTR is propagated back to the userspace program. This could disturb the program, but... * each process dealing with signals should react accordingly to EINTR return values. * syscalls returning EINTR happen to be quite common situation in the system even if no fake signal is sent. * freezer sends the fake signal and does not deal with EINTR anyhow. Thus EINTR values are returned when the system is resumed. The very safe marking is done in architectures' "entry" on syscall and interrupt/exception exit paths, and in a stack checking functions of livepatch. TIF_PATCH_PENDING is cleared and the next recalc_sigpending() drops TIF_SIGPENDING. In connection with this, also call klp_update_patch_state() before do_signal(), so that recalc_sigpending() in dequeue_signal() can clear TIF_PATCH_PENDING immediately and thus prevent a double call of do_signal(). Note that the fake signal is not sent to stopped/traced tasks. Such task prevents the patching to finish till it continues again (is not traced anymore). Last, sending the fake signal is not automatic. It is done only when admin requests it by writing 1 to signal sysfs attribute in livepatch sysfs directory. Signed-off-by: Miroslav Benes <mbenes@suse.cz> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: x86@kernel.org Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by: Jiri Kosina <jkosina@suse.cz> |
||
Linus Torvalds
|
02fc87b117 |
Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar: - topology enumeration fixes - KASAN fix - two entry fixes (not yet the big series related to KASLR) - remove obsolete code - instruction decoder fix - better /dev/mem sanity checks, hopefully working better this time - pkeys fixes - two ACPI fixes - 5-level paging related fixes - UMIP fixes that should make application visible faults more debuggable - boot fix for weird virtualization environment * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86/decoder: Add new TEST instruction pattern x86/PCI: Remove unused HyperTransport interrupt support x86/umip: Fix insn_get_code_seg_params()'s return value x86/boot/KASLR: Remove unused variable x86/entry/64: Add missing irqflags tracing to native_load_gs_index() x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing x86/pkeys/selftests: Fix protection keys write() warning x86/pkeys/selftests: Rename 'si_pkey' to 'siginfo_pkey' x86/mpx/selftests: Fix up weird arrays x86/pkeys: Update documentation about availability x86/umip: Print a warning into the syslog if UMIP-protected instructions are used x86/smpboot: Fix __max_logical_packages estimate x86/topology: Avoid wasting 128k for package id array perf/x86/intel/uncore: Cache logical pkg id in uncore driver x86/acpi: Reduce code duplication in mp_override_legacy_irq() x86/acpi: Handle SCI interrupts above legacy space gracefully x86/boot: Fix boot failure when SMP MP-table is based at 0 x86/mm: Limit mmap() of /dev/mem to valid physical addresses x86/selftests: Add test for mapping placement for 5-level paging ... |
||
Andy Lutomirski
|
ca37e57bbe |
x86/entry/64: Add missing irqflags tracing to native_load_gs_index()
Running this code with IRQs enabled (where dummy_lock is a spinlock): static void check_load_gs_index(void) { /* This will fail. */ load_gs_index(0xffff); spin_lock(&dummy_lock); spin_unlock(&dummy_lock); } Will generate a lockdep warning. The issue is that the actual write to %gs would cause an exception with IRQs disabled, and the exception handler would, as an inadvertent side effect, update irqflag tracing to reflect the IRQs-off status. native_load_gs_index() would then turn IRQs back on and return with irqflag tracing still thinking that IRQs were off. The dummy lock-and-unlock causes lockdep to notice the error and warn. Fix it by adding the missing tracing. Apparently nothing did this in a context where it mattered. I haven't tried to find a code path that would actually exhibit the warning if appropriately nasty user code were running. I suspect that the security impact of this bug is very, very low -- production systems don't run with lockdep enabled, and the warning is mostly harmless anyway. Found during a quick audit of the entry code to try to track down an unrelated bug that Ingo found in some still-in-development code. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/e1aeb0e6ba8dd430ec36c8a35e63b429698b4132.1511411918.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Andy Lutomirski
|
548c3050ea |
x86/entry/64: Fix entry_SYSCALL_64_after_hwframe() IRQ tracing
When I added entry_SYSCALL_64_after_hwframe(), I left TRACE_IRQS_OFF before it. This means that users of entry_SYSCALL_64_after_hwframe() were responsible for invoking TRACE_IRQS_OFF, and the one and only user (Xen, added in the same commit) got it wrong. I think this would manifest as a warning if a Xen PV guest with CONFIG_DEBUG_LOCKDEP=y were used with context tracking. (The context tracking bit is to cause lockdep to get invoked before we turn IRQs back on.) I haven't tested that for real yet because I can't get a kernel configured like that to boot at all on Xen PV. Move TRACE_IRQS_OFF below the label. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Fixes: 8a9949bc71a7 ("x86/xen/64: Rearrange the SYSCALL entries") Link: http://lkml.kernel.org/r/9150aac013b7b95d62c2336751d5b6e91d2722aa.1511325444.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Linus Torvalds
|
09bd7c75e5 |
Kbuild updates for v4.15
One of the most remarkable improvements in this cycle is, Kbuild is now able to cache the result of shell commands. Some variables are expensive to compute, for example, $(call cc-option,...) invokes the compiler. It is not efficient to redo this computation every time, even when we are not actually building anything. Kbuild creates a hidden file ".cache.mk" that contains invoked shell commands and their results. The speed-up should be noticeable. Summary: - Fix arch build issues (hexagon, sh) - Clean up various Makefiles and scripts - Fix wrong usage of {CFLAGS,LDFLAGS}_MODULE in arch Makefiles - Cache variables that are expensive to compute - Improve cc-ldopton and ld-option for Clang - Optimize output directory creation -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJaDxaLAAoJED2LAQed4NsGIHQP/isMxxaIxIAWU56+ZcII74k7 639VgrKi9n5y25d1dBRTQg+vReHE6E2JbkCqpVOu11t7m0LT7yUK8v3WwyLf1qTN GxnqZ/WMQU5/AYVqIWo8jN4FGHpivHJ6qbeiNJM9qN4RAkzG0sZUq746VaFZYmIR Lu0Gf4m4qjifkkhXsQdWT5i7yNTidPqaL6GNb+FcFkEHlVre8jma0kJlgfHxru84 WmETpjQXvHAZ/R61vY6ekAWpqFhw3ecJY96A9npnx+SQVQdSNAdpaU0SK29jB0ON /SAfpHg9oa/gD0LFOKV6zkjnAkd4TEjrJEiHHhz5gjT/SbS3T1llBIGZ1oV4X7Y0 Vlh9KWlm1FJJI4SIzc9qUaQMp6JtLfEfHKJCc45xVaN3fNrDnR8jl80x5+95ELga dCkZgnq5u82MtTysCbHBESwDYQaVPyIrh7In+mduglaCqhqj9KoDjoLoiGfCg7SA 3tPflYVd629w5l5GrazJ40jWn1+ggMtgMOVooJNJ+dINCP+GxsUpH84Ww2Pdic+/ qLdud6TeqxrZDGzWXqKNLu8alM8NGgSr101l9gIf1oqSyy63duBpMrxGDoIJS3FU rFDoFFUhlfkAXNbQHtVGNzKtcpCjURh992j9Fa1+NfMwSce5IHkMwTvPmNSRowi8 0llLjXhD/bxK6FpdvlV8 =zIdO -----END PGP SIGNATURE----- Merge tag 'kbuild-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: "One of the most remarkable improvements in this cycle is, Kbuild is now able to cache the result of shell commands. Some variables are expensive to compute, for example, $(call cc-option,...) invokes the compiler. It is not efficient to redo this computation every time, even when we are not actually building anything. Kbuild creates a hidden file ".cache.mk" that contains invoked shell commands and their results. The speed-up should be noticeable. Summary: - Fix arch build issues (hexagon, sh) - Clean up various Makefiles and scripts - Fix wrong usage of {CFLAGS,LDFLAGS}_MODULE in arch Makefiles - Cache variables that are expensive to compute - Improve cc-ldopton and ld-option for Clang - Optimize output directory creation" * tag 'kbuild-v4.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (30 commits) kbuild: move coccicheck help from scripts/Makefile.help to top Makefile sh: decompressor: add shipped files to .gitignore frv: .gitignore: ignore vmlinux.lds selinux: remove unnecessary assignment to subdir- kbuild: specify FORCE in Makefile.headersinst as .PHONY target kbuild: remove redundant mkdir from ./Kbuild kbuild: optimize object directory creation for incremental build kbuild: create object directories simpler and faster kbuild: filter-out PHONY targets from "targets" kbuild: remove redundant $(wildcard ...) for cmd_files calculation kbuild: create directory for make cache only when necessary sh: select KBUILD_DEFCONFIG depending on ARCH kbuild: fix linker feature test macros when cross compiling with Clang kbuild: shrink .cache.mk when it exceeds 1000 lines kbuild: do not call cc-option before KBUILD_CFLAGS initialization kbuild: Cache a few more calls to the compiler kbuild: Add a cache for generated variables kbuild: add forward declaration of default target to Makefile.asm-generic kbuild: remove KBUILD_SUBDIR_ASFLAGS and KBUILD_SUBDIR_CCFLAGS hexagon/kbuild: replace CFLAGS_MODULE with KBUILD_CFLAGS_MODULE ... |
||
Linus Torvalds
|
051089a2ee |
xen: features and fixes for v4.15-rc1
-----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABAgAGBQJaDdh4AAoJELDendYovxMvPFAH/2QjTys2ydIAdmwke4odpJ7U xuy7HOQCzOeZ5YsZthzCBsN90VmnDM7X7CcB8weSdjcKlXMSAWD+J1RgkL2iAJhI 8tzIEXECrlNuz4V5mX9TmMgtPCr4qzU3fsts0pZy4fYDq1PVWDefqOwEtbpbWabb wRSMq/nTb9iASTMgheSC0WfhJneqtJ+J20zrzkGPCBPRFcwfppeP8/7vpkmJslBi eH/pfchICM4w093T/BfavnsPvhLdjgRuwVzn6+e46s4tLnZAxnLRVQ7SXZXzBORq /dL/qC0XH3YXdU+XfIs//giZsmLns6SxZaMr4vs6TxFtuzZBKpLtkOKo9zndvxk= =sZY5 -----END PGP SIGNATURE----- Merge tag 'for-linus-4.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: "Xen features and fixes for v4.15-rc1 Apart from several small fixes it contains the following features: - a series by Joao Martins to add vdso support of the pv clock interface - a series by Juergen Gross to add support for Xen pv guests to be able to run on 5 level paging hosts - a series by Stefano Stabellini adding the Xen pvcalls frontend driver using a paravirtualized socket interface" * tag 'for-linus-4.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (34 commits) xen/pvcalls: fix potential endless loop in pvcalls-front.c xen/pvcalls: Add MODULE_LICENSE() MAINTAINERS: xen, kvm: track pvclock-abi.h changes x86/xen/time: setup vcpu 0 time info page x86/xen/time: set pvclock flags on xen_time_init() x86/pvclock: add setter for pvclock_pvti_cpu0_va ptp_kvm: probe for kvm guest availability xen/privcmd: remove unused variable pageidx xen: select grant interface version xen: update arch/x86/include/asm/xen/cpuid.h xen: add grant interface version dependent constants to gnttab_ops xen: limit grant v2 interface to the v1 functionality xen: re-introduce support for grant v2 interface xen: support priv-mapping in an HVM tools domain xen/pvcalls: remove redundant check for irq >= 0 xen/pvcalls: fix unsigned less than zero error check xen/time: Return -ENODEV from xen_get_wallclock() xen/pvcalls-front: mark expected switch fall-through xen: xenbus_probe_frontend: mark expected switch fall-throughs xen/time: do not decrease steal time after live migration on xen ... |
||
Masahiro Yamada
|
8a78756eb5 |
kbuild: create object directories simpler and faster
For the out-of-tree build, scripts/Makefile.build creates output directories, but this operation is not efficient. scripts/Makefile.lib calculates obj-dirs as follows: obj-dirs := $(dir $(multi-objs) $(obj-y)) Please notice $(sort ...) is not used here. Usually the result is as many "./" as objects here. For a lot of duplicated paths, the following command is invoked. _dummy := $(foreach d,$(obj-dirs), $(shell [ -d $(d) ] || mkdir -p $(d))) Then, the costly shell command is run over and over again. I see many points for optimization: [1] Use $(sort ...) to cut down duplicated paths before passing them to system call [2] Use single $(shell ...) instead of repeating it with $(foreach ...) This will reduce forking. [3] We can calculate obj-dirs more simply. Most of objects are already accumulated in $(targets). So, $(dir $(targets)) is fine and more comprehensive. I also removed ugly code in arch/x86/entry/vdso/Makefile. This is now really unnecessary. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Ingo Molnar <mingo@kernel.org> Tested-by: Douglas Anderson <dianders@chromium.org> |
||
Linus Torvalds
|
eb4d47c8ce |
Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Ingo Molnar: "Two changes: Propagate const/__initconst, and use ARRAY_SIZE() some more" * 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/events/amd/iommu: Make iommu_pmu const and __initconst x86: Use ARRAY_SIZE |
||
Linus Torvalds
|
d6ec9d9a4d |
Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Ingo Molnar: "Note that in this cycle most of the x86 topics interacted at a level that caused them to be merged into tip:x86/asm - but this should be a temporary phenomenon, hopefully we'll back to the usual patterns in the next merge window. The main changes in this cycle were: Hardware enablement: - Add support for the Intel UMIP (User Mode Instruction Prevention) CPU feature. This is a security feature that disables certain instructions such as SGDT, SLDT, SIDT, SMSW and STR. (Ricardo Neri) [ Note that this is disabled by default for now, there are some smaller enhancements in the pipeline that I'll follow up with in the next 1-2 days, which allows this to be enabled by default.] - Add support for the AMD SEV (Secure Encrypted Virtualization) CPU feature, on top of SME (Secure Memory Encryption) support that was added in v4.14. (Tom Lendacky, Brijesh Singh) - Enable new SSE/AVX/AVX512 CPU features: AVX512_VBMI2, GFNI, VAES, VPCLMULQDQ, AVX512_VNNI, AVX512_BITALG. (Gayatri Kammela) Other changes: - A big series of entry code simplifications and enhancements (Andy Lutomirski) - Make the ORC unwinder default on x86 and various objtool enhancements. (Josh Poimboeuf) - 5-level paging enhancements (Kirill A. Shutemov) - Micro-optimize the entry code a bit (Borislav Petkov) - Improve the handling of interdependent CPU features in the early FPU init code (Andi Kleen) - Build system enhancements (Changbin Du, Masahiro Yamada) - ... plus misc enhancements, fixes and cleanups" * 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (118 commits) x86/build: Make the boot image generation less verbose selftests/x86: Add tests for the STR and SLDT instructions selftests/x86: Add tests for User-Mode Instruction Prevention x86/traps: Fix up general protection faults caused by UMIP x86/umip: Enable User-Mode Instruction Prevention at runtime x86/umip: Force a page fault when unable to copy emulated result to user x86/umip: Add emulation code for UMIP instructions x86/cpufeature: Add User-Mode Instruction Prevention definitions x86/insn-eval: Add support to resolve 16-bit address encodings x86/insn-eval: Handle 32-bit address encodings in virtual-8086 mode x86/insn-eval: Add wrapper function for 32 and 64-bit addresses x86/insn-eval: Add support to resolve 32-bit address encodings x86/insn-eval: Compute linear address in several utility functions resource: Fix resource_size.cocci warnings X86/KVM: Clear encryption attribute when SEV is active X86/KVM: Decrypt shared per-cpu variables when SEV is active percpu: Introduce DEFINE_PER_CPU_DECRYPTED x86: Add support for changing memory encryption attribute in early boot x86/io: Unroll string I/O when SEV is active x86/boot: Add early boot support when running with SEV active ... |
||
Joao Martins
|
9f08890ab9 |
x86/pvclock: add setter for pvclock_pvti_cpu0_va
Right now there is only a pvclock_pvti_cpu0_va() which is defined on kvmclock since: commit dac16fba6fc5 ("x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap") The only user of this interface so far is kvm. This commit adds a setter function for the pvti page and moves pvclock_pvti_cpu0_va to pvclock, which is a more generic place to have it; and would allow other PV clocksources to use it, such as Xen. While moving pvclock_pvti_cpu0_va into pvclock, rename also this function to pvclock_get_pvti_cpu0_va (including its call sites) to be symmetric with the setter (pvclock_set_pvti_cpu0_va). Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Acked-by: Andy Lutomirski <luto@kernel.org> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> |
||
Frederic Weisbecker
|
7a10e2a919 |
x86: Use lockdep to assert IRQs are disabled/enabled
Use lockdep to check that IRQs are enabled or disabled as expected. This way the sanity check only shows overhead when concurrency correctness debug code is enabled. It also makes no more sense to fix the IRQ flags when a bug is detected as the assertion is now pure config-dependent debugging. And to quote Peter Zijlstra: The whole if !disabled, disable logic is uber paranoid programming, but I don't think we've ever seen that WARN trigger, and if it does (and then burns the kernel) we at least know what happend. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: David S . Miller <davem@davemloft.net> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Link: http://lkml.kernel.org/r/1509980490-4285-8-git-send-email-frederic@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Brijesh Singh
|
819aeee065 |
X86/KVM: Clear encryption attribute when SEV is active
The guest physical memory area holding the struct pvclock_wall_clock and struct pvclock_vcpu_time_info are shared with the hypervisor. It periodically updates the contents of the memory. When SEV is active, the encryption attributes from the shared memory pages must be cleared so that both hypervisor and guest can access the data. Signed-off-by: Brijesh Singh <brijesh.singh@amd.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Tested-by: Borislav Petkov <bp@suse.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: kvm@vger.kernel.org Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lkml.kernel.org/r/20171020143059.3291-18-brijesh.singh@amd.com |
||
Ingo Molnar
|
67caadc611 |
Merge branch 'x86/build' into x86/asm, to fix up conflicts
Conflicts: arch/x86/entry/syscalls/Makefile Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Ingo Molnar
|
b3d9a13681 |
Merge branch 'linus' into x86/asm, to pick up fixes and resolve conflicts
Conflicts: arch/x86/kernel/cpu/Makefile Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Ingo Molnar
|
8c5db92a70 |
Merge branch 'linus' into locking/core, to resolve conflicts
Conflicts: include/linux/compiler-clang.h include/linux/compiler-gcc.h include/linux/compiler-intel.h include/uapi/linux/stddef.h Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Linus Torvalds
|
ead751507d |
License cleanup: add SPDX license identifiers to some files
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWfswbQ8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ykvEwCfXU1MuYFQGgMdDmAZXEc+xFXZvqgAoKEcHDNA 6dVh26uchcEQLN/XqUDt =x306 -----END PGP SIGNATURE----- Merge tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull initial SPDX identifiers from Greg KH: "License cleanup: add SPDX license identifiers to some files Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'spdx_identifiers-4.14-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: License cleanup: add SPDX license identifier to uapi header files with a license License cleanup: add SPDX license identifier to uapi header files with no license License cleanup: add SPDX GPL-2.0 license identifier to files with no license |
||
Borislav Petkov
|
1e4c4f610f |
x86/entry/64: Shorten TEST instructions
Convert TESTL to TESTB and save 3 bytes per callsite. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20171102120926.4srwerqrr7g72e2k@pd.tnic Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Greg Kroah-Hartman
|
b24413180f |
License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
||
Andy Lutomirski
|
929bacec21 |
x86/entry/64: De-Xen-ify our NMI code
Xen PV is fundamentally incompatible with our fancy NMI code: it doesn't use IST at all, and Xen entries clobber two stack slots below the hardware frame. Drop Xen PV support from our NMI code entirely. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/bfbe711b5ae03f672f8848999a8eb2711efc7f98.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Juergen Gross
|
43e4111086 |
xen, x86/entry/64: Add xen NMI trap entry
Instead of trying to execute any NMI via the bare metal's NMI trap handler use a Xen specific one for PV domains, like we do for e.g. debug traps. As in a PV domain the NMI is handled via the normal kernel stack this is the correct thing to do. This will enable us to get rid of the very fragile and questionable dependencies between the bare metal NMI handler and Xen assumptions believed to be broken anyway. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/5baf5c0528d58402441550c5770b98e7961e7680.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Andy Lutomirski
|
c39858de69 |
x86/entry/64: Remove the RESTORE_..._REGS infrastructure
All users of RESTORE_EXTRA_REGS, RESTORE_C_REGS and such, and REMOVE_PT_GPREGS_FROM_STACK are gone. Delete the macros. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/c32672f6e47c561893316d48e06c7656b1039a36.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Andy Lutomirski
|
471ee48322 |
x86/entry/64: Use POP instead of MOV to restore regs on NMI return
This gets rid of the last user of the old RESTORE_..._REGS infrastructure. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/652a260f17a160789bc6a41d997f98249b73e2ab.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Andy Lutomirski
|
a512210643 |
x86/entry/64: Merge the fast and slow SYSRET paths
They did almost the same thing. Remove a bunch of pointless instructions (mostly hidden in macros) and reduce cognitive load by merging them. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1204e20233fcab9130a1ba80b3b1879b5db3fc1f.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Andy Lutomirski
|
4fbb39108f |
x86/entry/64: Use pop instead of movq in syscall_return_via_sysret
Saves 64 bytes. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/6609b7f74ab31c36604ad746e019ea8495aec76c.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Andy Lutomirski
|
e53178328c |
x86/entry/64: Shrink paranoid_exit_restore and make labels local
paranoid_exit_restore was a copy of restore_regs_and_return_to_kernel. Merge them and make the paranoid_exit internal labels local. Keeping .Lparanoid_exit makes the code a bit shorter because it allows a 2-byte jnz instead of a 5-byte jnz. Saves 96 bytes of text. ( This is still a bit suboptimal in a non-CONFIG_TRACE_IRQFLAGS kernel, but fixing that would make the code rather messy. ) Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/510d66a1895cda9473c84b1086f0bb974f22de6a.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Andy Lutomirski
|
e872045bfd |
x86/entry/64: Simplify reg restore code in the standard IRET paths
The old code restored all the registers with movq instead of pop. In theory, this was done because some CPUs have higher movq throughput, but any gain there would be tiny and is almost certainly outweighed by the higher text size. This saves 96 bytes of text. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/ad82520a207ccd851b04ba613f4f752b33ac05f7.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Andy Lutomirski
|
8a055d7f41 |
x86/entry/64: Move SWAPGS into the common IRET-to-usermode path
All of the code paths that ended up doing IRET to usermode did SWAPGS immediately beforehand. Move the SWAPGS into the common code. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/27fd6f45b7cd640de38fb9066fd0349bcd11f8e1.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Andy Lutomirski
|
26c4ef9c49 |
x86/entry/64: Split the IRET-to-user and IRET-to-kernel paths
These code paths will diverge soon. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/dccf8c7b3750199b4b30383c812d4e2931811509.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Andy Lutomirski
|
9da78ba6b4 |
x86/entry/64: Remove the restore_c_regs_and_iret label
The only user was the 64-bit opportunistic SYSRET failure path, and that path didn't really need it. This change makes the opportunistic SYSRET code a bit more straightforward and gets rid of the label. Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Borislav Petkov <bpetkov@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/be3006a7ad3326e3458cf1cc55d416252cbe1986.1509609304.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Masahiro Yamada
|
af8e947079 |
x86/build: Beautify build log of syscall headers
This makes the build log look nicer. Before: SYSTBL arch/x86/entry/syscalls/../../include/generated/asm/syscalls_32.h SYSHDR arch/x86/entry/syscalls/../../include/generated/asm/unistd_32_ia32.h SYSHDR arch/x86/entry/syscalls/../../include/generated/asm/unistd_64_x32.h SYSTBL arch/x86/entry/syscalls/../../include/generated/asm/syscalls_64.h SYSHDR arch/x86/entry/syscalls/../../include/generated/uapi/asm/unistd_32.h SYSHDR arch/x86/entry/syscalls/../../include/generated/uapi/asm/unistd_64.h SYSHDR arch/x86/entry/syscalls/../../include/generated/uapi/asm/unistd_x32.h After: SYSTBL arch/x86/include/generated/asm/syscalls_32.h SYSHDR arch/x86/include/generated/asm/unistd_32_ia32.h SYSHDR arch/x86/include/generated/asm/unistd_64_x32.h SYSTBL arch/x86/include/generated/asm/syscalls_64.h SYSHDR arch/x86/include/generated/uapi/asm/unistd_32.h SYSHDR arch/x86/include/generated/uapi/asm/unistd_64.h SYSHDR arch/x86/include/generated/uapi/asm/unistd_x32.h Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: linux-kbuild@vger.kernel.org Link: http://lkml.kernel.org/r/1509077470-2735-1-git-send-email-yamada.masahiro@socionext.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Mark Rutland
|
6aa7de0591 |
locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
Please do not apply this to mainline directly, instead please re-run the coccinelle script shown below and apply its output. For several reasons, it is desirable to use {READ,WRITE}_ONCE() in preference to ACCESS_ONCE(), and new code is expected to use one of the former. So far, there's been no reason to change most existing uses of ACCESS_ONCE(), as these aren't harmful, and changing them results in churn. However, for some features, the read/write distinction is critical to correct operation. To distinguish these cases, separate read/write accessors must be used. This patch migrates (most) remaining ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following coccinelle script: ---- // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and // WRITE_ONCE() // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch virtual patch @ depends on patch @ expression E1, E2; @@ - ACCESS_ONCE(E1) = E2 + WRITE_ONCE(E1, E2) @ depends on patch @ expression E; @@ - ACCESS_ONCE(E) + READ_ONCE(E) ---- Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: linux-arch@vger.kernel.org Cc: mpe@ellerman.id.au Cc: shuah@kernel.org Cc: snitzer@redhat.com Cc: thor.thayer@linux.intel.com Cc: tj@kernel.org Cc: viro@zeniv.linux.org.uk Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Josh Poimboeuf
|
82c62fa0c4 |
x86/asm: Don't use the confusing '.ifeq' directive
I find the '.ifeq <expression>' directive to be confusing. Reading it quickly seems to suggest its opposite meaning, or that it's missing an argument. Improve readability by replacing all of its x86 uses with '.if <expression> == 0'. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andrei Vagin <avagin@virtuozzo.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/757da028e802c7e98d23fbab8d234b1063e161cf.1508516398.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Josh Poimboeuf
|
98990a33b7 |
x86/entry: Fix idtentry unwind hint
This fixes the following ORC warning in the 'int3' entry code: WARNING: can't dereference iret registers at ffff8801c5f17fe0 for ip ffffffff95f0d94b The ORC metadata had the wrong stack offset for the iret registers. Their location on the stack is dependent on whether the exception has an error code. Reported-and-tested-by: Andrei Vagin <avagin@virtuozzo.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 8c1f75587a18 ("x86/entry/64: Add unwind hint annotations") Link: http://lkml.kernel.org/r/931d57f0551ed7979d5e7e05370d445c8e5137f8.1508516398.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Jérémy Lefaure
|
0cfe5b5fc0 |
x86: Use ARRAY_SIZE
Using the ARRAY_SIZE macro improves the readability of the code. Found with Coccinelle with the following semantic patch: @r depends on (org || report)@ type T; T[] E; position p; @@ ( (sizeof(E)@p /sizeof(*E)) | (sizeof(E)@p /sizeof(E[...])) | (sizeof(E)@p /sizeof(T)) ) Signed-off-by: Jérémy Lefaure <jeremy.lefaure@lse.epita.fr> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: linux-video@atrey.karlin.mff.cuni.cz Cc: Martin Mares <mj@ucw.cz> Cc: Andy Lutomirski <luto@amacapital.net> Link: https://lkml.kernel.org/r/20171001193101.8898-13-jeremy.lefaure@lse.epita.fr |
||
Josh Poimboeuf
|
5c99b692cf |
x86/unwind: Use MSB for frame pointer encoding on 32-bit
On x86-32, Tetsuo Handa and Fengguang Wu reported unwinder warnings like: WARNING: kernel stack regs at f60bb9c8 in swapper:1 has bad 'bp' value 0ba00000 And also there were some stack dumps with a bunch of unreliable '?' symbols after an apic_timer_interrupt symbol, meaning the unwinder got confused when it tried to read the regs. The cause of those issues is that, with GCC 4.8 (and possibly older), there are cases where GCC misaligns the stack pointer in a leaf function for no apparent reason: c124a388 <acpi_rs_move_data>: c124a388: 55 push %ebp c124a389: 89 e5 mov %esp,%ebp c124a38b: 57 push %edi c124a38c: 56 push %esi c124a38d: 89 d6 mov %edx,%esi c124a38f: 53 push %ebx c124a390: 31 db xor %ebx,%ebx c124a392: 83 ec 03 sub $0x3,%esp ... c124a3e3: 83 c4 03 add $0x3,%esp c124a3e6: 5b pop %ebx c124a3e7: 5e pop %esi c124a3e8: 5f pop %edi c124a3e9: 5d pop %ebp c124a3ea: c3 ret If an interrupt occurs in such a function, the regs on the stack will be unaligned, which breaks the frame pointer encoding assumption. So on 32-bit, use the MSB instead of the LSB to encode the regs. This isn't an issue on 64-bit, because interrupts align the stack before writing to it. Reported-and-tested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Byungchul Park <byungchul.park@lge.com> Cc: LKP <lkp@01.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/279a26996a482ca716605c7dbc7f2db9d8d91e81.1507597785.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org> |
||
Linus Torvalds
|
24e700e291 |
Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 apic updates from Thomas Gleixner: "This update provides: - Cleanup of the IDT management including the removal of the extra tracing IDT. A first step to cleanup the vector management code. - The removal of the paravirt op adjust_exception_frame. This is a XEN specific issue, but merged through this branch to avoid nasty merge collisions - Prevent dmesg spam about the TSC DEADLINE bug, when the CPU has disabled the TSC DEADLINE timer in CPUID. - Adjust a debug message in the ioapic code to print out the information correctly" * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (51 commits) x86/idt: Fix the X86_TRAP_BP gate x86/xen: Get rid of paravirt op adjust_exception_frame x86/eisa: Add missing include x86/idt: Remove superfluous ALIGNment x86/apic: Silence "FW_BUG TSC_DEADLINE disabled due to Errata" on CPUs without the feature x86/idt: Remove the tracing IDT leftovers x86/idt: Hide set_intr_gate() x86/idt: Simplify alloc_intr_gate() x86/idt: Deinline setup functions x86/idt: Remove unused functions/inlines x86/idt: Move interrupt gate initialization to IDT code x86/idt: Move APIC gate initialization to tables x86/idt: Move regular trap init to tables x86/idt: Move IST stack based traps to table init x86/idt: Move debug stack init to table based x86/idt: Switch early trap init to IDT tables x86/idt: Prepare for table based init x86/idt: Move early IDT setup out of 32-bit asm x86/idt: Move early IDT handler setup to IDT code x86/idt: Consolidate IDT invalidation ... |