25884 Commits

Author SHA1 Message Date
Bin Gao
4635fdc696 x86/tsc: Mark Intel ATOM_GOLDMONT TSC reliable
On Intel GOLDMONT Atom SoC TSC is the only available clocksource, so there
is no way to do software calibration or have a watchdog clocksource for it.
Software calibration is already disabled via the TSC_KNOWN_FREQ flag, but
the watchdog requirement still persists, so such systems cannot switch to
high resolution/nohz mode.

Mark it reliable, so it becomes usable. Hardware teams confirmed that this
is safe on that SoC.

Signed-off-by: Bin Gao <bin.gao@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1479241644-234277-4-git-send-email-bin.gao@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-18 10:58:30 +01:00
Bin Gao
4ca4df0b7e x86/tsc: Mark TSC frequency determined by CPUID as known
CPUs/SoCs with CPUID leaf 0x15 come with a known frequency and will report
the frequency to software via CPUID instruction. This hardware provided
frequency is the "real" frequency of TSC.

Set the X86_FEATURE_TSC_KNOWN_FREQ flag for such systems to skip the
software calibration process.

A 24 hours test on one of the CPUID 0x15 capable platforms was
conducted. PIT calibrated frequency resulted in more than 3 seconds drift
whereas the CPUID determined frequency showed less than 0.5 second
drift.

Signed-off-by: Bin Gao <bin.gao@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1479241644-234277-3-git-send-email-bin.gao@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-18 10:58:30 +01:00
Bin Gao
47c95a46d0 x86/tsc: Add X86_FEATURE_TSC_KNOWN_FREQ flag
The X86_FEATURE_TSC_RELIABLE flag in Linux kernel implies both reliable
(at runtime) and trustable (at calibration). But reliable running and
trustable calibration independent of each other. 

Add a new flag X86_FEATURE_TSC_KNOWN_FREQ, which denotes that the frequency
is known (via MSR/CPUID). This flag is only meant to skip the long term
calibration on systems which have a known frequency.

Add X86_FEATURE_TSC_KNOWN_FREQ to the skip the delayed calibration and
leave X86_FEATURE_TSC_RELIABLE in place.

After converting the existing users of X86_FEATURE_TSC_RELIABLE to use
either both flags or just X86_FEATURE_TSC_KNOWN_FREQ we can seperate the
functionality.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Bin Gao <bin.gao@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1479241644-234277-2-git-send-email-bin.gao@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-18 10:58:30 +01:00
Janakarajan Natarajan
e40ed1542d perf/x86: Add perf support for AMD family-17h processors
This patch enables perf core PMU support for the new AMD family-17h processors.

In family-17h, there is no PMC-event constraint. All events, irrespective of
the type, can be measured using any of the six generic performance counters.

Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1479399306-13375-1-git-send-email-Janakarajan.Natarajan@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-18 09:45:57 +01:00
Josh Poimboeuf
91e08ab0c8 x86/dumpstack: Prevent KASAN false positive warnings
The oops stack dump code scans the entire stack, which can cause KASAN
"stack-out-of-bounds" false positive warnings.  Tell KASAN to ignore it.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: davej@codemonkey.org.uk
Cc: dvyukov@google.com
Link: http://lkml.kernel.org/r/5f6e80c4b0c7f7f0b6211900847a247cdaad753c.1479398226.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-18 09:38:00 +01:00
Josh Poimboeuf
c2d75e03d6 x86/unwind: Prevent KASAN false positive warnings in guess unwinder
The guess unwinder scans the entire stack, which can cause KASAN
"stack-out-of-bounds" false positive warnings.  Tell KASAN to ignore it.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Stephane Eranian <eranian@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: davej@codemonkey.org.uk
Cc: dvyukov@google.com
Link: http://lkml.kernel.org/r/61939c0b2b2d63ce97ba59cba3b00fd47c2962cf.1479398226.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-18 09:38:00 +01:00
Greg Tucker
7a0b86b1b9 crypto: sha-mb - Fix total_len for correct hash when larger than 512MB
Current multi-buffer hash implementations have a restriction on the total
length of a hash job to 512MB. Hashing larger buffers will result in an
incorrect hash. This extends the limit to 2^62 - 1.

Signed-off-by: Greg Tucker <greg.b.tucker@intel.com>
Acked-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2016-11-17 23:35:00 +08:00
Daniel Vetter
3975797f3e Merge remote-tracking branch 'airlied/drm-next' into drm-intel-next-queued
Tvrtko needs

commit b3c11ac267d461d3d597967164ff7278a919a39f
Author: Eric Engestrom <eric@engestrom.ch>
Date:   Sat Nov 12 01:12:56 2016 +0000

    drm: move allocation out of drm_get_format_name()

to be able to apply his patches without conflicts.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2016-11-17 14:32:57 +01:00
Andy Lutomirski
a582c540ac x86/vdso: Use RDPID in preference to LSL when available
RDPID is a new instruction that reads MSR_TSC_AUX quickly.  This
should be considerably faster than reading the GDT.  Add a
cpufeature for it and use it from __vdso_getcpu() when available.

Tested-by: Megha Dey <megha.dey@intel.com>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/4f6c3a22012d10f1c65b9ca15800e01b42c7d39d.1479320367.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-17 08:31:14 +01:00
Ingo Molnar
89a01c51cb Merge branch 'x86/cpufeature' into x86/asm, to pick up dependency
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-17 08:30:54 +01:00
Christian Borntraeger
6d0d287891 locking/core: Provide common cpu_relax_yield() definition
No need to duplicate the same define everywhere. Since
the only user is stop-machine and the only provider is
s390, we can use a default implementation of cpu_relax_yield()
in sched.h.

Suggested-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: kvm@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-s390 <linux-s390@vger.kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: sparclinux@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1479298985-191589-1-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-17 08:17:36 +01:00
Josh Poimboeuf
f4474c9f0b x86/dumpstack: Handle NULL stack pointer in show_trace_log_lvl()
When show_trace_log_lvl() is called from show_regs(), it completely
fails to dump the stack.  This bug was introduced when
show_stack_log_lvl() was removed with the following commit:

  0ee1dd9f5e7e ("x86/dumpstack: Remove raw stack dump")

Previous callers of that function now call show_trace_log_lvl()
directly.  That resulted in a subtle change, in that the 'stack'
argument can now be NULL in certain cases.

A NULL 'stack' pointer means that the stack dump should start from the
topmost stack frame unless 'regs' is valid, in which case it should
start from 'regs->sp'.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 0ee1dd9f5e7e ("x86/dumpstack: Remove raw stack dump")
Link: http://lkml.kernel.org/r/c551842302a9c222d96a14e42e4003f059509f69.1479362652.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-17 07:48:39 +01:00
Arnd Bergmann
553bbc11aa x86/boot: Avoid warning for zero-filling .bss
The latest binutils are warning about a .fill directive with an explicit
value in a .bss section:

  arch/x86/kernel/head_32.S: Assembler messages:
  arch/x86/kernel/head_32.S:677: Warning: ignoring fill value in section `.bss..page_aligned'
  arch/x86/kernel/head_32.S:679: Warning: ignoring fill value in section `.bss..page_aligned'

This comes from the 'ENTRY()' macro padding the space between the symbols
with 'nop' via:

  .align 4,0x90

Open-coding the .globl directive without the padding avoids that warning,
as all the symbols are already page aligned.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20161116141726.2013389-1-arnd@arndb.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-17 07:34:58 +01:00
Gayatri Kammela
a8d9df5a50 x86/cpufeatures: Enable new AVX512 cpu features
Add a few new AVX512 instruction groups/features for enumeration in
/proc/cpuinfo: AVX512IFMA and AVX512VBMI.

Clear the flags in fpu_xstate_clear_all_cpu_caps().

CPUID.(EAX=7,ECX=0):EBX[bit 21] AVX512IFMA
CPUID.(EAX=7,ECX=0):ECX[bit 1]  AVX512VBMI

Detailed information of cpuid bits for the features can be found at
https://bugzilla.kernel.org/show_bug.cgi?id=187891

Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Ravi Shankar <ravi.v.shankar@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: mingo@elte.hu
Link: http://lkml.kernel.org/r/1479327060-18668-1-git-send-email-gayatri.kammela@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-17 01:09:40 +01:00
Luwei Kang
4504b5c941 kvm: x86: Add AVX512_4VNNIW and AVX512_4FMAPS support
Add two new AVX512 subfeatures support for KVM guest.

AVX512_4VNNIW:
Vector instructions for deep learning enhanced word variable precision.

AVX512_4FMAPS:
Vector instructions for deep learning floating-point single precision.

Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: He Chen <he.chen@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
[Changed subject tags.]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-11-16 22:20:02 +01:00
Radim Krčmář
283c95d0e3 KVM: x86: emulate FXSAVE and FXRSTOR
Internal errors were reported on 16 bit fxsave and fxrstor with ipxe.
Old Intels don't have unrestricted_guest, so we have to emulate them.

The patch takes advantage of the hardware implementation.

AMD and Intel differ in saving and restoring other fields in first 32
bytes.  A test wrote 0xff to the fxsave area, 0 to upper bits of MCSXR
in the fxsave area, executed fxrstor, rewrote the fxsave area to 0xee,
and executed fxsave:

  Intel (Nehalem):
    7f 1f 7f 7f ff 00 ff 07 ff ff ff ff ff ff 00 00
    ff ff ff ff ff ff 00 00 ff ff 00 00 ff ff 00 00
  Intel (Haswell -- deprecated FPU CS and FPU DS):
    7f 1f 7f 7f ff 00 ff 07 ff ff ff ff 00 00 00 00
    ff ff ff ff 00 00 00 00 ff ff 00 00 ff ff 00 00
  AMD (Opteron 2300-series):
    7f 1f 7f 7f ff 00 ee ee ee ee ee ee ee ee ee ee
    ee ee ee ee ee ee ee ee ff ff 00 00 ff ff 02 00

fxsave/fxrstor will only be emulated on early Intels, so KVM can't do
much to improve the situation.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2016-11-16 22:09:46 +01:00
Radim Krčmář
aabba3c6ab KVM: x86: add asm_safe wrapper
Move the existing exception handling for inline assembly into a macro
and switch its return values to X86EMUL type.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-16 22:09:45 +01:00
Radim Krčmář
4852018789 KVM: x86: save one bit in ctxt->d
Alignments are exclusive, so 5 modes can be expressed in 3 bits.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-16 22:09:45 +01:00
Radim Krčmář
d3fe959f81 KVM: x86: add Align16 instruction flag
Needed for FXSAVE and FXRSTOR.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-16 22:09:45 +01:00
Jiang Biao
695151964d kvm: x86: remove unused but set variable
The local variable *gpa_offset* is set but not used afterwards,
which make the compiler issue a warning with option
-Wunused-but-set-variable. Remove it to avoid the warning.

Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-16 22:09:44 +01:00
Jiang Biao
ecd8a8c2b4 kvm: x86: hyperv: make function static to avoid compiling warning
synic_set_irq is only used in hyperv.c, and should be static to
avoid compiling warning when with -Wmissing-prototypes option.

Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-16 22:09:44 +01:00
Jiang Biao
1e13175bd2 kvm: x86: cpuid: remove the unnecessary variable
The use of local variable *function* is not necessary here. Remove
it to avoid compiling warning with -Wunused-but-set-variable option.

Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-16 22:09:44 +01:00
Jiang Biao
ae6a237560 kvm: x86: make a function in x86.c static to avoid compiling warning
kvm_emulate_wbinvd_noskip is only used in x86.c, and should be
static to avoid compiling warning when with -Wmissing-prototypes
option.

Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-16 22:09:44 +01:00
Jiang Biao
33365e7a45 kvm: x86: make function static to avoid compiling warning
vmx_arm_hv_timer is only used in vmx.c, and should be static to
avoid compiling warning when with -Wmissing-prototypes option.

Signed-off-by: Jiang Biao <jiang.biao2@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-16 22:09:43 +01:00
Radim Krčmář
813ae37e6a Merge branch 'x86/cpufeature' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into kvm/next
Topic branch for AVX512_4VNNIW and AVX512_4FMAPS support in KVM.
2016-11-16 22:07:36 +01:00
Yazen Ghannam
ddfe43cdc0 x86/amd_nb: Add SMN and Indirect Data Fabric access for AMD Fam17h
Some devices on Fam17h can only be accessed through the System Management
Network (SMN). The SMN is accessed by a pair of index/data registers in PCI
config space. Add a pair of functions to read from and write to the SMN.

The Data Fabric on Fam17h allows multiple devices to use the same register
space. The registers of a specific device are accessed indirectly using the
device's DF InstanceId. Currently, we only need to read from these devices,
so only define a read function for now.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/1478812257-5424-5-git-send-email-Yazen.Ghannam@amd.com
[ Boris: make __amd_smn_rw() even more compact. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 20:46:38 +01:00
Yazen Ghannam
b791c6b6a5 x86/amd_nb: Add Fam17h Data Fabric as "Northbridge"
AMD Fam17h uses a Data Fabric component instead of a traditional
Northbridge. However, the DF is similar to a NB in that there is one per
die and it uses PCI config D18Fx registers. So let's reuse the existing
AMD_NB infrastructure for Data Fabrics.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/1478812257-5424-4-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 20:46:38 +01:00
Yazen Ghannam
de6bd0835a x86/amd_nb: Make all exports EXPORT_SYMBOL_GPL
Make all EXPORT_SYMBOL's into EXPORT_SYMBOL_GPL. While we're at it let's
fix some checkpatch warnings.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/1478812257-5424-3-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 20:46:38 +01:00
Yazen Ghannam
c7993890e7 x86/amd_nb: Make amd_northbridges internal to amd_nb.c
Hide amd_northbridges in amd_nb.c so that external callers will have to
use the exported accessor functions.

Also, fix some checkpatch.pl warnings.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/1478812257-5424-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 20:46:37 +01:00
Thomas Gleixner
7ce7f35b33 Merge branch 'x86/cpufeature' into x86/cache
Resolve the cpu/scattered conflict.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 14:19:34 +01:00
He Chen
47bdf3378d x86/cpuid: Provide get_scattered_cpuid_leaf()
Sparse populated CPUID leafs are collected in a software provided leaf to
avoid bloat of the x86_capability array, but there is no way to rebuild the
real leafs (e.g. for KVM CPUID enumeration) other than rereading the CPUID
leaf from the CPU. While this is possible it is problematic as it does not
take software disabled features into account. If a feature is disabled on
the host it should not be exposed to a guest either.

Add get_scattered_cpuid_leaf() which rebuilds the leaf from the scattered
cpuid table information and the active CPU features.

[ tglx: Rewrote changelog ]

Signed-off-by: He Chen <he.chen@linux.intel.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Luwei Kang <luwei.kang@intel.com>
Cc: kvm@vger.kernel.org
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Piotr Luc <Piotr.Luc@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lkml.kernel.org/r/1478856336-9388-3-git-send-email-he.chen@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 11:13:09 +01:00
He Chen
47f10a3600 x86/cpuid: Cleanup cpuid_regs definitions
cpuid_regs is defined multiple times as structure and enum. Rename the enum
and move all of it to processor.h so we don't end up with more instances.

Rename the misnomed register enumeration from CR_* to the obvious CPUID_*.

[ tglx: Rewrote changelog ]

Signed-off-by: He Chen <he.chen@linux.intel.com>
Reviewed-by: Borislav Petkov <bp@alien8.de>
Cc: Luwei Kang <luwei.kang@intel.com>
Cc: kvm@vger.kernel.org
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Piotr Luc <Piotr.Luc@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lkml.kernel.org/r/1478856336-9388-2-git-send-email-he.chen@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 11:13:09 +01:00
Ed Blake
98838d9507 serial: 8250: Add IrDA to UART capabilities
Add an IrDA UART capability flag and change the type of
uart_8250_port.capabilities to be u32 rather than unsigned short to
accommodate the additional flag.

Signed-off-by: Ed Blake <ed.blake@imgtec.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2016-11-16 10:59:38 +01:00
Martin Schwidefsky
f285144f81 sched/x86: Do not clear PREEMPT_NEED_RESCHED on preempt count reset
The per-cpu preempt count of x86 contains two values, the actual preempt
count and the inverted PREEMPT_NEED_RESCHED bit. If a corrupted preempt
count is detected the preempt_count_set() function is used to reset the
preempt count.

In case the inverted PREEMPT_NEED_RESCHED bit is zero at the time of the
reset, the preemption indication is lost. Use raw_cpu_cmpxchg_4() to reset
only the count part and leave the PREEMPT_NEED_RESCHED bit as it is.

This improves the kernel's behavior when it runs into preempt count leaks
and tries to fix them up.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1478523660-733-1-git-send-email-schwidefsky@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16 10:29:04 +01:00
Borislav Petkov
5d07c2cc19 x86/msr: Cleanup/streamline MSR helpers
Make the MSR argument an unsigned int, both low and high u32, put
"notrace" last in the function signature. Reflow function signatures for
better readability and cleanup white space.

Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 10:23:02 +01:00
Ingo Molnar
0acbc7aa47 Merge branch 'linus' into sched/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16 10:16:28 +01:00
Christian Borntraeger
5bd0b85ba8 locking/core, arch: Remove cpu_relax_lowlatency()
As there are no users left, we can remove cpu_relax_lowlatency()
implementations from every architecture.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Cc: <linux-arch@vger.kernel.org>
Link: http://lkml.kernel.org/r/1477386195-32736-6-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16 10:15:11 +01:00
Christian Borntraeger
79ab11cdb9 locking/core: Introduce cpu_relax_yield()
For spinning loops people do often use barrier() or cpu_relax().
For most architectures cpu_relax and barrier are the same, but on
some architectures cpu_relax can add some latency.
For example on power,sparc64 and arc, cpu_relax can shift the CPU
towards other hardware threads in an SMT environment.
On s390 cpu_relax does even more, it uses an hypercall to the
hypervisor to give up the timeslice.
In contrast to the SMT yielding this can result in larger latencies.
In some places this latency is unwanted, so another variant
"cpu_relax_lowlatency" was introduced. Before this is used in more
and more places, lets revert the logic and provide a cpu_relax_yield
that can be called in places where yielding is more important than
latency. By default this is the same as cpu_relax on all architectures.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Noam Camus <noamc@ezchip.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1477386195-32736-2-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16 10:15:09 +01:00
Yazen Ghannam
18807ddb7f x86/mce/AMD: Reset Threshold Limit after logging error
The error count field in MCA_MISC does not get reset by hardware when the
threshold has been reached. Software is expected to reset it. Currently,
the threshold limit only gets reset during init or when a user writes to
sysfs.

If the user is not monitoring threshold interrupts and resetting
the limit then the user will only see 1 interrupt when the limit is first
hit. So if, for example, the limit is set to 10 then only 1 interrupt will
be recorded after 10 errors even if 100 errors have occurred. The user may
then assume that only 10 errors have occurred.

Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: x86-ml <x86@kernel.org>
Link: http://lkml.kernel.org/r/1479244433-69267-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 09:57:11 +01:00
Kan Liang
c499336cea perf/x86/uncore: Fix crash by removing bogus event_list[] handling for SNB client uncore IMC
Vince Weaver reported the following bug when KASAN is enabled:

 [  205.748005] BUG: KASAN: slab-out-of-bounds in snb_uncore_imc_event_del+0x6c/0xa0 at addr ffff8800caa43768
 [  205.758324] Read of size 8 by task perf_fuzzer/6618

It's caused by accessing box->event_list.

For client IMC, there are no generic counters. It defines its own fixed
free running counters. So event_list and n_events are unused.

They can be removed safely, which fixes the bug.

( There's still the separate question of how uninitialized state snuck into
  this data structure - but that's a separate fix. )

Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Tested-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@kernel.org
Cc: davej@codemonkey.org.uk
Cc: dvyukov@google.com
Cc: eranian@gmail.com
Link: http://lkml.kernel.org/r/1479235210-29090-1-git-send-email-kan.liang@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16 09:46:35 +01:00
David Herrmann
f96acec8c8 x86/sysfb: Fix lfb_size calculation
The screen_info.lfb_size field is shifted by 16 bits *only* in case of
VBE. This has historical reasons since VBE advertised it similarly.
However, in case of EFI framebuffers, the size is no longer shifted. Fix
the x86 simple-framebuffer setup code to use the correct size in the
non-VBE case.

While at it, avoid variable abbreviations and rename 'len' to 'length',
and use the correct types matching the screen_info definition.

Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Gundersen <teg@jklm.no>
Link: http://lkml.kernel.org/r/20161115120158.15388-3-dh.herrmann@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16 09:38:23 +01:00
David Herrmann
9164b4ceb7 x86/sysfb: Add support for 64bit EFI lfb_base
The screen_info object was extended to support 64-bit lfb_base addresses
in:

  ae2ee627dc87 ("efifb: Add support for 64-bit frame buffer addresses")

However, the x86 simple-framebuffer setup code never made use of it. Fix
it to properly assemble and verify the lfb_base before advertising
simple-framebuffer devices.

In particular, this means if VIDEO_CAPABILITY_64BIT_BASE is set, the
screen_info->ext_lfb_base field will contain the upper 32bit of the
actual lfb_base. Make sure the address is not 0 (i.e., unset), as well as
does not overflow the physical address type.

Signed-off-by: David Herrmann <dh.herrmann@gmail.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Gundersen <teg@jklm.no>
Link: http://lkml.kernel.org/r/20161115120158.15388-2-dh.herrmann@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-16 09:38:22 +01:00
Sebastian Andrzej Siewior
0e285d36bd x86/mcheck: Move CPU_DEAD to hotplug state machine
This moves the last piece of the old hotplug notifier code in MCE to the
new hotplug state machine.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: rt@linutronix.de
Cc: linux-edac@vger.kernel.org
Link: http://lkml.kernel.org/r/20161110174447.11848-8-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 09:34:18 +01:00
Sebastian Andrzej Siewior
8c0eeac819 x86/mcheck: Move CPU_ONLINE and CPU_DOWN_PREPARE to hotplug state machine
The CPU_ONLINE and CPU_DOWN_PREPARE look fully symmetrical and could be move
to the hotplug state machine.
On a failure during registration we have the tear down callback invoked
(mce_cpu_pre_down()) so there should be no timer around and so no need to need
keep notifier installed (this was the reason according to the comment why the
notifier was registered despite of errors).

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: rt@linutronix.de
Cc: linux-edac@vger.kernel.org
Link: http://lkml.kernel.org/r/20161110174447.11848-7-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 09:34:18 +01:00
Sebastian Andrzej Siewior
39f152ffbf x86/mcheck: Reorganize the hotplug callbacks
Initially I wanted to remove mcheck_cpu_init() from identify_cpu() and let it
become an independent early hotplug callback. The main problem here was that
the init on the boot CPU may happen too late
(device_initcall_sync(mcheck_init_device)) and nobody wanted to risk receiving
and MCE event at boot time leading to a shutdown (if the MCE feature is not yet
enabled).

Here is attempt two: the timming stays as-is but the ordering of the functions
is changed:
- mcheck_cpu_init() (which is run from identify_cpu()) will setup the timer
  struct but won't fire the timer. This is moved to CPU_ONLINE since its
  cleanup part is in CPU_DOWN_PREPARE. So if it is okay to stop the timer early
  in the shutdown phase, it should be okay to start it late in the bring up phase.

- CPU_DOWN_PREPARE disables the MCE feature flags for !INTEL CPUs in
  mce_disable_cpu(). If a failure occures it would be re-enabled on all vendor
  CPUs (including Intel where it was not disabled during shutdown). To keep this
  working I am moving it to CPU_ONLINE. smp_call_function_single() is dropped
  beause the notifier runs nowdays on the target CPU.

- CPU_ONLINE is invoking mce_device_create() + mce_threshold_create_device()
  but its cleanup part is in CPU_DEAD (mce_threshold_remove_device() and
  mce_device_remove()). In order to keep this symmetrical I am moving the clean
  up from CPU_DEAD to CPU_DOWN_PREPARE.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: rt@linutronix.de
Cc: linux-edac@vger.kernel.org
Link: http://lkml.kernel.org/r/20161110174447.11848-6-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 09:34:18 +01:00
Sebastian Andrzej Siewior
4d7b02d58c x86/mcheck: Split threshold_cpu_callback into two callbacks
The threshold_cpu_callback callbacks looks like one of the notifier and
its arguments are almost the same. Split this out and have one ONLINE
and one DEAD callback. This will come handy later once the main code
gets changed to use the callback mechanism.
Also, handle threshold_cpu_callback_online() return value so we don't
continue if the function fails.

Boris Petkov removed the callback pointer and replaced it with proper
functions.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: rt@linutronix.de
Cc: linux-edac@vger.kernel.org
Link: http://lkml.kernel.org/r/20161110174447.11848-5-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 09:34:17 +01:00
Sebastian Andrzej Siewior
7f34b935e8 x86/mcheck: Be prepared for a rollback back to the ONLINE state
If we try a CPU down and fail in the middle then we roll back to the
online state. This means we would perform CPU_ONLINE / mce_device_create()
without invoking CPU_DEAD / mce_device_remove() for the cleanup of what was
allocated in CPU_ONLINE.

Be prepared for this and don't allocate the struct if we have it
already.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: rt@linutronix.de
Cc: linux-edac@vger.kernel.org
Link: http://lkml.kernel.org/r/20161110174447.11848-4-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 09:34:17 +01:00
Sebastian Andrzej Siewior
ec553abb31 x86/mcheck: Explicit cleanup on failure in mce_amd
If the ONLINE callback fails, the driver does not any clean up right
away instead it waits to get to the DEAD stage to do it. Yes, it waits.
Since we don't pass the error code back to the caller, no one knows.

Do the clean up right away so it does not look like a leak.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: rt@linutronix.de
Cc: linux-edac@vger.kernel.org
Link: http://lkml.kernel.org/r/20161110174447.11848-3-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 09:34:17 +01:00
Sebastian Andrzej Siewior
0943637293 x86/mcheck: Move threshold_create_device()
Move the threshold_create_device() so it can use
threshold_remove_device() without a forward declaration.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: rt@linutronix.de
Cc: linux-edac@vger.kernel.org
Link: http://lkml.kernel.org/r/20161110174447.11848-2-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-16 09:34:16 +01:00
Fenghua Yu
f410770293 x86/intel_rdt: Update percpu closid immeditately on CPUs affected by changee
If CPUs are moved to or removed from a rdtgroup, the percpu closid storage
is updated. If tasks running on an affected CPU use the percpu closid then
the PQR_ASSOC MSR is only updated when the task runs through a context
switch. Up to the context switch the CPUs operate on the wrong closid. This
state is potentially unbound.
    
Make the change immediately effective by invoking a smp function call on
the affected CPUs which stores the new closid in the perpu storage and
calls the rdt_sched_in() function which updates the MSR, if the current
task uses the percpu closid.

[ tglx: Made it work and massaged changelog once more ]

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Cc: "Ravi V Shankar" <ravi.v.shankar@intel.com>
Cc: "Tony Luck" <tony.luck@intel.com>
Cc: "Sai Prakhya" <sai.praneeth.prakhya@intel.com>
Cc: "Vikas Shivappa" <vikas.shivappa@linux.intel.com>
Cc: "Ingo Molnar" <mingo@elte.hu>
Cc: "H. Peter Anvin" <h.peter.anvin@intel.com>
Link: http://lkml.kernel.org/r/1478912558-55514-3-git-send-email-fenghua.yu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-11-15 18:35:50 +01:00