24968 Commits

Author SHA1 Message Date
Vladimir Davydov
3e79ec7ddc arch: x86: charge page tables to kmemcg
Page tables can bite a relatively big chunk off system memory and their
allocations are easy to trigger from userspace, so they should be
accounted to kmemcg.

This patch marks page table allocations as __GFP_ACCOUNT for x86.  Note
we must not charge allocations of kernel page tables, because they can
be shared among processes from different cgroups so accounting them to a
particular one can pin other cgroups for indefinitely long.  So we clear
__GFP_ACCOUNT flag if a page table is allocated for the kernel.

Link: http://lkml.kernel.org/r/7d5c54f6a2bcbe76f03171689440003d87e6c742.1464079538.git.vdavydov@virtuozzo.com
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kees Cook
c965b105bf kbuild: abort build on bad stack protector flag
Before, the stack protector flag was sanity checked before .config had
been reprocessed.  This meant the build couldn't be aborted early, and
only a warning could be emitted followed later by the compiler blowing
up with an unknown flag.  This has caused a lot of confusion over time,
so this splits the flag selection from sanity checking and performs the
sanity checking after the make has been restarted from a reprocessed
.config, so builds can be aborted as early as possible now.

Additionally moves the x86-specific sanity check to the same location,
since it suffered from the same warn-then-wait-for-compiler-failure
problem.

Link: http://lkml.kernel.org/r/20160712223043.GA11664@www.outflux.net
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Michal Marek <mmarek@suse.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-07-26 16:19:19 -07:00
Kees Cook
228d96c603 kbuild: Abort build on bad stack protector flag
Before, the stack protector flag was sanity checked before .config had
been reprocessed. This meant the build couldn't be aborted early, and
only a warning could be emitted followed later by the compiler blowing
up with an unknown flag. This has caused a lot of confusion over time,
so this splits the flag selection from sanity checking and performs the
sanity checking after the make has been restarted from a reprocessed
.config, so builds can be aborted as early as possible now.

Additionally moves the x86-specific sanity check to the same location,
since it suffered from the same warn-then-wait-for-compiler-failure
problem.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Michal Marek <mmarek@suse.com>
2016-07-26 23:50:59 +02:00
Linus Torvalds
bbce2ad2d7 Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "Here is the crypto update for 4.8:

  API:
   - first part of skcipher low-level conversions
   - add KPP (Key-agreement Protocol Primitives) interface.

  Algorithms:
   - fix IPsec/cryptd reordering issues that affects aesni
   - RSA no longer does explicit leading zero removal
   - add SHA3
   - add DH
   - add ECDH
   - improve DRBG performance by not doing CTR by hand

  Drivers:
   - add x86 AVX2 multibuffer SHA256/512
   - add POWER8 optimised crc32c
   - add xts support to vmx
   - add DH support to qat
   - add RSA support to caam
   - add Layerscape support to caam
   - add SEC1 AEAD support to talitos
   - improve performance by chaining requests in marvell/cesa
   - add support for Araneus Alea I USB RNG
   - add support for Broadcom BCM5301 RNG
   - add support for Amlogic Meson RNG
   - add support Broadcom NSP SoC RNG"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (180 commits)
  crypto: vmx - Fix aes_p8_xts_decrypt build failure
  crypto: vmx - Ignore generated files
  crypto: vmx - Adding support for XTS
  crypto: vmx - Adding asm subroutines for XTS
  crypto: skcipher - add comment for skcipher_alg->base
  crypto: testmgr - Print akcipher algorithm name
  crypto: marvell - Fix wrong flag used for GFP in mv_cesa_dma_add_iv_op
  crypto: nx - off by one bug in nx_of_update_msc()
  crypto: rsa-pkcs1pad - fix rsa-pkcs1pad request struct
  crypto: scatterwalk - Inline start/map/done
  crypto: scatterwalk - Remove unnecessary BUG in scatterwalk_start
  crypto: scatterwalk - Remove unnecessary advance in scatterwalk_pagedone
  crypto: scatterwalk - Fix test in scatterwalk_done
  crypto: api - Optimise away crypto_yield when hard preemption is on
  crypto: scatterwalk - add no-copy support to copychunks
  crypto: scatterwalk - Remove scatterwalk_bytes_sglen
  crypto: omap - Stop using crypto scatterwalk_bytes_sglen
  crypto: skcipher - Remove top-level givcipher interface
  crypto: user - Remove crypto_lookup_skcipher call
  crypto: cts - Convert to skcipher
  ...
2016-07-26 13:40:17 -07:00
Linus Torvalds
85802a49a8 This merge introduces three patches that are later reverted,
- Switching of MSR_TSC_AUX in SVM was thought to cause a host
    misbehavior, but it was later cleared of those doubts and the patch
    moved code to a hot path, so we reverted it.  That patch also needed
    a fix for 32 bit builds and both were reverted in one go.
 
  - Al Viro noticed that a fix for a leak in an error path was not valid
    with the given API and provided a better fix, so the original patch
    was reverted.
 
 Then there are two VMX fixes that move code around because VMCS was not
 accessed between vcpu_load() and vcpu_put(), a simple ARM VHE fix, and
 two one-liners for PML and MTRR.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCAAGBQJXljRwAAoJEED/6hsPKofoIZMIAIMm2h5HKplmpT007dVCt1zw
 dG8gO9hxOxstXfGVkNEZvdxyUb0ilFMO5AYySS1ctENpswVAZlKWlyc+aNGsHXOS
 KFylAUNHlibua1xgB64Sitgub8M9Ct5mDfvvqWL79aCgHTLcDxnb/0NprTqB2P3O
 TfsLaiKMDOeZs4nTcs62vNqpPJzoFc6DK2x1RltFGF9RpR7bOD7gnp7KypDWJx7S
 1LleWPHboxHQ40qf8dxAb7HwEARfXndlP6ZoCkf2stoWTwuexHJfesUnsNgEuXnX
 6YJ9mO7np/bHfSDpGMJbb9pPI5g7UDwOzmgvYQvzhak3LRmvjsZePpWchlb0yCs=
 =3VD4
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM leftovers from Radim Krčmář:
 "This is a combination of two pull requests for 4.7-rc8 that were not
  merged due to looking hairy.  I have changed the tag message to focus
  on circumstances of contained reverts as they were likely the reason
  behind rejection.

  This merge introduces three patches that are later reverted,

   - Switching of MSR_TSC_AUX in SVM was thought to cause a host
     misbehavior, but it was later cleared of those doubts and the patch
     moved code to a hot path, so we reverted it.  That patch also
     needed a fix for 32 bit builds and both were reverted in one go.

   - Al Viro noticed that a fix for a leak in an error path was not
     valid with the given API and provided a better fix, so the original
     patch was reverted.

  Then there are two VMX fixes that move code around because VMCS was
  not accessed between vcpu_load() and vcpu_put(), a simple ARM VHE fix,
  and two one-liners for PML and MTRR"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  arm64: KVM: VHE: Context switch MDSCR_EL1
  KVM: VMX: handle PML full VMEXIT that occurs during event delivery
  Revert "KVM: SVM: fix trashing of MSR_TSC_AUX"
  KVM: SVM: do not set MSR_TSC_AUX on 32-bit builds
  KVM: don't use anon_inode_getfd() before possible failures
  Revert "KVM: release anon file in failure path of vm creation"
  KVM: release anon file in failure path of vm creation
  KVM: nVMX: Fix memory corruption when using VMCS shadowing
  kvm: vmx: ensure VMCS is current while enabling PML
  KVM: SVM: fix trashing of MSR_TSC_AUX
  KVM: MTRR: fix kvm_mtrr_check_gfn_range_consistency page fault
2016-07-26 11:50:42 -07:00
Borislav Petkov
efaad554b4 x86/microcode/intel: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y
CONFIG_RANDOMIZE_MEMORY=y randomizes the physical memmap and thus the
address where the initrd is located. Therefore, we need to add the
offset KASLR put us to in order to find the initrd again on the AP path.

In the future, we will get rid of the initrd address caching and query
the address on both the BSP and AP paths but that would need more work.

Thanks to Nicolai Stange for the good bisection and debugging work.

Reported-and-tested-by: Nicolai Stange <nicstange@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160726095138.3470-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-26 19:32:57 +02:00
Linus Torvalds
37e13a1ebe Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "This tree contains tooling fixes plus some additions:

   - fixes to the vdso2c build environment that Stephen Rothwell is
     using for the linux-next build (Arnaldo Carvalho de Melo)

   - AVX-512 instruction mappings (Adrian Hunter)

   - misc fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Revert "perf tools: event.h needs asm/perf_regs.h"
  x86: Make the vdso2c compiler use the host architecture headers
  tools build: Fix objtool build with ARCH=x86_64
  objtool: Always use host headers
  objtool: Use tools/scripts/Makefile.arch to get ARCH and HOSTARCH
  tools build: Add HOSTARCH Makefile variable
  perf tests kmod-path: Fix build on ubuntu:16.04-x-armhf
  perf tools: Add AVX-512 instructions to the new instructions test
  perf tools: Add AVX-512 support to the instruction decoder used by Intel PT
  x86/insn: Add AVX-512 support to the instruction decoder
  x86/insn: perf tools: Fix vcvtph2ps instruction decoding
2016-07-26 10:26:29 -07:00
Juergen Gross
d34c30cc1f xen: add static initialization of steal_clock op to xen_time_ops
pv_time_ops might be overwritten with xen_time_ops after the
steal_clock operation has been initialized already. To prevent calling
a now uninitialized function pointer add the steal_clock static
initialization to xen_time_ops.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-26 14:07:06 +01:00
Dave Airlie
5e580523d9 Linux 4.7
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXlRXSAAoJEHm+PkMAQRiGG/gH/0Z8O4zWOsrwO+X1mRToRDBH
 joFOjAmCVe83T1VpF5LYNB+9+owL/dEDt6+ZIswnhH7AfQPjs4RqwS4PcuMbCDVO
 +mDm0PmfcKaYcQZrB2Z2OwIzRNnfCTVcsDPhIHwuIHk0m4z/xuGZonD8KoAj0+tO
 3yJF6sbE1KubDVjOb+lmZZSP3cXA0pDXrNhkYhE4Tsr8fiihGjeXSNJ8t2zPLjxo
 W3MPqo0rzDvQsOwoF4TWHHagVaFSJlhLBBgqu33fI7uO3jtfQD2G8wG68JCND1j3
 qbMoBfTLFV/yQmSIJUt0Wv1axaCcwnjpweEB35A/GEeZ0mNB1rDdoBeI1eKEQkc=
 =DGFC
 -----END PGP SIGNATURE-----

Backmerge tag 'v4.7' into drm-next

Linux 4.7

As requested by Daniel Vetter as the conflicts were getting messy.
2016-07-26 17:26:29 +10:00
Linus Torvalds
e65805251f Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq updates from Thomas Gleixner:
 "The irq department delivers:

   - new core infrastructure to allow better management of multi-queue
     devices (interrupt spreading, node aware descriptor allocation ...)

   - a new interrupt flow handler to support the new fangled Intel VMD
     devices.

   - yet another new interrupt controller driver.

   - a series of fixes which addresses sparse warnings, missing
     includes, missing static declarations etc from Ben Dooks.

   - a fix for the error handling in the hierarchical domain allocation
     code.

   - the usual pile of small updates to core and driver code"

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits)
  genirq: Fix missing irq allocation affinity hint
  irqdomain: Fix irq_domain_alloc_irqs_recursive() error handling
  irq/Documentation: Correct result of echnoing 5 to smp_affinity
  MAINTAINERS: Remove Jiang Liu from irq domains
  genirq/msi: Fix broken debug output
  genirq: Add a helper to spread an affinity mask for MSI/MSI-X vectors
  genirq/msi: Make use of affinity aware allocations
  genirq: Use affinity hint in irqdesc allocation
  genirq: Add affinity hint to irq allocation
  genirq: Introduce IRQD_AFFINITY_MANAGED flag
  genirq/msi: Remove unused MSI_FLAG_IDENTITY_MAP
  irqchip/s3c24xx: Fixup IO accessors for big endian
  irqchip/exynos-combiner: Fix usage of __raw IO
  irqdomain: Fix disposal of mappings for interrupt hierarchies
  irqchip/aspeed-vic: Add irq controller for Aspeed
  doc/devicetree: Add Aspeed VIC bindings
  x86/PCI/VMD: Use untracked irq handler
  genirq: Add untracked irq handler
  irqchip/mips-gic: Populate irq_domain names
  irqchip/gicv3-its: Implement two-level(indirect) device table support
  ...
2016-07-25 21:35:03 -07:00
Linus Torvalds
55392c4c06 Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer updates from Thomas Gleixner:
 "This update provides the following changes:

   - The rework of the timer wheel which addresses the shortcomings of
     the current wheel (cascading, slow search for next expiring timer,
     etc).  That's the first major change of the wheel in almost 20
     years since Finn implemted it.

   - A large overhaul of the clocksource drivers init functions to
     consolidate the Device Tree initialization

   - Some more Y2038 updates

   - A capability fix for timerfd

   - Yet another clock chip driver

   - The usual pile of updates, comment improvements all over the place"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (130 commits)
  tick/nohz: Optimize nohz idle enter
  clockevents: Make clockevents_subsys static
  clocksource/drivers/time-armada-370-xp: Fix return value check
  timers: Implement optimization for same expiry time in mod_timer()
  timers: Split out index calculation
  timers: Only wake softirq if necessary
  timers: Forward the wheel clock whenever possible
  timers/nohz: Remove pointless tick_nohz_kick_tick() function
  timers: Optimize collect_expired_timers() for NOHZ
  timers: Move __run_timers() function
  timers: Remove set_timer_slack() leftovers
  timers: Switch to a non-cascading wheel
  timers: Reduce the CPU index space to 256k
  timers: Give a few structs and members proper names
  hlist: Add hlist_is_singular_node() helper
  signals: Use hrtimer for sigtimedwait()
  timers: Remove the deprecated mod_timer_pinned() API
  timers, net/ipv4/inet: Initialize connection request timers as pinned
  timers, drivers/tty/mips_ejtag: Initialize the poll timer as pinned
  timers, drivers/tty/metag_da: Initialize the poll timer as pinned
  ...
2016-07-25 20:43:12 -07:00
Linus Torvalds
c410614c90 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Ingo Molnar:
 "Leftover fix from the v4.7 cycle: adds a reboot quirk"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/reboot: Add Dell Optiplex 7450 AIO reboot quirk
2016-07-25 20:06:38 -07:00
Linus Torvalds
5f22004ba9 Merge branch 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 timer updates from Ingo Molnar:
 "The main change in this tree is the reworking, fixing and extension of
  the TSC frequency enumeration code (by Len Brown)"

* 'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/tsc: Remove the unused check_tsc_disabled()
  x86/tsc: Enumerate BXT tsc_khz via CPUID
  x86/tsc: Enumerate SKL cpu_khz and tsc_khz via CPUID
  x86/tsc_msr: Remove irqoff around MSR-based TSC enumeration
  x86/tsc_msr: Add Airmont reference clock values
  x86/tsc_msr: Correct Silvermont reference clock values
  x86/tsc_msr: Update comments, expand definitions
  x86/tsc_msr: Remove debugging messages
  x86/tsc_msr: Identify Intel-specific code
  Revert "x86/tsc: Add missing Cherrytrail frequency to the table"
2016-07-25 19:41:35 -07:00
Linus Torvalds
8e466955d6 Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 platform updates from Ingo Molnar:
 "The main changes in this cycle were:

   - Intel-SoC enhancements (Andy Shevchenko)

   - Intel CPU symbolic model definition rework (Dave Hansen)

   - ... other misc changes"

* 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits)
  x86/sfi: Enable enumeration of SD devices
  x86/pci: Use MRFLD abbreviation for Merrifield
  x86/platform/intel-mid: Make vertical indentation consistent
  x86/platform/intel-mid: Mark regulators explicitly defined
  x86/platform/intel-mid: Rename mrfl.c to mrfld.c
  x86/platform/intel-mid: Enable spidev on Intel Edison boards
  x86/platform/intel-mid: Extend PWRMU to support Penwell
  x86/pci, x86/platform/intel_mid_pci: Remove duplicate power off code
  x86/platform/intel-mid: Add pinctrl for Intel Merrifield
  x86/platform/intel-mid: Enable GPIO expanders on Edison
  x86/platform/intel-mid: Add Power Management Unit driver
  x86/platform/atom/punit: Enable support for Merrifield
  x86/platform/intel_mid_pci: Rework IRQ0 workaround
  x86, thermal: Clean up and fix CPU model detection for intel_soc_dts_thermal
  x86, mmc: Use Intel family name macros for mmc driver
  x86/intel_telemetry: Use Intel family name macros for telemetry driver
  x86/acpi/lss: Use Intel family name macros for the acpi_lpss driver
  x86/cpufreq: Use Intel family name macros for the intel_pstate cpufreq driver
  x86/platform: Use new Intel model number macros
  x86/intel_idle: Use Intel family macros for intel_idle
  ...
2016-07-25 19:15:35 -07:00
Linus Torvalds
2d724ffddd Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fpu updates from Ingo Molnar:
 "The main x86 FPU changes in this cycle were:

   - a large series of cleanups, fixes and enhancements to re-enable the
     XSAVES instruction on Intel CPUs - which is the most advanced
     instruction to do FPU context switches (Yu-cheng Yu, Fenghua Yu)

   - Add FPU tracepoints for the FPU state machine (Dave Hansen)"

* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu: Do not BUG_ON() in early FPU code
  x86/fpu/xstate: Re-enable XSAVES
  x86/fpu/xstate: Fix fpstate_init() for XRSTORS
  x86/fpu/xstate: Return NULL for disabled xstate component address
  x86/fpu/xstate: Fix __fpu_restore_sig() for XSAVES
  x86/fpu/xstate: Fix xstate_offsets, xstate_sizes for non-extended xstates
  x86/fpu/xstate: Fix XSTATE component offset print out
  x86/fpu/xstate: Fix PTRACE frames for XSAVES
  x86/fpu/xstate: Fix supervisor xstate component offset
  x86/fpu/xstate: Align xstate components according to CPUID
  x86/fpu/xstate: Copy xstate registers directly to the signal frame when compacted format is in use
  x86/fpu/xstate: Keep init_fpstate.xsave.header.xfeatures as zero for init optimization
  x86/fpu/xstate: Rename 'xstate_size' to 'fpu_kernel_xstate_size', to distinguish it from 'fpu_user_xstate_size'
  x86/fpu/xstate: Define and use 'fpu_user_xstate_size'
  x86/fpu: Add tracepoints to dump FPU state at key points
2016-07-25 18:48:27 -07:00
Linus Torvalds
36e635cb21 Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 stackdump update from Ingo Molnar:
 "A number of stackdump enhancements"

* 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/dumpstack: Add show_stack_regs() and use it
  printk: Make the printk*once() variants return a value
  x86/dumpstack: Honor supplied @regs arg
2016-07-25 18:18:04 -07:00
Linus Torvalds
c265cc5c3c Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Ingo Molnar:
 "Three small cleanups"

* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lguest: Read offset of device_cap later
  lguest: Read length of device_cap later
  x86: Do away with ARCH_[WANT_OPTIONAL|REQUIRE]_GPIOLIB
2016-07-25 18:06:00 -07:00
Linus Torvalds
80f09cf5c1 Merge branch 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 build updates from Ingo Molnar:
 "A build system fix and a cleanup"

* 'x86-build-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  kbuild: Remove stale asm-generic wrappers
  kbuild, x86: Track generated headers with generated-y
2016-07-25 18:00:18 -07:00
Linus Torvalds
77cd3d0c43 Merge branch 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Ingo Molnar:
 "The main changes:

   - add initial commits to randomize kernel memory section virtual
     addresses, enabled via a new kernel option: RANDOMIZE_MEMORY
     (Thomas Garnier, Kees Cook, Baoquan He, Yinghai Lu)

   - enhance KASLR (RANDOMIZE_BASE) physical memory randomization (Kees
     Cook)

   - EBDA/BIOS region boot quirk cleanups (Andy Lutomirski, Ingo Molnar)

   - misc cleanups/fixes"

* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Simplify EBDA-vs-BIOS reservation logic
  x86/boot: Clarify what x86_legacy_features.reserve_bios_regions does
  x86/boot: Reorganize and clean up the BIOS area reservation code
  x86/mm: Do not reference phys addr beyond kernel
  x86/mm: Add memory hotplug support for KASLR memory randomization
  x86/mm: Enable KASLR for vmalloc memory regions
  x86/mm: Enable KASLR for physical mapping memory regions
  x86/mm: Implement ASLR for kernel memory regions
  x86/mm: Separate variable for trampoline PGD
  x86/mm: Add PUD VA support for physical mapping
  x86/mm: Update physical mapping variable names
  x86/mm: Refactor KASLR entropy functions
  x86/KASLR: Fix boot crash with certain memory configurations
  x86/boot/64: Add forgotten end of function marker
  x86/KASLR: Allow randomization below the load address
  x86/KASLR: Extend kernel image physical address randomization to addresses larger than 4G
  x86/KASLR: Randomize virtual address separately
  x86/KASLR: Clarify identity map interface
  x86/boot: Refuse to build with data relocations
  x86/KASLR, x86/power: Remove x86 hibernation restrictions
2016-07-25 17:32:28 -07:00
Linus Torvalds
0f657262d5 Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 mm updates from Ingo Molnar:
 "Various x86 low level modifications:

   - preparatory work to support virtually mapped kernel stacks (Andy
     Lutomirski)

   - support for 64-bit __get_user() on 32-bit kernels (Benjamin
     LaHaise)

   - (involved) workaround for Knights Landing CPU erratum (Dave Hansen)

   - MPX enhancements (Dave Hansen)

   - mremap() extension to allow remapping of the special VDSO vma, for
     purposes of user level context save/restore (Dmitry Safonov)

   - hweight and entry code cleanups (Borislav Petkov)

   - bitops code generation optimizations and cleanups with modern GCC
     (H. Peter Anvin)

   - syscall entry code optimizations (Paolo Bonzini)"

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits)
  x86/mm/cpa: Add missing comment in populate_pdg()
  x86/mm/cpa: Fix populate_pgd(): Stop trying to deallocate failed PUDs
  x86/syscalls: Add compat_sys_preadv64v2/compat_sys_pwritev64v2
  x86/smp: Remove unnecessary initialization of thread_info::cpu
  x86/smp: Remove stack_smp_processor_id()
  x86/uaccess: Move thread_info::addr_limit to thread_struct
  x86/dumpstack: Rename thread_struct::sig_on_uaccess_error to sig_on_uaccess_err
  x86/uaccess: Move thread_info::uaccess_err and thread_info::sig_on_uaccess_err to thread_struct
  x86/dumpstack: When OOPSing, rewind the stack before do_exit()
  x86/mm/64: In vmalloc_fault(), use CR3 instead of current->active_mm
  x86/dumpstack/64: Handle faults when printing the "Stack: " part of an OOPS
  x86/dumpstack: Try harder to get a call trace on stack overflow
  x86/mm: Remove kernel_unmap_pages_in_pgd() and efi_cleanup_page_tables()
  x86/mm/cpa: In populate_pgd(), don't set the PGD entry until it's populated
  x86/mm/hotplug: Don't remove PGD entries in remove_pagetable()
  x86/mm: Use pte_none() to test for empty PTE
  x86/mm: Disallow running with 32-bit PTEs to work around erratum
  x86/mm: Ignore A/D bits in pte/pmd/pud_none()
  x86/mm: Move swap offset/type up in PTE to work around erratum
  x86/entry: Inline enter_from_user_mode()
  ...
2016-07-25 15:34:18 -07:00
Linus Torvalds
425dbc6db3 Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/apic updates from Ingo Molnar:
 "Misc cleanups and a small fix"

* 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/apic: Remove the unused struct apic::apic_id_mask field
  x86/apic: Fix misspelled APIC
  x86/ioapic: Simplify ioapic_setup_resources()
2016-07-25 15:09:08 -07:00
Linus Torvalds
cca08cd66c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:

 - introduce and use task_rcu_dereference()/try_get_task_struct() to fix
   and generalize task_struct handling (Oleg Nesterov)

 - do various per entity load tracking (PELT) fixes and optimizations
   (Peter Zijlstra)

 - cputime virt-steal time accounting enhancements/fixes (Wanpeng Li)

 - introduce consolidated cputime output file cpuacct.usage_all and
   related refactorings (Zhao Lei)

 - ... plus misc fixes and enhancements

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/core: Panic on scheduling while atomic bugs if kernel.panic_on_warn is set
  sched/cpuacct: Introduce cpuacct.usage_all to show all CPU stats together
  sched/cpuacct: Use loop to consolidate code in cpuacct_stats_show()
  sched/cpuacct: Merge cpuacct_usage_index and cpuacct_stat_index enums
  sched/fair: Rework throttle_count sync
  sched/core: Fix sched_getaffinity() return value kerneldoc comment
  sched/fair: Reorder cgroup creation code
  sched/fair: Apply more PELT fixes
  sched/fair: Fix PELT integrity for new tasks
  sched/cgroup: Fix cpu_cgroup_fork() handling
  sched/fair: Fix PELT integrity for new groups
  sched/fair: Fix and optimize the fork() path
  sched/cputime: Add steal time support to full dynticks CPU time accounting
  sched/cputime: Fix prev steal time accouting during CPU hotplug
  KVM: Fix steal clock warp during guest CPU hotplug
  sched/debug: Always show 'nr_migrations'
  sched/fair: Use task_rcu_dereference()
  sched/api: Introduce task_rcu_dereference() and try_get_task_struct()
  sched/idle: Optimize the generic idle loop
  sched/fair: Fix the wrong throttled clock time for cfs_rq_clock_task()
2016-07-25 13:59:34 -07:00
Linus Torvalds
7e4dc77b28 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "With over 300 commits it's been a busy cycle - with most of the work
  concentrated on the tooling side (as it should).

  The main kernel side enhancements were:

   - Add per event callchain limit: Recently we introduced a sysctl to
     tune the max-stack for all events for which callchains were
     requested:

       $ sysctl kernel.perf_event_max_stack
       kernel.perf_event_max_stack = 127

     Now this patch introduces a way to configure this per event, i.e.
     this becomes possible:

       $ perf record -e sched:*/max-stack=2/ -e block:*/max-stack=10/ -a

     allowing finer tuning of how much buffer space callchains use.

     This uses an u16 from the reserved space at the end, leaving
     another u16 for future use.

     There has been interest in even finer tuning, namely to control the
     max stack for kernel and userspace callchains separately.  Further
     discussion is needed, we may for instance use the remaining u16 for
     that and when it is present, assume that the sample_max_stack
     introduced in this patch applies for the kernel, and the u16 left
     is used for limiting the userspace callchain (Arnaldo Carvalho de
     Melo)

   - Optimize AUX event (hardware assisted side-band event) delivery
     (Kan Liang)

   - Rework Intel family name macro usage (this is partially x86 arch
     work) (Dave Hansen)

   - Refine and fix Intel LBR support (David Carrillo-Cisneros)

   - Add support for Intel 'TopDown' events (Andi Kleen)

   - Intel uncore PMU driver fixes and enhancements (Kan Liang)

   - ... other misc changes.

  Here's an incomplete list of the tooling enhancements (but there's
  much more, see the shortlog and the git log for details):

   - Support cross unwinding, i.e.  collecting '--call-graph dwarf'
     perf.data files in one machine and then doing analysis in another
     machine of a different hardware architecture.  This enables, for
     instance, to do:

       $ perf record -a --call-graph dwarf

     on a x86-32 or aarch64 system and then do 'perf report' on it on a
     x86_64 workstation (He Kuang)

   - Allow reading from a backward ring buffer (one setup via
     sys_perf_event_open() with perf_event_attr.write_backward = 1)
     (Wang Nan)

   - Finish merging initial SDT (Statically Defined Traces) support, see
     cset comments for details about how it all works (Masami Hiramatsu)

   - Support attaching eBPF programs to tracepoints (Wang Nan)

   - Add demangling of symbols in programs written in the Rust language
     (David Tolnay)

   - Add support for tracepoints in the python binding, including an
     example, that sets up and parses sched:sched_switch events,
     tools/perf/python/tracepoint.py (Jiri Olsa)

   - Introduce --stdio-color to set up the color output mode selection
     in 'annotate' and 'report', allowing emit color escape sequences
     when redirecting the output of these tools (Arnaldo Carvalho de
     Melo)

   - Add 'callindent' option to 'perf script -F', to indent the Intel PT
     call stack, making this output more ftrace-like (Adrian Hunter,
     Andi Kleen)

   - Allow dumping the object files generated by llvm when processing
     eBPF scriptlet events (Wang Nan)

   - Add stackcollapse.py script to help generating flame graphs (Paolo
     Bonzini)

   - Add --ldlat option to 'perf mem' to specify load latency for loads
     event (e.g. cpu/mem-loads/ ) (Jiri Olsa)

   - Tooling support for Intel TopDown counters, recently added to the
     kernel (Andi Kleen)"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (303 commits)
  perf tests: Add is_printable_array test
  perf tools: Make is_printable_array global
  perf script python: Fix string vs byte array resolving
  perf probe: Warn unmatched function filter correctly
  perf cpu_map: Add more helpers
  perf stat: Balance opening and reading events
  tools: Copy linux/{hash,poison}.h and check for drift
  perf tools: Remove include/linux/list.h from perf's MANIFEST
  tools: Copy the bitops files accessed from the kernel and check for drift
  Remove: kernel unistd*h files from perf's MANIFEST, not used
  perf tools: Remove tools/perf/util/include/linux/const.h
  perf tools: Remove tools/perf/util/include/asm/byteorder.h
  perf tools: Add missing linux/compiler.h include to perf-sys.h
  perf jit: Remove some no-op error handling
  perf jit: Add missing curly braces
  objtool: Initialize variable to silence old compiler
  objtool: Add -I$(srctree)/tools/arch/$(ARCH)/include/uapi
  perf record: Add --tail-synthesize option
  perf session: Don't warn about out of order event if write_backward is used
  perf tools: Enable overwrite settings
  ...
2016-07-25 13:20:41 -07:00
Linus Torvalds
89e7eb098a Merge branch 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS updates from Ingo Molnar:
 "The biggest change in this cycle was an enhancement by Yazen Ghannam
  to reduce the number of MCE error injection related IPIs.

  The rest are smaller fixes"

* 'ras-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Fix mce_rdmsrl() warning message
  x86/RAS/AMD: Reduce the number of IPIs when prepping error injection
  x86/mce/AMD: Increase size of the bank_map type
  x86/mce: Do not use bank 1 for APEI generated error logs
2016-07-25 13:13:19 -07:00
Linus Torvalds
c86ad14d30 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The locking tree was busier in this cycle than the usual pattern - a
  couple of major projects happened to coincide.

  The main changes are:

   - implement the atomic_fetch_{add,sub,and,or,xor}() API natively
     across all SMP architectures (Peter Zijlstra)

   - add atomic_fetch_{inc/dec}() as well, using the generic primitives
     (Davidlohr Bueso)

   - optimize various aspects of rwsems (Jason Low, Davidlohr Bueso,
     Waiman Long)

   - optimize smp_cond_load_acquire() on arm64 and implement LSE based
     atomic{,64}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
     on arm64 (Will Deacon)

   - introduce smp_acquire__after_ctrl_dep() and fix various barrier
     mis-uses and bugs (Peter Zijlstra)

   - after discovering ancient spin_unlock_wait() barrier bugs in its
     implementation and usage, strengthen its semantics and update/fix
     usage sites (Peter Zijlstra)

   - optimize mutex_trylock() fastpath (Peter Zijlstra)

   - ... misc fixes and cleanups"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (67 commits)
  locking/atomic: Introduce inc/dec variants for the atomic_fetch_$op() API
  locking/barriers, arch/arm64: Implement LDXR+WFE based smp_cond_load_acquire()
  locking/static_keys: Fix non static symbol Sparse warning
  locking/qspinlock: Use __this_cpu_dec() instead of full-blown this_cpu_dec()
  locking/atomic, arch/tile: Fix tilepro build
  locking/atomic, arch/m68k: Remove comment
  locking/atomic, arch/arc: Fix build
  locking/Documentation: Clarify limited control-dependency scope
  locking/atomic, arch/rwsem: Employ atomic_long_fetch_add()
  locking/atomic, arch/qrwlock: Employ atomic_fetch_add_acquire()
  locking/atomic, arch/mips: Convert to _relaxed atomics
  locking/atomic, arch/alpha: Convert to _relaxed atomics
  locking/atomic: Remove the deprecated atomic_{set,clear}_mask() functions
  locking/atomic: Remove linux/atomic.h:atomic_fetch_or()
  locking/atomic: Implement atomic{,64,_long}_fetch_{add,sub,and,andnot,or,xor}{,_relaxed,_acquire,_release}()
  locking/atomic: Fix atomic64_relaxed() bits
  locking/atomic, arch/xtensa: Implement atomic_fetch_{add,sub,and,or,xor}()
  locking/atomic, arch/x86: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  locking/atomic, arch/tile: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  locking/atomic, arch/sparc: Implement atomic{,64}_fetch_{add,sub,and,or,xor}()
  ...
2016-07-25 12:41:29 -07:00
Linus Torvalds
a2303849a6 Merge branch 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Ingo Molnar:
 "The biggest change in this cycle were SGI/UV related changes that
  clean up and fix UV boot quirks and problems.

  There's also various smaller cleanups and refinements"

* 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi: Reorganize the GUID table to make it easier to read
  x86/efi: Remove the unused efi_get_time() function
  x86/efi: Update efi_thunk() to use the the arch_efi_call_virt*() macros
  x86/uv: Update uv_bios_call() to use efi_call_virt_pointer()
  efi: Convert efi_call_virt() to efi_call_virt_pointer()
  x86/efi: Remove unused variable 'efi'
  efi: Document #define FOO_PROTOCOL_GUID layout
  efibc: Report more information in the error messages
2016-07-25 12:30:01 -07:00
Stephen Rothwell
d51306f1a3 x86: Make the vdso2c compiler use the host architecture headers
To be clear: this is a ppc64le hosted, x86_64 target cross build.

Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Acked-by: Andy Lutomirski <luto@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160723150845.3af8e452@canb.auug.org.au
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-07-25 10:02:21 -03:00
Vitaly Kuznetsov
ee42d665d3 xen/pvhvm: run xen_vcpu_setup() for the boot CPU
Historically we didn't call VCPUOP_register_vcpu_info for CPU0 for
PVHVM guests (while we had it for PV and ARM guests). This is usually
fine as we can use vcpu info in the shared_info page but when we try
booting on a vCPU with Xen's vCPU id > 31 (e.g. when we try to kdump
after crashing on this CPU) we're not able to boot.

Switch to always doing VCPUOP_register_vcpu_info for the boot CPU.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-25 13:34:23 +01:00
Vitaly Kuznetsov
e15a862193 x86/xen: use xen_vcpu_id mapping when pointing vcpu_info to shared_info
shared_info page has space for 32 vcpu info slots for first 32 vCPUs
but these are the first 32 vCPUs from Xen's perspective and we should
map them accordingly with the newly introduced xen_vcpu_id mapping.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-25 13:33:40 +01:00
Vitaly Kuznetsov
ad5475f9fa x86/xen: use xen_vcpu_id mapping for HYPERVISOR_vcpu_op
HYPERVISOR_vcpu_op() passes Linux's idea of vCPU id as a parameter
while Xen's idea is expected. In some cases these ideas diverge so we
need to do remapping.

Convert all callers of HYPERVISOR_vcpu_op() to use xen_vcpu_nr().

Leave xen_fill_possible_map() and xen_filter_cpu_maps() intact as
they're only being called by PV guests before perpu areas are
initialized. While the issue could be solved by switching to
early_percpu for xen_vcpu_id I think it's not worth it: PV guests will
probably never get to the point where their idea of vCPU id diverges
from Xen's.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-25 13:32:34 +01:00
Vitaly Kuznetsov
88e957d6e4 xen: introduce xen_vcpu_id mapping
It may happen that Xen's and Linux's ideas of vCPU id diverge. In
particular, when we crash on a secondary vCPU we may want to do kdump
and unlike plain kexec where we do migrate_to_reboot_cpu() we try
booting on the vCPU which crashed. This doesn't work very well for
PVHVM guests as we have a number of hypercalls where we pass vCPU id
as a parameter. These hypercalls either fail or do something
unexpected.

To solve the issue introduce percpu xen_vcpu_id mapping. ARM and PV
guests get direct mapping for now. Boot CPU for PVHVM guest gets its
id from CPUID. With secondary CPUs it is a bit more
trickier. Currently, we initialize IPI vectors before these CPUs boot
so we can't use CPUID. Use ACPI ids from MADT instead.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-25 13:31:16 +01:00
Vitaly Kuznetsov
3e9e57fad3 x86/acpi: store ACPI ids from MADT for future usage
Currently we don't save ACPI ids (unlike LAPIC ids which go to
x86_cpu_to_apicid) from MADT and we may need this information later.
Particularly, ACPI ids is the only existent way for a PVHVM Xen guest
to figure out Xen's idea of its vCPUs ids before these CPUs boot and
in some cases these ids diverge from Linux's cpu ids.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-25 13:30:53 +01:00
Vitaly Kuznetsov
de2f5537b3 x86/xen: update cpuid.h from Xen-4.7
Update cpuid.h header from xen hypervisor tree to get
XEN_HVM_CPUID_VCPU_ID_PRESENT definition.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
2016-07-25 13:30:45 +01:00
Rafael J. Wysocki
bc841e260c Merge branch 'pm-cpu'
* pm-cpu:
  x86: remove duplicate turbo ratio limit MSRs
  tools/power turbostat: Replace MSR_NHM_TURBO_RATIO_LIMIT
  cpufreq: intel_pstate: Replace MSR_NHM_TURBO_RATIO_LIMIT
2016-07-25 13:46:30 +02:00
Rafael J. Wysocki
9fedbb3b6b Merge branch 'x86/cpu' from tip 2016-07-25 13:45:39 +02:00
Rafael J. Wysocki
7f234a4d8a Merge branches 'pm-sleep' and 'pm-tools'
* pm-sleep:
  PM / hibernate: Introduce test_resume mode for hibernation
  x86 / hibernate: Use hlt_play_dead() when resuming from hibernation
  PM / hibernate: Image data protection during restoration
  PM / hibernate: Add missing braces in __register_nosave_region()
  PM / hibernate: Clean up comments in snapshot.c
  PM / hibernate: Clean up function headers in snapshot.c
  PM / hibernate: Add missing braces in hibernate_setup()
  PM / hibernate: Recycle safe pages after image restoration
  PM / hibernate: Simplify mark_unsafe_pages()
  PM / hibernate: Do not free preallocated safe pages during image restore
  PM / suspend: show workqueue state in suspend flow
  PM / sleep: make PM notifiers called symmetrically
  PM / sleep: Make pm_prepare_console() return void
  PM / Hibernate: Don't let kasan instrument snapshot.c

* pm-tools:
  PM / tools: scripts: AnalyzeSuspend v4.2
  tools/turbostat: allow user to alter DESTDIR and PREFIX
2016-07-25 13:44:32 +02:00
Rafael J. Wysocki
d5f017b796 Merge branch 'acpi-tables'
* acpi-tables:
  ACPI: Rename configfs.c to acpi_configfs.c to prevent link error
  ACPI: add support for loading SSDTs via configfs
  ACPI: add support for configfs
  efi / ACPI: load SSTDs from EFI variables
  spi / ACPI: add support for ACPI reconfigure notifications
  i2c / ACPI: add support for ACPI reconfigure notifications
  ACPI: add support for ACPI reconfiguration notifiers
  ACPI / scan: fix enumeration (visited) flags for bus rescans
  ACPI / documentation: add SSDT overlays documentation
  ACPI: ARM64: support for ACPI_TABLE_UPGRADE
  ACPI / tables: introduce ARCH_HAS_ACPI_TABLE_UPGRADE
  ACPI / tables: move arch-specific symbol to asm/acpi.h
  ACPI / tables: table upgrade: refactor function definitions
  ACPI / tables: table upgrade: use cacheable map for tables

Conflicts:
	arch/arm64/include/asm/acpi.h
2016-07-25 13:41:01 +02:00
Rafael J. Wysocki
d85f4eb699 Merge branch 'acpi-numa'
* acpi-numa:
  ACPI / NUMA: Enable ACPI based NUMA on ARM64
  arm64, ACPI, NUMA: NUMA support based on SRAT and SLIT
  ACPI / processor: Add acpi_map_madt_entry()
  ACPI / NUMA: Improve SRAT error detection and add messages
  ACPI / NUMA: Move acpi_numa_memory_affinity_init() to drivers/acpi/numa.c
  ACPI / NUMA: remove unneeded acpi_numa=1
  ACPI / NUMA: move bad_srat() and srat_disabled() to drivers/acpi/numa.c
  x86 / ACPI / NUMA: cleanup acpi_numa_processor_affinity_init()
  arm64, NUMA: Cleanup NUMA disabled messages
  arm64, NUMA: rework numa_add_memblk()
  ACPI / NUMA: move acpi_numa_slit_init() to drivers/acpi/numa.c
  ACPI / NUMA: Move acpi_numa_arch_fixup() to ia64 only
  ACPI / NUMA: remove duplicate NULL check
  ACPI / NUMA: Replace ACPI_DEBUG_PRINT() with pr_debug()
  ACPI / NUMA: Use pr_fmt() instead of printk
2016-07-25 13:40:39 +02:00
Dan Williams
0606263f24 Merge branch 'for-4.8/libnvdimm' into libnvdimm-for-next 2016-07-24 08:05:44 -07:00
David S. Miller
de0ba9a0d8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Just several instances of overlapping changes.

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-07-24 00:53:32 -04:00
Andy Lutomirski
55920d31f1 x86/mm/cpa: Add missing comment in populate_pdg()
In commit:

  21cbc2822aa1 ("x86/mm/cpa: Unbreak populate_pgd(): stop trying to deallocate failed PUDs")

I intended to add this comment, but I failed at using git.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/242baf8612394f4e31216f96d13c4d2e9b90d1b7.1469293159.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-23 21:17:10 +02:00
Andy Lutomirski
530dd8d4b9 x86/mm/cpa: Fix populate_pgd(): Stop trying to deallocate failed PUDs
Valdis Kletnieks bisected a boot failure back to this recent commit:

  360cb4d15567 ("x86/mm/cpa: In populate_pgd(), don't set the PGD entry until it's populated")

I broke the case where a PUD table got allocated -- populate_pud()
would wander off a pgd_none entry and get lost.  I'm not sure how
this survived my testing.

Fix the original issue in a much simpler way.  The problem
was that, if we allocated a PUD table, failed to populate it, and
freed it, another CPU could potentially keep using the PGD entry we
installed (either by copying it via vmalloc_fault or by speculatively
caching it).  There's a straightforward fix: simply leave the
top-level entry in place if this happens.  This can't waste any
significant amount of memory -- there are at most 256 entries like
this systemwide and, as a practical matter, if we hit this failure
path repeatedly, we're likely to reuse the same page anyway.

For context, this is a reversion with this hunk added in:

	if (ret < 0) {
+		/*
+		 * Leave the PUD page in place in case some other CPU or thread
+		 * already found it, but remove any useless entries we just
+		 * added to it.
+		 */
-		unmap_pgd_range(cpa->pgd, addr,
+		unmap_pud_range(pgd_entry, addr,
			        addr + (cpa->numpages << PAGE_SHIFT));
		return ret;
	}

This effectively open-codes what the now-deleted unmap_pgd_range()
function used to do except that unmap_pgd_range() used to try to
free the page as well.

Reported-by: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Mike Krinkin <krinkin.m.u@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Link: http://lkml.kernel.org/r/21cbc2822aa18aa812c0215f4231dbf5f65afa7f.1469249789.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-23 21:13:25 +02:00
Dan Williams
fd1d961dd6 x86/insn: remove pcommit
The pcommit instruction is being deprecated in favor of either ADR
(asynchronous DRAM refresh: flush-on-power-fail) at the platform level, or
posted-write-queue flush addresses as defined by the ACPI 6.x NFIT (NVDIMM
Firmware Interface Table).

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Acked-by: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-07-23 11:04:23 -07:00
Dan Williams
dfa169bbee Revert "KVM: x86: add pcommit support"
This reverts commit 8b3e34e46aca9b6d349b331cd9cf71ccbdc91b2e.

Given the deprecation of the pcommit instruction, the relevant VMX
features and CPUID bits are not going to be rolled into the SDM.  Remove
their usage from KVM.

Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2016-07-23 11:04:23 -07:00
Eric Auger
d9565a7399 KVM: Move kvm_setup_default/empty_irq_routing declaration in arch specific header
kvm_setup_default_irq_routing and kvm_setup_empty_irq_routing are
not used by generic code. So let's move the declarations in x86 irq.h
header instead of kvm_host.h.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Suggested-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2016-07-22 18:52:00 +01:00
Andy Lutomirski
6a79296cb1 x86/boot: Simplify EBDA-vs-BIOS reservation logic
Both the intent and the effect of reserve_bios_regions() is simple:
reserve the range from the apparent BIOS start (suitably filtered)
through 1MB and, if the EBDA start address is sensible, extend that
reservation downward to cover the EBDA as well.

The code is overcomplicated, though, and contains head-scratchers
like:

	if (ebda_start < BIOS_START_MIN)
		ebda_start = BIOS_START_MAX;

That snipped is trying to say "if ebda_start < BIOS_START_MIN,
ignore it".

Simplify it: reorder the code so that it makes sense.  This should
have no functional effect under any circumstances.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Mario Limonciello <mario_limonciello@dell.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Link: http://lkml.kernel.org/r/ef89c0c761be20ead8bd9a3275743e6259b6092a.1469135598.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-22 11:46:01 +02:00
Andy Lutomirski
30f027398b x86/boot: Clarify what x86_legacy_features.reserve_bios_regions does
It doesn't just control probing for the EBDA -- it controls whether we
detect and reserve the <1MB BIOS regions in general.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Mario Limonciello <mario_limonciello@dell.com>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Link: http://lkml.kernel.org/r/55bd591115498440d461857a7b64f349a5d911f3.1469135598.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-22 11:46:01 +02:00
Dave Hansen
ec3ed4a210 x86/fpu: Do not BUG_ON() in early FPU code
I don't think it is really possible to have a system where CPUID
enumerates support for XSAVE but that it does not have FP/SSE
(they are "legacy" features and always present).

But, I did manage to hit this case in qemu when I enabled its
somewhat shaky XSAVE support.  The bummer is that the FPU is set
up before we parse the command-line or have *any* console support
including earlyprintk.  That turned what should have been an easy
thing to debug in to a bit more of an odyssey.

So a BUG() here is worthless.  All it does it guarantee that
if/when we hit this case we have an empty console.  So, remove
the BUG() and try to limp along by disabling XSAVE and trying to
continue.  Add a comment on why we are doing this, and also add
a common "out_disable" path for leaving fpu__init_system_xstate().

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave@sr71.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20160720194551.63BB2B58@viggo.jf.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-21 18:18:45 +02:00
Adrian Hunter
25af37f4e1 x86/insn: Add AVX-512 support to the instruction decoder
Add support for Intel's AVX-512 instructions to the instruction decoder.

AVX-512 instructions are documented in Intel Architecture Instruction
Set Extensions Programming Reference (February 2016).

AVX-512 instructions are identified by a EVEX prefix which, for the
purpose of instruction decoding, can be treated as though it were a
4-byte VEX prefix.

Existing instructions which can now accept an EVEX prefix need not be
further annotated in the op code map (x86-opcode-map.txt). In the case
of new instructions, the op code map is updated accordingly.

Also add associated Mask Instructions that are used to manipulate mask
registers used in AVX-512 instructions.

The 'perf tools' instruction decoder is updated in a subsequent patch.
And a representative set of instructions is added to the perf tools new
instructions test in a subsequent patch.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: X86 ML <x86@kernel.org>
Link: http://lkml.kernel.org/r/1469003437-32706-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-07-21 09:37:11 -03:00
Ingo Molnar
edce21216a x86/boot: Reorganize and clean up the BIOS area reservation code
So the reserve_ebda_region() code has accumulated a number of
problems over the years that make it really difficult to read
and understand:

- The calculation of 'lowmem' and 'ebda_addr' is an unnecessarily
  interleaved mess of first lowmem, then ebda_addr, then lowmem tweaks...

- 'lowmem' here means 'super low mem' - i.e. 16-bit addressable memory. In other
  parts of the x86 code 'lowmem' means 32-bit addressable memory... This makes it
  super confusing to read.

- It does not help at all that we have various memory range markers, half of which
  are 'start of range', half of which are 'end of range' - but this crucial
  property is not obvious in the naming at all ... gave me a headache trying to
  understand all this.

- Also, the 'ebda_addr' name sucks: it highlights that it's an address (which is
  obvious, all values here are addresses!), while it does not highlight that it's
  the _start_ of the EBDA region ...

- 'BIOS_LOWMEM_KILOBYTES' says a lot of things, except that this is the only value
  that is a pointer to a value, not a memory range address!

- The function name itself is a misnomer: it says 'reserve_ebda_region()' while
  its main purpose is to reserve all the firmware ROM typically between 640K and
  1MB, while the 'EBDA' part is only a small part of that ...

- Likewise, the paravirt quirk flag name 'ebda_search' is misleading as well: this
  too should be about whether to reserve firmware areas in the paravirt case.

- In fact thinking about this as 'end of RAM' is confusing: what this function
  *really* wants to reserve is firmware data and code areas! Once the thinking is
  inverted from a mixed 'ram' and 'reserved firmware area' notion to a pure
  'reserved area' notion everything becomes a lot clearer.

To improve all this rewrite the whole code (without changing the logic):

- Firstly invert the naming from 'lowmem end' to 'BIOS reserved area start'
  and propagate this concept through all the variable names and constants.

	BIOS_RAM_SIZE_KB_PTR		// was: BIOS_LOWMEM_KILOBYTES

	BIOS_START_MIN			// was: INSANE_CUTOFF

	ebda_start			// was: ebda_addr
	bios_start			// was: lowmem

	BIOS_START_MAX			// was: LOWMEM_CAP

- Then clean up the name of the function itself by renaming it
  to reserve_bios_regions() and renaming the ::ebda_search paravirt
  flag to ::reserve_bios_regions.

- Fix up all the comments (fix typos), harmonize and simplify their
  formulation and remove comments that become unnecessary due to
  the much better naming all around.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-07-21 10:11:57 +02:00