110210 Commits

Author SHA1 Message Date
Linus Torvalds
35ffccdb7e Merge branch 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pul x86 microcode updates from Ingo Molnar:
 "x86 microcode loader updates from Borislav Petkov:

   - early parsing of the built-in microcode

   - cleanups

   - misc smaller fixes"

* 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/microcode: Correct CPU family related variable types
  x86/microcode: Disable builtin microcode loading on 32-bit for now
  x86/microcode/intel: Rename get_matching_sig()
  x86/microcode/intel: Simplify get_matching_sig()
  x86/microcode/intel: Simplify update_match_cpu()
  x86/microcode/intel: Rename get_matching_microcode
  x86/cpu/microcode: Zap changelog
  x86/microcode: Parse built-in microcode early
  x86/microcode/intel: Remove unused @rev arg of get_matching_sig()
  x86/microcode/intel: Get rid of revision_is_newer()
2015-06-22 17:46:14 -07:00
Linus Torvalds
e2172d8fd5 Merge branch 'x86-kdump-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 kdump updates from Ingo Molnar:
 "Three kdump robustness related improvements (Joerg Roedel)"

* 'x86-kdump-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/crash: Allocate enough low memory when crashkernel=high
  x86/swiotlb: Try coherent allocations with __GFP_NOWARN
  swiotlb: Warn on allocation failure in swiotlb_alloc_coherent()
2015-06-22 17:40:55 -07:00
Linus Torvalds
e75c73ad64 Merge branch 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 FPU updates from Ingo Molnar:
 "This tree contains two main changes:

   - The big FPU code rewrite: wide reaching cleanups and reorganization
     that pulls all the FPU code together into a clean base in
     arch/x86/fpu/.

     The resulting code is leaner and faster, and much easier to
     understand.  This enables future work to further simplify the FPU
     code (such as removing lazy FPU restores).

     By its nature these changes have a substantial regression risk: FPU
     code related bugs are long lived, because races are often subtle
     and bugs mask as user-space failures that are difficult to track
     back to kernel side backs.  I'm aware of no unfixed (or even
     suspected) FPU related regression so far.

   - MPX support rework/fixes.  As this is still not a released CPU
     feature, there were some buglets in the code - should be much more
     robust now (Dave Hansen)"

* 'x86-fpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (250 commits)
  x86/fpu: Fix double-increment in setup_xstate_features()
  x86/mpx: Allow 32-bit binaries on 64-bit kernels again
  x86/mpx: Do not count MPX VMAs as neighbors when unmapping
  x86/mpx: Rewrite the unmap code
  x86/mpx: Support 32-bit binaries on 64-bit kernels
  x86/mpx: Use 32-bit-only cmpxchg() for 32-bit apps
  x86/mpx: Introduce new 'directory entry' to 'addr' helper function
  x86/mpx: Add temporary variable to reduce masking
  x86: Make is_64bit_mm() widely available
  x86/mpx: Trace allocation of new bounds tables
  x86/mpx: Trace the attempts to find bounds tables
  x86/mpx: Trace entry to bounds exception paths
  x86/mpx: Trace #BR exceptions
  x86/mpx: Introduce a boot-time disable flag
  x86/mpx: Restrict the mmap() size check to bounds tables
  x86/mpx: Remove redundant MPX_BNDCFG_ADDR_MASK
  x86/mpx: Clean up the code by not passing a task pointer around when unnecessary
  x86/mpx: Use the new get_xsave_field_ptr()API
  x86/fpu/xstate: Wrap get_xsave_addr() to make it safer
  x86/fpu/xstate: Fix up bad get_xsave_addr() assumptions
  ...
2015-06-22 17:16:11 -07:00
Linus Torvalds
cfe3eceb7a Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 EFI updates from Ingo Molnar:
 "EFI changes:

   - Use idiomatic negative error values in efivar_create_sysfs_entry()
     instead of returning '1' to indicate error (Dan Carpenter)

   - Implement new support to expose the EFI System Resource Tables in
     sysfs, which provides information for performing firmware updates
     (Peter Jones)

   - Documentation cleanup in the EFI handover protocol section which
     falsely claimed that 'cmdline_size' needed to be filled out by the
     boot loader (Alex Smith)

   - Align the order of SMBIOS tables in /sys/firmware/efi/systab to
     match the way that we do things for ACPI and add documentation to
     Documentation/ABI (Jean Delvare)"

* 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi: Work around ia64 build problem with ESRT driver
  efi: Add 'systab' information to Documentation/ABI
  efi: dmi: List SMBIOS3 table before SMBIOS table
  efi/esrt: Fix some compiler warnings
  x86, doc: Remove cmdline_size from list of fields to be filled in for EFI handover
  efi: Add esrt support
  efi: efivar_create_sysfs_entry() should return negative error codes
2015-06-22 17:10:44 -07:00
Linus Torvalds
b3ba283d83 Merge branch 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 CPU features from Ingo Molnar:
 "Various CPU feature support related changes: in particular the
  /proc/cpuinfo model name sanitization change should be monitored, it
  has a chance to break stuff.  (but really shouldn't and there are no
  regression reports)"

* 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/cpu/amd: Give access to the number of nodes in a physical package
  x86/cpu: Trim model ID whitespace
  x86/cpu: Strip any /proc/cpuinfo model name field whitespace
  x86/cpu/amd: Set X86_FEATURE_EXTD_APICID for future processors
  x86/gart: Check for GART support before accessing GART registers
2015-06-22 16:43:01 -07:00
Linus Torvalds
d43e4f44ba Merge branch 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Ingo Molnar:
 "Misc cleanups"

* 'x86-cleanups-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mm: Clean up types in xlate_dev_mem_ptr() some more
  x86: Deinline dma_free_attrs()
  x86: Deinline dma_alloc_attrs()
  x86: Remove unused TI_cpu
  x86: Merge common 32-bit values in asm-offsets.c
2015-06-22 16:23:00 -07:00
Linus Torvalds
23b7776290 Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler updates from Ingo Molnar:
 "The main changes are:

   - lockless wakeup support for futexes and IPC message queues
     (Davidlohr Bueso, Peter Zijlstra)

   - Replace spinlocks with atomics in thread_group_cputimer(), to
     improve scalability (Jason Low)

   - NUMA balancing improvements (Rik van Riel)

   - SCHED_DEADLINE improvements (Wanpeng Li)

   - clean up and reorganize preemption helpers (Frederic Weisbecker)

   - decouple page fault disabling machinery from the preemption
     counter, to improve debuggability and robustness (David
     Hildenbrand)

   - SCHED_DEADLINE documentation updates (Luca Abeni)

   - topology CPU masks cleanups (Bartosz Golaszewski)

   - /proc/sched_debug improvements (Srikar Dronamraju)"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (79 commits)
  sched/deadline: Remove needless parameter in dl_runtime_exceeded()
  sched: Remove superfluous resetting of the p->dl_throttled flag
  sched/deadline: Drop duplicate init_sched_dl_class() declaration
  sched/deadline: Reduce rq lock contention by eliminating locking of non-feasible target
  sched/deadline: Make init_sched_dl_class() __init
  sched/deadline: Optimize pull_dl_task()
  sched/preempt: Add static_key() to preempt_notifiers
  sched/preempt: Fix preempt notifiers documentation about hlist_del() within unsafe iteration
  sched/stop_machine: Fix deadlock between multiple stop_two_cpus()
  sched/debug: Add sum_sleep_runtime to /proc/<pid>/sched
  sched/debug: Replace vruntime with wait_sum in /proc/sched_debug
  sched/debug: Properly format runnable tasks in /proc/sched_debug
  sched/numa: Only consider less busy nodes as numa balancing destinations
  Revert 095bebf61a46 ("sched/numa: Do not move past the balance point if unbalanced")
  sched/fair: Prevent throttling in early pick_next_task_fair()
  preempt: Reorganize the notrace definitions a bit
  preempt: Use preempt_schedule_context() as the official tracing preemption point
  sched: Make preempt_schedule_context() function-tracing safe
  x86: Remove cpu_sibling_mask() and cpu_core_mask()
  x86: Replace cpu_**_mask() with topology_**_cpumask()
  ...
2015-06-22 15:52:04 -07:00
Linus Torvalds
6bc4c3ad36 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "These are the left over fixes from the v4.1 cycle"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf tools: Fix build breakage if prefix= is specified
  perf/x86: Honor the architectural performance monitoring version
  perf/x86/intel: Fix PMI handling for Intel PT
  perf/x86/intel/bts: Fix DS area sharing with x86_pmu events
  perf/x86: Add more Broadwell model numbers
  perf: Fix ring_buffer_attach() RCU sync, again
2015-06-22 15:45:41 -07:00
Linus Torvalds
c58267e9fa Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
 "Kernel side changes mostly consist of work on x86 PMU drivers:

   - x86 Intel PT (hardware CPU tracer) improvements (Alexander
     Shishkin)

   - x86 Intel CQM (cache quality monitoring) improvements (Thomas
     Gleixner)

   - x86 Intel PEBSv3 support (Peter Zijlstra)

   - x86 Intel PEBS interrupt batching support for lower overhead
     sampling (Zheng Yan, Kan Liang)

   - x86 PMU scheduler fixes and improvements (Peter Zijlstra)

  There's too many tooling improvements to list them all - here are a
  few select highlights:

  'perf bench':

      - Introduce new 'perf bench futex' benchmark: 'wake-parallel', to
        measure parallel waker threads generating contention for kernel
        locks (hb->lock). (Davidlohr Bueso)

  'perf top', 'perf report':

      - Allow disabling/enabling events dynamicaly in 'perf top':
        a 'perf top' session can instantly become a 'perf report'
        one, i.e. going from dynamic analysis to a static one,
        returning to a dynamic one is possible, to toogle the
        modes, just press 'f' to 'freeze/unfreeze' the sampling. (Arnaldo Carvalho de Melo)

      - Make Ctrl-C stop processing on TUI, allowing interrupting the load of big
        perf.data files (Namhyung Kim)

  'perf probe': (Masami Hiramatsu)

      - Support glob wildcards for function name
      - Support $params special probe argument: Collect all function arguments
      - Make --line checks validate C-style function name.
      - Add --no-inlines option to avoid searching inline functions
      - Greatly speed up 'perf probe --list' by caching debuginfo.
      - Improve --filter support for 'perf probe', allowing using its arguments
        on other commands, as --add, --del, etc.

  'perf sched':

      - Add option in 'perf sched' to merge like comms to lat output (Josef Bacik)

  Plus tons of infrastructure work - in particular preparation for
  upcoming threaded perf report support, but also lots of other work -
  and fixes and other improvements.  See (much) more details in the
  shortlog and in the git log"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (305 commits)
  perf tools: Configurable per thread proc map processing time out
  perf tools: Add time out to force stop proc map processing
  perf report: Fix sort__sym_cmp to also compare end of symbol
  perf hists browser: React to unassigned hotkey pressing
  perf top: Tell the user how to unfreeze events after pressing 'f'
  perf hists browser: Honour the help line provided by builtin-{top,report}.c
  perf hists browser: Do not exit when 'f' is pressed in 'report' mode
  perf top: Replace CTRL+z with 'f' as hotkey for enable/disable events
  perf annotate: Rename source_line_percent to source_line_samples
  perf annotate: Display total number of samples with --show-total-period
  perf tools: Ensure thread-stack is flushed
  perf top: Allow disabling/enabling events dynamicly
  perf evlist: Add toggle_enable() method
  perf trace: Fix race condition at the end of started workloads
  perf probe: Speed up perf probe --list by caching debuginfo
  perf probe: Show usage even if the last event is skipped
  perf tools: Move libtraceevent dynamic list to separated LDFLAGS variable
  perf tools: Fix a problem when opening old perf.data with different byte order
  perf tools: Ignore .config-detected in .gitignore
  perf probe: Fix to return error if no probe is added
  ...
2015-06-22 15:19:21 -07:00
Linus Torvalds
1bf7067c6e Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking updates from Ingo Molnar:
 "The main changes are:

   - 'qspinlock' support, enabled on x86: queued spinlocks - these are
     now the spinlock variant used by x86 as they outperform ticket
     spinlocks in every category.  (Waiman Long)

   - 'pvqspinlock' support on x86: paravirtualized variant of queued
     spinlocks.  (Waiman Long, Peter Zijlstra)

   - 'qrwlock' support, enabled on x86: queued rwlocks.  Similar to
     queued spinlocks, they are now the variant used by x86:

       CONFIG_ARCH_USE_QUEUED_SPINLOCKS=y
       CONFIG_QUEUED_SPINLOCKS=y
       CONFIG_ARCH_USE_QUEUED_RWLOCKS=y
       CONFIG_QUEUED_RWLOCKS=y

   - various lockdep fixlets

   - various locking primitives cleanups, further WRITE_ONCE()
     propagation"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
  locking/lockdep: Remove hard coded array size dependency
  locking/qrwlock: Don't contend with readers when setting _QW_WAITING
  lockdep: Do not break user-visible string
  locking/arch: Rename set_mb() to smp_store_mb()
  locking/arch: Add WRITE_ONCE() to set_mb()
  rtmutex: Warn if trylock is called from hard/softirq context
  arch: Remove __ARCH_HAVE_CMPXCHG
  locking/rtmutex: Drop usage of __HAVE_ARCH_CMPXCHG
  locking/qrwlock: Rename QUEUE_RWLOCK to QUEUED_RWLOCKS
  locking/pvqspinlock: Rename QUEUED_SPINLOCK to QUEUED_SPINLOCKS
  locking/pvqspinlock: Replace xchg() by the more descriptive set_mb()
  locking/pvqspinlock, x86: Enable PV qspinlock for Xen
  locking/pvqspinlock, x86: Enable PV qspinlock for KVM
  locking/pvqspinlock, x86: Implement the paravirt qspinlock call patching
  locking/pvqspinlock: Implement simple paravirt support for the qspinlock
  locking/qspinlock: Revert to test-and-set on hypervisors
  locking/qspinlock: Use a simple write to grab the lock
  locking/qspinlock: Optimize for smaller NR_CPUS
  locking/qspinlock: Extract out code snippets for the next patch
  locking/qspinlock: Add pending bit
  ...
2015-06-22 14:54:22 -07:00
Linus Torvalds
fc934d4017 Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RCU updates from Ingo Molnar:

 - Continued initialization/Kconfig updates: hide most Kconfig options
   from unsuspecting users.

   There's now a single high level configuration option:

        *
        * RCU Subsystem
        *
        Make expert-level adjustments to RCU configuration (RCU_EXPERT) [N/y/?] (NEW)

   Which if answered in the negative, leaves us with a single
   interactive configuration option:

        Offload RCU callback processing from boot-selected CPUs (RCU_NOCB_CPU) [N/y/?] (NEW)

   All the rest of the RCU options are configured automatically.  Later
   on we'll remove this single leftover configuration option as well.

 - Remove all uses of RCU-protected array indexes: replace the
   rcu_[access|dereference]_index_check() APIs with READ_ONCE() and
   rcu_lockdep_assert()

 - RCU CPU-hotplug cleanups

 - Updates to Tiny RCU: a race fix and further code shrinkage.

 - RCU torture-testing updates: fixes, speedups, cleanups and
   documentation updates.

 - Miscellaneous fixes

 - Documentation updates

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
  rcutorture: Allow repetition factors in Kconfig-fragment lists
  rcutorture: Display "make oldconfig" errors
  rcutorture: Update TREE_RCU-kconfig.txt
  rcutorture: Make rcutorture scripts force RCU_EXPERT
  rcutorture: Update configuration fragments for rcutree.rcu_fanout_exact
  rcutorture: TASKS_RCU set directly, so don't explicitly set it
  rcutorture: Test SRCU cleanup code path
  rcutorture: Replace barriers with smp_store_release() and smp_load_acquire()
  locktorture: Change longdelay_us to longdelay_ms
  rcutorture: Allow negative values of nreaders to oversubscribe
  rcutorture: Exchange TREE03 and TREE08 NR_CPUS, speed up CPU hotplug
  rcutorture: Exchange TREE03 and TREE04 geometries
  locktorture: fix deadlock in 'rw_lock_irq' type
  rcu: Correctly handle non-empty Tiny RCU callback list with none ready
  rcutorture: Test both RCU-sched and RCU-bh for Tiny RCU
  rcu: Further shrink Tiny RCU by making empty functions static inlines
  rcu: Conditionally compile RCU's eqs warnings
  rcu: Remove prompt for RCU implementation
  rcu: Make RCU able to tolerate undefined CONFIG_RCU_KTHREAD_PRIO
  rcu: Make RCU able to tolerate undefined CONFIG_RCU_FANOUT_LEAF
  ...
2015-06-22 14:01:01 -07:00
Rafael J. Wysocki
3bcda76d9d Merge branch 'pm-sleep'
* pm-sleep:
  x86: Load __USER_DS into DS/ES after resume
2015-06-22 14:40:28 +02:00
Ingo Molnar
ffa64eff95 x86: Load __USER_DS into DS/ES after resume
Srinivas Pandruvada reported a problem with system resume from
suspend-to-RAM on 32-bit x86 systems where the DS register of
the CPU is set to __KERNEL_DS instead of __USER_DS on return
to user space which cases a General Protection Fault to occur.

The issue is that DS is set to __KERNEL_DS by the ACPI resume code
path while the SYSEXIT path never reloads DS/ES.  It assumes they
are still __USER_DS set at the SYSENTER time (Brian Gerst), so if
the return to user space happens to be through SYSEXIT, it will lead
to the reported GPF.

Fix the problem by setting the DS and ES registers to __USER_DS
as expected by the SYSEXIT path.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=61781
Link: http://marc.info/?l=linux-pm&m=143406648920385&w=2
Acked-by: Pavel Machek <pavel@ucw.cz>
Tested-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Ingo Molnar <mingo@kernel.org>

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-06-22 14:40:03 +02:00
Ingo Molnar
7ef3d7d58d Merge branches 'x86/apic', 'x86/asm', 'x86/mm' and 'x86/platform' into x86/core, to merge last updates
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-22 09:15:03 +02:00
Thomas Gleixner
cb17b2a674 x86/hpet: Use proper hpet device number for MSI allocation
hpet_assign_irq() is called with hpet_device->num as "hardware
interrupt number", but hpet_device->num is initialized after the
interrupt has been assigned, so it's always 0. As a consequence only
the first MSI allocation succeeds, the following ones fail because the
"hardware interrupt number" already exists.

Move the initialization of dev->num and other fields before the call
to hpet_assign_irq(), which is the ordering before the offending
commit which introduced that regression.

Fixes: "3cb96f0c9733 x86/hpet: Enhance HPET IRQ to support hierarchical irqdomains"
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1506211635010.4107@nanos
Cc: Jiang Liu <jiang.liu@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
2015-06-21 16:38:40 +02:00
Jiang Liu
bafac298fb x86/hpet: Check for irq==0 when allocating hpet MSI interrupts
irq == 0 is not a valid irq for a irqdomain MSI allocation, but hpet
code checks only for negative return values.

Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/r/558447AF.30703@linux.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2015-06-20 12:00:58 +02:00
Vladimir Murzin
86dca36e6b arm64: use private ratelimit state along with show_unhandled_signals
printk_ratelimit() shares the ratelimiting state with other callers what
may lead to scenarios where at the time we want to print out debug
information we already limited, so nothing appears in the dmesg - this
makes exception-trace quite poor helper in debugging.

Additionally, we have imbalance with some messages limited with global
ratelimit state and other messages limited with their private state
defined via pr_*_ratelimited().

To address this inconsistency show_unhandled_signals_ratelimited()
macro is introduced and caller sites are converted to use it.

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-06-19 16:26:15 +01:00
Vladimir Murzin
9e793ab84e arm64: show unhandled SP/PC alignment faults
Report unhandled SP/PC alignment faults if the show_unhandled_signals
variable is set (via /proc/sys/debug/exception-trace).

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-06-19 16:20:10 +01:00
Wei Huang
e5af058aac KVM: x86/vPMU: reorder PMU functions
Keep called functions closer to their callers, and init/destroy
functions next to each other.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:30 +02:00
Wei Huang
e84cfe4ce0 KVM: x86/vPMU: whitespace and stylistic adjustments in PMU code
Signed-off-by: Wei Huang <wei@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:30 +02:00
Wei Huang
212dba1267 KVM: x86/vPMU: use the new macros to go between PMC, PMU and VCPU
Signed-off-by: Wei Huang <wei@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:30 +02:00
Wei Huang
474a5bb944 KVM: x86/vPMU: introduce pmu.h header
This will be used for private function used by AMD- and Intel-specific
PMU implementations.

Signed-off-by: Wei Huang <wei@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:29 +02:00
Wei Huang
c6702c9dcf KVM: x86/vPMU: rename a few PMU functions
Before introducing a pmu.h header for them, make the naming more
consistent.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:29 +02:00
Xiao Guangrong
6a39bbc5da KVM: MTRR: do not map huge page for non-consistent range
Based on Intel's SDM, mapping huge page which do not have consistent
memory cache for each 4k page will cause undefined behavior

In order to avoiding this kind of undefined behavior, we force to use
4k pages under this case

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:29 +02:00
Xiao Guangrong
fa61213746 KVM: MTRR: simplify kvm_mtrr_get_guest_memory_type
mtrr_for_each_mem_type() is ready now, use it to simplify
kvm_mtrr_get_guest_memory_type()

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:29 +02:00
Xiao Guangrong
f571c0973e KVM: MTRR: introduce mtrr_for_each_mem_type
It walks all MTRRs and gets all the memory cache type setting for the
specified range also it checks if the range is fully covered by MTRRs

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
[Adjust for range_size->range_shift change. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:29 +02:00
Xiao Guangrong
f7bfb57b3e KVM: MTRR: introduce fixed_mtrr_addr_* functions
Two functions are introduced:
- fixed_mtrr_addr_to_seg() translates the address to the fixed
  MTRR segment

- fixed_mtrr_addr_seg_to_range_index() translates the address to
  the index of kvm_mtrr.fixed_ranges[]

They will be used in the later patch

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
[Adjust for range_size->range_shift change. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:28 +02:00
Xiao Guangrong
19efffa244 KVM: MTRR: sort variable MTRRs
Sort all valid variable MTRRs based on its base address, it will help us to
check a range to see if it's fully contained in variable MTRRs

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
[Fix list insertion sort, simplify var_mtrr_range_is_valid to just
 test the V bit. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:28 +02:00
Xiao Guangrong
a13842dc66 KVM: MTRR: introduce var_mtrr_range
It gets the range for the specified variable MTRR

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
[Simplify boolean operations. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:28 +02:00
Xiao Guangrong
de9aef5e1a KVM: MTRR: introduce fixed_mtrr_segment table
This table summarizes the information of fixed MTRRs and introduce some APIs
to abstract its operation which helps us to clean up the code and will be
used in later patches

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
[Change range_size to range_shift, in order to avoid udivdi3 errors.
 - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:28 +02:00
Xiao Guangrong
3f3f78b614 KVM: MTRR: improve kvm_mtrr_get_guest_memory_type
- kvm_mtrr_get_guest_memory_type() only checks one page in MTRRs so
   that it's unnecessary to check to see if the range is partially
   covered in MTRR

 - optimize the check of overlap memory type and add some comments
   to explain the precedence

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:28 +02:00
Xiao Guangrong
86fd52701c KVM: MTRR: do not split 64 bits MSR content
Variable MTRR MSRs are 64 bits which are directly accessed with full length,
no reason to split them to two 32 bits

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:27 +02:00
Xiao Guangrong
10fac2dc2b KVM: MTRR: clean up mtrr default type
Drop kvm_mtrr->enable, omit the decode/code workload and get rid of
all the hard code

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:27 +02:00
Xiao Guangrong
910a6aae4e KVM: MTRR: exactly define the size of variable MTRRs
Only KVM_NR_VAR_MTRR variable MTRRs are available in KVM guest

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:27 +02:00
Xiao Guangrong
70109e7d9d KVM: MTRR: remove mtrr_state.have_fixed
vMTRR does not depend on any host MTRR feature and fixed MTRRs have always
been implemented, so drop this field

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:27 +02:00
Xiao Guangrong
eb839917a7 KVM: MTRR: handle MSR_MTRRcap in kvm_mtrr_get_msr
MSR_MTRRcap is a MTRR msr so move the handler to the common place, also
add some comments to make the hard code more readable

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:27 +02:00
Xiao Guangrong
ff53604b40 KVM: x86: move MTRR related code to a separate file
MTRR code locates in x86.c and mmu.c so that move them to a separate file to
make the organization more clearer and it will be the place where we fully
implement vMTRR

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:26 +02:00
Xiao Guangrong
b18d5431ac KVM: x86: fix CR0.CD virtualization
Currently, CR0.CD is not checked when we virtualize memory cache type for
noncoherent_dma guests, this patch fixes it by :

- setting UC for all memory if CR0.CD = 1
- zapping all the last sptes in MMU if CR0.CD is changed

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:26 +02:00
Bandan Das
f104765b4f KVM: nSVM: Check for NRIPS support before updating control field
If hardware doesn't support DecodeAssist - a feature that provides
more information about the intercept in the VMCB, KVM decodes the
instruction and then updates the next_rip vmcb control field.
However, NRIP support itself depends on cpuid Fn8000_000A_EDX[NRIPS].
Since skip_emulated_instruction() doesn't verify nrip support
before accepting control.next_rip as valid, avoid writing this
field if support isn't present.

Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-06-19 17:16:26 +02:00
Paolo Bonzini
05fe125fa3 KVM/ARM changes for v4.2:
- Proper guest time accounting
 - FP access fix for 32bit
 - The usual pile of GIC fixes
 - PSCI fixes
 - Random cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVg/c2AAoJECPQ0LrRPXpDjSoQALjxhdY0ROAz2yjVqcYD5Goj
 8XHLbwDxlXQIqXulEakpx/2CtQs3tYsnt7gg3HoO3aY0TqxKY+0nh9RuZ/b15KcP
 hailUK/hbdLo9sDmoPBRBBP26gqjnXVMYTNqSFk4u2KADYUYgl5auhXcTNARH6Gx
 +c3LJG9HUbbvKkTG+RB+Xgl3f9N0uaIaV15oTX+WsolNIQVmA8Sh5X7nXPnFXV4r
 klb9pcR3FPWA9m+fDW8VszoLZ39a12kfQsK0yshKy5ep/AA1liPEYDVc6Wp/cEqz
 ybDfGuO9GPoPLfzhYq1Cw/SeKaAFruu6QD1ugsSlh1DVZ2evme+OsEvzRsfCqEYc
 VkKkX2Rc8DQTjsJoaTCOmNfU+v0dWwriCXmfLPBj01+55M4oCDcntTasrEsDAc3o
 YkTk68T9HJxsNesWpkZDo0pYzGeu3GDS1Hg/ElESminCNfK/S/GBTtidLq7Rs7KU
 VJCaUeob6CoQnRdmNEm+nHInh5pSoOlKzdduFM9IcvOO5B12YlNMcVnr48Y16ea7
 ElYGkxR/n/uLt33A3Yvaj8+PeJbwxQ1RcWRDJy8Fw77AFLEaT2VTr2kZgJIqTatQ
 ENnCLMxNP2uAt1TrGhEempndYZdWK/RiuaA9MjnxSscFkrXPfn4bMJzV58ghF9L+
 62hk2S8I8q69SvqbPTgv
 =VQWP
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/ARM changes for v4.2:

- Proper guest time accounting
- FP access fix for 32bit
- The usual pile of GIC fixes
- PSCI fixes
- Random cleanups
2015-06-19 17:15:24 +02:00
Herbert Xu
c0b59fafe3 Merge branch 'mvebu/drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Merge the mvebu/drivers branch of the arm-soc tree which contains
just a single patch bfa1ce5f38938cc9e6c7f2d1011f88eba2b9e2b2 ("bus:
mvebu-mbus: add mv_mbus_dram_info_nooverlap()") that happens to be
a prerequisite of the new marvell/cesa crypto driver.
2015-06-19 22:07:07 +08:00
Borislav Petkov
04c17341b4 x86/boot: Fix overflow warning with 32-bit binutils
When building the kernel with 32-bit binutils built with support
only for the i386 target, we get the following warning:

  arch/x86/kernel/head_32.S:66: Warning: shift count out of range (32 is not between 0 and 31)

The problem is that in that case, binutils' internal type
representation is 32-bit wide and the shift range overflows.

In order to fix this, manipulate the shift expression which
creates the 4GiB constant to not overflow the shift count.

Suggested-by: Michael Matz <matz@suse.de>
Reported-and-tested-by: Enrico Mioso <mrkiko.rs@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-19 16:03:26 +02:00
Will Deacon
6f1a6ae87c arm64: vdso: work-around broken ELF toolchains in Makefile
When building the kernel with a bare-metal (ELF) toolchain, the -shared
option may not be passed down to collect2, resulting in silent corruption
of the vDSO image (in particular, the DYNAMIC section is omitted).

The effect of this corruption is that the dynamic linker fails to find
the vDSO symbols and libc is instead used for the syscalls that we
intended to optimise (e.g. gettimeofday). Functionally, there is no
issue as the sigreturn trampoline is still intact and located by the
kernel.

This patch fixes the problem by explicitly passing -shared to the linker
when building the vDSO.

Cc: <stable@vger.kernel.org>
Reported-by: Szabolcs Nagy <Szabolcs.Nagy@arm.com>
Reported-by: James Greenlaigh <james.greenhalgh@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-06-19 14:54:10 +01:00
Sudeep Holla
af391b15f7 arm64: kernel: rename __cpu_suspend to keep it aligned with arm
This patch renames __cpu_suspend to cpu_suspend so that it's aligned
with ARM32. It also removes the redundant wrapper created.

This is in preparation to implement generic PSCI system suspend using
the cpu_{suspend,resume} which now has the same interface on both ARM
and ARM64.

Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Ashwin Chaugule <ashwin.chaugule@linaro.org>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2015-06-19 14:46:39 +01:00
Palik, Imre
2c33645d36 perf/x86: Honor the architectural performance monitoring version
Architectural performance monitoring, version 1, doesn't support fixed counters.

Currently, even if a hypervisor advertises support for architectural
performance monitoring version 1, perf may still try to use the fixed
counters, as the constraints are set up based on the CPU model.

This patch ensures that perf honors the architectural performance monitoring
version returned by CPUID, and it only uses the fixed counters for version 2
and above.

(Some of the ideas in this patch came from Peter Zijlstra.)

Signed-off-by: Imre Palik <imrep@amazon.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Anthony Liguori <aliguori@amazon.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1433767609-1039-1-git-send-email-imrep.amz@gmail.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-19 09:38:48 +02:00
Alexander Shishkin
1b7b938f18 perf/x86/intel: Fix PMI handling for Intel PT
Intel PT is a separate PMU and it is not using any of the x86_pmu
code paths, which means in particular that the active_events counter
remains intact when new PT events are created.

However, PT uses the generic x86_pmu PMI handler for its PMI handling needs.

The problem here is that the latter checks active_events and in case of it
being zero, exits without calling the actual x86_pmu.handle_nmi(), which
results in unknown NMI errors and massive data loss for PT.

The effect is not visible if there are other perf events in the system
at the same time that keep active_events counter non-zero, for instance
if the NMI watchdog is running, so one needs to disable it to reproduce
the problem.

At the same time, the active_events counter besides doing what the name
suggests also implicitly serves as a PMC hardware and DS area reference
counter.

This patch adds a separate reference counter for the PMC hardware, leaving
active_events for actually counting the events and makes sure it also
counts PT and BTS events.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@infradead.org
Cc: adrian.hunter@intel.com
Link: http://lkml.kernel.org/r/87k2v92t0s.fsf@ashishki-desk.ger.corp.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-19 09:38:47 +02:00
Alexander Shishkin
6b099d9b04 perf/x86/intel/bts: Fix DS area sharing with x86_pmu events
Currently, the intel_bts driver relies on the DS area allocated by the x86_pmu
code in its event_init() path, which is a bug: creating a BTS event while
no x86_pmu events are present results in a NULL pointer dereference.

The same DS area is also used by PEBS sampling, which makes it quite a bit
trickier to have a separate one for intel_bts' purposes.

This patch makes intel_bts driver use the same DS allocation and reference
counting code as x86_pmu to make sure it is always present when either
intel_bts or x86_pmu need it.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: acme@infradead.org
Cc: adrian.hunter@intel.com
Link: http://lkml.kernel.org/r/1434024837-9916-2-git-send-email-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-19 09:38:47 +02:00
Andi Kleen
4b36f1a413 perf/x86: Add more Broadwell model numbers
This patch adds additional model numbers for Broadwell to perf.
Support for Broadwell with Iris Pro (Intel Core i7-57xxC)
and support for Broadwell Server Xeon.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1434055942-28253-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-06-19 09:38:46 +02:00
Michael Ellerman
6096f88451 Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/scottwood/linux into next
Freescale updates from Scott:

"Highlights include more 8xx optimizations, an e6500 hugetlb optimization,
QMan device tree nodes, t1024/t1023 support, and various fixes and
cleanup."
2015-06-19 17:23:48 +10:00
Alexey Kardashevskiy
5c89a87d13 powerpc/powernv: Fix wrong IOMMU table in pnv_ioda_setup_bus_dma()
When pnv_pci_ioda_fixup() is called during PHB fixup time, each PE in
the sorted list of PEs (phb::pe_dma_list) is iterated to setup the PE's
DMA32 space by pnv_ioda_setup_bus_dma() if the PE's DMA32 weight is bigger
than zero. The function also assigns all the subordinate PCI devices of
the PE's primary bus with the PE's DMA32 IOMMU table. It causes the PCI
devicess in the child PEs, which don't have DMA weight, receives wrong
IOMMU table and then IOMMU group.

The patch fixes above issue by more check on the PE's coverage and don't
assign IOMMU table to those PCI devices, which belong to the child PEs.
The problem was found on Firestone platform initially.

Suggested-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2015-06-19 17:10:29 +10:00