IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Pull perf fixes from Thomas Gleixner:
"A couple of small fixes to x86 perf drivers:
- Measure L2 for HW_CACHE* events on AMD
- Fix the address filter handling in the intel/pt driver
- Handle the BTS disabling at the proper place"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/amd: Make HW_CACHE_REFERENCES and HW_CACHE_MISSES measure L2
perf/x86/intel/pt: Do validate the size of a kernel address filter
perf/x86/intel/pt: Fix kernel address filter's offset validation
perf/x86/intel/pt: Fix an off-by-one in address filter configuration
perf/x86/intel: Don't disable "intel_bts" around "intel" event batching
This commit exports the following information to
user-space via the newly created per-vcpu debugfs
directory:
- TSC offset (as a signed number)
- TSC scaling ratio
- TSC scaling ratio fractinal bits
The original intention of this commit was to
export only the TSC offset, but the TSC scaling
information is exported for completeness.
We need to retrieve the TSC offset from user-space
in order to support the merging of host and guest
traces in trace-cmd. Today, we use the kvm_write_tsc_offset
tracepoint, but it has a number of problems (mainly,
it requires a running VM to be rebooted, ftrace setup,
and also tracepoints are not supposed to be ABIs).
The merging of host and guest traces is explained
in more detail in this thread:
[Qemu-devel] [RFC] host and guest kernel trace merging
https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg00887.html
This commit creates the following files in debugfs:
/sys/kernel/debug/kvm/66828-10/vcpu0/tsc-offset
/sys/kernel/debug/kvm/66828-10/vcpu0/tsc-scaling-ratio
/sys/kernel/debug/kvm/66828-10/vcpu0/tsc-scaling-ratio-frac-bits
The last two are only created if TSC scaling is supported.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Two stubs are added:
o kvm_arch_has_vcpu_debugfs(): must return true if the arch
supports creating debugfs entries in the vcpu debugfs dir
(which will be implemented by the next commit)
o kvm_arch_create_vcpu_debugfs(): code that creates debugfs
entries in the vcpu debugfs dir
For x86, this commit introduces a new file to avoid growing
arch/x86/kvm/x86.c even more.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The TSC offset can now be read directly from struct kvm_arch_vcpu.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A future commit will want to easily read a vCPU's TSC offset,
so we store it in struct kvm_arch_vcpu_arch for easy access.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
show_stack_log_lvl() and friends allow a NULL pointer for the
task_struct to indicate the current task. This creates confusion and
can cause sneaky bugs.
Instead require the caller to pass 'current' directly.
This only changes the internal workings of the dumpstack code. The
dump_trace() and show_stack() interfaces still allow a NULL task
pointer. Those interfaces should also probably be fixed as well.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
While the Intel PMU monitors the LLC when perf enables the
HW_CACHE_REFERENCES and HW_CACHE_MISSES events, these events monitor
L1 instruction cache fetches (0x0080) and instruction cache misses
(0x0081) on the AMD PMU.
This is extremely confusing when monitoring the same workload across
Intel and AMD machines, since parameters like,
$ perf stat -e cache-references,cache-misses
measure completely different things.
Instead, make the AMD PMU measure instruction/data cache and TLB fill
requests to the L2 and instruction/data cache and TLB misses in the L2
when HW_CACHE_REFERENCES and HW_CACHE_MISSES are enabled,
respectively. That way the events measure unified caches on both
platforms.
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1472044328-21302-1-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Right now, the kernel address filters in PT are prone to integer overflow
that may happen in adding filter's size to its offset to obtain the end
of the range. Such an overflow would also throw a #GP in the PT event
configuration path.
Fix this by explicitly validating the result of this calculation.
Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: stable@vger.kernel.org # v4.7
Cc: stable@vger.kernel.org#v4.7
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160915151352.21306-4-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The kernel_ip() filter is used mostly by the DS/LBR code to look at the
branch addresses, but Intel PT also uses it to validate the address
filter offsets for kernel addresses, for which it is not sufficient:
supplying something in bits 64:48 that's not a sign extension of the lower
address bits (like 0xf00d000000000000) throws a #GP.
This patch adds address validation for the user supplied kernel filters.
Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: stable@vger.kernel.org # v4.7
Cc: stable@vger.kernel.org#v4.7
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160915151352.21306-3-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
PT address filter configuration requires that a range is specified by
its first and last address, but at the moment we're obtaining the end
of the range by adding user specified size to its start, which is off
by one from what it actually needs to be.
Fix this and make sure that zero-sized filters don't pass the filter
validation.
Reported-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: stable@vger.kernel.org # v4.7
Cc: stable@vger.kernel.org#v4.7
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160915151352.21306-2-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This will prevent a crash if get_wchan() runs after the task stack
is freed.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/337aeca8614024aa4d8d9c81053bbf8fcffbe4ad.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Specifically, pin the stack in save_stack_trace_tsk() and
show_trace_log_lvl().
This will prevent a crash if the target task dies before or while
dumping its stack once we start freeing task stacks early.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/cf0082cde65d1941a996d026f2b2cdbfaca17bfa.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When I rebased my thread_info changes onto Brian's switch_to()
changes, I carefully checked that I fixed up all the code correctly,
but I missed a comment :(
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: 15f4eae70d36 ("x86: Move thread_info into task_struct")
Link: http://lkml.kernel.org/r/089fe1e1cbe8b258b064fccbb1a5a5fd23861031.1474003868.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Windows because it uses RTC periodic interrupts.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJX2xpHAAoJEL/70l94x66D9NsIAIw+9oRA86qjehVnguV3fRKA
ITZ4OGFDiXPWuxqDaw8mHHXr0RYx8KcMTzfFNbV+YL5U0cq9xYzdaNhchKPpyF+3
7H5wL8Ku9wkYZ930kdCf5Q+LNCfg8d/wKlibPEbX0MDx4jL99kkcxLzEkmIRqFlq
bpXaQe/KR1xCWR6gI/a6aRJWLfGuFMV82YSnk/dCSjwotbAwjJUSt+IPhLwhx28o
7ddcxW3CxQqelJorcu2lvRiGnCvEzDhIdOvHJqibCjo3uzqbcI4PA2gs3rozbs9s
VMEzqZpNgK0XsyKyccSw75npViIHYPkjMzxoyHMDhgiP3eIwp/tJquxAfLjK4WE=
=h4P4
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fix from Paolo Bonzini:
"One fix for an x86 regression in VM migration, mostly visible with
Windows because it uses RTC periodic interrupts"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: x86: correctly reset dest_map->vector when restoring LAPIC state
get_user_ex(x, ptr) should zero x on failure. It's not a lot of a leak
(at most we are leaking uninitialized 64bit value off the kernel stack,
and in a fairly constrained situation, at that), but the fix is trivial,
so...
Cc: stable@vger.kernel.org
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
[ This sat in different branch from the uaccess fixes since mid-August ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When userspace sends KVM_SET_LAPIC, KVM schedules a check between
the vCPU's IRR and ISR and the IOAPIC redirection table, in order
to re-establish the IOAPIC's dest_map (the list of CPUs servicing
the real-time clock interrupt with the corresponding vectors).
However, __rtc_irq_eoi_tracking_restore_one was forgetting to
set dest_map->vectors. Because of this, the IOAPIC did not process
the real-time clock interrupt EOI, ioapic->rtc_status.pending_eoi
got stuck at a non-zero value, and further RTC interrupts were
reported to userspace as coalesced.
Fixes: 9e4aabe2bb3454c83dac8139cf9974503ee044db
Fixes: 4d99ba898dd0c521ca6cdfdde55c9b58aea3cb3d
Cc: stable@vger.kernel.org
Cc: Joerg Roedel <jroedel@suse.de>
Cc: David Gilbert <dgilbert@redhat.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
At the moment, intel_bts events get disabled from intel PMU's disable
callback, which includes event scheduling transactions of said PMU,
which have nothing to do with intel_bts events.
We do want to keep intel_bts events off inside the PMI handler to
avoid filling up their buffer too soon.
This patch moves intel_bts enabling/disabling directly to the PMI
handler.
Reported-by: Vince Weaver <vincent.weaver@maine.edu>
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: vince@deater.net
Link: http://lkml.kernel.org/r/20160915082233.11065-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
... otherwise the compiler complains:
arch/x86/entry/vdso/vma.c:252:12: warning: ‘map_vdso_randomized’ defined but not used [-Wunused-function]
But the #ifdeffery here is getting pretty ugly, so move around
vdso_addr() as well to cluster the dependencies a bit more.
It's still not particulary pretty though ...
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: gorcunov@openvz.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: oleg@redhat.com
Cc: xemul@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Now that most of the thread_info users have been cleaned up,
this is straightforward.
Most of this code was written by Linus.
Originally-from: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a50eab40abeaec9cb9a9e3cbdeafd32190206654.1473801993.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
thread_info may move in the future, so use the accessors.
[ Andy Lutomirski wrote this changelog message and changed
"task_thread_info(child)->cpu" to "task_cpu(child)". ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/3439705d9838940cc82733a7335fa8c654c37db8.1473801993.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
It was a nice optimization while it lasted, but thread_info is moving
and this optimization will no longer work.
Quoting Linus:
Oh Gods, Andy. That pt_regs_to_thread_info() thing made me want
to do unspeakable acts on a poor innocent wax figure that looked
_exactly_ like you.
[ Changelog written by Andy. ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/6376aa81c68798cc81631673f52bd91a3e078944.1473801993.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Because sched.h and thread_info.h are a tangled mess, I turned
in_compat_syscall() into a macro. If we had current_thread_struct()
or similar and we could use it from thread_info.h, then this would
be a bit cleaner.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jann Horn <jann@thejh.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/ccc8a1b2f41f9c264a41f771bb4a6539a642ad72.1473801993.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
in_exception_stack() has some recursion checking which makes sure the
stack trace code never traverses a given exception stack more than once.
This prevents an infinite loop if corruption somehow causes a stack's
"next stack" pointer to point to itself (directly or indirectly).
The recursion checking can be useful for other stacks in addition to the
exception stack, so extend it to work for all stacks.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/95de5db4cfe111754845a5cef04e20630d01423f.1473905218.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
When an interrupt happens in entry code while running on a software IRQ
stack, and the IRQ stack was empty, regs->sp will contain the stack end
address (e.g., irq_stack_ptr). If the regs are passed to dump_trace(),
get_stack_info() will report STACK_TYPE_UNKNOWN, causing dump_trace() to
return prematurely without trying to go to the next stack.
Update the bounds checking for software interrupt stacks so that the
ending address is now considered part of the stack.
This means that it's now possible for the 'walk_stack' callbacks --
print_context_stack() and print_context_stack_bp() -- to be called with
an empty stack. But that's fine; they're already prepared to deal with
that due to their on_stack() checks.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/5a5e5de92dcf11e8dc6b6e8e50ad7639d067830b.1473905218.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
valid_stack_ptr() is buggy: it assumes that all stacks are of size
THREAD_SIZE, which is not true for exception stacks. So the
walk_stack() callbacks will need to know the location of the beginning
of the stack as well as the end.
Another issue is that in general the various features of a stack (type,
size, next stack pointer, description string) are scattered around in
various places throughout the stack dump code.
Encapsulate all that information in a single place with a new stack_info
struct and a get_stack_info() interface.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/8164dd0db96b7e6a279fa17ae5e6dc375eecb4a9.1473905218.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
in_exception_stack() does some bad, bad things just so the unwinder can
print different values for different areas of the debug exception stack.
There's no need to clarify where exactly on the stack it is. Just print
"#DB" and be done with it.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/e91cb410169dd576678dd427c35efb716fd0cee1.1473905218.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Enumeration
Mark Haswell Power Control Unit as having non-compliant BARs (Bjorn Helgaas)
Power management
Fix bridge_d3 update on device removal (Lukas Wunner)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJX2az0AAoJEFmIoMA60/r8/EoP/28mGRiKi8mlqNAR3MYN3F0n
VSIm7WyxWNawH1gRJXKQBzNqgMJnj4qRGXSIvP3AIYyBDcJs/X7j91/eKOARNfQr
55A+gfSz4jUKlw+0WgPY8/U2/xQ4yoom1zhbsAYcIVeljZo/3JUg+wHpPjhIMkH0
2slTerHRDExrS43jxQi225toiEaO6lcY8EVmHCDo+jYlQz3sCEwIXg9hn1rwTbvG
sJI0zyUwHF+oWowgJqlwYxsbPPnelPAN5YAx7KrHuVmBdL0Bgo3oIRtbb3JZZ9Up
L9bQ6NpRjSARvijaZ2TAhueqIIDv2HGgvwNB01l4Yggw7Sm1dFCuUS6vj/e5tpZA
xntE3F6s2Z+I4I1D7pAX3jMYCdYx/QltiTCeGRp8pJv+f4ewW3jcel3FAksY3BEg
0NCjDrGFqGYai4hGRROpt/aXlW/Pn53eQLlu4Xg2qgkj0NMh0ODMrTjMnABB39ae
eGqIXab7WeVBxt10eU19J1u1RTqpUO2LJW+cMnvYdCfKAYby/gj8SD8vIsn3oDjZ
hQS/4fSHurc7LZwDmwOfaiHlGnvcQWV9EKwgScS0v8AxPQnC8pNUgYpzZcXY8Q6I
YXtyK7suFriSZPS0Qs4FrdfJrmTBaBQ55aZu9aftb5v4YqacN5qZo+HaXiONBy3v
RzsFK6xIbnIgb35g8vKy
=PQml
-----END PGP SIGNATURE-----
Merge tag 'pci-v4.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fixes from Bjorn Helgaas:
"Here are two changes for v4.8. The first fixes a "[Firmware Bug]: reg
0x10: invalid BAR (can't size)" warning on Haswell, and the second
fixes a problem in some new runtime suspend functionality we merged
for v4.8. Summary:
Enumeration:
Mark Haswell Power Control Unit as having non-compliant BARs (Bjorn Helgaas)
Power management:
Fix bridge_d3 update on device removal (Lukas Wunner)"
* tag 'pci-v4.8-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
PCI: Fix bridge_d3 update on device removal
PCI: Mark Haswell Power Control Unit as having non-compliant BARs
Introduce new flags that defines which ABI to use on creating sigframe.
Those flags kernel will set according to sigaction syscall ABI,
which set handler for the signal being delivered.
So that will drop the dependency on TIF_IA32/TIF_X32 flags on signal deliver.
Those flags will be used only under CONFIG_COMPAT.
Similar way ARM uses sa_flags to differ in which mode deliver signal
for 26-bit applications (look at SA_THIRYTWO).
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Cc: 0x7f454c46@gmail.com
Cc: oleg@redhat.com
Cc: linux-mm@kvack.org
Cc: gorcunov@openvz.org
Cc: xemul@virtuozzo.com
Link: http://lkml.kernel.org/r/20160905133308.28234-7-dsafonov@virtuozzo.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
As the task isn't executing at the moment of {GET,SET}REGS,
return regset that corresponds to code selector, rather than
value of TIF_IA32 flag.
I.e. if we ptrace i386 elf binary that has just changed it's
code selector to __USER_CS, than GET_REGS will return
full x86_64 register set.
Note, that this will work only if application has changed it's CS.
If the application does 32-bit syscall with __USER_CS, ptrace
will still return 64-bit register set. Which might be still confusing
for tools that expect TS_COMPACT to be exposed [1, 2].
So this this change should make PTRACE_GETREGSET more reliable and
this will be another step to drop TIF_{IA32,X32} flags.
[1]: https://sourceforge.net/p/strace/mailman/message/30471411/
[2]: https://lkml.org/lkml/2012/1/18/320
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: 0x7f454c46@gmail.com
Cc: oleg@redhat.com
Cc: linux-mm@kvack.org
Cc: luto@kernel.org
Cc: Pedro Alves <palves@redhat.com>
Cc: gorcunov@openvz.org
Cc: xemul@virtuozzo.com
Link: http://lkml.kernel.org/r/20160905133308.28234-6-dsafonov@virtuozzo.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
show_stack_log_lvl() and dump_trace() are already preemption safe:
- If they're running in irq or exception context, preemption is already
disabled and the percpu stack pointers can be trusted.
- If they're running with preemption enabled, they must be running on
the task stack anyway, so it doesn't matter if they're comparing the
stack pointer against a percpu stack pointer from this CPU or another
one: either way it won't match.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Byungchul Park <byungchul.park@lge.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Nilay Vaish <nilayvaish@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/a0ca0b1044eca97d4f0ec7c1619cf80b3b65560d.1473371307.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull x86 fixes from Ingo Molnar:
"Three fixes:
- AMD microcode loading fix with randomization
- an lguest tooling fix
- and an APIC enumeration boundary condition fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/apic: Fix num_processors value in case of failure
tools/lguest: Don't bork the terminal in case of wrong args
x86/microcode/AMD: Fix load of builtin microcode with randomized memory
Pull perf fixes from Ingo Molnar:
"This contains:
- a set of fixes found by directed-random perf fuzzing efforts by
Vince Weaver, Alexander Shishkin and Peter Zijlstra
- a cqm driver crash fix
- an AMD uncore driver use after free fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel: Fix PEBSv3 record drain
perf/x86/intel/bts: Kill a silly warning
perf/x86/intel/bts: Fix BTS PMI detection
perf/x86/intel/bts: Fix confused ordering of PMU callbacks
perf/core: Fix aux_mmap_count vs aux_refcount order
perf/core: Fix a race between mmap_close() and set_output() of AUX events
perf/x86/amd/uncore: Prevent use after free
perf/x86/intel/cqm: Check cqm/mbm enabled state in event init
perf/core: Remove WARN from perf_event_read()
Pull EFI fixes from Ingo Molnar:
"This contains a Xen fix, an arm64 fix and a race condition /
robustization set of fixes related to ExitBootServices() usage and
boundary conditions"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/efi: Use efi_exit_boot_services()
efi/libstub: Use efi_exit_boot_services() in FDT
efi/libstub: Introduce ExitBootServices helper
efi/libstub: Allocate headspace in efi_get_memory_map()
efi: Fix handling error value in fdt_find_uefi_params
efi: Make for_each_efi_memory_desc_in_map() cope with running on Xen
The espfix64 setup code was a bit inscrutible and contained an
unnecessary push of RAX. Remove that push, update all the stack
offsets to match, and document the whole mess.
Reported-By: Borislav Petkov <bp@alien8.de>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/e5459eb10cf1175c8b36b840bc425f210d045f35.1473717910.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
and allow drivers to permanently reserve EFI boot services regions
on x86, as well as ARM/arm64 - Matt Fleming
* Add ARM support for the EFI esrt driver - Ard Biesheuvel
* Make the EFI runtime services and efivar API interruptible by
swapping spinlocks for semaphores - Sylvain Chouleur
* Provide the EFI identity mapping for kexec which allows kexec to
work on SGI/UV platforms with requiring the "noefi" kernel command
line parameter - Alex Thorlton
* Add debugfs node to dump EFI page tables on arm64 - Ard Biesheuvel
* Merge the EFI test driver being carried out of tree until now in
the FWTS project - Ivan Hu
* Expand the list of flags for classifying EFI regions as "RAM" on
arm64 so we align with the UEFI spec - Ard Biesheuvel
* Optimise out the EFI mixed mode if it's unsupported (CONFIG_X86_32)
or disabled (CONFIG_EFI_MIXED=n) and switch the early EFI boot
services function table for direct calls, alleviating us from
having to maintain the custom function table - Lukas Wunner
* Miscellaneous cleanups and fixes
-----BEGIN PGP SIGNATURE-----
iQI2BAABCAAgBQJX0tCTGRxtYXR0QGNvZGVibHVlcHJpbnQuY28udWsACgkQLzhZ
wI0jPVWLVBAAn/iM91Vmhggdk3t0wCMJzrNGonw61VJ9TZJVbCUJyiH0qdDUThhj
R4rO+6Vf6yOuyswu+mGmae61tfsjwJHH+IPpB8nRLIGQRwzoxk+aGC7FzmQ0ISVO
wIdv5shsmeWhFAyNB1D4hzlp1NxOZaqcU/0cfUVGe4HmK0Js3tUpWWx8VgJ7yvW+
X1PBbfyChArGqiwV6FJz/mJxRAgByUfhvYMcX9HhQkou6F4U5Y8l3vlhUMbuAZAi
ZfG2LWGYCQ+F4XKxMq2QDAtdUgBzlYWw6W60o55x9WO4cEVSzNVRgedto5o1Zea9
2QGEr94gim+e5cJ/HeDIEmbWZhAqIdcNDqXSSBd1CDVQytp4PNAn6rxk+2S9kxoe
T9Mk523HEabo+AZvDAPPJlzcsnIe83JYy69M1xFvcP25ebk7y2BwQtd1jwWPrPDQ
Q/llzF93aezUFR/guvIw0oHckhQl0ZkNedL9Tq4+UKL0ibp2X4gSX636/x4PkBSP
5+pyfmO1SAqTiiMQGQMnp4+ngPQeQrxkmVnh1P7cKlTNXg1IoS03t46Xn2Pj10cd
3KneVDeN9DKIAOn7wPKuPnjTho+9FH36xbwTaIgbt0cWuFFfu090rmqOQfjAJEDN
foHzsMZ7S6CmeOJnj97NNR8sMQDcc+p9bh1KXpJIHaZAgrKmvqPZpMk=
=G7L6
-----END PGP SIGNATURE-----
Merge tag 'efi-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into efi/core
Pull EFI updates from Matt Fleming:
"* Refactor the EFI memory map code into architecture neutral files
and allow drivers to permanently reserve EFI boot services regions
on x86, as well as ARM/arm64 - Matt Fleming
* Add ARM support for the EFI esrt driver - Ard Biesheuvel
* Make the EFI runtime services and efivar API interruptible by
swapping spinlocks for semaphores - Sylvain Chouleur
* Provide the EFI identity mapping for kexec which allows kexec to
work on SGI/UV platforms with requiring the "noefi" kernel command
line parameter - Alex Thorlton
* Add debugfs node to dump EFI page tables on arm64 - Ard Biesheuvel
* Merge the EFI test driver being carried out of tree until now in
the FWTS project - Ivan Hu
* Expand the list of flags for classifying EFI regions as "RAM" on
arm64 so we align with the UEFI spec - Ard Biesheuvel
* Optimise out the EFI mixed mode if it's unsupported (CONFIG_X86_32)
or disabled (CONFIG_EFI_MIXED=n) and switch the early EFI boot
services function table for direct calls, alleviating us from
having to maintain the custom function table - Lukas Wunner
* Miscellaneous cleanups and fixes"
Signed-off-by: Ingo Molnar <mingo@kernel.org>
__show_regs() fails to dump the PKRU state when the debug registers are in
their default state because there is a return statement on the debug
register state.
Change the logic to report PKRU value even when debug registers are in
their default state.
Fixes:c0b17b5bd4b7 ("x86/mm/pkeys: Dump PKRU with other kernel registers")
Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: http://lkml.kernel.org/r/20160910183045.4618-1-nicolas.iooss_linux@m4x.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In particular:
arch/x86/ras/mce_amd_inj.c: In function ‘prepare_msrs’:
arch/x86/ras/mce_amd_inj.c:249:13: warning: declaration of ‘i_mce’ shadows a global declaration [-Wshadow]
struct mce i_mce = *(struct mce *)info;
^~~~~
arch/x86/ras/mce_amd_inj.c: In function ‘init_mce_inject’:
arch/x86/ras/mce_amd_inj.c:453:16: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (i = 0; i < ARRAY_SIZE(dfs_fls); i++) {
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/20160912075941.24699-16-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Bank 4 is reserved on family 0x17 and shouldn't generate any MCE
records. However, broken hardware and software is not something unheard
of so warn about bank 4 errors. They shouldn't be coming from bank 4
naturally but users can still use mce_amd_inj to simulate errors from it
for testing purposed.
Also, avoid special handling in the injector mce_amd_inj like it is
being done on the older families.
[ bp: Rewrite commit message and merge into one patch. Use boot_cpu_data. ]
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Link: http://lkml.kernel.org/r/1473384591-5323-1-git-send-email-Yazen.Ghannam@amd.com
Link: http://lkml.kernel.org/r/1473384591-5323-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The MCA_ADDR registers on Scalable MCA systems contain the ErrorAddr
in bits [55:0] and the least significant bit of the address in bits
[61:56]. We should extract the valid ErrorAddr bits from the MCA_ADDR
register rather than saving the raw value to struct mce.
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1473275643-1721-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The MCA_SYND and MCA_IPID registers contain valuable information and
should be included in MCE output. The MCA_SYND register contains
syndrome and other error information, and the MCA_IPID register will
uniquely identify the MCA bank's type without having to rely on system
software.
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472680624-34221-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The MCA_IPID register uniquely identifies a bank's type and instance
on Scalable MCA systems. We should save the value of this register
in struct mce along with the other relevant error information. This
ensures that we can decode errors without relying on system software to
correlate the bank to the type.
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472680624-34221-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The Deferred Error Interrupt Type is set per bank on Scalable MCA
systems. This is done in a bitfield in the MCA_CONFIG register of each
bank. We should set its type to APIC-based interrupt and not assume BIOS
has set it for us.
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: http://lkml.kernel.org/r/1472737486-1720-1-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Define a bank's sysfs filename based on its IP type and InstanceId.
Credits go to Aravind for:
* The general idea and proto- get_name().
* Defining smca_umc_block_names[] and buf_mcatype[].
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Aravind Gopalakrishnan <aravindksg.lkml@gmail.com>
Link: http://lkml.kernel.org/r/1473193490-3291-2-git-send-email-Yazen.Ghannam@amd.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>