linux

iv/linux

Author	SHA1	Message	Date
Borislav Petkov	57639bedd2	x86, MCE: Remove unused defines Those were sitting there unused since the dawn of time, drop them. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>	2012-09-17 19:33:38 +02:00
Borislav Petkov	e57dbaf77f	x86, mce: Enable MCA support by default MCA is the basic support for hardware error logging and reporting, and it is majorly unwise to run without it so enable machine check software support by default on x86. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Acked-by: Tony Luck <tony.luck@intel.com>	2012-09-17 19:33:13 +02:00
Konrad Rzeszutek Wilk	2a3bce8f6a	xen/swiotlb: Fix compile warnings when using plain integer instead of NULL pointer. arch/x86/xen/pci-swiotlb-xen.c:96:1: warning: Using plain integer as NULL pointer arch/x86/xen/pci-swiotlb-xen.c:96:1: warning: Using plain integer as NULL pointer Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-09-17 13:00:43 -04:00
Konrad Rzeszutek Wilk	b827760053	xen/swiotlb: Use the swiotlb_late_init_with_tbl to init Xen-SWIOTLB late when PV PCI is used. With this patch we provide the functionality to initialize the Xen-SWIOTLB late in the bootup cycle - specifically for Xen PCI-frontend. We still will work if the user had supplied 'iommu=soft' on the Linux command line. Note: We cannot depend on after_bootmem to automatically determine whether this is early or not. This is because when PCI IOMMUs are initialized it is after after_bootmem but before a lot of "other" subsystems are initialized. CC: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> [v1: Fix smatch warnings] [v2: Added check for xen_swiotlb] [v3: Rebased with new xen-swiotlb changes] [v4: squashed xen/swiotlb: Depending on after_bootmem is not correct in] Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-09-17 12:58:16 -04:00
Michael S. Tsirkin	9fc77441e5	KVM: make processes waiting on vcpu mutex killable vcpu mutex can be held for unlimited time so taking it with mutex_lock on an ioctl is wrong: one process could be passed a vcpu fd and call this ioctl on the vcpu used by another process, it will then be unkillable until the owner exits. Call mutex_lock_killable instead and return status. Note: mutex_lock_interruptible would be even nicer, but I am not sure all users are prepared to handle EINTR from these ioctls. They might misinterpret it as an error. Cleanup paths expect a vcpu that can't be used by any userspace so this will always succeed - catch bugs by calling BUG_ON. Catch callers that don't check return state by adding __must_check. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-09-17 13:46:32 -03:00
Yan, Zheng	314d9f63f3	perf/x86: Add cpumask for uncore pmu This patch adds a cpumask file to the uncore pmu sysfs directory. The cpumask file contains one active cpu for every socket. Signed-off-by: "Yan, Zheng" <zheng.z.yan@intel.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Ingo Molnar <mingo@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: "Yan, Zheng" <zheng.z.yan@intel.com> Link: http://lkml.kernel.org/r/1347263631-23175-2-git-send-email-zheng.z.yan@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2012-09-17 13:11:43 -03:00
Avi Kivity	7454766f7b	KVM: SVM: Make use of asm.h Use macros for bitness-insensitive register names, instead of rolling our own. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-09-17 10:38:05 -03:00
Avi Kivity	b188c81f2e	KVM: VMX: Make use of asm.h Use macros for bitness-insensitive register names, instead of rolling our own. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-09-17 10:38:04 -03:00
Avi Kivity	83287ea420	KVM: VMX: Make lto-friendly LTO (link-time optimization) doesn't like local labels to be referred to from a different function, since the two functions may be built in separate compilation units. Use an external variable instead. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-09-17 10:38:03 -03:00
Matthew Garrett	e9b10953ed	x86, EFI: Calculate the EFI framebuffer size instead of trusting the firmware Seth Forshee reported that his system was reporting that the EFI framebuffer stretched from 0x90010000-0xb0010000 despite the GPU's BAR only covering 0x90000000-0x9ffffff. It's safer to calculate this value from the pixel stride and screen height (values we already depend on) rather than face potential problems with resource allocation later on. Signed-off-by: Matthew Garrett <mjg@redhat.com> Tested-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2012-09-17 13:29:24 +01:00
Matthew Garrett	f462ed939d	efifb: Skip DMI checks if the bootloader knows what it's doing The majority of the DMI checks in efifb are for cases where the bootloader has provided invalid information. However, on some machines the overrides may do more harm than good due to configuration differences between machines with the same machine identifier. It turns out that it's possible for the bootloader to get the correct information on GOP-based systems, but we can't guarantee that the kernel's being booted with one that's been updated to do so. Add support for a capabilities flag that can be set by the bootloader, and skip the DMI checks in that case. Additionally, set this flag in the UEFI stub code. Signed-off-by: Matthew Garrett <mjg@redhat.com> Acked-by: Peter Jones <pjones@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2012-09-17 13:29:23 +01:00
Seiji Aguchi	d6cf86d8f2	efi: initialize efi.runtime_version to make query_variable_info/update_capsule workable A value of efi.runtime_version is checked before calling update_capsule()/query_variable_info() as follows. But it isn't initialized anywhere. <snip> static efi_status_t virt_efi_query_variable_info(u32 attr, u64 storage_space, u64 remaining_space, u64 *max_variable_size) { if (efi.runtime_version < EFI_2_00_SYSTEM_TABLE_REVISION) return EFI_UNSUPPORTED; <snip> This patch initializes a value of efi.runtime_version at boot time. Signed-off-by: Seiji Aguchi <seiji.aguchi@hds.com> Acked-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2012-09-17 13:29:22 +01:00
Matthew Garrett	9dead5bbb8	efi: Build EFI stub with EFI-appropriate options We can't assume the presence of the red zone while we're still in a boot services environment, so we should build with -fno-red-zone to avoid problems. Change the size of wchar at the same time to make string handling simpler. Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2012-09-17 13:29:21 +01:00
Matthew Garrett	38cb5ef447	X86: Improve GOP detection in the EFI boot stub We currently use the PCI IO protocol as a proxy for a functional GOP. This is less than ideal, since some platforms will put the GOP on output devices rather than the GPU itself. Move to using the conout protocol. This is not guaranteed per-spec, but is part of the consplitter implementation that causes this problem in the first place and so should be reliable. Signed-off-by: Matthew Garrett <mjg@redhat.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2012-09-17 13:29:20 +01:00
David S. Miller	b48b63a1f6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: net/netfilter/nfnetlink_log.c net/netfilter/xt_LOG.c Rather easy conflict resolution, the 'net' tree had bug fixes to make sure we checked if a socket is a time-wait one or not and elide the logging code if so. Whereas on the 'net-next' side we are calculating the UID and GID from the creds using different interfaces due to the user namespace changes from Eric Biederman. Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-15 11:43:53 -04:00
Oleg Nesterov	baedbf02b1	uprobes: Make arch_uprobe_task->saved_trap_nr "unsigned int" Make arch_uprobe_task->saved_trap_nr "unsigned int" and move it down after ->saved_scratch_register, this changes sizeof() from 24 to 16. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>	2012-09-15 17:37:32 +02:00
Oleg Nesterov	d6a00b35e4	uprobes/x86: Fix arch_uprobe_disable_step() && UTASK_SSTEP_TRAPPED interaction arch_uprobe_disable_step() should also take UTASK_SSTEP_TRAPPED into account. In this case the probed insn was not executed, we need to clear X86_EFLAGS_TF if it was set by us and that is all. Again, this code will look more clean when we move it into arch_uprobe_post_xol() and arch_uprobe_abort_xol(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>	2012-09-15 17:37:32 +02:00
Oleg Nesterov	3a4664aa83	uprobes/x86: Xol should send SIGTRAP if X86_EFLAGS_TF was set arch_uprobe_disable_step() correctly preserves X86_EFLAGS_TF and returns to user-mode. But this means the application gets SIGTRAP only after the next insn. This means that UPROBE_CLEAR_TF logic is not really right. _enable should only record the state of X86_EFLAGS_TF, and _disable should check it separately from UPROBE_FIX_SETF. Remove arch_uprobe_task->restore_flags, add ->saved_tf instead, and change enable/disable accordingly. This assumes that the probed insn was not trapped, see the next patch. arch_uprobe_skip_sstep() logic has the same problem, change it to check X86_EFLAGS_TF and send SIGTRAP as well. We will cleanup this all after we fold enable/disable_step into pre/post_hol hooks. Note: send_sig(SIGTRAP) is not actually right, we need send_sigtrap(). But this needs more changes, handle_swbp() does the same and this is equally wrong. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>	2012-09-15 17:37:31 +02:00
Oleg Nesterov	9bd1190a11	uprobes/x86: Do not (ab)use TIF_SINGLESTEP/user_*_single_step() for single-stepping user_enable/disable_single_step() was designed for ptrace, it assumes a single user and does unnecessary and wrong things for uprobes. For example: - arch_uprobe_enable_step() can't trust TIF_SINGLESTEP, an application itself can set X86_EFLAGS_TF which must be preserved after arch_uprobe_disable_step(). - we do not want to set TIF_SINGLESTEP/TIF_FORCED_TF in arch_uprobe_enable_step(), this only makes sense for ptrace. - otoh we leak TIF_SINGLESTEP if arch_uprobe_disable_step() doesn't do user_disable_single_step(), the application will be killed after the next syscall. - arch_uprobe_enable_step() does access_process_vm() we do not need/want. Change arch_uprobe_enable/disable_step() to set/clear X86_EFLAGS_TF directly, this is much simpler and more correct. However, we need to clear TIF_BLOCKSTEP/DEBUGCTLMSR_BTF before executing the probed insn, add set_task_blockstep(false). Note: with or without this patch, there is another (hopefully minor) problem. A probed "pushf" insn can see the wrong X86_EFLAGS_TF set by uprobes. Perhaps we should change _disable to update the stack, or teach arch_uprobe_skip_sstep() to emulate this insn. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>	2012-09-15 17:37:30 +02:00
Oleg Nesterov	95cf00fa5d	ptrace/x86: Partly fix set_task_blockstep()->update_debugctlmsr() logic Afaics the usage of update_debugctlmsr() and TIF_BLOCKSTEP in step.c was always very wrong. 1. update_debugctlmsr() was simply unneeded. The child sleeps TASK_TRACED, __switch_to_xtra(next_p => child) should notice TIF_BLOCKSTEP and set/clear DEBUGCTLMSR_BTF after resume if needed. 2. It is wrong. The state of DEBUGCTLMSR_BTF bit in CPU register should always match the state of current's TIF_BLOCKSTEP bit. 3. Even get_debugctlmsr() + update_debugctlmsr() itself does not look right. Irq can change other bits in MSR_IA32_DEBUGCTLMSR register or the caller can be preempted in between. 4. It is not safe to play with TIF_BLOCKSTEP if task != current. DEBUGCTLMSR_BTF and TIF_BLOCKSTEP should always match each other if the task is running. The tracee is stopped but it can be SIGKILL'ed right before set/clear_tsk_thread_flag(). However, now that uprobes uses user_enable_single_step(current) we can't simply remove update_debugctlmsr(). So this patch adds the additional "task == current" check and disables irqs to avoid the race with interrupts/preemption. Unfortunately this patch doesn't solve the last problem, we need another fix. Probably we should teach ptrace_stop() to set/clear single/block stepping after resume. And afaics there is yet another problem: perf can play with MSR_IA32_DEBUGCTLMSR from nmi, this obviously means that even __switch_to_xtra() has problems. Signed-off-by: Oleg Nesterov <oleg@redhat.com>	2012-09-15 17:37:29 +02:00
Oleg Nesterov	848e8f5f0a	ptrace/x86: Introduce set_task_blockstep() helper No functional changes, preparation for the next fix and for uprobes single-step fixes. Move the code playing with TIF_BLOCKSTEP/DEBUGCTLMSR_BTF into the new helper, set_task_blockstep(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>	2012-09-15 17:37:28 +02:00
Sebastian Andrzej Siewior	bdc1e47217	uprobes/x86: Implement x86 specific arch_uprobe_*_step The arch specific implementation behaves like user_enable_single_step() except that it does not disable single stepping if it was already enabled by ptrace. This allows the debugger to single step over an uprobe. The state of block stepping is not restored. It makes only sense together with TF and if that was enabled then the debugger is notified. Note: this is still not correct. For example, TIF_SINGLESTEP check is not right, the application itself can set X86_EFLAGS_TF. And otoh we leak TIF_SINGLESTEP (set by enable) if the probed insn is "popf". See the next patches, we need the changes in arch/x86/kernel/step.c first. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>	2012-09-15 17:37:28 +02:00
Linus Torvalds	7ef6e97380	Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "This tree includes various fixes" Ingo really needs to improve on the whole "explain git pull" part. "Various fixes" indeed. * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/hwpb: Invoke __perf_event_disable() if interrupts are already disabled perf/x86: Enable Intel Cedarview Atom suppport perf_event: Switch to internal refcount, fix race with close() oprofile, s390: Fix uninitialized memory access when writing to oprofilefs perf/x86: Fix microcode revision check for SNB-PEBS	2012-09-14 17:43:45 -07:00
Ingo Molnar	26f45274af	Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core Pull tracing updates from Steve Rostedt. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-09-14 10:06:51 +02:00
Masami Hiramatsu	c6aaf4d0bb	kprobes/x86: Fix to support jprobes on ftrace-based kprobe Fix kprobes/x86 to support jprobes on ftrace-based kprobes. Because of -mfentry support of ftrace, ftrace is now put on the beginning of function where jprobes are put. Originally ftrace-based kprobes doesn't support jprobe because it will change regs->ip and ftrace doesn't support changing IP and ftrace itself doesn't conflict jprobe. However, ftrace -mfentry support moves mcount call on the top of functions where jprobes are put. This means that jprobe always conflicts with ftrace-based kprobe and fails. This patch allows ftrace-based kprobes to support jprobes by allowing to modify regs->ip and kprobes breakpoint handler also allows to skip singlestepping because there is a ftrace call (not an original instruction). Link: http://lkml.kernel.org/r/20120905143125.10329.90836.stgit@localhost.localdomain Reported-by: Fengguang Wu <fengguang.wu@intel.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-09-13 22:52:11 -04:00
Steven Rostedt	47d5a5f88b	ftrace/x86-64: Allow to change RIP in handlers Allow ftrace handlers to change RIP register (regs->ip) in handlers. This will allow handlers to call another function instead of original function. Link: http://lkml.kernel.org/r/20120905143118.10329.5078.stgit@localhost.localdomain Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-09-13 22:52:10 -04:00
Masami Hiramatsu	4b036d54bf	kprobes/x86: Fix kprobes to collectly handle IP on ftrace Current kprobe_ftrace_handler expects regs->ip == ip, but it is incorrect (originally on x86-64). Actually, ftrace handler sets regs->ip = ip + MCOUNT_INSN_SIZE. kprobe_ftrace_handler must take care for that. Link: http://lkml.kernel.org/r/20120905143112.10329.72069.stgit@localhost.localdomain Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-09-13 22:52:09 -04:00
Masami Hiramatsu	a5e37863ab	ftrace/x86: Adjust x86 regs.ip as like as x86-64 Adjust x86 regs.ip to ip + MCOUNT_INSN_SIZE as like as on x86-64. This helps us to consolidate codes which use regs->ip on both of x86/x86-64. Link: http://lkml.kernel.org/r/20120905143100.10329.60109.stgit@localhost.localdomain Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-09-13 22:52:09 -04:00
Jan Beulich	3120e25efd	x86/Kconfig: Clean up Kconfig defaults The main goal here is to have the resulting .config no carry any options that aren't enabled and can't be (i.e such where the default is "no" and can't be changed), so that if any such option later gets a user visible prompt, the user will actually be prompted on a "make ...oldconfig" rather than keeping the previously invisible option disabled. There's a little bit of other trivial cleanup mixed in here. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/504DEE19020000780009A285@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-09-13 17:45:33 +02:00
Jan Beulich	5870661c09	x86: Prefer TZCNT over BFS Following a relatively recent compiler change, make use of the fact that for non-zero input BSF and TZCNT produce the same result, and that CPUs not knowing of TZCNT will treat the instruction as BSF (i.e. ignore what looks like a REP prefix to them). The assumption here is that TZCNT would never have worse performance than BSF. For the moment, only do this when the respective generic-CPU option is selected (as there are no specific-CPU options covering the CPUs supporting TZCNT), and don't do that when size optimization was requested. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/504DEA1B020000780009A277@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-09-13 17:44:01 +02:00
Jan Beulich	1edfbb4153	x86/64: Adjust types of temporaries used by ffs()/fls()/fls64() The 64-bit special cases of the former two (the thrird one is 64-bit only anyway) don't need to use "long" temporaries, as the result will always fit in a 32-bit variable, and the functions return plain "int". This avoids a few REX prefixes, i.e. minimally reduces code size. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/504DE550020000780009A258@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-09-13 17:43:58 +02:00
T Makphaibulchoke	73e8f3d7e2	x86/mm/init.c: Fix devmem_is_allowed() off by one Fixing an off-by-one error in devmem_is_allowed(), which allows accesses to physical addresses 0x100000-0x100fff, an extra page past 1MB. Signed-off-by: T Makphaibulchoke <tmac@hp.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Cc: yinghai@kernel.org Cc: tiwai@suse.de Cc: dhowells@redhat.com Link: http://lkml.kernel.org/r/1346210503-14276-1-git-send-email-tmac@hp.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-09-13 17:35:54 +02:00
Ian Campbell	6eebdda35e	x86: Drop unnecessary kernel_eflags variable on 64-bit On 64 bit x86 we save the current eflags in cpu_init for use in ret_from_fork. Strictly speaking reserved bits in EFLAGS should be read as written but in practise it is unlikely that EFLAGS could ever be extended in this way and the kernel alread clears any undefined flags early on. The equivalent 32 bit code simply hard codes 0x0202 as the new EFLAGS. This change makes 64 bit use the same mechanism to setup the initial EFLAGS on fork. Note that 64 bit resets EFLAGS before calling schedule_tail() as opposed to 32 bit which calls schedule_tail() first. Therefore the correct value for EFLAGS has opposite IF bit. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Acked-by: Andi Kleen <ak@linux.intel.com> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Pekka Enberg <penberg@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Link: http://lkml.kernel.org/r/20120824195847.GA31628@moon Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-09-13 17:32:47 +02:00
Ingo Molnar	4553f0b90e	Merge branch 'core/rcu' into perf/core Steve Rostedt asked for the merge of a single commit, into both the RCU and the perf/tracing tree: \| Josh made a change to the tracing code that affects both the \| work Paul McKenney and I are currently doing. At the last \| Kernel Summit back in August, Linus said when such a case \| exists, it is best to make a separate branch based off of his \| tree and place the change there. This way, the repositories \| that need to share the change can both pull them in and the \| SHA1 will match for both. Whichever branch is pulled in first \| by Linus will also pull in the necessary change for the other \| branch as well. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-09-13 17:18:38 +02:00
Robert Richter	bad9ac2d7f	perf/x86/ibs: Check syscall attribute flags Current implementation simply ignores attribute flags. Thus, there is no notification to userland of unsupported features. Check syscall's attribute flags to let userland know if a feature is supported by the kernel. This is also needed to distinguish between future kernels what might support a feature. Cc: <stable@vger.kernel.org> v3.5.. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20120910093018.GO8285@erda.amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-09-13 16:59:48 +02:00
Stephane Eranian	35534b201c	perf/x86: Export Sandy Bridge uncore clockticks event in sysfs This patch exports the clockticks event and its encoding to user level. The clockticks event was exported for Nehalem/Westmere but not for Sandy Bridge (client). Given that it uses a special encoding, it needs to be exported to user tools, so users can do: # perf stat -a -C 0 -e uncore_cbox_0/clockticks/ sleep 1 Signed-off-by: Stephane Eranian <eranian@google.com> Acked-by: Yan, Zheng <zheng.z.yan@intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20120829130122.GA32336@quad Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-09-13 16:59:46 +02:00
Takuya Yoshikawa	ecba9a52ac	KVM: x86: lapic: Clean up find_highest_vector() and count_vectors() find_highest_vector() and count_vectors(): - Instead of using magic values, define and use proper macros. find_highest_vector(): - Remove likely() which is there only for historical reasons and not doing correct branch predictions anymore. Using such heuristics to optimize this function is not worth it now. Let CPUs predict things instead. - Stop checking word[0] separately. This was only needed for doing likely() optimization. - Use for loop, not while, to iterate over the register array to make the code clearer. Note that we actually confirmed that the likely() did wrong predictions by inserting debug code. Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-09-12 13:38:23 -03:00
Stefano Stabellini	2fc136eecd	xen/m2p: do not reuse kmap_op->dev_bus_addr If the caller passes a valid kmap_op to m2p_add_override, we use kmap_op->dev_bus_addr to store the original mfn, but dev_bus_addr is part of the interface with Xen and if we are batching the hypercalls it might not have been written by the hypervisor yet. That means that later on Xen will write to it and we'll think that the original mfn is actually what Xen has written to it. Rather than "stealing" struct members from kmap_op, keep using page->index to store the original mfn and add another parameter to m2p_remove_override to get the corresponding kmap_op instead. It is now responsibility of the caller to keep track of which kmap_op corresponds to a particular page in the m2p_override (gntdev, the only user of this interface that passes a valid kmap_op, is already doing that). CC: stable@kernel.org Reported-and-Tested-By: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-09-12 11:21:40 -04:00
Konrad Rzeszutek Wilk	98104c3480	Merge branch 'stable/128gb.v5.1' into stable/for-linus-3.7 * stable/128gb.v5.1: xen/mmu: If the revector fails, don't attempt to revector anything else. xen/p2m: When revectoring deal with holes in the P2M array. xen/mmu: Release just the MFN list, not MFN list and part of pagetables. xen/mmu: Remove from __ka space PMD entries for pagetables. xen/mmu: Copy and revector the P2M tree. xen/p2m: Add logic to revector a P2M tree to use __va leafs. xen/mmu: Recycle the Xen provided L4, L3, and L2 pages xen/mmu: For 64-bit do not call xen_map_identity_early xen/mmu: use copy_page instead of memcpy. xen/mmu: Provide comments describing the _ka and _va aliasing issue xen/mmu: The xen_setup_kernel_pagetable doesn't need to return anything. Revert "xen/x86: Workaround 64-bit hypervisor and 32-bit initial domain." and "xen/x86: Use memblock_reserve for sensitive areas." xen/x86: Workaround 64-bit hypervisor and 32-bit initial domain. xen/x86: Use memblock_reserve for sensitive areas. xen/p2m: Fix the comment describing the P2M tree. Conflicts: arch/x86/xen/mmu.c The pagetable_init is the old xen_pagetable_setup_done and xen_pagetable_setup_start rolled in one. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-09-12 11:18:57 -04:00
Konrad Rzeszutek Wilk	25a765b7f0	Merge branch 'x86/platform' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into stable/for-linus-3.7 * 'x86/platform' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (9690 commits) x86: Document x86_init.paging.pagetable_init() x86: xen: Cleanup and remove x86_init.paging.pagetable_setup_done() x86: Move paging_init() call to x86_init.paging.pagetable_init() x86: Rename pagetable_setup_start() to pagetable_init() x86: Remove base argument from x86_init.paging.pagetable_setup_start Linux 3.6-rc5 HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured Remove user-triggerable BUG from mpol_to_str xen/pciback: Fix proper FLR steps. uml: fix compile error in deliver_alarm() dj: memory scribble in logi_dj Fix order of arguments to compat_put_time[spec\|val] xen: Use correct masking in xen_swiotlb_alloc_coherent. xen: fix logical error in tlb flushing xen/p2m: Fix one-off error in checking the P2M tree directory. powerpc: Don't use __put_user() in patch_instruction powerpc: Make sure IPI handlers see data written by IPI senders powerpc: Restore correct DSCR in context switch powerpc: Fix DSCR inheritance in copy_thread() powerpc: Keep thread.dscr and thread.dscr_inherit in sync ...	2012-09-12 11:14:33 -04:00
Attilio Rao	6428227898	x86: Document x86_init.paging.pagetable_init() Signed-off-by: Attilio Rao <attilio.rao@citrix.com> Acked-by: <konrad.wilk@oracle.com> Cc: <Ian.Campbell@citrix.com> Cc: <Stefano.Stabellini@eu.citrix.com> Cc: <xen-devel@lists.xensource.com> Link: http://lkml.kernel.org/r/1345580561-8506-6-git-send-email-attilio.rao@citrix.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-09-12 15:33:07 +02:00
Attilio Rao	c711288727	x86: xen: Cleanup and remove x86_init.paging.pagetable_setup_done() At this stage x86_init.paging.pagetable_setup_done is only used in the XEN case. Move its content in the x86_init.paging.pagetable_init setup function and remove the now unused x86_init.paging.pagetable_setup_done remaining infrastructure. Signed-off-by: Attilio Rao <attilio.rao@citrix.com> Acked-by: <konrad.wilk@oracle.com> Cc: <Ian.Campbell@citrix.com> Cc: <Stefano.Stabellini@eu.citrix.com> Cc: <xen-devel@lists.xensource.com> Link: http://lkml.kernel.org/r/1345580561-8506-5-git-send-email-attilio.rao@citrix.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-09-12 15:33:06 +02:00
Attilio Rao	843b8ed2ec	x86: Move paging_init() call to x86_init.paging.pagetable_init() Move the paging_init() call to the platform specific pagetable_init() function, so we can get rid of the extra pagetable_setup_done() function pointer. Signed-off-by: Attilio Rao <attilio.rao@citrix.com> Acked-by: <konrad.wilk@oracle.com> Cc: <Ian.Campbell@citrix.com> Cc: <Stefano.Stabellini@eu.citrix.com> Cc: <xen-devel@lists.xensource.com> Link: http://lkml.kernel.org/r/1345580561-8506-4-git-send-email-attilio.rao@citrix.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-09-12 15:33:06 +02:00
Attilio Rao	7737b215ad	x86: Rename pagetable_setup_start() to pagetable_init() In preparation for unifying the pagetable_setup_start() and pagetable_setup_done() setup functions, rename appropriately all the infrastructure related to pagetable_setup_start(). Signed-off-by: Attilio Rao <attilio.rao@citrix.com> Ackedd-by: <konrad.wilk@oracle.com> Cc: <Ian.Campbell@citrix.com> Cc: <Stefano.Stabellini@eu.citrix.com> Cc: <xen-devel@lists.xensource.com> Link: http://lkml.kernel.org/r/1345580561-8506-3-git-send-email-attilio.rao@citrix.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-09-12 15:33:06 +02:00
Attilio Rao	73090f8993	x86: Remove base argument from x86_init.paging.pagetable_setup_start We either use swapper_pg_dir or the argument is unused. Preparatory patch to simplify platform pagetable setup further. Signed-off-by: Attilio Rao <attilio.rao@citrix.com> Ackedb-by: <konrad.wilk@oracle.com> Cc: <Ian.Campbell@citrix.com> Cc: <Stefano.Stabellini@eu.citrix.com> Cc: <xen-devel@lists.xensource.com> Link: http://lkml.kernel.org/r/1345580561-8506-2-git-send-email-attilio.rao@citrix.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-09-12 15:33:06 +02:00
Linus Torvalds	ffc2964918	KVM updates for 3.6-rc5 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJQTgR7AAoJEI7yEDeUysxlLnsP/jsGydL5WWsujpK9jdEbNa+r Y2D8aPgKRe6Z6ZmWpe7jxymXA4haXh1kKO8nIVgfFOgtvN+Po8rZMOQE0TpuulFy ANYa2N9Mi4+tpi9pqjcVgafjNa7xxRer04jZr5a789ry0MXjKspoUv/rFPpOsxrL XlaPnRu1MLnCWmnLv3nvgatth3fIjo4X31FGmUwl4dRc59NIQREek7z3gAV0x4td +JK6qvwzYOz0LhwV/Ssk84Sm97OOfXVaKWvxfC8y0H6nt0nXqmfGI/d5tWL8F9xz 92c0lelvbm6AMgx5yY5onRYDE6SkdlnAS563umIB58jMp44Fj+0MMPSfPQM1rBXC NZ7mh2BJJ8Dsfi2NVneWW089o+uxSfI9HkuXGEiTzcu2mjISjuJ5ih7efAAmr7xc e8dMM6fVTpNTazd6VGvsgno8f/Hr+D6qOZwZ8DsH4QLX0NJC41Dm6loHCE3saR2x tjGwcbJEYqnboXEd09YOGqVOExR111cBkhVmf1qvOB71ENXoMnJ1L3e+AWojc1xN klaMM8SQd0MsgjqTx4AOpPMKkzmONN1v44sFuQfARR1iwyY6dmLcJTqMQtDG9vJP YTpZdkmswQrAnE2xVyMWkUd3vFj2d3JRU+aELYfIcVEuxx3hsRZdR8ezcrTRZLST s4ii8XXmb9MGlggwrmk5 =BPP1 -----END PGP SIGNATURE----- Merge tag 'kvm-3.6-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Avi Kivity: "A trio of KVM fixes: incorrect lookup of guest cpuid, an uninitialized variable fix, and error path cleanup fix." * tag 'kvm-3.6-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: fix error paths for failed gfn_to_page() calls KVM: x86: Check INVPCID feature bit in EBX of leaf 7 KVM: PIC: fix use of uninitialised variable.	2012-09-11 09:30:08 +08:00
Eric Dumazet	280050cc81	x86 bpf_jit: support MOD operation commit b6069a9570 (filter: add MOD operation) added generic support for modulus operation in BPF. This patch brings JIT support for x86_64 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: George Bakos <gbakos@alpinista.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-09-10 17:08:48 -04:00
Xiao Guangrong	4484141a94	KVM: fix error paths for failed gfn_to_page() calls This bug was triggered: [ 4220.198458] BUG: unable to handle kernel paging request at fffffffffffffffe [ 4220.203907] IP: [<ffffffff81104d85>] put_page+0xf/0x34 ...... [ 4220.237326] Call Trace: [ 4220.237361] [<ffffffffa03830d0>] kvm_arch_destroy_vm+0xf9/0x101 [kvm] [ 4220.237382] [<ffffffffa036fe53>] kvm_put_kvm+0xcc/0x127 [kvm] [ 4220.237401] [<ffffffffa03702bc>] kvm_vcpu_release+0x18/0x1c [kvm] [ 4220.237407] [<ffffffff81145425>] __fput+0x111/0x1ed [ 4220.237411] [<ffffffff8114550f>] ____fput+0xe/0x10 [ 4220.237418] [<ffffffff81063511>] task_work_run+0x5d/0x88 [ 4220.237424] [<ffffffff8104c3f7>] do_exit+0x2bf/0x7ca The test case: printf(fmt, ##args); \ exit(-1);} while (0) static int create_vm(void) { int sys_fd, vm_fd; sys_fd = open("/dev/kvm", O_RDWR); if (sys_fd < 0) die("open /dev/kvm fail.\n"); vm_fd = ioctl(sys_fd, KVM_CREATE_VM, 0); if (vm_fd < 0) die("KVM_CREATE_VM fail.\n"); return vm_fd; } static int create_vcpu(int vm_fd) { int vcpu_fd; vcpu_fd = ioctl(vm_fd, KVM_CREATE_VCPU, 0); if (vcpu_fd < 0) die("KVM_CREATE_VCPU ioctl.\n"); printf("Create vcpu.\n"); return vcpu_fd; } static void vcpu_thread(void arg) { int vm_fd = (int)(long)arg; create_vcpu(vm_fd); return NULL; } int main(int argc, char argv[]) { pthread_t thread; int vm_fd; (void)argc; (void)argv; vm_fd = create_vm(); pthread_create(&thread, NULL, vcpu_thread, (void )(long)vm_fd); printf("Exit.\n"); return 0; } It caused by release kvm->arch.ept_identity_map_addr which is the error page. The parent thread can send KILL signal to the vcpu thread when it was exiting which stops faulting pages and potentially allocating memory. So gfn_to_pfn/gfn_to_page may fail at this time Fixed by checking the page before it is used Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-09-10 11:34:11 +03:00
Xiao Guangrong	7de5bdc96c	KVM: MMU: remove unnecessary check Checking the return of kvm_mmu_get_page is unnecessary since it is guaranteed by memory cache Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-09-10 11:26:16 +03:00
Liu, Jinsong	92b5265d38	KVM: Depend on HIGH_RES_TIMERS KVM lapic timer and tsc deadline timer based on hrtimer, setting a leftmost node to rb tree and then do hrtimer reprogram. If hrtimer not configured as high resolution, hrtimer_enqueue_reprogram do nothing and then make kvm lapic timer and tsc deadline timer fail. Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-09-10 11:10:03 +03:00

... 4 5 6 7 8 ...

16286 Commits