linux

iv/linux

Author	SHA1	Message	Date
Thomas Gleixner	18a67d32c3	x86, irq: Remove pointless irq_reserve_irqs() call That's a leftover from the time where x86 supported SPARSE_IRQ=n. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Grant Likely <grant.likely@linaro.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20140507154338.967285614@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-05-16 14:05:21 +02:00
Thomas Gleixner	54859f59fc	x86: Remove create/destroy_irq() No more users. Remove the cruft Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Grant Likely <grant.likely@linaro.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20140507154336.760446122@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-05-16 14:05:20 +02:00
Thomas Gleixner	a553b142b8	iommu: dmar: Provide arch specific irq allocation ia64 and x86 share this driver. x86 is moving to a different irq allocation and ia64 keeps its private irq_create/destroy stuff. Use macros to redirect to one or the other. Yes, macros to avoid include hell. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Grant Likely <grant.likely@linaro.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Fenghua Yu <fenghua.yu@intel.com> Acked-by: Joerg Roedel <joro@8bytes.org> Cc: x86@kernel.org Cc: linux-ia64@vger.kernel.org Cc: iommu@lists.linux-foundation.org Link: http://lkml.kernel.org/r/20140507154336.372289825@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-05-16 14:05:19 +02:00
Thomas Gleixner	d07c9f1875	x86: Get rid of get_nr_irqs_gsi() No need to expose this outside of the ioapic code. The dynamic allocations are guaranteed not to happen in the gsi space. See commit 62a08ae2a. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Grant Likely <grant.likely@linaro.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: x86@kernel.org Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/20140507154335.959870037@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-05-16 14:05:19 +02:00
Thomas Gleixner	be47be6c28	x86: ioapic: Use irq_alloc/free_hwirq() No functional change just less crap. This does not replace the requirement to move x86 to irq domains, but it limits the mess to some degree. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Grant Likely <grant.likely@linaro.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20140507154335.749579081@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-05-16 14:05:19 +02:00
Thomas Gleixner	0a2db49dc4	x86: uv: Use irq_alloc/free_hwirq() No functional change. The request to allocate the irq above NR_IRQS_LEGACY is completely pointless as the implementation enforces that the dynamic allocations are above the GSI interrupts, which includes the legacy PIT irqs. This does not replace the requirement to move x86 to irq domains, but it limits the mess to some degree. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Grant Likely <grant.likely@linaro.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20140507154335.252789823@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-05-16 14:05:19 +02:00
Thomas Gleixner	499c2b75e9	x86: hpet: Use irq_alloc/free_hwirq() Use the new interfaces. No functional change. This does not replace the requirement to move x86 to irq domains, but it limits the mess to some degree. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Grant Likely <grant.likely@linaro.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20140507154334.991589924@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-05-16 14:05:18 +02:00
Thomas Gleixner	b1ee544174	x86: Implement arch_setup/teardown_hwirq() This is just a cleanup to get rid of the create/destroy_irq variants which were designed in hell. The long term solution for x86 is to switch over to irq domains and cleanup the whole vector allocation mess. The generic irq_alloc_hwirqs() interface deliberately prevents multi-MSI vector allocation to further enforce the irq domain conversion (aside of the desire to support ioapic hotplug). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Grant Likely <grant.likely@linaro.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20140507154334.482904047@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2014-05-16 14:05:18 +02:00
Alexei Starovoitov	622582786c	net: filter: x86: internal BPF JIT Maps all internal BPF instructions into x86_64 instructions. This patch replaces original BPF x64 JIT with internal BPF x64 JIT. sysctl net.core.bpf_jit_enable is reused as on/off switch. Performance: 1. old BPF JIT and internal BPF JIT generate equivalent x86_64 code. No performance difference is observed for filters that were JIT-able before Example assembler code for BPF filter "tcpdump port 22" original BPF -> old JIT: original BPF -> internal BPF -> new JIT: 0: push %rbp 0: push %rbp 1: mov %rsp,%rbp 1: mov %rsp,%rbp 4: sub $0x60,%rsp 4: sub $0x228,%rsp 8: mov %rbx,-0x8(%rbp) b: mov %rbx,-0x228(%rbp) // prologue 12: mov %r13,-0x220(%rbp) 19: mov %r14,-0x218(%rbp) 20: mov %r15,-0x210(%rbp) 27: xor %eax,%eax // clear A c: xor %ebx,%ebx 29: xor %r13,%r13 // clear X e: mov 0x68(%rdi),%r9d 2c: mov 0x68(%rdi),%r9d 12: sub 0x6c(%rdi),%r9d 30: sub 0x6c(%rdi),%r9d 16: mov 0xd8(%rdi),%r8 34: mov 0xd8(%rdi),%r10 3b: mov %rdi,%rbx 1d: mov $0xc,%esi 3e: mov $0xc,%esi 22: callq 0xffffffffe1021e15 43: callq 0xffffffffe102bd75 27: cmp $0x86dd,%eax 48: cmp $0x86dd,%rax 2c: jne 0x0000000000000069 4f: jne 0x000000000000009a 2e: mov $0x14,%esi 51: mov $0x14,%esi 33: callq 0xffffffffe1021e31 56: callq 0xffffffffe102bd91 38: cmp $0x84,%eax 5b: cmp $0x84,%rax 3d: je 0x0000000000000049 62: je 0x0000000000000074 3f: cmp $0x6,%eax 64: cmp $0x6,%rax 42: je 0x0000000000000049 68: je 0x0000000000000074 44: cmp $0x11,%eax 6a: cmp $0x11,%rax 47: jne 0x00000000000000c6 6e: jne 0x0000000000000117 49: mov $0x36,%esi 74: mov $0x36,%esi 4e: callq 0xffffffffe1021e15 79: callq 0xffffffffe102bd75 53: cmp $0x16,%eax 7e: cmp $0x16,%rax 56: je 0x00000000000000bf 82: je 0x0000000000000110 58: mov $0x38,%esi 88: mov $0x38,%esi 5d: callq 0xffffffffe1021e15 8d: callq 0xffffffffe102bd75 62: cmp $0x16,%eax 92: cmp $0x16,%rax 65: je 0x00000000000000bf 96: je 0x0000000000000110 67: jmp 0x00000000000000c6 98: jmp 0x0000000000000117 69: cmp $0x800,%eax 9a: cmp $0x800,%rax 6e: jne 0x00000000000000c6 a1: jne 0x0000000000000117 70: mov $0x17,%esi a3: mov $0x17,%esi 75: callq 0xffffffffe1021e31 a8: callq 0xffffffffe102bd91 7a: cmp $0x84,%eax ad: cmp $0x84,%rax 7f: je 0x000000000000008b b4: je 0x00000000000000c2 81: cmp $0x6,%eax b6: cmp $0x6,%rax 84: je 0x000000000000008b ba: je 0x00000000000000c2 86: cmp $0x11,%eax bc: cmp $0x11,%rax 89: jne 0x00000000000000c6 c0: jne 0x0000000000000117 8b: mov $0x14,%esi c2: mov $0x14,%esi 90: callq 0xffffffffe1021e15 c7: callq 0xffffffffe102bd75 95: test $0x1fff,%ax cc: test $0x1fff,%rax 99: jne 0x00000000000000c6 d3: jne 0x0000000000000117 d5: mov %rax,%r14 9b: mov $0xe,%esi d8: mov $0xe,%esi a0: callq 0xffffffffe1021e44 dd: callq 0xffffffffe102bd91 // MSH e2: and $0xf,%eax e5: shl $0x2,%eax e8: mov %rax,%r13 eb: mov %r14,%rax ee: mov %r13,%rsi a5: lea 0xe(%rbx),%esi f1: add $0xe,%esi a8: callq 0xffffffffe1021e0d f4: callq 0xffffffffe102bd6d ad: cmp $0x16,%eax f9: cmp $0x16,%rax b0: je 0x00000000000000bf fd: je 0x0000000000000110 ff: mov %r13,%rsi b2: lea 0x10(%rbx),%esi 102: add $0x10,%esi b5: callq 0xffffffffe1021e0d 105: callq 0xffffffffe102bd6d ba: cmp $0x16,%eax 10a: cmp $0x16,%rax bd: jne 0x00000000000000c6 10e: jne 0x0000000000000117 bf: mov $0xffff,%eax 110: mov $0xffff,%eax c4: jmp 0x00000000000000c8 115: jmp 0x000000000000011c c6: xor %eax,%eax 117: mov $0x0,%eax c8: mov -0x8(%rbp),%rbx 11c: mov -0x228(%rbp),%rbx // epilogue cc: leaveq 123: mov -0x220(%rbp),%r13 cd: retq 12a: mov -0x218(%rbp),%r14 131: mov -0x210(%rbp),%r15 138: leaveq 139: retq On fully cached SKBs both JITed functions take 12 nsec to execute. BPF interpreter executes the program in 30 nsec. The difference in generated assembler is due to the following: Old BPF imlements LDX_MSH instruction via sk_load_byte_msh() helper function inside bpf_jit.S. New JIT removes the helper and does it explicitly, so ldx_msh cost is the same for both JITs, but generated code looks longer. New JIT has 4 registers to save, so prologue/epilogue are larger, but the cost is within noise on x64. Old JIT checks whether first insn clears A and if not emits 'xor %eax,%eax'. New JIT clears %rax unconditionally. 2. old BPF JIT doesn't support ANC_NLATTR, ANC_PAY_OFFSET, ANC_RANDOM extensions. New JIT supports all BPF extensions. Performance of such filters improves 2-4 times depending on a filter. The longer the filter the higher performance gain. Synthetic benchmarks with many ancillary loads see 20x speedup which seems to be the maximum gain from JIT Notes: . net.core.bpf_jit_enable=2 + tools/net/bpf_jit_disasm is still functional and can be used to see generated assembler . there are two jit_compile() functions and code flow for classic filters is: sk_attach_filter() - load classic BPF bpf_jit_compile() - try to JIT from classic BPF sk_convert_filter() - convert classic to internal bpf_int_jit_compile() - JIT from internal BPF seccomp and tracing filters will just call bpf_int_jit_compile() Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-15 16:31:30 -04:00
Alexei Starovoitov	f3c2af7ba1	net: filter: x86: split bpf_jit_compile() Split bpf_jit_compile() into two functions to improve readability of for(pass++) loop. The change follows similar style of JIT compilers for arm, powerpc, s390 The body of new do_jit() was not reformatted to reduce noise in this patch, since the following patch replaces most of it. Tested with BPF testsuite. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-15 16:31:30 -04:00
Alan	9b17aeec23	goldfish: Allow 64bit builds We can now enable the 64bit option for the Goldfish 64bit emulator. Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2014-05-15 13:19:01 -07:00
David Vrabel	f59c5145dc	x86/xen: do not use _PAGE_IOMAP in xen_remap_domain_mfn_range() _PAGE_IOMAP is used in xen_remap_domain_mfn_range() to prevent the pfn_pte() call in remap_area_mfn_pte_fn() from using the p2m to translate the MFN. If mfn_pte() is used instead, the p2m look up is avoided and the use of _PAGE_IOMAP is no longer needed. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-05-15 16:16:40 +01:00
David Vrabel	25b884a83d	x86/xen: set regions above the end of RAM as 1:1 PCI devices may have BARs located above the end of RAM so mark such frames as identity frames in the p2m (instead of the default of missing). PFNs outside the p2m (above MAX_P2M_PFN) are also considered to be identity frames for the same reason. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-05-15 16:15:18 +01:00
David Vrabel	2dcc9a3de1	x86/xen: only warn once if bad MFNs are found during setup In xen_add_extra_mem(), if the WARN() checks for bad MFNs trigger it is likely that they will trigger at lot, spamming the log. Use WARN_ONCE() instead. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-05-15 16:15:01 +01:00
David Vrabel	3cb83e46d0	x86/xen: compactly store large identity ranges in the p2m Large (multi-GB) identity ranges currently require a unique middle page (filled with p2m_identity entries) per 1 GB region. Similar to the common p2m_mid_missing middle page for large missing regions, introduce a p2m_mid_identity page (filled with p2m_identity entries) which can be used instead. set_phys_range_identity() thus only needs to allocate new middle pages at the beginning and end of the range. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-05-15 16:14:44 +01:00
David Vrabel	a9b5bff66b	x86/xen: fix set_phys_range_identity() if pfn_e > MAX_P2M_PFN Allow set_phys_range_identity() to work with a range that overlaps MAX_P2M_PFN by clamping pfn_e to MAX_P2M_PFN. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-05-15 16:13:23 +01:00
David Vrabel	fcca2e3119	x86/xen: rename early_p2m_alloc() and early_p2m_alloc_middle() early_p2m_alloc_middle() allocates a new leaf page and early_p2m_alloc() allocates a new middle page. This is confusing. Swap the names so they match what the functions actually do. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2014-05-15 16:12:25 +01:00
Radim Krčmář	bc5eb20161	xen/x86: set panic notifier priority to minimum Execution is not going to continue after telling Xen about the crash. Let other panic notifiers run by postponing the final hypercall as much as possible. Signed-off-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com>	2014-05-15 15:54:57 +01:00
Linus Torvalds	fa81511bb0	x86-64, modify_ldt: Make support for 16-bit segments a runtime option Checkin: b3b42ac2cbae x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels disabled 16-bit segments on 64-bit kernels due to an information leak. However, it does seem that people are genuinely using Wine to run old 16-bit Windows programs on Linux. A proper fix for this ("espfix64") is coming in the upcoming merge window, but as a temporary fix, create a sysctl to allow the administrator to re-enable support for 16-bit segments. It adds a "/proc/sys/abi/ldt16" sysctl that defaults to zero (off). If you hit this issue and care about your old Windows program more than you care about a kernel stack address information leak, you can do echo 1 > /proc/sys/abi/ldt16 as root (add it to your startup scripts), and you should be ok. The sysctl table is only added if you have COMPAT support enabled on x86-64, but I assume anybody who runs old windows binaries very much does that ;) Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/CA%2B55aFw9BPoD10U1LfHbOMpHWZkvJTkMcfCs9s3urPr1YyWBxw@mail.gmail.com Cc: <stable@vger.kernel.org>	2014-05-14 16:33:54 -07:00
Marcelo Tosatti	16a9602158	KVM: x86: disable master clock if TSC is reset during suspend Updating system_time from the kernel clock once master clock has been enabled can result in time backwards event, in case kernel clock frequency is lower than TSC frequency. Disable master clock in case it is necessary to update it from the resume path. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-14 17:59:21 +02:00
Steven Rostedt	e18eead3c3	ftrace/x86: Move the mcount/fentry code out of entry_64.S As the mcount code gets more complex, it really does not belong in the entry.S file. By moving it into its own file "mcount.S" keeps things a bit cleaner. Link: http://lkml.kernel.org/p/20140508152152.2130e8cf@gandalf.local.home Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-05-14 11:37:31 -04:00
Steven Rostedt (Red Hat)	f1b2f2bd58	ftrace: Remove FTRACE_UPDATE_MODIFY_CALL_REGS flag As the decision to what needs to be done (converting a call to the ftrace_caller to ftrace_caller_regs or to convert from ftrace_caller_regs to ftrace_caller) can easily be determined from the rec->flags of FTRACE_FL_REGS and FTRACE_FL_REGS_EN, there's no need to have the ftrace_check_record() return either a UPDATE_MODIFY_CALL_REGS or a UPDATE_MODIFY_CALL. Just he latter is enough. This added flag causes more complexity than is required. Remove it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-05-14 11:37:30 -04:00
Steven Rostedt (Red Hat)	7413af1fb7	ftrace: Make get_ftrace_addr() and get_ftrace_addr_old() global Move and rename get_ftrace_addr() and get_ftrace_addr_old() to ftrace_get_addr_new() and ftrace_get_addr_curr() respectively. This moves these two helper functions in the generic code out from the arch specific code, and renames them to have a better generic name. This will allow other archs to use them as well as makes it a bit easier to work on getting separate trampolines for different functions. ftrace_get_addr_new() returns the trampoline address that the mcount call address will be converted to. ftrace_get_addr_curr() returns the trampoline address of what the mcount call address currently jumps to. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-05-14 11:37:29 -04:00
Steven Rostedt (Red Hat)	94792ea07c	ftrace/x86: Get the current mcount addr for add_breakpoint() The add_breakpoint() code in the ftrace updating gets the address of what the call will become, but if the mcount address is changing from regs to non-regs ftrace_caller or vice versa, it will use what the record currently is. This is rather silly as the code should always use what is currently there regardless of if it's changing the regs function or just converting to a nop. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2014-05-14 11:37:28 -04:00
Oleg Nesterov	b02ef20a9f	uprobes/x86: Fix the wrong ->si_addr when xol triggers a trap If the probed insn triggers a trap, ->si_addr = regs->ip is technically correct, but this is not what the signal handler wants; we need to pass the address of the probed insn, not the address of xol slot. Add the new arch-agnostic helper, uprobe_get_trap_addr(), and change fill_trap_info() and math_error() to use it. !CONFIG_UPROBES case in uprobes.h uses a macro to avoid include hell and ensure that it can be compiled even if an architecture doesn't define instruction_pointer(). Test-case: #include <signal.h> #include <stdio.h> #include <unistd.h> extern void probe_div(void); void sigh(int sig, siginfo_t info, void c) { int passed = (info->si_addr == probe_div); printf(passed ? "PASS\n" : "FAIL\n"); _exit(!passed); } int main(void) { struct sigaction sa = { .sa_sigaction = sigh, .sa_flags = SA_SIGINFO, }; sigaction(SIGFPE, &sa, NULL); asm ( "xor %ecx,%ecx\n" ".globl probe_div; probe_div:\n" "idiv %ecx\n" ); return 0; } it fails if probe_div() is probed. Note: show_unhandled_signals users should probably use this helper too, but we need to cleanup them first. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>	2014-05-14 13:57:28 +02:00
Oleg Nesterov	0eb14833d5	x86/traps: Kill DO_ERROR_INFO() Now that DO_ERROR_INFO() doesn't differ from DO_ERROR() we can remove it and use DO_ERROR() instead. Signed-off-by: Oleg Nesterov <oleg@redhat.com>	2014-05-14 13:57:28 +02:00
Oleg Nesterov	1c326c4dfe	x86/traps: Shift fill_trap_info() from DO_ERROR_INFO() to do_error_trap() Move the callsite of fill_trap_info() into do_error_trap() and remove the "siginfo_t info" argument. This obviously breaks DO_ERROR() which passed info == NULL, we simply change fill_trap_info() to return "siginfo_t " and add the "default" case which returns SEND_SIG_PRIV. Signed-off-by: Oleg Nesterov <oleg@redhat.com>	2014-05-14 13:57:27 +02:00
Oleg Nesterov	958d3d7298	x86/traps: Introduce fill_trap_info(), simplify DO_ERROR_INFO() Extract the fill-siginfo code from DO_ERROR_INFO() into the new helper, fill_trap_info(). It can calculate si_code and si_addr looking at trapnr, so we can remove these arguments from DO_ERROR_INFO() and simplify the source code. The generated code is the same, __builtin_constant_p(trapnr) == T. Signed-off-by: Oleg Nesterov <oleg@redhat.com>	2014-05-14 13:57:27 +02:00
Oleg Nesterov	dff0796e53	x86/traps: Introduce do_error_trap() Move the common code from DO_ERROR() and DO_ERROR_INFO() into the new helper, do_error_trap(). This simplifies define's and shaves 527 bytes from traps.o. Signed-off-by: Oleg Nesterov <oleg@redhat.com>	2014-05-14 13:57:27 +02:00
Oleg Nesterov	38cad57be9	x86/traps: Use SEND_SIG_PRIV instead of force_sig() force_sig() is just force_sig_info(SEND_SIG_PRIV). Imho it should die, we have too many ugly "send signal" helpers. And do_trap() looks just ugly because it uses force_sig_info() or force_sig() depending on info != NULL. Signed-off-by: Oleg Nesterov <oleg@redhat.com>	2014-05-14 13:57:26 +02:00
Oleg Nesterov	5e1b05beec	x86/traps: Make math_error() static Trivial, make math_error() static. Signed-off-by: Oleg Nesterov <oleg@redhat.com>	2014-05-14 13:57:26 +02:00
Denys Vlasenko	1ea30fb645	uprobes/x86: Fix scratch register selection for rip-relative fixups Before this patch, instructions such as div, mul, shifts with count in CL, cmpxchg are mishandled. This patch adds vex prefix handling. In particular, it avoids colliding with register operand encoded in vex.vvvv field. Since we need to avoid two possible register operands, the selection of scratch register needs to be from at least three registers. After looking through a lot of CPU docs, it looks like the safest choice is SI,DI,BX. Selecting BX needs care to not collide with implicit use of BX by cmpxchg8b. Test-case: #include <stdio.h> static const char const pass[] = { "FAIL", "pass" }; long two = 2; void test1(void) { long ax = 0, dx = 0; asm volatile("\n" " xor %%edx,%%edx\n" " lea 2(%%edx),%%eax\n" // We divide 2 by 2. Result (in eax) should be 1: " probe1: .globl probe1\n" " divl two(%%rip)\n" // If we have a bug (eax mangled on entry) the result will be 2, // because eax gets restored by probe machinery. : "=a" (ax), "=d" (dx) /out/ : "0" (ax), "1" (dx) /in/ : "memory" /clobber/ ); dprintf(2, "%s: %s\n", __func__, pass[ax == 1] ); } long val2 = 0; void test2(void) { long old_val = val2; long ax = 0, dx = 0; asm volatile("\n" " mov val2,%%eax\n" // eax := val2 " lea 1(%%eax),%%edx\n" // edx := eax+1 // eax is equal to val2. cmpxchg should store edx to val2: " probe2: .globl probe2\n" " cmpxchg %%edx,val2(%%rip)\n" // If we have a bug (eax mangled on entry), val2 will stay unchanged : "=a" (ax), "=d" (dx) /out/ : "0" (ax), "1" (dx) /in/ : "memory" /clobber/ ); dprintf(2, "%s: %s\n", __func__, pass[val2 == old_val + 1] ); } long val3[2] = {0,0}; void test3(void) { long old_val = val3[0]; long ax = 0, dx = 0; asm volatile("\n" " mov val3,%%eax\n" // edx:eax := val3 " mov val3+4,%%edx\n" " mov %%eax,%%ebx\n" // ecx:ebx := edx:eax + 1 " mov %%edx,%%ecx\n" " add $1,%%ebx\n" " adc $0,%%ecx\n" // edx:eax is equal to val3. cmpxchg8b should store ecx:ebx to val3: " probe3: .globl probe3\n" " cmpxchg8b val3(%%rip)\n" // If we have a bug (edx:eax mangled on entry), val3 will stay unchanged. // If ecx:edx in mangled, val3 will get wrong value. : "=a" (ax), "=d" (dx) /out/ : "0" (ax), "1" (dx) /in/ : "cx", "bx", "memory" /clobber/ ); dprintf(2, "%s: %s\n", __func__, pass[val3[0] == old_val + 1 && val3[1] == 0] ); } int main(int argc, char *argv) { test1(); test2(); test3(); return 0; } Before this change all tests fail if probe{1,2,3} are probed. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Reviewed-by: Jim Keniston <jkenisto@us.ibm.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com>	2014-05-14 13:57:25 +02:00
Denys Vlasenko	50204c6f6d	uprobes/x86: Simplify rip-relative handling It is possible to replace rip-relative addressing mode with addressing mode of the same length: (reg+disp32). This eliminates the need to fix up immediate and correct for changing instruction length. And we can kill arch_uprobe->def.riprel_target. Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Reviewed-by: Jim Keniston <jkenisto@us.ibm.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com>	2014-05-14 13:57:25 +02:00
Rob Herring	eafd370dfe	Merge branch 'dt-bus-name' into for-next	2014-05-13 18:34:35 -05:00
Anthony Iliopoulos	9844f54623	x86, mm, hugetlb: Add missing TLB page invalidation for hugetlb_cow() The invalidation is required in order to maintain proper semantics under CoW conditions. In scenarios where a process clones several threads, a thread operating on a core whose DTLB entry for a particular hugepage has not been invalidated, will be reading from the hugepage that belongs to the forked child process, even after hugetlb_cow(). The thread will not see the updated page as long as the stale DTLB entry remains cached, the thread attempts to write into the page, the child process exits, or the thread gets migrated to a different processor. Signed-off-by: Anthony Iliopoulos <anthony.iliopoulos@huawei.com> Link: http://lkml.kernel.org/r/20140514092948.GA17391@server-36.huawei.corp Suggested-by: Shay Goikhman <shay.goikhman@huawei.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org> # v2.6.16+ (!)	2014-05-13 16:34:09 -07:00
Alexei Starovoitov	773cd38f40	net: filter: x86: fix JIT address randomization bpf_alloc_binary() adds 128 bytes of room to JITed program image and rounds it up to the nearest page size. If image size is close to page size (like 4000), it is rounded to two pages: round_up(4000 + 4 + 128) == 8192 then 'hole' is computed as 8192 - (4000 + 4) = 4188 If prandom_u32() % hole selects a number >= PAGE_SIZE - sizeof(header) then kernel will crash during bpf_jit_free(): kernel BUG at arch/x86/mm/pageattr.c:887! Call Trace: [<ffffffff81037285>] change_page_attr_set_clr+0x135/0x460 [<ffffffff81694cc0>] ? _raw_spin_unlock_irq+0x30/0x50 [<ffffffff810378ff>] set_memory_rw+0x2f/0x40 [<ffffffffa01a0d8d>] bpf_jit_free_deferred+0x2d/0x60 [<ffffffff8106bf98>] process_one_work+0x1d8/0x6a0 [<ffffffff8106bf38>] ? process_one_work+0x178/0x6a0 [<ffffffff8106c90c>] worker_thread+0x11c/0x370 since bpf_jit_free() does: unsigned long addr = (unsigned long)fp->bpf_func & PAGE_MASK; struct bpf_binary_header header = (void *)addr; to compute start address of 'bpf_binary_header' and header->pages will pass junk to: set_memory_rw(addr, header->pages); Fix it by making sure that &header->image[prandom_u32() % hole] and &header are in the same page Fixes: 314beb9bcabfd ("x86: bpf_jit_comp: secure bpf jit against spraying attacks") Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-13 18:31:13 -04:00
Ville Syrjälä	36dfcea47a	x86/gpu: Sprinkle const, __init and __initconst to stolen memory quirks gen8_stolen_size() is missing __init, so add it. Also all the intel_stolen_funcs structures can be marked __initconst. intel_stolen_ids[] can also be made const if we replace the __initdata with __initconst. Cc: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-13 14:13:23 +02:00
Damien Lespiau	3e3b2c3908	x86/gpu: Implement stolen memory size early quirk for CHV CHV uses the same bits as SNB/VLV to code the Graphics Mode Select field (GFX stolen memory size) with the addition of finer granularity modes: 4MB increments from 0x11 (8MB) to 0x1d. Values strictly above 0x1d are either reserved or not supported. v2: 4MB increments, not 8MB. 32MB has been omitted from the list of new values (Ville Syrjälä) v3: Also correctly interpret GGMS (GTT Graphics Memory Size) (Ville Syrjälä) v4: Don't assign a value that needs 20bits or more to a u16 (Rafael Barbalho) [vsyrjala: v5: Split from i915 changes and add chv_stolen_funcs] Cc: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Rafael Barbalho <rafael.barbalho@intel.com> Tested-by: Rafael Barbalho <rafael.barbalho@intel.com> Signed-off-by: Damien Lespiau <damien.lespiau@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-05-13 14:13:23 +02:00
Jan Kiszka	d9f89b88f5	KVM: x86: Fix CR3 reserved bits check in long mode Regression of 346874c9: PAE is set in long mode, but that does not mean we have valid PDPTRs. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2014-05-12 20:04:01 +02:00
David S. Miller	5f013c9bc7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/altera/altera_sgdma.c net/netlink/af_netlink.c net/sched/cls_api.c net/sched/sch_api.c The netlink conflict dealt with moving to netlink_capable() and netlink_ns_capable() in the 'net' tree vs. supporting 'tc' operations in non-init namespaces. These were simple transformations from netlink_capable to netlink_ns_capable. The Altera driver conflict was simply code removal overlapping some void pointer cast cleanups in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>	2014-05-12 13:19:14 -04:00
David Vrabel	aa8532c322	xen: refactor suspend pre/post hooks New architectures currently have to provide implementations of 5 different functions: xen_arch_pre_suspend(), xen_arch_post_suspend(), xen_arch_hvm_post_suspend(), xen_mm_pin_all(), and xen_mm_unpin_all(). Refactor the suspend code to only require xen_arch_pre_suspend() and xen_arch_post_suspend(). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>	2014-05-12 17:19:56 +01:00
H. Peter Anvin	7a5091d584	x86, rdrand: When nordrand is specified, disable RDSEED as well One can logically expect that when the user has specified "nordrand", the user doesn't want any use of the CPU random number generator, neither RDRAND nor RDSEED, so disable both. Reported-by: Stephan Mueller <smueller@chronox.de> Cc: Theodore Ts'o <tytso@mit.edu> Link: http://lkml.kernel.org/r/21542339.0lFnPSyGRS@myon.chronox.de Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-05-11 20:25:20 -07:00
Ong Boon Leong	04725ad594	x86, iosf: Add PCI ID macros for better readability Introduce PCI IDs macro for the list of supported product: BayTrail & Quark X1000. Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Link: http://lkml.kernel.org/r/1399668248-24199-5-git-send-email-david.e.box@linux.intel.com Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-05-09 14:57:35 -07:00
Ong Boon Leong	90916e048c	x86, iosf: Add Quark X1000 PCI ID Add PCI device ID, i.e. that of the Host Bridge, for IOSF MBI driver. Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Link: http://lkml.kernel.org/r/1399668248-24199-4-git-send-email-david.e.box@linux.intel.com Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-05-09 14:57:23 -07:00
Ong Boon Leong	7ef1def800	x86, iosf: Added Quark MBI identifiers Added all the MBI units below and their associated read/write opcodes: - Host Bridge Arbiter - Host Bridge - Remote Management Unit - Memory Manager & eSRAM - SoC Unit Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com> Link: http://lkml.kernel.org/r/1399668248-24199-3-git-send-email-david.e.box@linux.intel.com Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-05-09 14:57:08 -07:00
David E. Box	6b8f0c8780	x86, iosf: Make IOSF driver modular and usable by more drivers Currently drivers that run on non-IOSF systems (Core/Xeon) can't use the IOSF driver on SOC's without selecting it which forces an unnecessary and limiting dependency. Provides dummy functions to allow these modules to conditionally use the driver on IOSF equipped platforms without impacting their ability to compile and load on non-IOSF platforms. Build default m to ensure availability on x86 SOC's. Signed-off-by: David E. Box <david.e.box@linux.intel.com> Link: http://lkml.kernel.org/r/1399668248-24199-2-git-send-email-david.e.box@linux.intel.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2014-05-09 14:56:15 -07:00
Boris Ostrovsky	28b92e09e2	x86, vdso, time: Cast tv_nsec to u64 for proper shifting in update_vsyscall() With tk->wall_to_monotonic.tv_nsec being a 32-bit value on 32-bit systems, (tk->wall_to_monotonic.tv_nsec << tk->shift) in update_vsyscall() may lose upper bits or, worse, add them since compiler will do this: (u64)(tk->wall_to_monotonic.tv_nsec << tk->shift) instead of ((u64)tk->wall_to_monotonic.tv_nsec << tk->shift) So if, for example, tv_nsec is 0x800000 and shift is 8 we will end up with 0xffffffff80000000 instead of 0x80000000. And then we are stuck in the subsequent 'while' loop. We need an explicit cast. Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: http://lkml.kernel.org/r/1399648287-15178-1-git-send-email-boris.ostrovsky@oracle.com Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: <stable@vger.kernel.org> # v3.14 Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2014-05-09 08:45:52 -07:00
Andres Freund	c45f77364b	x86: Fix typo in MSR_IA32_MISC_ENABLE_LIMIT_CPUID macro The spuriously added semicolon didn't have any effect because the macro isn't currently in use. c0a639ad0bc6b178b46996bd1f821a04643e2bde Signed-off-by: Andres Freund <andres@anarazel.de> Link: http://lkml.kernel.org/r/1399598957-7011-3-git-send-email-andres@anarazel.de Cc: Borislav Petkov <bp@suse.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2014-05-09 08:42:47 -07:00
Andres Freund	722a0d22d0	x86: Fix typo preventing msr_set/clear_bit from having an effect Due to a typo the msr accessor function introduced in 22085a66c2fab6cf9b9393c056a3600a6b4735de didn't have any lasting effects because they accidentally wrote the old value back. After c0a639ad0bc6b178b46996bd1f821a04643e2bde this at the very least this causes cpuid limits not to be lifted on some cpus leading to missing capabilities for those. Signed-off-by: Andres Freund <andres@anarazel.de> Link: http://lkml.kernel.org/r/1399598957-7011-2-git-send-email-andres@anarazel.de Cc: Borislav Petkov <bp@suse.de> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2014-05-09 08:42:32 -07:00
Vivek Goyal	a9a17104a1	x86, boot: Remove misc.h inclusion from compressed/string.c Given the fact that we removed inclusion of boot.h from boot/string.c does not look like we need misc.h inclusion in compressed/string.c. So remove it. misc.h was also pulling in string_32.h which in turn had macros for memcmp and memcpy. So we don't need to #undef memcmp and memcpy anymore. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Link: http://lkml.kernel.org/r/1398447972-27896-3-git-send-email-vgoyal@redhat.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2014-05-08 08:00:06 -07:00

... 3 4 5 6 7 ...

19460 Commits