linux

iv/linux

Author	SHA1	Message	Date
Al Viro	3b7d15bde5	um: ->restart_block.fn needs to be reset on sigreturn Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-05-21 14:19:31 -04:00
Stefano Stabellini	68c2c39a76	xen: do not map the same GSI twice in PVHVM guests. PV on HVM guests map GSIs into event channels. At restore time the event channels are resumed by restore_pirqs. Device drivers might try to register the same GSI again through ACPI at restore time, but the GSI has already been mapped and bound by restore_pirqs. This patch detects these situations and avoids mapping the same GSI multiple times. Without this patch we get: (XEN) irq.c:2235: dom4: pirq 23 or emuirq 28 already mapped and waste a pirq. CC: stable@kernel.org Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-05-21 14:11:36 -04:00
Sasha Levin	29d679ffd8	x86, printk: Add missing KERN_CONT to NMI selftest Fix this behaviour: ---------------- \| NMI testsuite: -------------------- remote IPI: ok \| local IPI: ok \| Revealed due to a new modification to printk(). Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Link: http://lkml.kernel.org/r/1336492573-17530-3-git-send-email-levinsasha928@gmail.com Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-05-21 10:13:04 -07:00
Linus Torvalds	7e5cb5e151	Merge branch 'vfs-cleanups' (random vfs cleanups) This teaches vfs_fstat() to use the appropriate f[get\|put]_light functions, allowing it to avoid some unnecessary locking for the common case. More noticeably, it also cleans up and simplifies the "getname_flags()" function, which now relies on the architecture strncpy_from_user() doing all the user access checks properly, instead of hacking around the fact that on x86 it didn't use to do it right (see commit 92ae03f2ef99: "x86: merge 32/64-bit versions of 'strncpy_from_user()' and speed it up"). * vfs-cleanups: VFS: make vfs_fstat() use f[get\|put]_light() VFS: clean up and simplify getname_flags() x86: make word-at-a-time strncpy_from_user clear bytes at the end	2012-05-21 08:46:08 -07:00
Linus Torvalds	8c12fec90c	Merge branch 'stat-cleanups' (clean up copying of stat info to user space) This makes cp_new_stat() a bit more readable, and avoids having to memset() the whole structure just to fill in a couple of padding fields. This is another result of me looking at code generation of functions that show up high on certain kernel profiles, and just going "Oh, let's just clean that up". Architectures that don't supply the #define to fill just the padding fields will still fall back to memset(). * stat-cleanups: vfs: don't force a big memset of stat data just to clear padding fields vfs: de-crapify "cp_new_stat()" function	2012-05-21 08:41:38 -07:00
Konrad Rzeszutek Wilk	2f1bd67d54	xen/smp: unbind irqworkX when unplugging vCPUs. The git commit 1ff2b0c303698e486f1e0886b4d9876200ef8ca5 "xen: implement IRQ_WORK_VECTOR handler" added the functionality to have a per-cpu "irqworkX" for the IPI APIC functionality. However it missed the unbind when a vCPU is unplugged resulting in an orphaned per-cpu interrupt line for unplugged vCPU: 30: 216 0 xen-dyn-event hvc_console 31: 810 4 xen-dyn-event eth0 32: 29 0 xen-dyn-event blkif - 36: 0 0 xen-percpu-ipi irqwork2 - 37: 287 0 xen-dyn-event xenbus + 36: 287 0 xen-dyn-event xenbus NMI: 0 0 Non-maskable interrupts LOC: 0 0 Local timer interrupts SPU: 0 0 Spurious interrupts Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2012-05-21 09:26:04 -04:00
Marek Szyprowski	0a2b9a6ea9	X86: integrate CMA with DMA-mapping subsystem This patch adds support for CMA to dma-mapping subsystem for x86 architecture that uses common pci-dma/pci-nommu implementation. This allows to test CMA on KVM/QEMU and a lot of common x86 boxes. Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> CC: Michal Nazarewicz <mina86@mina86.com> Acked-by: Arnd Bergmann <arnd@arndb.de>	2012-05-21 15:09:38 +02:00
Thomas Gleixner	bdebaf80a0	x86: Use generic time config Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Anna-Maria Gleixner <anna-maria@glx-um.de> Link: http://lkml.kernel.org/r/20120518163104.630579708@glx-um.de Cc: x86@kernel.org	2012-05-21 11:01:44 +02:00
Shuah Khan	74bc491795	x86/pci-calgary_64.c: Remove obsoleted simple_strtoul() usage Change calgary_parse_options() to call kstrtoul() instead of calling obsoleted simple_strtoul(). Signed-off-by: Shuah Khan <shuahkhan@gmail.com> Acked-by: Muli Ben-Yehuda <muli@cs.technion.ac.il> Cc: jdmason@kudzu.us Link: http://lkml.kernel.org/r/1337556268.3126.5.camel@lorien2 Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-21 10:29:40 +02:00
Ingo Molnar	bb27f55eb9	Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Fixes for perf/core: - Rename some perf_target methods to avoid double negation, from Namhyung Kim. - Revert change to use per task events with inheritance, from Namhyung Kim. - Events should start disabled till children starts running, from David Ahern. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-21 09:17:50 +02:00
H. Peter Anvin	61f5446169	x86, realmode: Move end signature into header.S The end signature was defined in wakeup_asm.S as it originally came from the ACPI wakeup code. However, we rely on the existence of the .signature section to expand .bss, otherwise we would have to include code to explicitly zero the .bss depending on the configuration. Since the expanded .bss is just in .init.data anyway, it's easier to always have it expanded. This fixes failures when compiled without CONFIG_ACPI_SLEEP. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com>	2012-05-21 00:02:45 -07:00
Linus Torvalds	5d1204582e	Merge branch 'x86/ld-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 linker bug workarounds from Peter Anvin. GNU ld-2.22.52.0.[12] () has an unfortunate bug where it incorrectly turns certain relocation entries absolute. Section-relative symbols that are part of otherwise empty sections are silently changed them to absolute. We rely on section-relative symbols staying section-relative, and actually have several sections in the linker script solely for this purpose. See for example http://sourceware.org/bugzilla/show_bug.cgi?id=14052 We could just black-list the buggy linker, but it appears that it got shipped in at least F17, and possibly other distros too, so it's sadly not some rare unusual case. This backports the workaround from the x86/trampoline branch, and as Peter says: "This is not a minimal fix, not at all, but it is a tested code base." 'x86/ld-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, relocs: When printing an error, say relative or absolute x86, relocs: Workaround for binutils 2.22.52.0.1 section bug x86, realmode: 16-bit real-mode code support for relocs tool (*) That's a manly release numbering system. Stupid, sure. But manly.	2012-05-19 15:28:22 -07:00
H. Peter Anvin	24ab82bd9b	x86, relocs: When printing an error, say relative or absolute When the relocs tool throws an error, let the error message say if it is an absolute or relative symbol. This should make it a lot more clear what action the programmer needs to take and should help us find the reason if additional symbol bugs show up. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: <stable@vger.kernel.org>	2012-05-18 19:50:02 -07:00
H. Peter Anvin	a3e854d95a	x86, relocs: Workaround for binutils 2.22.52.0.1 section bug GNU ld 2.22.52.0.1 has a bug that it blindly changes symbols from section-relative to absolute if they are in a section of zero length. This turns the symbols __init_begin and __init_end into absolute symbols. Let the relocs program know that those should be treated as relative symbols. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: H.J. Lu <hjl.tools@gmail.com> Cc: <stable@vger.kernel.org> Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com>	2012-05-18 19:50:00 -07:00
H. Peter Anvin	6520fe5564	x86, realmode: 16-bit real-mode code support for relocs tool A new option is added to the relocs tool called '--realmode'. This option causes the generation of 16-bit segment relocations and 32-bit linear relocations for the real-mode code. When the real-mode code is moved to the low-memory during kernel initialization, these relocation entries can be used to relocate the code properly. In the assembly code 16-bit segment relocations must be relative to the 'real_mode_seg' absolute symbol. Linear relocations must be relative to a symbol prefixed with 'pa_'. 16-bit segment relocation is used to load cs:ip in 16-bit code. Linear relocations are used in the 32-bit code for relocatable data references. They are declared in the linker script of the real-mode code. The relocs tool is moved to arch/x86/tools/relocs.c, and added new target archscripts that can be used to build scripts needed building an architecture. be compiled before building the arch/x86 tree. [ hpa: accelerating this because it detects invalid absolute relocations, a serious bug in binutils 2.22.52.0.x which currently produces bad kernels. ] Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1336501366-28617-2-git-send-email-jarkko.sakkinen@intel.com Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: <stable@vger.kernel.org>	2012-05-18 19:49:40 -07:00
H. Peter Anvin	8a3b947c40	x86, relocs: When printing an error, say relative or absolute When the relocs tool throws an error, let the error message say if it is an absolute or relative symbol. This should make it a lot more clear what action the programmer needs to take. Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2012-05-18 09:52:01 -07:00
Linus Torvalds	3d9944978e	Fix machine check recovery -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPtS5qAAoJEKurIx+X31iB+bAP/1FjUa2Nd53X89HFc6DoLktA 4AshM/JENhAfSbpTfGhg10ZOuwUa8sQ85Sf6yz1CsW6mEiJK/bDFrR1g2KrmejyL owgQvV6ukPABzfB27tWyXSVVBPmLkJviedLDautVpgPEPVuqauntmpe7fW51b5pf 92MxvYZ6AzgbDIjVaXP7e+kPomvgyM1C/UEvCgoyEcw81h5dchU9NSdXNBS67JS/ uOsMiMJyoNI46haYYbyFgMq3RmpYuxTLFj7qFDlUltyjP+vIvyLs38Ae/vkRMNfV sYXWRUQlRpvqg4MDIFVZx8FWTufzTm0BMS+Be7JkXKWdF3DAksq6FprOWIxfYi+d PMxwTFeSJzTINb9n9MiLt3TmuRy3mu37QWd28qJaJciNMkYWbclPqyJmjwsuAMKg hKSy2FvewIDHTAGOkwaVjS+L8O7j3TNRIAbweFA1d1K4rt6oSfwdn6GZrAb6MTvx oV0Fe1nAyY9mucyjknBTim3RYZ9qQ7H9SjL8JoaGihSzi988MgBun+iEQOQiwjNl YJNGIv0wb31qCKFtxU0A8rA0sFdhQRoLgigJfETb5a+2ghtG323oH6ZVWTnB8bp/ g1XOpus222/iJdL2AizMiJeUNV1IZ0SeLn2GoTFQIy9Uf11Zdgu5wLJumUva8EC6 JAACbpBlYufftIn278OH =tk2M -----END PGP SIGNATURE----- Merge tag 'linus-mce-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull a machine check recovery fix from Tony Luck. I really don't like how the MCE code does some of the things it does, but this does seem to be an improvement. * tag 'linus-mce-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: x86/mce: Only restart instruction after machine check recovery if it is safe	2012-05-18 09:42:20 -07:00
Arnaldo Carvalho de Melo	16ee6576e2	Merge remote-tracking branch 'tip/perf/urgent' into perf/core Merge reason: We are going to queue up a dependent patch: "perf tools: Move parse event automated tests to separated object" That depends on: commit e7c72d8 perf tools: Add 'G' and 'H' modifiers to event parsing Conflicts: tools/perf/builtin-stat.c Conflicted with the recent 'perf_target' patches when checking the result of perf_evsel open routines to see if a retry is needed to cope with older kernels where the exclude guest/host perf_event_attr bits were not used. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2012-05-18 13:13:33 -03:00
H. Peter Anvin	c54a354c18	x86, relocs: More relocations which may end up as absolute GNU ld 2.22.52.0.1 has a bug that it blindly changes symbols from section-relative to absolute if they are in a section of zero length. This turns the symbols __init_begin and __init_end into absolute symbols. Let the relocs program know that those should be treated as relative symbols. This bug is exposed by checkin 433de739bbc2 x86, realmode: 16-bit real-mode code support for relocs tool only in the sense that that checkin changes the relocs tool to report an error instead of silently generating a kernel which is broken if relocated. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: H.J. Lu <hjl.tools@gmail.com> Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com>	2012-05-18 08:31:44 -07:00
Robert Richter	5bcdf5e4fe	perf/x86: Update event scheduling constraints for AMD family 15h models This update is for newer family 15h cpu models from 0x02 to 0x1f. Signed-off-by: Robert Richter <robert.richter@amd.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: stable@vger.kernel.org # v2.6.39+ Link: http://lkml.kernel.org/r/1337337642-1621-1-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-18 14:07:46 +02:00
Jan Beulich	20167d3421	x86-64: Fix accounting in kernel_physical_mapping_init() When finding a present and acceptable 2M/1G mapping, the number of pages mapped this way shouldn't be incremented (as it was already incremented when the earlier part of the mapping was established). Instead, last_map_addr needs to be updated in this case. Further, address increments were wrong in one place each in both phys_pmd_init() and phys_pud_init() (lacking the aligning down to the respective page boundary). As we're now doing the same calculation several times, fold it into a single instance using a local variable (matching how kernel_physical_mapping_init() itself does it at the PGD level). Observed during code inspection, not because of an actual problem. Signed-off-by: Jan Beulich <jbeulich@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4FB3C27202000078000841A0@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-18 10:13:37 +02:00
Alex Shi	3e7f3db001	x86/tlb: Clean up and unify TLB_FLUSH_ALL definition Since sizeof(long) is 4 in x86_32 mode, and it's 8 in x86_64 mode, sizeof(long long) is also 8 byte in x86_64 mode. use long mode can fit TLB_FLUSH_ALL defination here both in 32 or 64 bits mode. Signed-off-by: Alex Shi <alex.shi@intel.com> Link: http://lkml.kernel.org/n/tip-evv5bekiipi2pmyzdsy8lkkw@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-18 10:13:37 +02:00
Michael S. Tsirkin	0ab711ae6a	x86/apic: Implement EIO micro-optimization We know both register and value for eoi beforehand, so there's no need to check it and no need to do math to calculate the msr. Saves instructions/branches on each EOI when using x2apic. I looked at the objdump output to verify that the generated code looks right and actually is shorter. The real improvemements will be on the KVM guest side though, those come in a later patch. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: gleb@redhat.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/e019d1a125316f10d3e3a4b2f6bda41473f4fb72.1337184153.git.mst@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-18 09:46:09 +02:00
Michael S. Tsirkin	2a43195d83	x86/apic: Add apic->eoi_write() callback Add eoi_write callback so that kvm can override eoi accesses without touching the rest of the apic. As a side-effect, this will enable a micro-optimization for apics using msr. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: gleb@redhat.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/0df425d746c49ac2ecc405174df87752869629d2.1337184153.git.mst@redhat.com [ tidied it up a bit ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-18 09:46:08 +02:00
Michael S. Tsirkin	4ebcc24390	x86/apic: Use symbolic APIC_EOI_ACK Use the symbol instead of hard-coded numbers, now that the reason for the value is documented where the constant is defined we don't need to duplicate this explanation in code. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: gleb@redhat.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/ecbe4c79d69c172378e47e5a587ff5cd10293c9f.1337184153.git.mst@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-18 09:46:08 +02:00
Michael S. Tsirkin	c8f64bf7df	x86/apic: Fix typo EIO_ACK -> EOI_ACK and document it Fix typo in the macro name and document the reason it has this value. Update users. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Cc: Avi Kivity <avi@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: gleb@redhat.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/37867b31b9330690af2e60a2a7c4cb4b1b070caf.1337184153.git.mst@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-18 09:46:07 +02:00
Ingo Molnar	87e4baacae	x86/xen/apic: Add missing #include <xen/xen.h> This file depends on <xen/xen.h>, but the dependency was hidden due to: <asm/acpi.h> -> <asm/trampoline.h> -> <asm/io.h> -> <xen/xen.h> With the removal of <asm/trampoline.h>, this exposed the missing Cc: Len Brown <lenb@kernel.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/n/tip-7ccybvue6mw6wje3uxzzcglj@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-18 09:34:45 +02:00
H. Peter Anvin	bea3f8781e	x86, relocs: Workaround for binutils 2.22.52.0.1 section bug GNU ld 2.22.52.0.1 has a bug that it blindly changes symbols from section-relative to absolute if they are in a section of zero length. This turns the symbols __init_begin and __init_end into absolute symbols. Let the relocs program know that those should be treated as relative symbols. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: H.J. Lu <hjl.tools@gmail.com>	2012-05-18 00:24:09 -07:00
Paul Gortmaker	bb8187d35f	MCA: delete all remaining traces of microchannel bus support. Hardware with MCA bus is limited to 386 and 486 class machines that are now 20+ years old and typically with less than 32MB of memory. A quick search on the internet, and you see that even the MCA hobbyist/enthusiast community has lost interest in the early 2000 era and never really even moved ahead from the 2.4 kernels to the 2.6 series. This deletes anything remaining related to CONFIG_MCA from core kernel code and from the x86 architecture. There is no point in carrying this any further into the future. One complication to watch for is inadvertently scooping up stuff relating to machine check, since there is overlap in the TLA name space (e.g. arch/x86/boot/mca.c). Cc: Thomas Gleixner <tglx@linutronix.de> Cc: James Bottomley <JBottomley@Parallels.com> Cc: x86@kernel.org Acked-by: Ingo Molnar <mingo@elte.hu> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-05-17 19:06:13 -04:00
Alan Cox	80b3e55737	x86: Fix boot on Twinhead H12Y Despite lots of investigation into why this is needed we don't know or have an elegant cure. The only answer found on this laptop is to mark a problem region as used so that Linux doesn't put anything there. Currently all the users add reserve= command lines and anyone not knowing this needs to find the magic page that documents it. Automate it instead. Signed-off-by: Alan Cox <alan@linux.intel.com> Tested-and-bugfixed-by: Arne Fitzenreiter <arne@fitzenreiter.de> Resolves-bug: https://bugzilla.kernel.org/show_bug.cgi?id=10231 Link: http://lkml.kernel.org/r/20120515174347.5109.94551.stgit@bluebook Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-17 20:04:00 +02:00
Linus Torvalds	31ae98359d	Merge branches 'perf-urgent-for-linus', 'x86-urgent-for-linus' and 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf, x86 and scheduler updates from Ingo Molnar. * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tracing: Do not enable function event with enable perf stat: handle ENXIO error for perf_event_open perf: Turn off compiler warnings for flex and bison generated files perf stat: Fix case where guest/host monitoring is not supported by kernel perf build-id: Fix filename size calculation * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, kvm: KVM paravirt kernels don't check for CPUID being unavailable x86: Fix section annotation of acpi_map_cpu2node() x86/microcode: Ensure that module is only loaded on supported Intel CPUs * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix KVM and ia64 boot crash due to sched_groups circular linked list assumption	2012-05-17 09:35:17 -07:00
Peter Zijlstra	8e7fbcbc22	sched: Remove stale power aware scheduling remnants and dysfunctional knobs It's been broken forever (i.e. it's not scheduling in a power aware fashion), as reported by Suresh and others sending patches, and nobody cares enough to fix it properly ... so remove it to make space free for something better. There's various problems with the code as it stands today, first and foremost the user interface which is bound to topology levels and has multiple values per level. This results in a state explosion which the administrator or distro needs to master and almost nobody does. Furthermore large configuration state spaces aren't good, it means the thing doesn't just work right because it's either under so many impossibe to meet constraints, or even if there's an achievable state workloads have to be aware of it precisely and can never meet it for dynamic workloads. So pushing this kind of decision to user-space was a bad idea even with a single knob - it's exponentially worse with knobs on every node of the topology. There is a proposal to replace the user interface with a single 3 state knob: sched_balance_policy := { performance, power, auto } where 'auto' would be the preferred default which looks at things like Battery/AC mode and possible cpufreq state or whatever the hw exposes to show us power use expectations - but there's been no progress on it in the past many months. Aside from that, the actual implementation of the various knobs is known to be broken. There have been sporadic attempts at fixing things but these always stop short of reaching a mergable state. Therefore this wholesale removal with the hopes of spurring people who care to come forward once again and work on a coherent replacement. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1326104915.2442.53.camel@twins Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-17 13:48:56 +02:00
Dave Airlie	db2e034d2c	x86/vga: fix build with efi disabled. Reported by sfr on -next merge. Signed-off-by: Dave Airlie <airlied@redhat.com>	2012-05-17 08:32:50 +01:00
Steven Rostedt	e4f5d5440b	ftrace/x86: Have x86 ftrace use the ftrace_modify_all_code() To remove duplicate code, have the ftrace arch_ftrace_update_code() use the generic ftrace_modify_all_code(). This requires that the default ftrace_replace_code() becomes a weak function so that an arch may override it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2012-05-16 20:00:27 -04:00
Suresh Siddha	1dcc8d7ba2	x86, fpu: drop the fpu state during thread exit There is no need to save any active fpu state to the task structure memory if the task is dead. Just drop the state instead. For example, this saved some 1770 xsave's during the system boot of a two socket Xeon system. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1336692811-30576-4-git-send-email-suresh.b.siddha@intel.com Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-16 15:20:59 -07:00
Suresh Siddha	d75f1b391f	x86, xsave: remove thread_has_fpu() bug check in __sanitize_i387_state() Code paths like fork(), exit() and signal handling flush the fpu state explicitly to the structures in memory. BUG_ON() in __sanitize_i387_state() is checking that the fpu state is not live any more. But for preempt kernels, task can be scheduled out and in at any place and the preload_fpu logic during context switch can make the fpu registers live again. For example, consider a 64-bit Task which uses fpu frequently and as such you will find its fpu_counter mostly non-zero. During its time slice, kernel used fpu by doing kernel_fpu_begin/kernel_fpu_end(). After this, in the same scheduling slice, task-A got a signal to handle. Then during the signal setup path we got preempted when we are just before the sanitize_i387_state() in arch/x86/kernel/xsave.c:save_i387_xstate(). And when we come back we will have the fpu registers live that can hit the bug_on. Similarly during core dump, other threads can context-switch in and out (because of spurious wakeups while waiting for the coredump to finish in kernel/exit.c:exit_mm()) and the main thread dumping core can run into this bug when it finds some other thread with its fpu registers live on some other cpu. So remove the paranoid check for now, even though it caught a bug in the multi-threaded core dump case (fixed in the previous patch). Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1336692811-30576-3-git-send-email-suresh.b.siddha@intel.com Cc: Oleg Nesterov <oleg@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-16 15:17:17 -07:00
Suresh Siddha	55ccf3fe3f	fork: move the real prepare_to_copy() users to arch_dup_task_struct() Historical prepare_to_copy() is mostly a no-op, duplicated for majority of the architectures and the rest following the x86 model of flushing the extended register state like fpu there. Remove it and use the arch_dup_task_struct() instead. Suggested-by: Oleg Nesterov <oleg@redhat.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Link: http://lkml.kernel.org/r/1336692811-30576-1-git-send-email-suresh.b.siddha@intel.com Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: David Howells <dhowells@redhat.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Chris Zankel <chris@zankel.net> Cc: Richard Henderson <rth@twiddle.net> Cc: Russell King <linux@arm.linux.org.uk> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Mark Salter <msalter@redhat.com> Cc: Aurelien Jacquiot <a-jacquiot@ti.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Tony Luck <tony.luck@intel.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Jonas Bonn <jonas@southpole.se> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Chen Liqin <liqin.chen@sunplusct.com> Cc: Lennox Wu <lennox.wu@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-16 15:16:26 -07:00
Avi Kivity	d8368af8b4	KVM: Fix mmu_reload() clash with nested vmx event injection Currently the inject_pending_event() call during guest entry happens after kvm_mmu_reload(). This is for historical reasons - we used to inject_pending_event() in atomic context, while kvm_mmu_reload() needs task context. A problem is that nested vmx can cause the mmu context to be reset, if event injection is intercepted and causes a #VMEXIT instead (the #VMEXIT resets CR0/CR3/CR4). If this happens, we end up with invalid root_hpa, and since kvm_mmu_reload() has already run, no one will fix it and we end up entering the guest this way. Fix by reordering event injection to be before kvm_mmu_reload(). Use ->cancel_injection() to undo if kvm_mmu_reload() fails. https://bugzilla.kernel.org/show_bug.cgi?id=42980 Reported-by: Luke-Jr <luke-jr+linuxbugs@utopios.org> Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-05-16 18:09:26 -03:00
H. Peter Anvin	638d957b51	x86, realmode: Change EFER to a single u64 field Change EFER to be a single u64 field instead of two u32 fields; change the order to maintain alignment. Note that on x86-64 cr4 is really also a 64-bit quantity, although we can only set the low 32 bits from the trampoline code since it is still executing in 32-bit mode at that point. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com>	2012-05-16 14:02:05 -07:00
Peter Jones	ab7b64e9ee	x86: Don't continue booting if we can't load the specified initrd If we've determined we can't do what the user asked, trying to do something else isn't going to make the user's life better. Without this the screen scrolls a bit and then you get a panic anyway, and it's nice not to have so much scroll after the real problem in bug reports. Link: http://lkml.kernel.org/r/1337190206-12121-1-git-send-email-pjones@redhat.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-16 13:59:52 -07:00
H. Peter Anvin	1371270188	x86, realmode: Move kernel/realmode.c to realmode/init.c Keep all the realmode code together, including initialization (only the rm/ subdirectory is actually built as real-mode code, anyway.) Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com>	2012-05-16 13:49:10 -07:00
H. Peter Anvin	51edbe6a2f	x86, realmode: Move not-common bits out of trampoline_common.S Move the bits that aren't actually common out of trampoline_common.S and into the arch-specific files. Furthermore, make sure the page directory is first in the .bss section for trampoline_64.S in order to not waste an entire page of memory. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com>	2012-05-16 13:44:10 -07:00
H. Peter Anvin	796038799a	x86, realmode: Mask out EFER.LMA when saving trampoline EFER Some AMD processors apparently #GP(0) if EFER.LMA is set in WRMSR, rather than ignoring it. Thus, we need to mask it out. Reported-by: Ingo Molnar <mingo@kernel.org> Tested-by: Borislav Petkov <bp@alien8.de> Cc: Jarkko Sakkinen <jarkko.sakkinen@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Link: http://lkml.kernel.org/r/1336501366-28617-24-git-send-email-jarkko.sakkinen@intel.com	2012-05-16 13:22:41 -07:00
Avi Kivity	c142786c62	KVM: MMU: Don't use RCU for lockless shadow walking Using RCU for lockless shadow walking can increase the amount of memory in use by the system, since RCU grace periods are unpredictable. We also have an unconditional write to a shared variable (reader_counter), which isn't good for scaling. Replace that with a scheme similar to x86's get_user_pages_fast(): disable interrupts during lockless shadow walk to force the freer (kvm_mmu_commit_zap_page()) to wait for the TLB flush IPI to find the processor with interrupts enabled. We also add a new vcpu->mode, READING_SHADOW_PAGE_TABLES, to prevent kvm_flush_remote_tlbs() from avoiding the IPI. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-05-16 16:08:28 -03:00
Avi Kivity	b2da15ac26	KVM: VMX: Optimize %ds, %es reload On x86_64, we can defer %ds and %es reload to the heavyweight context switch, since nothing in the lightweight paths uses the host %ds or %es (they are ignored by the processor). Furthermore we can avoid the load if the segments are null, by letting the hardware load the null segments for us. This is the expected case. On i386, we could avoid the reload entirely, since the entry.S paths take care of reload, except for the SYSEXIT path which leaves %ds and %es set to __USER_DS. So we set them to the same values as well. Saves about 70 cycles out of 1600 (around 4%; noisy measurements). Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-05-16 16:03:19 -03:00
Avi Kivity	512d5649e8	KVM: VMX: Fix %ds/%es clobber The vmx exit code unconditionally restores %ds and %es to __USER_DS. This can override the user's values, since %ds and %es are not saved and restored in x86_64 syscalls. In practice, this isn't dangerous since nobody uses segment registers in long mode, least of all programs that use KVM. Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>	2012-05-16 16:03:19 -03:00
Dave Airlie	6cf20beec4	x86/vga: set the default device from the fixup. Since Matthew's efi/vga changes on non-EFI machines we were failing to tell the vgaarb/switcheroo what the default device was, this sets the default device in the quirk if none has been set before. This fixes the switcheroo on my T410s. Cc: Matthew Garrett <mjg@redhat.com> Acked-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2012-05-16 10:22:16 +01:00
Bjorn Helgaas	867aae6ebe	x86/PCI: only check for spinlock being held in SMP kernels spin_is_locked() is always false on UP kernels: spin_lock_irqsave() does no locking, so we can't tell whether the lock is held or not. Therefore, this warning is only valid for SMP kernels. CC: Myron Stowe <myron.stowe@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2012-05-15 17:01:09 -06:00
Shuah Khan	363f7ce325	x86: kernel/dumpstack.c simple_strtoul cleanup Change kstack_setup() and code_bytes_setup() in kernel/dumpstack.c to call kstrtoul() instead of calling obsoleted simple_strtoul(). Signed-off-by: Shuah Khan <shuahkhan@gmail.com> Link: http://lkml.kernel.org/r/1336327084.2897.15.camel@lorien2 Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-15 15:36:42 -07:00
Shuah Khan	5abe68e493	x86: kernel/check.c simple_strtoul cleanup Change set_corruption_check() and set_corruption_check_period() in kernel/check.c to call kstrtoul() instead of calling obsoleted simple_strtoul(). Signed-off-by: Shuah Khan <shuahkhan@gmail.com> Link: http://lkml.kernel.org/r/1336326908.2897.12.camel@lorien2 Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-05-15 15:36:41 -07:00

1 2 3 4 5 ...

15397 Commits