linux

iv/linux

Author	SHA1	Message	Date
Dave Hansen	ae0def05ed	perf/x86: Only print PMU state when also WARN()'ing intel_pmu_handle_irq() has a warning in it if it does too many loops. It is a WARN_ONCE(), but the perf_event_print_debug() call beneath it is unconditional. For the first warning, you get a nice backtrace and message, but subsequent ones just dump the PMU state with no leading messages. I doubt this is what was intended. This patch will only print the PMU state when paired with the WARN_ON() text. It effectively open-codes WARN_ONCE()'s one-time-only logic. My suspicion is that the code really just wants to make sure we do not sit in the loop and spit out a warning for every loop iteration after the 100th. From what I've seen, this is very unlikely to happen since we also clear the PMU state. After this patch, instead of seeing the PMU state dumped each time, you will just see: [57494.894540] perf_event_intel: clearing PMU state on CPU#129 [57579.539668] perf_event_intel: clearing PMU state on CPU#10 [57587.137762] perf_event_intel: clearing PMU state on CPU#134 [57623.039912] perf_event_intel: clearing PMU state on CPU#114 [57644.559943] perf_event_intel: clearing PMU state on CPU#118 ... Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20130530174559.0DB049F4@viggo.jf.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-06-19 12:50:47 +02:00
Andrew Hunter	43b4578071	perf/x86: Reduce stack usage of x86_schedule_events() x86_schedule_events() caches event constraints on the stack during scheduling. Given the number of possible events, this is 512 bytes of stack; since it can be invoked under schedule() under god-knows-what, this is causing stack blowouts. Trade some space usage for stack safety: add a place to cache the constraint pointer to struct perf_event. For 8 bytes per event (1% of its size) we can save the giant stack frame. This shouldn't change any aspect of scheduling whatsoever and while in theory the locality's a tiny bit worse, I doubt we'll see any performance impact either. Tested: `perf stat whatever` does not blow up and produces results that aren't hugely obviously wrong. I'm not sure how to run particularly good tests of perf code, but this should not produce any functional change whatsoever. Signed-off-by: Andrew Hunter <ahh@google.com> Reviewed-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1369332423-4400-1-git-send-email-ahh@google.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-06-19 12:50:44 +02:00
Ingo Molnar	eff2108f02	Merge branch 'perf/urgent' into perf/core Merge in the latest fixes, to avoid conflicts with ongoing work. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-06-19 12:44:41 +02:00
Stephane Eranian	f1a527899e	perf/x86: Fix broken PEBS-LL support on SNB-EP/IVB-EP This patch fixes broken support of PEBS-LL on SNB-EP/IVB-EP. For some reason, the LDLAT extra reg definition for snb_ep showed up as duplicate in the snb table. This patch moves the definition of LDLAT back into the snb_ep table. Thanks to Don Zickus for tracking this one down. Signed-off-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20130607212210.GA11849@quad Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-06-19 12:44:16 +02:00
Igor Mammedov	07868fc6aa	x86: kvmclock: zero initialize pvclock shared memory area kernel might hung in pvclock_clocksource_read() due to uninitialized memory might contain odd version value in following cycle: do { version = __pvclock_read_cycles(src, &ret, &flags); } while ((src->version & 1) \|\| version != src->version); if secondary kvmclock is accessed before it's registered with kvm. Clear garbage in pvclock shared memory area right after it's allocated to avoid this issue. Ref: https://bugzilla.kernel.org/show_bug.cgi?id=59521 Signed-off-by: Igor Mammedov <imammedo@redhat.com> [See BZ for analysis. We may want a different fix for 3.11, but this is the safest for now - Paolo] Cc: <stable@vger.kernel.org> # 3.8 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-06-19 12:25:28 +02:00
Randy Dunlap	d1603990ea	x86: fix build error and kconfig for ia32_emulation and binfmt Fix kconfig warning and build errors on x86_64 by selecting BINFMT_ELF when COMPAT_BINFMT_ELF is being selected. warning: (IA32_EMULATION) selects COMPAT_BINFMT_ELF which has unmet direct dependencies (COMPAT && BINFMT_ELF) fs/built-in.o: In function `elf_core_dump': compat_binfmt_elf.c:(.text+0x3e093): undefined reference to `elf_core_extra_phdrs' compat_binfmt_elf.c:(.text+0x3ebcd): undefined reference to `elf_core_extra_data_size' compat_binfmt_elf.c:(.text+0x3eddd): undefined reference to `elf_core_write_extra_phdrs' compat_binfmt_elf.c:(.text+0x3f004): undefined reference to `elf_core_write_extra_data' [ hpa: This was sent to me for -next but it is a low risk build fix ] Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: http://lkml.kernel.org/r/51C0B614.5000708@infradead.org Cc: <stable@vger.kernel.org> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-06-18 16:20:32 -05:00
Yinghai Lu	d8d386c106	x86, mtrr: Fix original mtrr range get for mtrr_cleanup Joshua reported: Commit cd7b304dfaf1 (x86, range: fix missing merge during add range) broke mtrr cleanup on his setup in 3.9.5. corresponding commit in upstream is fbe06b7bae7c. BADgran_size: 64K chunk_size: 16M num_reg: 6 lose cover RAM: -0G https://bugzilla.kernel.org/show_bug.cgi?id=59491 So it rejects new var mtrr layout. It turns out we have some problem with initial mtrr range retrieval. The current sequence is: x86_get_mtrr_mem_range ==> bunchs of add_range_with_merge ==> bunchs of subract_range ==> clean_sort_range add_range_with_merge for [0,1M) sort_range() add_range_with_merge could have blank slots, so we can not just sort only, that will have final result have extra blank slot in head. So move that calling add_range_with_merge for [0,1M), with that we could avoid extra clean_sort_range calling. Reported-by: Joshua Covington <joshuacov@googlemail.com> Tested-by: Joshua Covington <joshuacov@googlemail.com> Signed-off-by: Yinghai Lu <yinghai@kernel.org> Link: http://lkml.kernel.org/r/1371154622-8929-2-git-send-email-yinghai@kernel.org Cc: <stable@vger.kernel.org> v3.9 Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-06-18 11:32:02 -05:00
Zhanghaoyu (A)	764bcbc5a6	KVM: x86: remove vcpu's CPL check in host-invoked XCR set __kvm_set_xcr function does the CPL check when set xcr. __kvm_set_xcr is called in two flows, one is invoked by guest, call stack shown as below, handle_xsetbv(or xsetbv_interception) kvm_set_xcr __kvm_set_xcr the other one is invoked by host, for example during system reset: kvm_arch_vcpu_ioctl kvm_vcpu_ioctl_x86_set_xcrs __kvm_set_xcr The former does need the CPL check, but the latter does not. Cc: stable@vger.kernel.org Signed-off-by: Zhang Haoyu <haoyu.zhang@huawei.com> [Tweaks to commit message. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-06-18 09:55:35 +02:00
Greg Kroah-Hartman	bb07b00be7	Merge 3.10-rc6 into driver-core-next We want these fixes here too. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-06-17 16:57:20 -07:00
Fenghua Yu	6bb2ff846f	x86 thermal: Disable power limit notification interrupt by default The package power limit notification interrupt is primarily for system diagnosis, and should not be blindly enabled on every system by default -- particuarly since Linux does nothing in the handler except count how many times it has been called... Add a new kernel cmdline parameter "int_pln_enable" for situations where users want to oberve these events via existing system counters: $ grep TRM /proc/interrupts $ grep . /sys/devices/system/cpu/cpu/thermal_throttle/ https://bugzilla.kernel.org/show_bug.cgi?id=36182 Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2013-06-14 14:49:00 -07:00
Fenghua Yu	c81147483e	x86 thermal: Delete power-limit-notification console messages Package power limits are common on some systems under some conditions -- so printing console messages when limits are reached causes unnecessary customer concern and support calls. Note that even with these console messages gone, the events can still be observed via system counters: $ grep TRM /proc/interrupts Shows total thermal interrupts, which includes both power limit notifications and thermal throttling interrupts. $ grep . /sys/devices/system/cpu/cpu/thermal_throttle/ Will show what caused those interrupts, core and package throttling and power limit notifications. https://bugzilla.kernel.org/show_bug.cgi?id=36182 Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2013-06-14 14:48:37 -07:00
Steve Capper	53313b2c91	x86: mm: Remove general hugetlb code from x86. huge_pte_alloc, huge_pte_offset and follow_huge_p[mu]d have already been copied over to mm. This patch removes the x86 copies of these functions and activates the general ones by enabling: CONFIG_ARCH_WANT_GENERAL_HUGETLB Signed-off-by: Steve Capper <steve.capper@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Andrew Morton <akpm@linux-foundation.org>	2013-06-14 09:40:15 +01:00
Steve Capper	cfe28c5d63	x86: mm: Remove x86 version of huge_pmd_share. The huge_pmd_share code has been copied over to mm/hugetlb.c to make it accessible to other architectures. Remove the x86 copy of the huge_pmd_share code and enable the ARCH_WANT_HUGE_PMD_SHARE config flag. That way we reference the general one. Signed-off-by: Steve Capper <steve.capper@linaro.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Andrew Morton <akpm@linux-foundation.org>	2013-06-14 09:39:46 +01:00
Linus Torvalds	cb03dc094a	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Peter Anvin: "Another set of fixes, the biggest bit of this is yet another tweak to the UEFI anti-bricking code; apparently we finally got some feedback from Samsung as to what makes at least their systems fail. This set should actually fix the boot regressions that some other systems (e.g. SGI) have exhibited. Other than that, there is a patch to avoid a panic with particularly unhappy memory layouts and two minor protocol fixes which may or may not be manifest bugs" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Fix typo in kexec register clearing x86, relocs: Move __vvar_page from S_ABS to S_REL Modify UEFI anti-bricking code x86: Fix adjust_range_size_mask calling position	2013-06-13 13:08:51 -07:00
H. Peter Anvin	45df901cc8	* More tweaking to the EFI variable anti-bricking algorithm. Quite a few users were reporting boot regressions in v3.9. This has now been fixed with a more accurate "minimum storage requirement to avoid bricking" value from Samsung (5K instead of 50%) and code to trigger garbage collection when we near our limit - Matthew Garrett. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQIcBAABAgAGBQJRtkY2AAoJEC84WcCNIz1VJOsP/00xwiY4VKh2RfqNkYKSl/w5 gEshIHFEAXHX5X8C4ReocZVywvdjTgbJoKBbBy3FePYRzLddrmavvjen17hk7BzS /cO8/eXForkNWCGR1kLagA6HLpgKP5DPayKizoMb4Mg6muzfT1SCcN6Pzh8cDMWe btcq/l9JZejXdJ4Wfoq1My+WdXs19OT/BNeD3y65K4x29vNUjop6oaIdDJWLlH/S aeLHh8d4xbSHNWzK1fBP7CnFTYU27xxs1BFNAReU6McxeQCYZAIaRovYnjTZEvfJ twd2tLrOn9HBVTbWa8T4XGNSr+QcT4XGMadLvdwuqltmKDfH6Onm8aWQM3IqA7gy Qimbcv2B7HrITgXWTzp3DPkXF1LA8/8QHSBXVMUU9Rl6QOLy18vIdKiQy3M1Ng9Z 0q+Ow93JtnL11zf9wLDMdKaKcA9HOxbG/wRTK6XO4vGaWj9brFv3n5Ib7OreHH6D GP58zDEnThFuj97K/NKREBZZFcFOMZpKk5MAipVkzltihUQmNeTF/dAtBJ3Ncu/A PqQE6uuKVXjASJR8Gy0bI3WHtSTZK4L/sg9c2MF3bdJa9BswN+m8IEbls+S+iFOx +sYPQx7Zw6SFENxDw8cDYNzC14yfr60qyOxTWfkHH7l/FnvhOgwHzqPsLcXx0ouR C6k1yPYSTgiqFdWC2sjn =TZuM -----END PGP SIGNATURE----- Merge tag 'efi-urgent' into x86/urgent * More tweaking to the EFI variable anti-bricking algorithm. Quite a few users were reporting boot regressions in v3.9. This has now been fixed with a more accurate "minimum storage requirement to avoid bricking" value from Samsung (5K instead of 50%) and code to trigger garbage collection when we near our limit - Matthew Garrett. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-06-13 08:59:23 -07:00
Jussi Kivilinna	fe6510b5d6	crypto: aesni_intel - fix accessing of unaligned memory The new XTS code for aesni_intel uses input buffers directly as memory operands for pxor instructions, which causes crash if those buffers are not aligned to 16 bytes. Patch changes XTS code to handle unaligned memory correctly, by loading memory with movdqu instead. Reported-by: Dave Jones <davej@redhat.com> Tested-by: Dave Jones <davej@redhat.com> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2013-06-13 14:57:42 +08:00
Srinivas Pandruvada	25cdce170d	x86, mcheck, therm_throt: Process package thresholds Added callback registration for package threshold reports. Also added a callback to check the rate control implemented in callback or not. If there is no rate control implemented, then there is a default rate control similar to core threshold notification by delaying for CHECK_INTERVAL (5 minutes) between reports. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Zhang Rui <rui.zhang@intel.com>	2013-06-13 09:59:14 +08:00
Kees Cook	c8a22d19dd	x86: Fix typo in kexec register clearing Fixes a typo in register clearing code. Thanks to PaX Team for fixing this originally, and James Troup for pointing it out. Signed-off-by: Kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/20130605184718.GA8396@www.outflux.net Cc: <stable@vger.kernel.org> v2.6.30+ Cc: PaX Team <pageexec@freemail.hu> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-06-12 15:16:18 -07:00
Kees Cook	b1983b0a75	x86, relocs: Move __vvar_page from S_ABS to S_REL The __vvar_page relocation should actually be listed in S_REL instead of S_ABS. Oddly, this didn't always cause things to break, presumably because there are no users for relocation information on 64 bits yet. [ hpa: Not for stable - new code in 3.10 ] Signed-off-by: Kees Cook <keescook@chromium.org> Link: http://lkml.kernel.org/r/20130611185652.GA23674@www.outflux.net Reported-by: Michael Davidson <md@google.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-06-12 15:14:57 -07:00
Marcelo Tosatti	8915aa27d5	KVM: x86: handle idiv overflow at kvm_write_tsc Its possible that idivl overflows (due to large delta stored in usdiff, valid scenario). Create an exception handler to catch the overflow exception (division by zero is protected by vcpu->arch.virtual_tsc_khz check), and interpret it accordingly (delta is larger than USEC_PER_SEC). Fixes https://bugzilla.redhat.com/show_bug.cgi?id=969644 Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-12 14:24:11 +03:00
Thomas Gleixner	d7880812b3	idle: Add the stack canary init to cpu_startup_entry() Moving x86 to the generic idle implementation (commit 7d1a9417 "x86: Use generic idle loop") wreckaged the stack protector. I stupidly missed that boot_init_stack_canary() must be inlined from a function which never returns, but I put that call into arch_cpu_idle_prepare() which of course returns. I pondered to play tricks with arch_cpu_idle_prepare() first, but then I noticed, that the other archs which have implemented the stackprotector (ARM and SH) do not initialize the canary for the non-boot cpus. So I decided to move the boot_init_stack_canary() call into cpu_startup_entry() ifdeffed with an CONFIG_X86 for now. This #ifdef is just a temporary measure as I don't want to inflict the boot_init_stack_canary() call on ARM and SH that late in the cycle. I'll queue a patch for 3.11 which removes the #ifdef if the ARM/SH maintainers have no objection. Reported-by: Wouter van Kesteren <woutershep@gmail.com> Cc: x86@kernel.org Cc: Russell King <linux@arm.linux.org.uk> Cc: Paul Mundt <lethal@linux-sh.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2013-06-11 22:04:47 +02:00
Zach Bobroff	d3768d885c	x86, efi: retry ExitBootServices() on failure ExitBootServices is absolutely supposed to return a failure if any ExitBootServices event handler changes the memory map. Basically the get_map loop should run again if ExitBootServices returns an error the first time. I would say it would be fair that if ExitBootServices gives an error the second time then Linux would be fine in returning control back to BIOS. The second change is the following line: again: size += sizeof(mem_map) 2; Originally you were incrementing it by the size of one memory map entry. The issue here is all related to the low_alloc routine you are using. In this routine you are making allocations to get the memory map itself. Doing this allocation or allocations can affect the memory map by more than one record. [ mfleming - changelog, code style ] Signed-off-by: Zach Bobroff <zacharyb@ami.com> Cc: <stable@vger.kernel.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2013-06-11 07:51:54 +01:00
Borislav Petkov	43ab0476a6	efi: Convert runtime services function ptrs ... to void * like the boot services and lose all the void * casts. No functionality change. Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2013-06-11 07:39:26 +01:00
Matthew Garrett	f8b8404337	Modify UEFI anti-bricking code This patch reworks the UEFI anti-bricking code, including an effective reversion of cc5a080c and 31ff2f20. It turns out that calling QueryVariableInfo() from boot services results in some firmware implementations jumping to physical addresses even after entering virtual mode, so until we have 1:1 mappings for UEFI runtime space this isn't going to work so well. Reverting these gets us back to the situation where we'd refuse to create variables on some systems because they classify deleted variables as "used" until the firmware triggers a garbage collection run, which they won't do until they reach a lower threshold. This results in it being impossible to install a bootloader, which is unhelpful. Feedback from Samsung indicates that the firmware doesn't need more than 5KB of storage space for its own purposes, so that seems like a reasonable threshold. However, there's still no guarantee that a platform will attempt garbage collection merely because it drops below this threshold. It seems that this is often only triggered if an attempt to write generates a genuine EFI_OUT_OF_RESOURCES error. We can force that by attempting to create a variable larger than the remaining space. This should fail, but if it somehow succeeds we can then immediately delete it. I've tested this on the UEFI machines I have available, but I don't have a Samsung and so can't verify that it avoids the bricking problem. Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com> Signed-off-by: Lee, Chun-Y <jlee@suse.com> [ dummy variable cleanup ] Cc: <stable@vger.kernel.org> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2013-06-10 21:59:37 +01:00
Linus Torvalds	50e6f8511a	Bug-fixes for regressions: - xen/tmem stopped working after a certain combination of modprobe/swapon was used - cpu online/offlining would trigger WARN_ON. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQEcBAABAgAGBQJRtgQxAAoJEFjIrFwIi8fJ9QYH/ibEGBlKLfGzN4Apx6evBA68 l9CuLIpGCkPOiJLgVs10zY77Sg3f95bHFkJrwrEIDeTQ42joNKybsIp+qZCIS5CW sAnzSd6Mb8jqQYJpgLl03+z3GdFzILTJ9e0zYTETbW2vfLpAETm87XWH+gjxDK0e 9I0kZX+Q3+2VWi5xv+UwWkYIOLbggLyKYajTHDwWNuC7vQgkJulCAbmjgN/NBv7A xacvXdbEClIfZ5tvJJN0RggdEWo6WyTxyExfLPmpXFbHEvWDUX5LPEf8MhFY1ORK U1C0BSV6YuLo350G5lY4I0R75ZEWLeUyVkZFnJIeGnUF1rP6OXEuVWTyeivPDBA= =Fy36 -----END PGP SIGNATURE----- Merge tag 'stable/for-linus-3.10-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull xen fixes from Konrad Rzeszutek Wilk: "Two bug-fixes for regressions: - xen/tmem stopped working after a certain combination of modprobe/swapon was used - cpu online/offlining would trigger WARN_ON." * tag 'stable/for-linus-3.10-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/tmem: Don't over-write tmem_frontswap_poolid after tmem_frontswap_init set it. xen/smp: Fixup NOHZ per cpu data when onlining an offline CPU.	2013-06-10 13:27:46 -07:00
Konrad Rzeszutek Wilk	09e99da766	xen/time: Free onlined per-cpu data structure if we want to online it again. If the per-cpu time data structure has been onlined already and we are trying to online it again, then free the previous copy before blindly over-writting it. A developer naturally should not call this function multiple times but just in case. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-06-10 08:43:37 -04:00
Konrad Rzeszutek Wilk	a05e2c371f	xen/time: Check that the per_cpu data structure has data before freeing. We don't check whether the per_cpu data structure has actually been freed in the past. This checks it and if it has been freed in the past then just continues on without double-freeing. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-06-10 08:43:36 -04:00
Konrad Rzeszutek Wilk	c9d76a24a2	xen/time: Don't leak interrupt name when offlining. When the user does: echo 0 > /sys/devices/system/cpu/cpu1/online echo 1 > /sys/devices/system/cpu/cpu1/online kmemleak reports: kmemleak: 7 new suspected memory leaks (see /sys/kernel/debug/kmemleak) One of the leaks is from xen/time: unreferenced object 0xffff88003fa51280 (size 32): comm "swapper/0", pid 1, jiffies 4294667339 (age 1027.789s) hex dump (first 32 bytes): 74 69 6d 65 72 31 00 00 00 00 00 00 00 00 00 00 timer1.......... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff81660721>] kmemleak_alloc+0x21/0x50 [<ffffffff81190aac>] __kmalloc_track_caller+0xec/0x2a0 [<ffffffff812fe1bb>] kvasprintf+0x5b/0x90 [<ffffffff812fe228>] kasprintf+0x38/0x40 [<ffffffff81041ec1>] xen_setup_timer+0x51/0xf0 [<ffffffff8166339f>] xen_cpu_up+0x5f/0x3e8 [<ffffffff8166bbf5>] _cpu_up+0xd1/0x14b [<ffffffff8166bd48>] cpu_up+0xd9/0xec [<ffffffff81ae6e4a>] smp_init+0x4b/0xa3 [<ffffffff81ac4981>] kernel_init_freeable+0xdb/0x1e6 [<ffffffff8165ce39>] kernel_init+0x9/0xf0 [<ffffffff8167edfc>] ret_from_fork+0x7c/0xb0 [<ffffffffffffffff>] 0xffffffffffffffff This patch fixes it by stashing away the 'name' in the per-cpu data structure and freeing it when offlining the CPU. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-06-10 08:43:35 -04:00
Konrad Rzeszutek Wilk	31620a198c	xen/time: Encapsulate the struct clock_event_device in another structure. We don't do any code movement. We just encapsulate the struct clock_event_device in a new structure which contains said structure and a pointer to a char *name. The 'name' will be used in 'xen/time: Don't leak interrupt name when offlining'. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-06-10 08:43:34 -04:00
Konrad Rzeszutek Wilk	354e7b7619	xen/spinlock: Don't leak interrupt name when offlining. When the user does: echo 0 > /sys/devices/system/cpu/cpu1/online echo 1 > /sys/devices/system/cpu/cpu1/online kmemleak reports: kmemleak: 7 new suspected memory leaks (see /sys/kernel/debug/kmemleak) unreferenced object 0xffff88003fa51260 (size 32): comm "swapper/0", pid 1, jiffies 4294667339 (age 1027.789s) hex dump (first 32 bytes): 73 70 69 6e 6c 6f 63 6b 31 00 00 00 00 00 00 00 spinlock1....... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff81660721>] kmemleak_alloc+0x21/0x50 [<ffffffff81190aac>] __kmalloc_track_caller+0xec/0x2a0 [<ffffffff812fe1bb>] kvasprintf+0x5b/0x90 [<ffffffff812fe228>] kasprintf+0x38/0x40 [<ffffffff81663789>] xen_init_lock_cpu+0x61/0xbe [<ffffffff816633a6>] xen_cpu_up+0x66/0x3e8 [<ffffffff8166bbf5>] _cpu_up+0xd1/0x14b [<ffffffff8166bd48>] cpu_up+0xd9/0xec [<ffffffff81ae6e4a>] smp_init+0x4b/0xa3 [<ffffffff81ac4981>] kernel_init_freeable+0xdb/0x1e6 [<ffffffff8165ce39>] kernel_init+0x9/0xf0 [<ffffffff8167edfc>] ret_from_fork+0x7c/0xb0 [<ffffffffffffffff>] 0xffffffffffffffff Instead of doing it like the "xen/smp: Don't leak interrupt name when offlining" patch did (which has a per-cpu structure which contains both the IRQ number and char) we use a per-cpu pointers to a char. The reason is that the "__this_cpu_read(lock_kicker_irq);" macro blows up with "__bad_size_call_parameter()" as the size of the returned structure is not within the parameters of what it expects and optimizes for. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-06-10 08:43:33 -04:00
Konrad Rzeszutek Wilk	b85fffec7f	xen/smp: Don't leak interrupt name when offlining. When the user does: echo 0 > /sys/devices/system/cpu/cpu1/online echo 1 > /sys/devices/system/cpu/cpu1/online kmemleak reports: kmemleak: 7 new suspected memory leaks (see /sys/kernel/debug/kmemleak) unreferenced object 0xffff88003fa51240 (size 32): comm "swapper/0", pid 1, jiffies 4294667339 (age 1027.789s) hex dump (first 32 bytes): 72 65 73 63 68 65 64 31 00 00 00 00 00 00 00 00 resched1........ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff81660721>] kmemleak_alloc+0x21/0x50 [<ffffffff81190aac>] __kmalloc_track_caller+0xec/0x2a0 [<ffffffff812fe1bb>] kvasprintf+0x5b/0x90 [<ffffffff812fe228>] kasprintf+0x38/0x40 [<ffffffff81047ed1>] xen_smp_intr_init+0x41/0x2c0 [<ffffffff816636d3>] xen_cpu_up+0x393/0x3e8 [<ffffffff8166bbf5>] _cpu_up+0xd1/0x14b [<ffffffff8166bd48>] cpu_up+0xd9/0xec [<ffffffff81ae6e4a>] smp_init+0x4b/0xa3 [<ffffffff81ac4981>] kernel_init_freeable+0xdb/0x1e6 [<ffffffff8165ce39>] kernel_init+0x9/0xf0 [<ffffffff8167edfc>] ret_from_fork+0x7c/0xb0 [<ffffffffffffffff>] 0xffffffffffffffff This patch fixes some of it by using the 'struct xen_common_irq->name' field to stash away the char so that it can be freed when the interrupt line is destroyed. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-06-10 08:43:32 -04:00
Konrad Rzeszutek Wilk	ee336e10d5	xen/smp: Set the per-cpu IRQ number to a valid default. When we free it we want to make sure to set it to a default value of -1 so that we don't double-free it (in case somebody calls us twice). Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-06-10 08:43:31 -04:00
Konrad Rzeszutek Wilk	9547689fcd	xen/smp: Introduce a common structure to contain the IRQ name and interrupt line. This patch adds a new structure to contain the common two things that each of the per-cpu interrupts need: - an interrupt number, - and the name of the interrupt (to be added in 'xen/smp: Don't leak interrupt name when offlining'). This allows us to carry the tuple of the per-cpu interrupt data structure and expand it as we need in the future. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-06-10 08:43:30 -04:00
Konrad Rzeszutek Wilk	53b94fdc8f	xen/smp: Coalesce the free_irq calls in one function. There are two functions that do a bunch of 'free_irq' on the per_cpu IRQ. Instead of having duplicate code just move it to one function. This is just code movement. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-06-10 08:43:29 -04:00
David S. Miller	93a306aef5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Merge 'net' into 'net-next' to get the MSG_CMSG_COMPAT regression fix. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-06 23:39:26 -07:00
Linus Torvalds	c51aa6db2a	PCI update for v3.10: PCI ROM from EFI x86/PCI: Map PCI setup data with ioremap() so it can be in highmem -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJRsMcBAAoJEFmIoMA60/r8EGMP/RiVo0oHoM1DlNfFonw8BjTl iClotD3rfQRgjDx8/563/KfO464mLJLbWdjanbpmWB215Xu4LJwv3wfWWEB1f3C+ hbsz4aFLmoQFnCzg1BIsuNNu88cXymf4A+osfKl+pCxPY+zChkqNZUPBk0m7aW4/ Y8JZz0NkxDUUb6ewtDhRDci9d5dHAzL/bmfcdUrP5X56znV72eVH555vY33hzNQ+ opoB9bFOdjxq8EDMlSaGxHmClUH/1VJd8Gr9Q4oj8OUOMq7At/WbHbFYJPbn/09+ xdh+78oGKZ6vI1rfIwlGSbJaizZJoE8l3EVEeY1rSgxF2zKNRAyN9IZOPaTigsjd 5zHTwUp+itxstw3fxjeG0R+Gl8guf13XTWzGcR6sC1TTn/NXSxqI3inORs5PWmpP ldfPkOW5Y3pcP6UfmJcAn1z9W8toyCouovFPnSoit6ZzS/+aFY2vYfHBQZ8LdJHv Gq4sYmcxWTzfgBxybc6OczztXgTQ+grhs3nnP29/QrGDyJ6RA6E6ENKiqARkALCF 1qFeOdWcoG4HaAlQhMRwuJTsUw7ofJWmYi6AdbHgI5pBQM4lLR1Kp+hGJnmeTBHm Svx98dv525HZ2tVUZwsd+ExsEfQ4IlPECVmOpxFkUf4sRILhvCTP5C6KWHBev7sG +F4rwomkmlS1hu82xMwl =dMNk -----END PGP SIGNATURE----- Merge tag 'pci-v3.10-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: "This fixes a crash when booting a 32-bit kernel via the EFI boot stub. PCI ROM from EFI x86/PCI: Map PCI setup data with ioremap() so it can be in highmem" * tag 'pci-v3.10-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: x86/PCI: Map PCI setup data with ioremap() so it can be in highmem	2013-06-06 16:28:15 -07:00
H. Peter Anvin	60e019eb37	x86: Get rid of ->hard_math and all the FPU asm fu Reimplement FPU detection code in C and drop old, not-so-recommended detection method in asm. Move all the relevant stuff into i387.c where it conceptually belongs. Finally drop cpuinfo_x86.hard_math. [ hpa: huge thanks to Borislav for taking my original concept patch and productizing it ] [ Boris, note to self: do not use static_cpu_has before alternatives! ] Signed-off-by: H. Peter Anvin <hpa@zytor.com> Link: http://lkml.kernel.org/r/1367244262-29511-2-git-send-email-bp@alien8.de Link: http://lkml.kernel.org/r/1365436666-9837-2-git-send-email-bp@alien8.de Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-06-06 14:32:04 -07:00
Matthew Garrett	1acba98f81	UEFI: Don't pass boot services regions to SetVirtualAddressMap() We need to map boot services regions during startup in order to avoid firmware bugs, but we shouldn't be passing those regions to SetVirtualAddressMap(). Ensure that we're only passing regions that are marked as being mapped at runtime. Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com>	2013-06-06 14:28:11 +01:00
David S. Miller	6bc19fb82d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Merge 'net' bug fixes into 'net-next' as we have patches that will build on top of them. This merge commit includes a change from Emil Goode (emilgoode@gmail.com) that fixes a warning that would have been introduced by this merge. Specifically it fixes the pingv6_ops method ipv6_chk_addr() to add a "const" to the "struct net_device *dev" argument and likewise update the dummy_ipv6_chk_addr() declaration. Signed-off-by: David S. Miller <davem@davemloft.net>	2013-06-05 16:37:30 -07:00
Jacob Shin	cd1c32ca96	x86, microcode, amd: Allow multiple families' bin files appended together Add support for parsing through multiple families' microcode patch container binary files appended together when early loading. This is already supported on Intel. Reported-by: Henrique de Moraes Holschuh <hmh@hmh.eng.br> Signed-off-by: Jacob Shin <jacob.shin@amd.com> Link: http://lkml.kernel.org/r/1370463236-2115-3-git-send-email-jacob.shin@amd.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-06-05 13:56:55 -07:00
Jacob Shin	275bbe2e29	x86, microcode, amd: Make find_ucode_in_initrd() __init Change find_ucode_in_initrd() to __init and only let BSP call it during cold boot. This is the right thing to do because only BSP will see initrd loaded by the boot loader. APs will offset into initrd_start to find the microcode patch binary. Reported-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Jacob Shin <jacob.shin@amd.com> Link: http://lkml.kernel.org/r/1370463236-2115-2-git-send-email-jacob.shin@amd.com Cc: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2013-06-05 13:56:47 -07:00
Matt Fleming	65694c5aad	x86/PCI: Map PCI setup data with ioremap() so it can be in highmem f9a37be0f0 ("x86: Use PCI setup data") added support for using PCI ROM images from setup_data. This used phys_to_virt(), which is not valid for highmem addresses, and can cause a crash when booting a 32-bit kernel via the EFI boot stub. pcibios_add_device() assumes that the physical addresses stored in setup_data are accessible via the direct kernel mapping, and that calling phys_to_virt() is valid. This isn't guaranteed to be true on x86 where the direct mapping range is much smaller than on x86-64. Calling phys_to_virt() on a highmem address results in the following: BUG: unable to handle kernel paging request at 39a3c198 IP: [<c262be0f>] pcibios_add_device+0x2f/0x90 ... Call Trace: [<c2370c73>] pci_device_add+0xe3/0x130 [<c274640b>] pci_scan_single_device+0x8b/0xb0 [<c2370d08>] pci_scan_slot+0x48/0x100 [<c2371904>] pci_scan_child_bus+0x24/0xc0 [<c262a7b0>] pci_acpi_scan_root+0x2c0/0x490 [<c23b7203>] acpi_pci_root_add+0x312/0x42f ... The solution is to use ioremap() instead of phys_to_virt() to map the setup data into the kernel address space. [bhelgaas: changelog] Tested-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Matthew Garrett <mjg59@srcf.ucam.org> Cc: Seth Forshee <seth.forshee@canonical.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Cc: stable@vger.kernel.org # v3.8+	2013-06-05 10:50:04 -06:00
Mathias Krause	a90936845d	x86, mce: Fix "braodcast" typo Fix the typo in MCJ_IRQ_BRAODCAST. Signed-off-by: Mathias Krause <minipli@googlemail.com> Signed-off-by: Borislav Petkov <bp@suse.de>	2013-06-05 11:59:17 +02:00
Gleb Natapov	05988d728d	KVM: MMU: reduce KVM_REQ_MMU_RELOAD when root page is zapped Quote Gleb's mail: \| why don't we check for sp->role.invalid in \| kvm_mmu_prepare_zap_page before calling kvm_reload_remote_mmus()? and \| Actually we can add check for is_obsolete_sp() there too since \| kvm_mmu_invalidate_all_pages() already calls kvm_reload_remote_mmus() \| after incrementing mmu_valid_gen. [ Xiao: add some comments and the check of is_obsolete_sp() ] Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-05 12:34:02 +03:00
Xiao Guangrong	365c886860	KVM: MMU: reclaim the zapped-obsolete page first As Marcelo pointed out that \| "(retention of large number of pages while zapping) \| can be fatal, it can lead to OOM and host crash" We introduce a list, kvm->arch.zapped_obsolete_pages, to link all the pages which are deleted from the mmu cache but not actually freed. When page reclaiming is needed, we always zap this kind of pages first. Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-05 12:33:33 +03:00
Xiao Guangrong	f34d251d66	KVM: MMU: collapse TLB flushes when zap all pages kvm_zap_obsolete_pages uses lock-break technique to zap pages, it will flush tlb every time when it does lock-break We can reload mmu on all vcpus after updating the generation number so that the obsolete pages are not used on any vcpus, after that we do not need to flush tlb when obsolete pages are zapped It will do kvm_mmu_prepare_zap_page many times and use one kvm_mmu_commit_zap_page to collapse tlb flush, the side-effects is that causes obsolete pages unlinked from active_list but leave on hash-list, so we add the comment around the hash list walker Note: kvm_mmu_commit_zap_page is still needed before free the pages since other vcpus may be doing locklessly shadow page walking Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-05 12:33:18 +03:00
Xiao Guangrong	e7d11c7a89	KVM: MMU: zap pages in batch Zap at lease 10 pages before releasing mmu-lock to reduce the overload caused by requiring lock After the patch, kvm_zap_obsolete_pages can forward progress anyway, so update the comments [ It improves the case 0.6% ~ 1% that do kernel building meanwhile read PCI ROM. ] Note: i am not sure that "10" is the best speculative value, i just guessed that '10' can make vcpu do not spend long time on kvm_zap_obsolete_pages and do not cause mmu-lock too hungry. Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-05 12:33:10 +03:00
Xiao Guangrong	7f52af7412	KVM: MMU: do not reuse the obsolete page The obsolete page will be zapped soon, do not reuse it to reduce future page fault Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-05 12:33:04 +03:00
Xiao Guangrong	35006126f0	KVM: MMU: add tracepoint for kvm_mmu_invalidate_all_pages It is good for debug and development Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-05 12:32:57 +03:00
Xiao Guangrong	2248b02321	KVM: MMU: show mmu_valid_gen in shadow page related tracepoints Show sp->mmu_valid_gen Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-05 12:32:49 +03:00

... 3 4 5 6 7 ...

17764 Commits