linux

iv/linux

Author	SHA1	Message	Date
Xiao Guangrong	f8f559422b	KVM: MMU: fast invalidate all mmio sptes This patch tries to introduce a very simple and scale way to invalidate all mmio sptes - it need not walk any shadow pages and hold mmu-lock KVM maintains a global mmio valid generation-number which is stored in kvm->memslots.generation and every mmio spte stores the current global generation-number into his available bits when it is created When KVM need zap all mmio sptes, it just simply increase the global generation-number. When guests do mmio access, KVM intercepts a MMIO #PF then it walks the shadow page table and get the mmio spte. If the generation-number on the spte does not equal the global generation-number, it will go to the normal #PF handler to update the mmio spte Since 19 bits are used to store generation-number on mmio spte, we zap all mmio sptes when the number is round Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-06-27 14:20:36 +03:00
Xiao Guangrong	b37fbea6ce	KVM: MMU: make return value of mmio page fault handler more readable Define some meaningful names instead of raw code Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-06-27 14:20:17 +03:00
Xiao Guangrong	f2fd125d32	KVM: MMU: store generation-number into mmio spte Store the generation-number into bit3 ~ bit11 and bit52 ~ bit61, totally 19 bits can be used, it should be enough for nearly all most common cases In this patch, the generation-number is always 0, it will be changed in the later patch [Gleb: masking generation bits from spte in get_mmio_spte_gfn() and get_mmio_spte_access()] Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-06-27 14:18:15 +03:00
Xiao Guangrong	885032b910	KVM: MMU: retain more available bits on mmio spte Let mmio spte only use bit62 and bit63 on upper 32 bits, then bit 52 ~ bit 61 can be used for other purposes Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Gleb Natapov <gleb@redhat.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-06-20 23:33:20 +02:00
Marcelo Tosatti	8915aa27d5	KVM: x86: handle idiv overflow at kvm_write_tsc Its possible that idivl overflows (due to large delta stored in usdiff, valid scenario). Create an exception handler to catch the overflow exception (division by zero is protected by vcpu->arch.virtual_tsc_khz check), and interpret it accordingly (delta is larger than USEC_PER_SEC). Fixes https://bugzilla.redhat.com/show_bug.cgi?id=969644 Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-12 14:24:11 +03:00
Gleb Natapov	05988d728d	KVM: MMU: reduce KVM_REQ_MMU_RELOAD when root page is zapped Quote Gleb's mail: \| why don't we check for sp->role.invalid in \| kvm_mmu_prepare_zap_page before calling kvm_reload_remote_mmus()? and \| Actually we can add check for is_obsolete_sp() there too since \| kvm_mmu_invalidate_all_pages() already calls kvm_reload_remote_mmus() \| after incrementing mmu_valid_gen. [ Xiao: add some comments and the check of is_obsolete_sp() ] Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-05 12:34:02 +03:00
Xiao Guangrong	365c886860	KVM: MMU: reclaim the zapped-obsolete page first As Marcelo pointed out that \| "(retention of large number of pages while zapping) \| can be fatal, it can lead to OOM and host crash" We introduce a list, kvm->arch.zapped_obsolete_pages, to link all the pages which are deleted from the mmu cache but not actually freed. When page reclaiming is needed, we always zap this kind of pages first. Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-05 12:33:33 +03:00
Xiao Guangrong	f34d251d66	KVM: MMU: collapse TLB flushes when zap all pages kvm_zap_obsolete_pages uses lock-break technique to zap pages, it will flush tlb every time when it does lock-break We can reload mmu on all vcpus after updating the generation number so that the obsolete pages are not used on any vcpus, after that we do not need to flush tlb when obsolete pages are zapped It will do kvm_mmu_prepare_zap_page many times and use one kvm_mmu_commit_zap_page to collapse tlb flush, the side-effects is that causes obsolete pages unlinked from active_list but leave on hash-list, so we add the comment around the hash list walker Note: kvm_mmu_commit_zap_page is still needed before free the pages since other vcpus may be doing locklessly shadow page walking Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-05 12:33:18 +03:00
Xiao Guangrong	e7d11c7a89	KVM: MMU: zap pages in batch Zap at lease 10 pages before releasing mmu-lock to reduce the overload caused by requiring lock After the patch, kvm_zap_obsolete_pages can forward progress anyway, so update the comments [ It improves the case 0.6% ~ 1% that do kernel building meanwhile read PCI ROM. ] Note: i am not sure that "10" is the best speculative value, i just guessed that '10' can make vcpu do not spend long time on kvm_zap_obsolete_pages and do not cause mmu-lock too hungry. Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-05 12:33:10 +03:00
Xiao Guangrong	7f52af7412	KVM: MMU: do not reuse the obsolete page The obsolete page will be zapped soon, do not reuse it to reduce future page fault Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-05 12:33:04 +03:00
Xiao Guangrong	35006126f0	KVM: MMU: add tracepoint for kvm_mmu_invalidate_all_pages It is good for debug and development Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-05 12:32:57 +03:00
Xiao Guangrong	2248b02321	KVM: MMU: show mmu_valid_gen in shadow page related tracepoints Show sp->mmu_valid_gen Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-05 12:32:49 +03:00
Xiao Guangrong	6ca18b6950	KVM: x86: use the fast way to invalidate all pages Replace kvm_mmu_zap_all by kvm_mmu_invalidate_zap_all_pages Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-05 12:32:42 +03:00
Xiao Guangrong	5304b8d37c	KVM: MMU: fast invalidate all pages The current kvm_mmu_zap_all is really slow - it is holding mmu-lock to walk and zap all shadow pages one by one, also it need to zap all guest page's rmap and all shadow page's parent spte list. Particularly, things become worse if guest uses more memory or vcpus. It is not good for scalability In this patch, we introduce a faster way to invalidate all shadow pages. KVM maintains a global mmu invalid generation-number which is stored in kvm->arch.mmu_valid_gen and every shadow page stores the current global generation-number into sp->mmu_valid_gen when it is created When KVM need zap all shadow pages sptes, it just simply increase the global generation-number then reload root shadow pages on all vcpus. Vcpu will create a new shadow page table according to current kvm's generation-number. It ensures the old pages are not used any more. Then the obsolete pages (sp->mmu_valid_gen != kvm->arch.mmu_valid_gen) are zapped by using lock-break technique Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-05 12:32:33 +03:00
Xiao Guangrong	a2ae162265	KVM: MMU: drop unnecessary kvm_reload_remote_mmus It is the responsibility of kvm_mmu_zap_all that keeps the consistent of mmu and tlbs. And it is also unnecessary after zap all mmio sptes since no mmio spte exists on root shadow page and it can not be cached into tlb Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-05 12:32:24 +03:00
Xiao Guangrong	758ccc89b8	KVM: x86: drop calling kvm_mmu_zap_all in emulator_fix_hypercall Quote Gleb's mail: \| Back then kvm->lock protected memslot access so code like: \| \| mutex_lock(&vcpu->kvm->lock); \| kvm_mmu_zap_all(vcpu->kvm); \| mutex_unlock(&vcpu->kvm->lock); \| \| which is what `7aa81cc0` does was enough to guaranty that no vcpu will \| run while code is patched. This is no longer the case and \| mutex_lock(&vcpu->kvm->lock); is gone from that code path long time ago, \| so now kvm_mmu_zap_all() there is useless and the code is incorrect. So we drop it and it will be fixed later Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-06-05 12:32:00 +03:00
Avi Kivity	e47a5f5fb7	KVM: x86 emulator: convert XADD to fastop Signed-off-by: Avi Kivity <avi.kivity@gmail.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-05-21 15:43:24 +03:00
Avi Kivity	203831e8e4	KVM: x86 emulator: drop unused old-style inline emulation Signed-off-by: Avi Kivity <avi.kivity@gmail.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-05-21 15:43:23 +03:00
Avi Kivity	b8c0b6ae49	KVM: x86 emulator: convert DIV/IDIV to fastop Since DIV and IDIV can generate exceptions, we need an additional output parameter indicating whether an execption has occured. To avoid increasing register pressure on i386, we use %rsi, which is already allocated for the fastop code pointer. Gleb: added comment about fop usage as exception indication. Signed-off-by: Avi Kivity <avi.kivity@gmail.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-05-21 15:43:21 +03:00
Avi Kivity	b9fa409b00	KVM: x86 emulator: convert single-operand MUL/IMUL to fastop Signed-off-by: Avi Kivity <avi.kivity@gmail.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-05-21 15:43:20 +03:00
Avi Kivity	017da7b604	KVM: x86 emulator: Switch fastop src operand to RDX This makes OpAccHi useful. Signed-off-by: Avi Kivity <avi.kivity@gmail.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-05-21 15:43:19 +03:00
Avi Kivity	ab2c5ce666	KVM: x86 emulator: switch MUL/DIV to DstXacc Signed-off-by: Avi Kivity <avi.kivity@gmail.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-05-21 15:43:17 +03:00
Avi Kivity	820207c8fc	KVM: x86 emulator: decode extended accumulator explicity Single-operand MUL and DIV access an extended accumulator: AX for byte instructions, and DX:AX, EDX:EAX, or RDX:RAX for larger-sized instructions. Add support for fetching the extended accumulator. In order not to change things too much, RDX is loaded into Src2, which is already loaded by fastop(). This avoids increasing register pressure on i386. Gleb: disable src writeback for ByteOp div/mul. Signed-off-by: Avi Kivity <avi.kivity@gmail.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-05-21 15:43:16 +03:00
Avi Kivity	fb32b1eda2	KVM: x86 emulator: add support for writing back the source operand Some instructions write back the source operand, not just the destination. Add support for doing this via the decode flags. Gleb: add BUG_ON() to prevent source to be memory operand. Signed-off-by: Avi Kivity <avi.kivity@gmail.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-05-21 15:43:14 +03:00
Marc Zyngier	535cf7b3b1	KVM: get rid of $(addprefix ../../../virt/kvm/, ...) in Makefiles As requested by the KVM maintainers, remove the addprefix used to refer to the main KVM code from the arch code, and replace it with a KVM variable that does the same thing. Tested-by: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Gleb Natapov <gleb@redhat.com> Cc: Christoffer Dall <cdall@cs.columbia.edu> Acked-by: Xiantao Zhang <xiantao.zhang@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Alexander Graf <agraf@suse.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-05-19 15:14:00 +03:00
Gleb Natapov	35af577aac	KVM: MMU: clenaup locking in mmu_free_roots() Do locking around each case separately instead of having one lock and two unlocks. Move root_hpa assignment out of the lock. Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-05-16 11:55:51 +03:00
Marcelo Tosatti	0061d53daf	KVM: x86: limit difference between kvmclock updates kvmclock updates which are isolated to a given vcpu, such as vcpu->cpu migration, should not allow system_timestamp from the rest of the vcpus to remain static. Otherwise ntp frequency correction applies to one vcpu's system_timestamp but not the others. So in those cases, request a kvmclock update for all vcpus. The worst case for a remote vcpu to update its kvmclock is then bounded by maximum nohz sleep latency. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-05-15 20:36:09 +03:00
Jan Kiszka	f1ed0450a5	KVM: x86: Remove support for reporting coalesced APIC IRQs Since the arrival of posted interrupt support we can no longer guarantee that coalesced IRQs are always reported to the IRQ source. Moreover, accumulated APIC timer events could cause a busy loop when a VCPU should rather be halted. The consensus is to remove coalesced tracking from the LAPIC. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-05-14 12:09:02 +03:00
Takuya Yoshikawa	e2858b4ab1	KVM: MMU: Use kvm_mmu_sync_roots() in kvm_mmu_load() No need to open-code this function. Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp> Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-05-12 14:52:55 +03:00
Linus Torvalds	607eeb0b83	Bug-fixes: - More fixes in the vCPU PVHVM hotplug path. - Add more documentation. - Fix various ARM related issues in the Xen generic drivers. - Updates in the xen-pciback driver per Bjorn's updates. - Mask the x2APIC feature for PV guests. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) iQEcBAABAgAGBQJRjPL5AAoJEFjIrFwIi8fJdlIIANXawH+B+aFbqsFSKOOh76XN smgICU78SVzKpW9WAPYK7YFqSdNN4AleGC2Mn2lSkiaqgciRyDb9Yt+OSMMts2Xn ZVbFkGhEKR+DtZfTKo9YgsGatul/McTiVEkuuli+aN5dql3WXDLAaA+/b9bO3ohh TCWtWNuSCGmlfDoJET2je+J6CgKvCErH3fvzKNxgYxytcGhAvxoVK/lC4d3pnq/m wQUAIcF8XYENqC2m1WDR0OGveAB0Me0j9g+UkQS+TzqA8GPmxC4aptjkroFYhOz6 8nZp+LanimmTI6olVioWEXkCr5+dxb058jQKwncQfonFpl58RS0qUrz5zoe3etU= =h9SX -----END PGP SIGNATURE----- Merge tag 'stable/for-linus-3.10-rc0-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen Pull Xen bug-fixes from Konrad Rzeszutek Wilk: - More fixes in the vCPU PVHVM hotplug path. - Add more documentation. - Fix various ARM related issues in the Xen generic drivers. - Updates in the xen-pciback driver per Bjorn's updates. - Mask the x2APIC feature for PV guests. * tag 'stable/for-linus-3.10-rc0-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen: xen/pci: Used cached MSI-X capability offset xen/pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK xen: clear IRQ_NOAUTOEN and IRQ_NOREQUEST xen: mask x2APIC feature in PV xen: SWIOTLB is only used on x86 xen/spinlock: Fix check from greater than to be also be greater or equal to. xen/smp/pvhvm: Don't point per_cpu(xen_vpcu, 33 and larger) to shared_info xen/vcpu: Document the xen_vcpu_info and xen_vcpu xen/vcpu/pvhvm: Fix vcpu hotplugging hanging.	2013-05-11 16:19:30 -07:00
Linus Torvalds	ac4e01093f	Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull idle update from Len Brown: "Add support for new Haswell-ULT CPU idle power states" * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: intel_idle: initial C8, C9, C10 support tools/power turbostat: display C8, C9, C10 residency	2013-05-11 15:23:17 -07:00
Linus Torvalds	3644bc2ec7	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull stray syscall bits from Al Viro: "Several syscall-related commits that were missing from the original" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: switch compat_sys_sysctl to COMPAT_SYSCALL_DEFINE unicore32: just use mmap_pgoff()... unify compat fanotify_mark(2), switch to COMPAT_SYSCALL_DEFINE x86, vm86: fix VM86 syscalls: use SYSCALL_DEFINEx(...)	2013-05-10 09:21:05 -07:00
Linus Torvalds	c67723ebbb	Merge tag 'kvm-3.10-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Gleb Natapov: "Most of the fixes are in the emulator since now we emulate more than we did before for correctness sake we see more bugs there, but there is also an OOPS fixed and corruption of xcr0 register." * tag 'kvm-3.10-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: emulator: emulate SALC KVM: emulator: emulate XLAT KVM: emulator: emulate AAM KVM: VMX: fix halt emulation while emulating invalid guest sate KVM: Fix kvm_irqfd_init initialization KVM: x86: fix maintenance of guest/host xcr0 state	2013-05-10 09:08:21 -07:00
Bjorn Helgaas	7c86617dde	xen/pci: Used cached MSI-X capability offset We now cache the MSI-X capability offset in the struct pci_dev, so no need to find the capability again. Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-05-10 09:10:38 -04:00
Bjorn Helgaas	4be6bfe2af	xen/pci: Use PCI_MSIX_TABLE_BIR, not PCI_MSIX_FLAGS_BIRMASK PCI_MSIX_FLAGS_BIRMASK is mis-named because the BIR mask is in the Table Offset register, not the flags ("Message Control" per spec) register. Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-05-10 09:10:20 -04:00
Al Viro	91c2e0bcae	unify compat fanotify_mark(2), switch to COMPAT_SYSCALL_DEFINE Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-05-09 13:46:38 -04:00
Linus Torvalds	e15e611906	PCI updates for v3.10: MSI PCI: Set ->mask_pos correctly Hotplug PCI: Delay final fixups until resources are assigned Moorestown x86/pci/mrst: Use configuration mechanism 1 for 00:00.0, 00:02.0, 00:03.0 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJRi9CqAAoJEFmIoMA60/r82WIQAMJE7kwl88TpIw4QdTum4a+z E+eBMnGFSw/y7JNP81rOuLG8xZiZNQxGUvTEVV+WY6n7xhRSwcNZIiVqZ1xSBSjM PpmLOLrXI6I9JIouBLKIRiPOzbC7iB6O7RhKZCO68guQ8Y7epYoJjORaKELGZmN+ 09fVapvHHhcOcYYYAiUWvobjk6vCx4+6dSj6tRldI8JNl+WQtqMaK2P3QjsncZek 20OXcrfv4X3ApoSwXcn1NUUDSlrAgM+VcwL6RJK9boURDnPOU8IzaD78DQVOUCNx BfLULFYkBq62ExCpTkg4Xo8aRowLAEcThZ3Z8/XtMmFlWDSdNm5BaTkYnPCf7nGW +8Oaxjm0pNVa5QnQqoK1HWpWtU1JTA0hO1tmJ9WyU+84GPuqTN3qiJTigfC9NHOi mj8O98sgbzIENKszBRpaNctjKVxKNFrBzQ3kOdFGB7NBVXN0pC2jnGBoxuxUrU3B h/yMB0Ku/GvnHRSngEhDzlkeuQpWTxvlhdjvlL63F0gEDQO0k3UPaiqD4zMpruga bHZ10v73Kdqp0FVatijmHztXO/yYB3m7tH3ZUDD3yfhKdaeOuTceDgbyo/ZwP2Wq gzOwuOVYLuJK68Qj4vkHJIKy86jHfNP4HGawhh8dECZDmsciarymQ+vAzobalYIu GSurydL4zqJJ+MIH84dW =B2U8 -----END PGP SIGNATURE----- Merge tag 'pci-v3.10-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "MSI: PCI: Set ->mask_pos correctly Hotplug: PCI: Delay final fixups until resources are assigned Moorestown: x86/pci/mrst: Use configuration mechanism 1 for 00:00.0, 00:02.0, 00:03.0" * tag 'pci-v3.10-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Delay final fixups until resources are assigned x86/pci/mrst: Use configuration mechanism 1 for 00:00.0, 00:02.0, 00:03.0 PCI: Set ->mask_pos correctly	2013-05-09 10:21:44 -07:00
Linus Torvalds	5647ac0ad4	Removal of GENERIC_GPIO for v3.10 GENERIC_GPIO now synonymous with GPIOLIB. There are no longer any valid cases for enableing GENERIC_GPIO without GPIOLIB, even though it is possible to do so which has been causing confusion and breakage. This branch does the work to completely eliminate GENERIC_GPIO. However, it is not trivial to just create a branch to remove it. Over the course of the v3.9 cycle more code referencing GENERIC_GPIO has been added to linux-next that conflicts with this branch. The following must be done to resolve the conflicts when merging this branch into mainline: * "git grep CONFIG_GENERIC_GPIO" should return 0 hits. Matches should be replaced with CONFIG_GPIOLIB * "git grep '\bGENERIC_GPIO\b'" should return 1 hit in the Chinese documentation. * Selectors of GENERIC_GPIO should be turned into selectors of GPIOLIB * definitions of the option in architecture Kconfig code should be deleted. Stephen has 3 merge fixup patches[1] that do the above. They are currently applicable on mainline as of May 2nd. [1] http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg428056.html -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIcBAABAgAGBQJRifUnAAoJEEFnBt12D9kBs2YP/0U6+ia+xYvkVaJc28PDVIzn OReZNcJOYU8D5voxz0voaRD0EdcPwjbMu9Kp9aXMHlk4VxevF+8jCc/us0bIjtO1 VcB5VmSCIhMhxdnBlum11Mk7Vr5MCweyl9NBsypnPt8cl4obMBZHf2yzoodFktNb wtyYlOb6FALtc6iDbOO6dG3w9F7FAOLvskUFzdv89m8mupTsBu9jw9NqFDbJHOex rxq0Sdd+kWF/nkJVcV5Y6jIdletRlhpipefMJ9diexreHvwqh+c4kJEYZaXgB5+m ha95cPbReK1d+RqzM3A8d4irzSVSmq4k7ijI6QkFOr48+AH7XsgKv5so885LKzMN IIXg2Phm9i0H8+ecEvhcc4oIYBHJiEKK54Y0qUD9dqbFoDGPTCSqMHdSSMbpAY+J bIIXlVzj1En3PPNUJLPt8q8Qz6WxCT9mDST3QSGYnD4o90HT+1R9j92RxGL6McOq rUOyJDwmzFvpBvKK4raGdOU435M+ps2NPKKNIRaIGQPPY9rM1kN4YqvhXukEsC9L 3a3+3cQLh7iKxBHncxeQsJfethP1CPkJnzvF9r+ZZLf2rcPH4pbQIE2uO0XnX/nd 5/DKi0nGgAJ//GMMzdo3RiOA5zGFjIZ/KMvfhQldpP6qFJRhqdGi6FPlAcwr1z1n YnCByPwwlvfC4LTXFOGL =xodc -----END PGP SIGNATURE----- Merge tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux Pull removal of GENERIC_GPIO from Grant Likely: "GENERIC_GPIO now synonymous with GPIOLIB. There are no longer any valid cases for enableing GENERIC_GPIO without GPIOLIB, even though it is possible to do so which has been causing confusion and breakage. This branch does the work to completely eliminate GENERIC_GPIO." * tag 'gpio-for-linus' of git://git.secretlab.ca/git/linux: gpio: update gpio Chinese documentation Remove GENERIC_GPIO config option Convert selectors of GENERIC_GPIO to GPIOLIB blackfin: force use of gpiolib m68k: coldfire: use gpiolib mips: pnx833x: remove requirement for GENERIC_GPIO openrisc: default GENERIC_GPIO to false avr32: default GENERIC_GPIO to false xtensa: remove explicit selection of GENERIC_GPIO sh: replace CONFIG_GENERIC_GPIO by CONFIG_GPIOLIB powerpc: remove redundant GENERIC_GPIO selection unicore32: default GENERIC_GPIO to false unicore32: remove unneeded select GENERIC_GPIO arm: plat-orion: use GPIO driver on CONFIG_GPIOLIB arm: remove redundant GENERIC_GPIO selection mips: alchemy: require gpiolib mips: txx9: change GENERIC_GPIO to GPIOLIB mips: loongson: use GPIO driver on CONFIG_GPIOLIB mips: remove redundant GENERIC_GPIO select	2013-05-09 09:59:16 -07:00
Paolo Bonzini	326f578f7e	KVM: emulator: emulate SALC This is an almost-undocumented instruction available in 32-bit mode. I say "almost" undocumented because AMD documents it in their opcode maps just to say that it is unavailable in 64-bit mode (sections "A.2.1 One-Byte Opcodes" and "B.3 Invalid and Reassigned Instructions in 64-Bit Mode"). It is roughly equivalent to "sbb %al, %al" except it does not set the flags. Use fastop to emulate it, but do not use the opcode directly because it would fail if the host is 64-bit! Reported-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: stable@vger.kernel.org # 3.9 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-05-09 13:15:08 +03:00
Paolo Bonzini	7fa57952d7	KVM: emulator: emulate XLAT This is used by SGABIOS, KVM breaks with emulate_invalid_guest_state=1. It is just a MOV in disguise, with a funny source address. Reported-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: stable@vger.kernel.org # 3.9 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-05-09 13:14:51 +03:00
Paolo Bonzini	a035d5c64d	KVM: emulator: emulate AAM This is used by SGABIOS, KVM breaks with emulate_invalid_guest_state=1. AAM needs the source operand to be unsigned; do the same in AAD as well for consistency, even though it does not affect the result. Reported-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: stable@vger.kernel.org # 3.9 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-05-09 13:12:58 +03:00
Gleb Natapov	8d76c49e9f	KVM: VMX: fix halt emulation while emulating invalid guest sate The invalid guest state emulation loop does not check halt_request which causes 100% cpu loop while guest is in halt and in invalid state, but more serious issue is that this leaves halt_request set, so random instruction emulated by vm86 #GP exit can be interpreted as halt which causes guest hang. Fix both problems by handling halt_request in emulation loop. Reported-by: Tomas Papan <tomas.papan@gmail.com> Tested-by: Tomas Papan <tomas.papan@gmail.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> CC: stable@vger.kernel.org Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-05-09 09:04:56 +03:00
Zhenzhong Duan	4ea9b9aca9	xen: mask x2APIC feature in PV On x2apic enabled pvm, doing sysrq+l, got NULL pointer dereference as below. SysRq : Show backtrace of all active CPUs BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<ffffffff8125e3cb>] memcpy+0xb/0x120 Call Trace: [<ffffffff81039633>] ? __x2apic_send_IPI_mask+0x73/0x160 [<ffffffff8103973e>] x2apic_send_IPI_all+0x1e/0x20 [<ffffffff8103498c>] arch_trigger_all_cpu_backtrace+0x6c/0xb0 [<ffffffff81501be4>] ? _raw_spin_lock_irqsave+0x34/0x50 [<ffffffff8131654e>] sysrq_handle_showallcpus+0xe/0x10 [<ffffffff8131616d>] __handle_sysrq+0x7d/0x140 [<ffffffff81316230>] ? __handle_sysrq+0x140/0x140 [<ffffffff81316287>] write_sysrq_trigger+0x57/0x60 [<ffffffff811ca996>] proc_reg_write+0x86/0xc0 [<ffffffff8116dd8e>] vfs_write+0xce/0x190 [<ffffffff8116e3e5>] sys_write+0x55/0x90 [<ffffffff8150a242>] system_call_fastpath+0x16/0x1b That's because apic points to apic_x2apic_cluster or apic_x2apic_phys but the basic element like cpumask isn't initialized. Mask x2APIC feature in pvm to avoid overwrite of apic pointer, update commit message per Konrad's suggestion. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Tested-by: Tamon Shiose <tamon.shiose@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-05-08 08:38:11 -04:00
Konrad Rzeszutek Wilk	cb91f8f44c	xen/spinlock: Fix check from greater than to be also be greater or equal to. During review of git commit `cb9c6f15f3` ("xen/spinlock: Check against default value of -1 for IRQ line.") Stefano pointed out a bug in the patch. Unfortunatly due to vacation timing the fix was not applied and this patch fixes it up. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-05-08 08:38:09 -04:00
Konrad Rzeszutek Wilk	d5b17dbff8	xen/smp/pvhvm: Don't point per_cpu(xen_vpcu, 33 and larger) to shared_info As it will point to some data, but not event channel data (the shared_info has an array limited to 32). This means that for PVHVM guests with more than 32 VCPUs without the usage of VCPUOP_register_info any interrupts to VCPUs larger than 32 would have gone unnoticed during early bootup. That is OK, as during early bootup, in smp_init we end up calling the hotplug mechanism (xen_hvm_cpu_notify) which makes the VCPUOP_register_vcpu_info call for all VCPUs and we can receive interrupts on VCPUs 33 and further. This is just a cleanup. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-05-08 08:38:08 -04:00
Marcelo Tosatti	42bdf991f4	KVM: x86: fix maintenance of guest/host xcr0 state Emulation of xcr0 writes zero guest_xcr0_loaded variable so that subsequent VM-entry reloads CPU's xcr0 with guests xcr0 value. However, this is incorrect because guest_xcr0_loaded variable is read to decide whether to reload hosts xcr0. In case the vcpu thread is scheduled out after the guest_xcr0_loaded = 0 assignment, and scheduler decides to preload FPU: switch_to { __switch_to __math_state_restore restore_fpu_checking fpu_restore_checking if (use_xsave()) fpu_xrstor_checking xrstor64 with CPU's xcr0 == guests xcr0 Fix by properly restoring hosts xcr0 during emulation of xcr0 writes. Analyzed-by: Ulrich Obergfell <uobergfe@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-05-08 12:47:43 +03:00
Linus Torvalds	c8de2fa4dc	Merge branch 'rwsem-optimizations' Merge rwsem optimizations from Michel Lespinasse: "These patches extend Alex Shi's work (which added write lock stealing on the rwsem slow path) in order to provide rwsem write lock stealing on the fast path (that is, without taking the rwsem's wait_lock). I have unfortunately been unable to push this through -next before due to Ingo Molnar / David Howells / Peter Zijlstra being busy with other things. However, this has gotten some attention from Rik van Riel and Davidlohr Bueso who both commented that they felt this was ready for v3.10, and Ingo Molnar has said that he was OK with me pushing directly to you. So, here goes :) Davidlohr got the following test results from pgbench running on a quad-core laptop: \| db_size \| clients \| tps-vanilla \| tps-rwsem \| +---------+----------+----------------+--------------+ \| 160 MB \| 1 \| 5803 \| 6906 \| + 19.0% \| 160 MB \| 2 \| 13092 \| 15931 \| \| 160 MB \| 4 \| 29412 \| 33021 \| \| 160 MB \| 8 \| 32448 \| 34626 \| \| 160 MB \| 16 \| 32758 \| 33098 \| \| 160 MB \| 20 \| 26940 \| 31343 \| + 16.3% \| 160 MB \| 30 \| 25147 \| 28961 \| \| 160 MB \| 40 \| 25484 \| 26902 \| \| 160 MB \| 50 \| 24528 \| 25760 \| ------------------------------------------------------ \| 1.6 GB \| 1 \| 5733 \| 7729 \| + 34.8% \| 1.6 GB \| 2 \| 9411 \| 19009 \| + 101.9% \| 1.6 GB \| 4 \| 31818 \| 33185 \| \| 1.6 GB \| 8 \| 33700 \| 34550 \| \| 1.6 GB \| 16 \| 32751 \| 33079 \| \| 1.6 GB \| 20 \| 30919 \| 31494 \| \| 1.6 GB \| 30 \| 28540 \| 28535 \| \| 1.6 GB \| 40 \| 26380 \| 27054 \| \| 1.6 GB \| 50 \| 25241 \| 25591 \| ------------------------------------------------------ \| 7.6 GB \| 1 \| 5779 \| 6224 \| \| 7.6 GB \| 2 \| 10897 \| 13611 \| + 24.9% \| 7.6 GB \| 4 \| 32683 \| 33108 \| \| 7.6 GB \| 8 \| 33968 \| 34712 \| \| 7.6 GB \| 16 \| 32287 \| 32895 \| \| 7.6 GB \| 20 \| 27770 \| 31689 \| + 14.1% \| 7.6 GB \| 30 \| 26739 \| 29003 \| \| 7.6 GB \| 40 \| 24901 \| 26683 \| \| 7.6 GB \| 50 \| 17115 \| 25925 \| + 51.5% ------------------------------------------------------ (Davidlohr also has one additional patch which further improves throughput, though I will ask him to send it directly to you as I have suggested some minor changes)." * emailed patches from Michel Lespinasse <walken@google.com>: rwsem: no need for explicit signed longs x86 rwsem: avoid taking slow path when stealing write lock rwsem: do not block readers at head of queue if other readers are active rwsem: implement support for write lock stealing on the fastpath rwsem: simplify __rwsem_do_wake rwsem: skip initial trylock in rwsem_down_write_failed rwsem: avoid taking wait_lock in rwsem_down_write_failed rwsem: use cmpxchg for trying to steal write lock rwsem: more agressive lock stealing in rwsem_down_write_failed rwsem: simplify rwsem_down_write_failed rwsem: simplify rwsem_down_read_failed rwsem: move rwsem_down_failed_common code into rwsem_down_{read,write}_failed rwsem: shorter spinlocked section in rwsem_down_failed_common() rwsem: make the waiter type an enumeration rather than a bitmask	2013-05-07 09:22:03 -07:00
Michel Lespinasse	a31a369b07	x86 rwsem: avoid taking slow path when stealing write lock modify __down_write[_nested] and __down_write_trylock to grab the write lock whenever the active count is 0, even if there are queued waiters (they must be writers pending wakeup, since the active count is 0). Note that this is an optimization only; architectures without this optimization will still work fine: - __down_write() would take the slow path which would take the wait_lock and then try stealing the lock (as in the spinlocked rwsem implementation) - __down_write_trylock() would fail, but callers must be ready to deal with that - since there are some writers pending wakeup, they could have raced with us and obtained the lock before we steal it. Signed-off-by: Michel Lespinasse <walken@google.com> Reviewed-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Davidlohr Bueso <davidlohr.bueso@hp.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-05-07 07:20:17 -07:00
Konrad Rzeszutek Wilk	a520996ae2	xen/vcpu: Document the xen_vcpu_info and xen_vcpu They are important structures and it is not clear at first look what they are for. The xen_vcpu is a pointer. By default it points to the shared_info structure (at the CPU offset location). However if the VCPUOP_register_vcpu_info hypercall is implemented we can make the xen_vcpu pointer point to a per-CPU location. Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> [v1: Added comments from Ian Campbell] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-05-07 10:05:34 -04:00
Linus Torvalds	99737982ca	IOMMU Updates for Linux v3.10 The updates are mostly about the x86 IOMMUs this time. Exceptions are the groundwork for the PAMU IOMMU from Freescale (for a PPC platform) and an extension to the IOMMU group interface. On the x86 side this includes a workaround for VT-d to disable interrupt remapping on broken chipsets. On the AMD-Vi side the most important new feature is a kernel command-line interface to override broken information in IVRS ACPI tables and get interrupt remapping working this way. Besides that there are small fixes all over the place. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJRh2vAAAoJECvwRC2XARrjbjkP/jzKzeffybUpQsIJF8rs/IEt hSwqpGLr6WR5FdneEH9fiBIp4pyMDXmuAb/2ZNgB+DgPN3xgqmWVo4WLk7pMo3BS /xIz/lu7hIX3AtKt807pL9+rPdhGYEJ43Vmr4bW9x0l1kuNXy6fmMLcN5FaPKjV4 p4hY4jOstEgtYQw4wi39/9b4FsYoipZizkOUSdtCzWwTv7jOHH7/Wra8iZyzL6Je 1VlF/efp0ytTcwLdHOfGwPCIlZrQRtQCM4SqdAUG9bOL3ARR9Yu/0iW1295nbLzo CQX5CfKePvo/fGxki1jcBi+UCyxYKPosB5kCxmh4MAxCg/VzzMsaME/A73tLJa6W Y29bbjwPoBPMq03HX8S9R5QWY8HpFujUUp+J4TXcKuTgYEV28WfLu1uaeKD716nM LoXUojov7Cj8ZQZnhyu5l+XNaephBZLfw/8bM6bAxhlKXwAjmLiS5Z+srPl1GJee 5GCV+L94JifHLZaREWh3JFsh9O3W7Wno2++c4JU32uCWJHXH7tMgs2P8n5AY9rnT Km1a9y6w2MF3Gg9j4y6u75m0XnFTNzYjeJMUtqVlwVhNHhgaXfuIWY63xOQCLJs1 ThTHOjoh0VqONGobR/ywn+0ouo9X07DnWpluyFd+zY3XK0UE0NOu9XMr4i6TWxOf mlzWoEKxtw36XGHB/FtQ =MVc/ -----END PGP SIGNATURE----- Merge tag 'iommu-updates-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU updates from Joerg Roedel: "The updates are mostly about the x86 IOMMUs this time. Exceptions are the groundwork for the PAMU IOMMU from Freescale (for a PPC platform) and an extension to the IOMMU group interface. On the x86 side this includes a workaround for VT-d to disable interrupt remapping on broken chipsets. On the AMD-Vi side the most important new feature is a kernel command-line interface to override broken information in IVRS ACPI tables and get interrupt remapping working this way. Besides that there are small fixes all over the place." * tag 'iommu-updates-v3.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (24 commits) iommu/tegra: Fix printk formats for dma_addr_t iommu: Add a function to find an iommu group by id iommu/vt-d: Remove warning for HPET scope type iommu: Move swap_pci_ref function to drivers/iommu/pci.h. iommu/vt-d: Disable translation if already enabled iommu/amd: fix error return code in early_amd_iommu_init() iommu/AMD: Per-thread IOMMU Interrupt Handling iommu: Include linux/err.h iommu/amd: Workaround for ERBT1312 iommu/amd: Document ivrs_ioapic and ivrs_hpet parameters iommu/amd: Don't report firmware bugs with cmd-line ivrs overrides iommu/amd: Add ioapic and hpet ivrs override iommu/amd: Add early maps for ioapic and hpet iommu/amd: Extend IVRS special device data structure iommu/amd: Move add_special_device() to __init iommu: Fix compile warnings with forward declarations iommu/amd: Properly initialize irq-table lock iommu/amd: Use AMD specific data structure for irq remapping iommu/amd: Remove map_sg_no_iommu() iommu/vt-d: add quirk for broken interrupt remapping on 55XX chipsets ...	2013-05-06 14:59:13 -07:00

1 2 3 4 5 ...

17465 Commits