linux

iv/linux

Author	SHA1	Message	Date
Rafael J. Wysocki	f1c7842e5f	Merge branches 'pm-cpufreq', 'intel_pstate' and 'pm-cpuidle' * pm-cpufreq: cpufreq / CPPC: Initialize policy->min to lowest nonlinear performance cpufreq: sfi: make freq_table static cpufreq: exynos5440: Fix inconsistent indenting cpufreq: imx6q: imx6ull should use the same flow as imx6ul cpufreq: dt: Add support for hi3660 * intel_pstate: cpufreq: Update scaling_cur_freq documentation cpufreq: intel_pstate: Clean up after performance governor changes intel_pstate: skip scheduler hook when in "performance" mode intel_pstate: delete scheduler hook in HWP mode x86: use common aperfmperf_khz_on_cpu() to calculate KHz using APERF/MPERF cpufreq: intel_pstate: Remove max/min fractions to limit performance x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz" * pm-cpuidle: cpuidle: menu: allow state 0 to be disabled intel_idle: Use more common logging style x86/ACPI/cstate: Allow ACPI C1 FFH MWAIT use on AMD systems ARM: cpuidle: Support asymmetric idle definition	2017-07-03 14:21:18 +02:00
Rafael J. Wysocki	16b5b09240	Merge branch 'pm-tools' * pm-tools: cpupower: Add support for new AMD family 0x17 cpupower: Fix bug where return value was not used tools/power turbostat: update version number tools/power turbostat: decode MSR_IA32_MISC_ENABLE only on Intel tools/power turbostat: stop migrating, unless '-m' tools/power turbostat: if --debug, print sampling overhead tools/power turbostat: hide SKL counters, when not requested intel_pstate: use updated msr-index.h HWP.EPP values tools/power x86_energy_perf_policy: support HWP.EPP x86: msr-index.h: fix shifts to ULL results in HWP macros. x86: msr-index.h: define HWP.EPP values x86: msr-index.h: define EPB mid-points	2017-07-03 14:17:16 +02:00
Marek Marczykowski-Górecki	c54590cac5	x86/xen: allow userspace access during hypercalls Userspace application can do a hypercall through /dev/xen/privcmd, and some for some hypercalls argument is a pointers to user-provided structure. When SMAP is supported and enabled, hypervisor can't access. So, lets allow it. The same applies to HYPERVISOR_dm_op, where additionally privcmd driver carefully verify buffer addresses. Cc: stable@vger.kernel.org Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>	2017-07-03 13:26:17 +02:00
Gustavo A. R. Silva	bf1b9ddf18	x86: xen: remove unnecessary variable in xen_foreach_remap_area() Remove unnecessary variable mfn in function xen_foreach_remap_area() and, refactor the code. Variable mfn at line 518:mfn = xen_remap_buf.mfns[i]; is only being used to store a value to be passed as an argument to the xen_update_mem_tables() function. This value can be passed directly, which makes variable mfn unnecessary. Also, value assigned to variable mfn at line 534:mfn = xen_remap_mfn; is never used. Addresses-Coverity-ID: 1260110 Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com>	2017-07-03 13:24:17 +02:00
Peter Feiner	ac8d57e573	kvm: x86: mmu: allow A/D bits to be disabled in an mmu Adds the plumbing to disable A/D bits in the MMU based on a new role bit, ad_disabled. When A/D is disabled, the MMU operates as though A/D aren't available (i.e., using access tracking faults instead). To avoid SP -> kvm_mmu_page.role.ad_disabled lookups all over the place, A/D disablement is now stored in the SPTE. This state is stored in the SPTE by tweaking the use of SPTE_SPECIAL_MASK for access tracking. Rather than just setting SPTE_SPECIAL_MASK when an access-tracking SPTE is non-present, we now always set SPTE_SPECIAL_MASK for access-tracking SPTEs. Signed-off-by: Peter Feiner <pfeiner@google.com> [Use role.ad_disabled even for direct (non-shadow) EPT page tables. Add documentation and a few MMU_WARN_ONs. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-03 11:19:54 +02:00
Peter Feiner	dcdca5fed5	x86: kvm: mmu: make spte mmio mask more explicit Specify both a mask (i.e., bits to consider) and a value (i.e., pattern of bits that indicates a special PTE) for mmio SPTEs. On Intel, this lets us pack even more information into the (SPTE_SPECIAL_MASK \| EPT_VMX_RWX_MASK) mask we use for access tracking liberating all (SPTE_SPECIAL_MASK \| (non-misconfigured-RWX)) values. Signed-off-by: Peter Feiner <pfeiner@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-03 10:43:31 +02:00
Peter Feiner	ce00053b1c	x86: kvm: mmu: dead code thanks to access tracking The MMU always has hardware A bits or access tracking support, thus it's unnecessary to handle the scenario where we have neither. Signed-off-by: Peter Feiner <pfeiner@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-07-03 10:43:23 +02:00
Bjorn Helgaas	8cd9385034	Merge branch 'pci/resource' into next * pci/resource: PCI: Work around poweroff & suspend-to-RAM issue on Macbook Pro 11 PCI: Do not disregard parent resources starting at 0x0 Conflicts: arch/x86/pci/fixup.c	2017-07-02 18:49:49 -05:00
Bjorn Helgaas	2cf816a947	Merge branch 'pci/pm' into next * pci/pm: PCI/PM: Avoid using device_may_wakeup() for runtime PM x86/PCI: Avoid AMD SB7xx EHCI USB wakeup defect PCI/PM: Restore the status of PCI devices across hibernation drm/radeon: make MacBook Pro d3_delay quirk more generic drm/amdgpu: remove unnecessary save/restore of pdev->d3_delay PCI/PM: Add needs_resume flag to avoid suspend complete optimization PCI: imx6: Fix config read timeout handling switchtec: Fix minor bug with partition ID register switchtec: Use new cdev_device_add() helper function PCI: endpoint: Make PCI_ENDPOINT depend on HAS_DMA	2017-07-02 18:48:49 -05:00
Jork Loeser	7dcf90e9e0	PCI: hv: Use vPCI protocol version 1.2 Update the Hyper-V vPCI driver to use the Server-2016 version of the vPCI protocol, fixing MSI creation and retargeting issues. Signed-off-by: Jork Loeser <jloeser@microsoft.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Acked-by: K. Y. Srinivasan <kys@microsoft.com>	2017-07-02 18:43:09 -05:00
Oliver O'Halloran	65f7d04978	mm, x86: Add ARCH_HAS_ZONE_DEVICE to Kconfig Currently ZONE_DEVICE depends on X86_64 and this will get unwieldly as new architectures (and platforms) get ZONE_DEVICE support. Move to an arch selected Kconfig option to save us the trouble. Cc: linux-mm@kvack.org Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2017-07-02 20:40:26 +10:00
Linus Torvalds	e18aca0236	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "Fixlets for x86: - Prevent kexec crash when KASLR is enabled, which was caused by an address calculation bug - Restore the freeing of PUDs on memory hot remove - Correct a negated pointer check in the intel uncore performance monitoring driver - Plug a memory leak in an error exit path in the RDT code" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/intel_rdt: Fix memory leak on mount failure x86/boot/KASLR: Fix kexec crash due to 'virt_addr' calculation bug x86/boot/KASLR: Add checking for the offset of kernel virtual address randomization perf/x86/intel/uncore: Fix wrong box pointer check x86/mm/hotplug: Fix BUG_ON() after hot-remove by not freeing PUD	2017-07-01 09:10:17 -07:00
Ingo Molnar	23acd3e1a0	perf/core improvements and fixes: Intel PT: - Support "ptwrite" instructio, a way to stuff 32 or 64 bit values into the Intel PT trace (Adrian Hunter) - Support power events in Intel PT to report changes to C-state (Adrian Hunter) - Synthesize Intel PT events as PERF_RECORD_SAMPLE records with a perf_event_attr.type (PERF_TYPE_SYNTH) just after the range used by the kernel, i.e. right after what is allocated for PMUs, at INT_MAX + 1U, attr.config will have the identification for the synthesized event and the PERF_SAMPLE_RAW payload will have its fields (Adrian Hunter) Infrastructure: - Remove warning() and error(), using instead pr_warning() and pr_error(), consolidating error reporting (Arnaldo Carvalho de Melo) - Add platform dependency to 'perf test 15' (Thomas Richter) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJZVsurAAoJENZQFvNTUqpAnYYP/i44/Y99vfN751fuTlJYci2g u1VVRsd0GC8OnFIZKRzFumAd+IXRUXiLp25nP36yvsXNOMHGU1O/SQmRRHOC6zTY ffPmnlHeUT8LOVX82GiiG6E6rzE2KHuAbgILvzswelPoyT6/91mysoZMu2xHpy3f sLUtjN7gAZqy6nMNTiGgItUDyFIAl4c2iQf5v8YkxfM0UxekXt/XIj2Zn5uUXTIW q9B0po9/MneI+7Fqtj3YTN7owY0YhXmynKHzE7CseNyGFFbtIzoTLW3qgtz+Ld3M ip0QcsRiV6hbgEkPsi6nwOAF1EABlsHb4QHwFifVqzWCPwqeLmI3rd7FsONDNcCZ TVoHfm1wlgqtQw6KVQodIrTKCq7DOpjTIzk6AX980vJ8yp2KtWf2DB0AqwpJ/7R2 2nqTsLm9iWbPOTA0mp/7au/WbNDcgL9jv2yqU8/UGBg92tVlVN5IiAVVpnsdBJgi VjEeUdqbvs9aw//+L1uN0N7Y22zqpQAm/eomd9wwXzDHCeWjIcrIR4tDA5i22waH 4XFJLgJhfbTZsSGonpQ+7GVPzFru3rz56wAM4UbD3BRtVCj+EMPu0/mb9u3URgjp 1iJdOm7WY/XH7AYV5dXnZyR+o4VDHwuziw5yxvoR3RNpARxAjVFGzXfq6Q5DbHPS mycD8rcoQp+3IeyA/IEN =tvJF -----END PGP SIGNATURE----- Merge tag 'perf-core-for-mingo-4.13-20170630' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: Intel PT enhancements: - Support "ptwrite" instruction, a way to stuff 32 or 64 bit values into the Intel PT trace (Adrian Hunter) - Support power events in Intel PT to report changes to C-state (Adrian Hunter) - Synthesize Intel PT events as PERF_RECORD_SAMPLE records with a perf_event_attr.type (PERF_TYPE_SYNTH) just after the range used by the kernel, i.e. right after what is allocated for PMUs, at INT_MAX + 1U, attr.config will have the identification for the synthesized event and the PERF_SAMPLE_RAW payload will have its fields (Adrian Hunter) Infrastructure changes: - Remove warning() and error(), using instead pr_warning() and pr_error(), consolidating error reporting (Arnaldo Carvalho de Melo) - Add platform dependency to 'perf test 15' (Thomas Richter) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-07-01 10:39:25 +02:00
Vikas Shivappa	79298acc4b	x86/intel_rdt: Fix memory leak on mount failure If mount fails, the kn_info directory is not freed causing memory leak. Add the missing error handling path. Fixes: 4e978d06dedb ("x86/intel_rdt: Add "info" files to resctrl file system") Signed-off-by: Vikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: vikas.shivappa@intel.com Cc: andi.kleen@intel.com Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1498503368-20173-3-git-send-email-vikas.shivappa@linux.intel.com	2017-06-30 21:20:00 +02:00
Linus Torvalds	27ab862a3a	IOMMU Fixes for Linux 4.12-rc7 Two fixes: * A fix for AMD IOMMU interrupt remapping code when IRQs are forwarded directly to KVM guests * Fixed check in the recently merged code to allow tboot with Intel VT-d disabled -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABAgAGBQJZVoIAAAoJECvwRC2XARrjpYoP/0JGpnrtGsw2GnXbe4jCXwWg hWpJze0QyHi7oJrNHRK1W72rwQxp53NUXS942bosXj18U+KU3x3UnLQfU5u1RfEi Z44riGpurZ2+Ied+X01eS5ALeno9/jj7DOnSDsrZS2vI56aJTGISNO/8xvE5M5Mj xwvOQW+K1Wgqac2Sdi+ckt8tUhik1MKbkL8k04/27Yzq6FxvkEyiHXDQcXA+eGId ewW034Z2ueriZ6fzuanQW3j57albeIN6XsTaOHbwD2edQwiyZ6yoMakKgMuHnpgx F4BtPtcbgSHSQ48ZZb9bdQpvEO3lY0jmSlMI4S2Fu5DmSIKC/KOy/4yYUhlQtrHT UUbIvq/pC+SMPxZiZAJhyIFcV6YkTelArPFx+QxsWvMMiXeGnezgeFsAOzwZuptF FFm9ItgfPkGxkiECFwJwSAHsTiiFocRfYHHv/ace/6X4ZB+nZrl3mSfX7+EtT2LG Kje2XtUzoGR/8LSBTMlQQeurhBZwbnoaFEtiVMrLODhvFT3IU7B00wgQHWNpjyRj Rqe/ScHRdRF1NQtW6QDTpNU4rZGB4lt1WxpMlONVVpO4LI/fXZq8Nq4lT+FzUHV1 xNQEh9xV2DK7bjT3HDd/OKSusaLzImNFYmi6+t/gROX8z/PqISb5IOmjaiO9pYYM coSJHmYL1hc0yFsNp6eB =bw0V -----END PGP SIGNATURE----- Merge tag 'iommu-fixes-v4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull IOMMU fixes from Joerg Roedel: "Two fixes: - A fix for AMD IOMMU interrupt remapping code when IRQs are forwarded directly to KVM guests - Fixed check in the recently merged code to allow tboot with Intel VT-d disabled" * tag 'iommu-fixes-v4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu/amd: Fix interrupt remapping when disable guest_mode iommu/vt-d: Correctly disable Intel IOMMU force on	2017-06-30 10:37:48 -07:00
David S. Miller	b079115937	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net A set of overlapping changes in macvlan and the rocker driver, nothing serious. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-30 12:43:08 -04:00
Kai-Heng Feng	0bf3730bbc	x86/PCI: Avoid AMD SB7xx EHCI USB wakeup defect On an AMD Carrizo laptop, when EHCI runtime PM is enabled, EHCI ports do not assert PME# for device plug/unplug events while in D3. As Alan Stern points out [1], the PME signal is not enabled when controller is in D3, therefore it's not being woken up when new devices get plugged in. Testing shows PME signal works when the EHCI power state is D2. Clear the PCI_PM_CAP_PME_D3 and PCI_PM_CAP_PME_D3cold bits in dev->pme_support to indicate the device will not assert PME# from those states. [1] http://lkml.kernel.org/r/Pine.LNX.4.44L0.1706121010010.2092-100000@iolanthe.rowland.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=196091 Link: https://support.amd.com/TechDocs/46837.pdf (Section 23) Link: https://support.amd.com/TechDocs/42413.pdf (Appendix A2) Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> [bhelgaas: changelog, add parens in quirk] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2017-06-30 11:15:08 -05:00
Nick Desaulniers	8616abc253	KVM: x86: remove ignored type attribute The macro insn_fetch marks the 'type' argument as having a specified alignment. Type attributes can only be applied to structs, unions, or enums, but insn_fetch is only ever invoked with integral types, so Clang produces 19 -Wignored-attributes warnings for this source file. Signed-off-by: Nick Desaulniers <nick.desaulniers@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-30 12:45:55 +02:00
Paolo Bonzini	04a7ea04d5	KVM/ARM updates for 4.13 - vcpu request overhaul - allow timer and PMU to have their interrupt number selected from userspace - workaround for Cavium erratum 30115 - handling of memory poisonning - the usual crop of fixes and cleanups -----BEGIN PGP SIGNATURE----- iQJJBAABCAAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAllWCM0VHG1hcmMuenlu Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDjJ0QAI16x6+trKhH31lTSYekYfqm4hZ2 Fp7IbALW9KNCaY35tZov2Zuh99qGRduxTh7ewqhKpON8kkU+UKj0F7zH22+vfN4m yas/+uNr8R9VLyvea4ysPsgx8Q8v1Ix9setohHYNZIL9/klVqtaHpYvArHVF/mzq p2j/NxRS2dlp9r2TtoMRMhA05u6r0wolhUuh+z9v2ipib0gfOBIG24jsqCTEcD9n 5A/cVd+ztYshkrV95h3y9peahwt3zOA4QBGzrQ2K25jp0s54nqhmC7JTNSa8dtar YGW2MuAMoIFTwCFAlpwCzrwpOJFzF3Q6A8bOxei2fjclzjPMgT1xQxuhOoe4ntFa lTPxSHalm5W6dFTW90YSo2DBcPe+N7sQkhjR0cCeY3GYsOFhXMLTlOl5Pt1YK1or +3FAI74tFRKvVmb9mhZeGTvuzhDgRvtf3Qq5rjwlGzKc2BBOEgtMyj/Wgwo4N6Dz IjOnoRaUGELoBCWoTorMxLpsPBdPVSUxNyJTdAhqZ/ZtT1xqjhFNLZcrVWmOTzDM 1cav+jZkla4sLmJSNDD54aCSvvtPHis0nZn9PRlh12xgOyYiAVx4K++MNuWP0P37 hbh1gbPT+FcoVxPurUsX/pjNlTucPZcBwFytZDQlpwtPBpEFzJiImLYe/PldRb0f 9WQOH1Y1+q14MF+N =6hNK -----END PGP SIGNATURE----- Merge tag 'kvmarm-for-4.13' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM updates for 4.13 - vcpu request overhaul - allow timer and PMU to have their interrupt number selected from userspace - workaround for Cavium erratum 30115 - handling of memory poisonning - the usual crop of fixes and cleanups Conflicts: arch/s390/include/asm/kvm_host.h	2017-06-30 12:38:26 +02:00
Josh Poimboeuf	c207aee480	objtool, x86: Add several functions and files to the objtool whitelist In preparation for an objtool rewrite which will have broader checks, whitelist functions and files which cause problems because they do unusual things with the stack. These whitelists serve as a TODO list for which functions and files don't yet have undwarf unwinder coverage. Eventually most of the whitelists can be removed in favor of manual CFI hint annotations or objtool improvements. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: live-patching@vger.kernel.org Link: http://lkml.kernel.org/r/7f934a5d707a574bda33ea282e9478e627fb1829.1498659915.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-06-30 10:19:19 +02:00
Andy Lutomirski	8781fb7e97	x86/mm: Delete a big outdated comment about TLB flushing The comment describes the old explicit IPI-based flush logic, which is long gone. Signed-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Nadav Amit <nadav.amit@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/55e44997e56086528140c5180f8337dc53fb7ffc.1498751203.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-06-30 10:12:35 +02:00
Andy Lutomirski	bc0d5a89fb	x86/mm: Don't reenter flush_tlb_func_common() It was historically possible to have two concurrent TLB flushes targetting the same CPU: one initiated locally and one initiated remotely. This can now cause an OOPS in leave_mm() at arch/x86/mm/tlb.c:47: if (this_cpu_read(cpu_tlbstate.state) == TLBSTATE_OK) BUG(); with this call trace: flush_tlb_func_local arch/x86/mm/tlb.c:239 [inline] flush_tlb_mm_range+0x26d/0x370 arch/x86/mm/tlb.c:317 Without reentrancy, this OOPS is impossible: leave_mm() is only called if we're not in TLBSTATE_OK, but then we're unexpectedly in TLBSTATE_OK in leave_mm(). This can be caused by flush_tlb_func_remote() happening between the two checks and calling leave_mm(), resulting in two consecutive leave_mm() calls on the same CPU with no intervening switch_mm() calls. We never saw this OOPS before because the old leave_mm() implementation didn't put us back in TLBSTATE_OK, so the assertion didn't fire. Nadav noticed the reentrancy issue in a different context, but neither of us realized that it caused a problem yet. Reported-by: Levin, Alexander (Sasha Levin) <alexander.levin@verizon.com> Signed-off-by: Andy Lutomirski <luto@kernel.org> Reviewed-by: Nadav Amit <nadav.amit@gmail.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mel Gorman <mgorman@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@redhat.com> Cc: linux-mm@kvack.org Fixes: 3d28ebceaffa ("x86/mm: Rework lazy TLB to track the actual loaded mm") Link: http://lkml.kernel.org/r/855acf733268d521c9f2e191faee2dcc23a29729.1498751203.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-06-30 10:12:35 +02:00
Paolo Abeni	236222d393	x86/uaccess: Optimize copy_user_enhanced_fast_string() for short strings According to the Intel datasheet, the REP MOVSB instruction exposes a pretty heavy setup cost (50 ticks), which hurts short string copy operations. This change tries to avoid this cost by calling the explicit loop available in the unrolled code for strings shorter than 64 bytes. The 64 bytes cutoff value is arbitrary from the code logic point of view - it has been selected based on measurements, as the largest value that still ensures a measurable gain. Micro benchmarks of the __copy_from_user() function with lengths in the [0-63] range show this performance gain (shorter the string, larger the gain): - in the [55%-4%] range on Intel Xeon(R) CPU E5-2690 v4 - in the [72%-9%] range on Intel Core i7-4810MQ Other tested CPUs - namely Intel Atom S1260 and AMD Opteron 8216 - show no difference, because they do not expose the ERMS feature bit. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alan Cox <gnomes@lxorguk.ukuu.org.uk> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/4533a1d101fd460f80e21329a34928fad521c1d4.1498744345.git.pabeni@redhat.com [ Clarified the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-06-30 09:52:51 +02:00
Colin Ian King	e91c8d97ea	perf/x86/intel: Constify the 'lbr_desc[]' array and make a function static A few minor clean-ups: constify the lbr_desc[] array and make local function lbr_from_signext_quirk_rd() static to fix a sparse warning: "symbol 'lbr_from_signext_quirk_rd' was not declared. Should it be static?" Signed-off-by: Colin Ian King <colin.king@canonical.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-janitors@vger.kernel.org Link: http://lkml.kernel.org/r/20170629091406.9870-1-colin.king@canonical.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-06-30 09:00:56 +02:00
Kirill A. Shutemov	a24261d70e	x86/KASLR: Fix detection 32/64 bit bootloaders for 5-level paging KASLR uses hack to detect whether we booted via startup_32() or startup_64(): it checks what is loaded into cr3 and compares it to _pgtables. _pgtables is the array of page tables where early code allocates page table from. KASLR expects cr3 to point to _pgtables if we booted via startup_32(), but that's not true if we booted with 5-level paging enabled. In this case top level page table is allocated separately and only the first p4d page table is allocated from the array. Let's modify the check to cover both 4- and 5-level paging cases. The patch also renames 'level4p' to 'top_level_pgt' as it now can hold page table for 4th or 5th level, depending on configuration. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arch@vger.kernel.org Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/20170628121730.43079-1-kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-06-30 08:56:53 +02:00
Baoquan He	8eabf42ae5	x86/boot/KASLR: Fix kexec crash due to 'virt_addr' calculation bug Kernel text KASLR is separated into physical address and virtual address randomization. And for virtual address randomization, we only randomiza to get an offset between 16M and KERNEL_IMAGE_SIZE. So the initial value of 'virt_addr' should be LOAD_PHYSICAL_ADDR, but not the original kernel loading address 'output'. The bug will cause kernel boot failure if kernel is loaded at a different position than the address, 16M, which is decided at compiled time. Kexec/kdump is such practical case. To fix it, just assign LOAD_PHYSICAL_ADDR to virt_addr as initial value. Tested-by: Dave Young <dyoung@redhat.com> Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 8391c73 ("x86/KASLR: Randomize virtual address separately") Link: http://lkml.kernel.org/r/1498567146-11990-3-git-send-email-bhe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-06-30 08:53:14 +02:00
Baoquan He	b892cb873c	x86/boot/KASLR: Add checking for the offset of kernel virtual address randomization For kernel text KASLR, the virtual address is confined to area of 1G, [0xffffffff80000000, 0xffffffffc0000000). For the implemenataion of virtual address randomization, we only randomize to get an offset between 16M and 1G, then add this offset to the starting address, 0xffffffff80000000. Here 16M is the offset which is decided at linking stage. So the amount of the local variable 'virt_addr' which respresents the offset plus the kernel output size can not exceed KERNEL_IMAGE_SIZE. Add a debug check for the offset. If out of bounds, print error message and hang there. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Baoquan He <bhe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1498567146-11990-2-git-send-email-bhe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-06-30 08:53:14 +02:00
Nicholas Piggin	827880ec26	x86/um: thin archives build fix The linker does not like vdso-syms.lds in input archive files. Make it an extra-y instead. Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: user-mode-linux-devel@lists.sourceforge.net Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2017-06-30 09:03:05 +09:00
Kirill A. Shutemov	bb43dbc5e0	x86/ftrace: Exclude functions in head64.c from function-tracing A recent commit moved most logic of early boot up from startup_64() written in assembly to __startup_64() written in C. Fengguang reported breakage due to the change. It was tracked down to CONFIG_FUNCTION_TRACER being enabled. Tracing this function is not possible because it's invoked from the earliest boot stage before the relocation fixups have been done. It is the function doing the relocation. Exclude it from being built with tracer stubs. Fixes: c88d71508e36 ("x86/boot/64: Rewrite startup_64() in C") Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: lkp@01.org Link: http://lkml.kernel.org/r/20170627115948.17938-1-kirill.shutemov@linux.intel.com	2017-06-29 22:33:27 +02:00
Kan Liang	80c65fdb4c	perf/x86/intel/uncore: Fix wrong box pointer check Should not init a NULL box. It will cause system crash. The issue looks like caused by a typo. This was not noticed because there is no NULL box. Also, for most boxes, they are enabled by default. The init code is not critical. Fixes: fff4b87e594a ("perf/x86/intel/uncore: Make package handling more robust") Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20170629190926.2456-1-kan.liang@intel.com	2017-06-29 21:28:13 +02:00
Wanpeng Li	c853354429	KVM: LAPIC: Fix lapic timer injection delay If the TSC deadline timer is programmed really close to the deadline or even in the past, the computation in vmx_set_hv_timer will program the absolute target tsc value to vmcs preemption timer field w/ delta == 0, then plays a vmentry and an upcoming vmx preemption timer fire vmexit dance, the lapic timer injection is delayed due to this duration. Actually the lapic timer which is emulated by hrtimer can handle this correctly. This patch fixes it by firing the lapic timer and injecting a timer interrupt immediately during the next vmentry if the TSC deadline timer is programmed really close to the deadline or even in the past. This saves ~300 cycles on the tsc_deadline_timer test of apic.flat. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-29 18:21:13 +02:00
Paolo Bonzini	a749e247f7	KVM: lapic: reorganize restart_apic_timer Move the code to cancel the hv timer into the caller, just before it starts the hrtimer. Check availability of the hv timer in start_hv_timer. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-29 18:18:52 +02:00
Paolo Bonzini	35ee9e48b9	KVM: lapic: reorganize start_hv_timer There are many cases in which the hv timer must be canceled. Split out a new function to avoid duplication. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-29 18:10:35 +02:00
Tobias Klauser	6474924e2b	arch: remove unused macro/function thread_saved_pc() The only user of thread_saved_pc() in non-arch-specific code was removed in commit 8243d5597793 ("sched/core: Remove pointless printout in sched_show_task()"). Remove the implementations as well. Some architectures use thread_saved_pc() in their arch-specific code. Leave their thread_saved_pc() intact. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-06-28 16:13:57 -07:00
Bjorn Helgaas	13cfc73216	PCI: Work around poweroff & suspend-to-RAM issue on Macbook Pro 11 Neither soft poweroff (transition to ACPI power state S5) nor suspend-to-RAM (transition to state S3) works on the Macbook Pro 11,4 and 11,5. The problem is related to the [mem 0x7fa00000-0x7fbfffff] space. When we use that space, e.g., by assigning it to the 00:1c.0 Root Port, the ACPI Power Management 1 Control Register (PM1_CNT) at [io 0x1804] doesn't work anymore. Linux does a soft poweroff (transition to S5) by writing to PM1_CNT. The theory about why this doesn't work is: - The write to PM1_CNT causes an SMI - The BIOS SMI handler depends on something in [mem 0x7fa00000-0x7fbfffff] - When Linux assigns [mem 0x7fa00000-0x7fbfffff] to the 00:1c.0 Port, it covers up whatever the SMI handler uses, so the SMI handler no longer works correctly Reserve the [mem 0x7fa00000-0x7fbfffff] space so we don't assign it to anything. This is voodoo programming, since we don't know what the real conflict is, but we've failed to find the root cause. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=103211 Tested-by: thejoe@gmail.com Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Lukas Wunner <lukas@wunner.de> Cc: Chen Yu <yu.c.chen@intel.com>	2017-06-28 16:03:38 -05:00
Jim Mattson	403526054a	kvm: nVMX: Check memory operand to INVVPID The memory operand fetched for INVVPID is 128 bits. Bits 63:16 are reserved and must be zero. Otherwise, the instruction fails with VMfail(Invalid operand to INVEPT/INVVPID). If the INVVPID_TYPE is 0 (individual address invalidation), then bits 127:64 must be in canonical form, or the instruction fails with VMfail(Invalid operand to INVEPT/INVVPID). Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-28 22:38:37 +02:00
Thomas Gleixner	df65c1bcd9	x86/PCI: Select CONFIG_PCI_LOCKLESS_CONFIG All x86 PCI configuration space accessors have either their own serialization or can operate completely lockless (ECAM). Disable the global lock in the generic PCI configuration space accessors. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Bjorn Helgaas <helgaas@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: linux-pci@vger.kernel.org Link: http://lkml.kernel.org/r/20170316215057.295079391@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2017-06-28 22:32:56 +02:00
Thomas Gleixner	bb290fda87	x86/PCI/ce4100: Properly lock accessor functions x86 wants to get rid of the global pci_lock protecting the config space accessors so ECAM mode can operate completely lockless, but the CE4100 PCI code relies on that to protect the simulation registers. Restructure the code so it uses the x86 specific pci_config_lock to serialize the inner workings of the CE4100 PCI magic. That allows to remove the global locking via pci_lock later. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Bjorn Helgaas <helgaas@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: linux-pci@vger.kernel.org Link: http://lkml.kernel.org/r/20170316215057.126873574@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2017-06-28 22:32:55 +02:00
Thomas Gleixner	aae3e318d0	x86/PCI: Abort if legacy init fails If the legacy PCI init fails, then there are no PCI config space accesors available, but the code continues and tries to scan the busses, which fails due to the lack of config space accessors. Return right away, if the last init fallback fails. Switch the few printks to pr_info while at it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Bjorn Helgaas <helgaas@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: linux-pci@vger.kernel.org Link: http://lkml.kernel.org/r/20170316215057.047576516@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2017-06-28 22:32:55 +02:00
Thomas Gleixner	9304d1621e	x86/PCI: Remove duplicate defines For some historic reason these defines are duplicated and also available in arch/x86/include/asm/pci_x86.h, Remove them. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Bjorn Helgaas <helgaas@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Borislav Petkov <bp@alien8.de> Cc: linux-pci@vger.kernel.org Link: http://lkml.kernel.org/r/20170316215056.967808646@linutronix.de Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2017-06-28 22:32:55 +02:00
Christoph Hellwig	5860acc1a9	x86: remove arch specific dma_supported implementation And instead wire it up as method for all the dma_map_ops instances. Note that this also means the arch specific check will be fully instead of partially applied in the AMD iommu driver. Signed-off-by: Christoph Hellwig <hch@lst.de>	2017-06-28 06:54:46 -07:00
Christoph Hellwig	a760088b45	x86: remove DMA_ERROR_CODE All dma_map_ops instances now handle their errors through ->mapping_error. Signed-off-by: Christoph Hellwig <hch@lst.de>	2017-06-28 06:54:36 -07:00
Christoph Hellwig	8bd17c6670	x86/calgary: implement ->mapping_error DMA_ERROR_CODE is going to go away, so don't rely on it. Signed-off-by: Christoph Hellwig <hch@lst.de>	2017-06-28 06:54:35 -07:00
Christoph Hellwig	14a9aad7f0	x86/pci-nommu: implement ->mapping_error DMA_ERROR_CODE is going to go away, so don't rely on it. Signed-off-by: Christoph Hellwig <hch@lst.de>	2017-06-28 06:54:34 -07:00
Dan Williams	ca6a4657e5	x86, libnvdimm, pmem: remove global pmem api Now that all callers of the pmem api have been converted to dax helpers that call back to the pmem driver, we can remove include/linux/pmem.h and asm/pmem.h. Cc: <x86@kernel.org> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Toshi Kani <toshi.kani@hpe.com> Cc: Oliver O'Halloran <oohall@gmail.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2017-06-27 16:29:54 -07:00
Dan Williams	f2b612578e	x86, libnvdimm, pmem: move arch_invalidate_pmem() to libnvdimm Kill this globally defined wrapper and move to libnvdimm so that we can ultimately remove include/linux/pmem.h and asm/pmem.h. Cc: <x86@kernel.org> Cc: Jeff Moyer <jmoyer@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2017-06-27 16:29:00 -07:00
Adrian Hunter	d5b1a5f660	x86/insn: perf tools: Add new ptwrite instruction Add ptwrite to the op code map and the perf tools new instructions test. To run the test: $ tools/perf/perf test "x86 ins" 39: Test x86 instruction decoder - new instructions : Ok Or to see the details: $ tools/perf/perf test -v "x86 ins" 2>&1 \| grep ptwrite For information about ptwrite, refer the Intel SDM. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Link: http://lkml.kernel.org/r/1495180230-19367-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2017-06-27 11:58:04 -03:00
Ladi Prosek	1a5e185294	KVM: SVM: suppress unnecessary NMI singlestep on GIF=0 and nested exit enable_nmi_window is supposed to be a no-op if we know that we'll see a VM exit by the time the NMI window opens. This commit adds two more cases: * We intercept stgi so we don't need to singlestep on GIF=0. * We emulate nested vmexit so we don't need to singlestep when nested VM exit is required. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-27 16:35:43 +02:00
Ladi Prosek	a12713c25b	KVM: SVM: don't NMI singlestep over event injection Singlestepping is enabled by setting the TF flag and care must be taken to not let the guest see (and reuse at an inconvenient time) the modified rflag value. One such case is event injection, as part of which flags are pushed on the stack and restored later on iret. This commit disables singlestepping when we're about to inject an event and forces an immediate exit for us to re-evaluate the NMI related state. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-27 16:35:25 +02:00
Ladi Prosek	9b61174793	KVM: SVM: hide TF/RF flags used by NMI singlestep These flags are used internally by SVM so it's cleaner to not leak them to callers of svm_get_rflags. This is similar to how the TF flag is handled on KVM_GUESTDBG_SINGLESTEP by kvm_get_rflags and kvm_set_rflags. Without this change, the flags may propagate from host VMCB to nested VMCB or vice versa while singlestepping over a nested VM enter/exit, and then get stuck in inappropriate places. Example: NMI singlestepping is enabled while running L1 guest. The instruction to step over is VMRUN and nested vmrun emulation stashes rflags to hsave->save.rflags. Then if singlestepping is disabled while still in L2, TF/RF will be cleared from the nested VMCB but the next nested VM exit will restore them from hsave->save.rflags and cause an unexpected DB exception. Signed-off-by: Ladi Prosek <lprosek@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2017-06-27 16:34:58 +02:00

1 2 3 4 5 ...

27189 Commits