IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Radim's kernel.org email is bouncing, which I take as a signal that
he is not really able to deal with KVM at this time. Make MAINTAINERS
match the effective value of KVM's bus factor.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
I haven't been active for 18 months, and don't have the hardware set up
to test KVM for MIPS, so mark it as orphaned and remove myself as
maintainer. Hopefully somebody from MIPS can pick this up.
Signed-off-by: James Hogan <jhogan@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: kvm@vger.kernel.org
Cc: linux-mips@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The host reports support for the synthetic feature X86_FEATURE_SSBD
when any of the three following hardware features are set:
CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31]
CPUID.80000008H:EBX.AMD_SSBD[bit 24]
CPUID.80000008H:EBX.VIRT_SSBD[bit 25]
Either of the first two hardware features implies the existence of the
IA32_SPEC_CTRL MSR, but CPUID.80000008H:EBX.VIRT_SSBD[bit 25] does
not. Therefore, CPUID.80000008H:EBX.AMD_SSBD[bit 24] should only be
set in the guest if CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31] or
CPUID.80000008H:EBX.AMD_SSBD[bit 24] is set on the host.
Fixes: 4c6903a0f9d76 ("KVM: x86: fix reporting of AMD speculation bug CPUID leaf")
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Jacob Xu <jacobhxu@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Reported-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The host reports support for the synthetic feature X86_FEATURE_SSBD
when any of the three following hardware features are set:
CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31]
CPUID.80000008H:EBX.AMD_SSBD[bit 24]
CPUID.80000008H:EBX.VIRT_SSBD[bit 25]
Either of the first two hardware features implies the existence of the
IA32_SPEC_CTRL MSR, but CPUID.80000008H:EBX.VIRT_SSBD[bit 25] does
not. Therefore, CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31] should only be
set in the guest if CPUID.(EAX=7,ECX=0):EDX.SSBD[bit 31] or
CPUID.80000008H:EBX.AMD_SSBD[bit 24] is set on the host.
Fixes: 0c54914d0c52a ("KVM: x86: use Intel speculation bugs and features as derived in generic x86 code")
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Jacob Xu <jacobhxu@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Reported-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A device mapping is normally always mapped at Stage-2, since there
is very little gain in having it faulted in.
Nonetheless, it is possible to end-up in a situation where the device
mapping has been removed from Stage-2 (userspace munmaped the VFIO
region, and the MMU notifier did its job), but present in a userspace
mapping (userpace has mapped it back at the same address). In such
a situation, the device mapping will be demand-paged as the guest
performs memory accesses.
This requires to be careful when dealing with mapping size, cache
management, and to handle potential execution of a device mapping.
Reported-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Tested-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20191211165651.7889-2-maz@kernel.org
Commit 4b927b94d5df ("KVM: arm/arm64: vgic: Introduce find_reg_by_id()")
introduced 'find_reg_by_id()', which looks up a system register only if
the 'id' index parameter identifies a valid system register. As part of
the patch, existing callers of 'find_reg()' were ported over to the new
interface, but this breaks 'index_to_sys_reg_desc()' in the case that the
initial lookup in the vCPU target table fails because we will then call
into 'find_reg()' for the system register table with an uninitialised
'param' as the key to the lookup.
GCC 10 is bright enough to spot this (amongst a tonne of false positives,
but hey!):
| arch/arm64/kvm/sys_regs.c: In function ‘index_to_sys_reg_desc.part.0.isra’:
| arch/arm64/kvm/sys_regs.c:983:33: warning: ‘params.Op2’ may be used uninitialized in this function [-Wmaybe-uninitialized]
| 983 | (u32)(x)->CRn, (u32)(x)->CRm, (u32)(x)->Op2);
| [...]
Revert the hunk of 4b927b94d5df which breaks 'index_to_sys_reg_desc()' so
that the old behaviour of checking the index upfront is restored.
Fixes: 4b927b94d5df ("KVM: arm/arm64: vgic: Introduce find_reg_by_id()")
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20191212094049.12437-1-will@kernel.org
In kvm_arch_prepare_memory_region, arm kvm regards the memory region as
writable if the flag has no KVM_MEM_READONLY, and the vm is readonly if
!VM_WRITE.
But there is common usage for setting kvm memory region as follows:
e.g. qemu side (see the PROT_NONE flag)
1. mmap(NULL, size, PROT_NONE, MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
memory_region_init_ram_ptr()
2. re mmap the above area with read/write authority.
Such example is used in virtio-fs qemu codes which hasn't been upstreamed
[1]. But seems we can't forbid this example.
Without this patch, it will cause an EPERM during kvm_set_memory_region()
and cause qemu boot crash.
As told by Ard, "the underlying assumption is incorrect, i.e., that the
value of vm_flags at this point in time defines how the VMA is used
during its lifetime. There may be other cases where a VMA is created
with VM_READ vm_flags that are changed to VM_READ|VM_WRITE later, and
we are currently rejecting this use case as well."
[1] https://gitlab.com/virtio-fs/qemu/blob/5a356e/hw/virtio/vhost-user-fs.c#L488
Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Christoffer Dall <christoffer.dall@arm.com>
Link: https://lore.kernel.org/r/20191206020802.196108-1-justin.he@arm.com
We don't intend to support IMPLEMENATION DEFINED system registers, but
have to trap them (and emulate them as UNDEFINED). These traps aren't
interesting to the system administrator or to the KVM developers, so
let's not bother logging when we do so.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20191205180652.18671-3-mark.rutland@arm.com
Currently kvm_pr_unimpl() is ratelimited, so print_sys_reg_instr() won't
spam the console. However, someof its callers try to print some
contextual information with kvm_err(), which is not ratelimited. This
means that in some cases the context may be printed without the sysreg
encoding, which isn't all that useful.
Let's ensure that both are consistently printed together and
ratelimited, by refactoring print_sys_reg_instr() so that some callers
can provide it with an arbitrary format string.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20191205180652.18671-2-mark.rutland@arm.com
Use wrapper function lock_all_vcpus()/unlock_all_vcpus()
in kvm_vgic_create() to remove duplicated code dealing
with locking and unlocking all vcpus in a vm.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/1575081918-11401-1-git-send-email-linmiaohe@huawei.com
In kvm_vgic_dist_init() called from kvm_vgic_map_resources(), if
dist->vgic_model is invalid, dist->spis will be freed without set
dist->spis = NULL. And in vgicv2 resources clean up path,
__kvm_vgic_destroy() will be called to free allocated resources.
And dist->spis will be freed again in clean up chain because we
forget to set dist->spis = NULL in kvm_vgic_dist_init() failed
path. So double free would happen.
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Link: https://lore.kernel.org/r/1574923128-19956-1-git-send-email-linmiaohe@huawei.com
As arg dummy is not really needed, there's no need to pass
NULL when calling cpu_init_hyp_mode(). So clean it up.
Fixes: 67f691976662 ("arm64: kvm: allows kvm cpu hotplug")
Reviewed-by: Steven Price <steven.price@arm.com>
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/1574320559-5662-1-git-send-email-linmiaohe@huawei.com
If X86_FEATURE_RTM is disabled, the guest should not be able to access
MSR_IA32_TSX_CTRL. We can therefore use it in KVM to force all
transactions from the guest to abort.
Tested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The current guest mitigation of TAA is both too heavy and not really
sufficient. It is too heavy because it will cause some affected CPUs
(those that have MDS_NO but lack TAA_NO) to fall back to VERW and
get the corresponding slowdown. It is not really sufficient because
it will cause the MDS_NO bit to disappear upon microcode update, so
that VMs started before the microcode update will not be runnable
anymore afterwards, even with tsx=on.
Instead, if tsx=on on the host, we can emulate MSR_IA32_TSX_CTRL for
the guest and let it run without the VERW mitigation. Even though
MSR_IA32_TSX_CTRL is quite heavyweight, and we do not want to write
it on every vmentry, we can use the shared MSR functionality because
the host kernel need not protect itself from TSX-based side-channels.
Tested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Because KVM always emulates CPUID, the CPUID clear bit
(bit 1) of MSR_IA32_TSX_CTRL must be emulated "manually"
by the hypervisor when performing said emulation.
Right now neither kvm-intel.ko nor kvm-amd.ko implement
MSR_IA32_TSX_CTRL but this will change in the next patch.
Reviewed-by: Jim Mattson <jmattson@google.com>
Tested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
"Shared MSRs" are guest MSRs that are written to the host MSRs but
keep their value until the next return to userspace. They support
a mask, so that some bits keep the host value, but this mask is
only used to skip an unnecessary MSR write and the value written
to the MSR is always the guest MSR.
Fix this and, while at it, do not update smsr->values[slot].curr if
for whatever reason the wrmsr fails. This should only happen due to
reserved bits, so the value written to smsr->values[slot].curr
will not match when the user-return notifier and the host value will
always be restored. However, it is untidy and in rare cases this
can actually avoid spurious WRMSRs on return to userspace.
Cc: stable@vger.kernel.org
Reviewed-by: Jim Mattson <jmattson@google.com>
Tested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM does not implement MSR_IA32_TSX_CTRL, so it must not be presented
to the guests. It is also confusing to have !ARCH_CAP_TSX_CTRL_MSR &&
!RTM && ARCH_CAP_TAA_NO: lack of MSR_IA32_TSX_CTRL suggests TSX was not
hidden (it actually was), yet the value says that TSX is not vulnerable
to microarchitectural data sampling. Fix both.
Cc: stable@vger.kernel.org
Tested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a comment explaining the rational behind having both
no_compat open and ioctl callbacks to fend off compat tasks.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acquire the per-VM slots_lock when zapping all shadow pages as part of
toggling nx_huge_pages. The fast zap algorithm relies on exclusivity
(via slots_lock) to identify obsolete vs. valid shadow pages, because it
uses a single bit for its generation number. Holding slots_lock also
obviates the need to acquire a read lock on the VM's srcu.
Failing to take slots_lock when toggling nx_huge_pages allows multiple
instances of kvm_mmu_zap_all_fast() to run concurrently, as the other
user, KVM_SET_USER_MEMORY_REGION, does not take the global kvm_lock.
(kvm_mmu_zap_all_fast() does take kvm->mmu_lock, but it can be
temporarily dropped by kvm_zap_obsolete_pages(), so it is not enough
to enforce exclusivity).
Concurrent fast zap instances causes obsolete shadow pages to be
incorrectly identified as valid due to the single bit generation number
wrapping, which results in stale shadow pages being left in KVM's MMU
and leads to all sorts of undesirable behavior.
The bug is easily confirmed by running with CONFIG_PROVE_LOCKING and
toggling nx_huge_pages via its module param.
Note, until commit 4ae5acbc4936 ("KVM: x86/mmu: Take slots_lock when
using kvm_mmu_zap_all_fast()", 2019-11-13) the fast zap algorithm used
an ulong-sized generation instead of relying on exclusivity for
correctness, but all callers except the recently added set_nx_huge_pages()
needed to hold slots_lock anyways. Therefore, this patch does not have
to be backported to stable kernels.
Given that toggling nx_huge_pages is by no means a fast path, force it
to conform to the current approach instead of reintroducing the previous
generation count.
Fixes: b8e8c8303ff28 ("kvm: mmu: ITLB_MULTIHIT mitigation", but NOT FOR STABLE)
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On a system without KVM_COMPAT, we prevent IOCTLs from being issued
by a compat task. Although this prevents most silly things from
happening, it can still confuse a 32bit userspace that is able
to open the kvm device (the qemu test suite seems to be pretty
mad with this behaviour).
Take a more radical approach and return a -ENODEV to the compat
task.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When applying commit 7a5ee6edb42e ("KVM: X86: Fix initialization of MSR
lists"), it forgot to reset the three MSR lists number varialbes to 0
while removing the useless conditionals.
Fixes: 7a5ee6edb42e (KVM: X86: Fix initialization of MSR lists)
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If a huge page is recovered (and becomes no executable) while another
thread is executing it, the resulting contention on mmu_lock can cause
latency spikes. Disabling recovery for PREEMPT_RT kernels fixes this
issue.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
VT-d posted interrupts, DAX/ZONE_DEVICE,
module unload/reload.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJdyrEsAAoJEL/70l94x66DIOkH/Asqrh4o4pwfRHWE+9rnM6PI
j8oFi7Q4eRXJnP4zEMnMbb6xD/BfSH1tWEcPcYgIxD/t0DFx8F92/xsETAJ/Qc5n
CWpmnhMkJqERlV+GSRuBqnheMo0CEH1Ab1QZKhh5U3//pK3OtGY9WyydJHWcquTh
bGh2pnxwVZOtIIEmclUUfKjyR2Fu8hJLnQwzWgYZ27UK7J2pLmiiTX0vwQG359Iq
sDn9ND33pCBW5e/D2mzccRjOJEvzwrumewM1sRDsoAYLJzUjg9+xD83vZDa1d7R6
gajCDFWVJbPoLvUY+DgsZBwMMlogElimJMT/Zft3ERbCsYJbFvcmwp4JzyxDxQ4=
=J6KN
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"Fix unwinding of KVM_CREATE_VM failure, VT-d posted interrupts,
DAX/ZONE_DEVICE, and module unload/reload"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: MMU: Do not treat ZONE_DEVICE pages as being reserved
KVM: VMX: Introduce pi_is_pir_empty() helper
KVM: VMX: Do not change PID.NDST when loading a blocked vCPU
KVM: VMX: Consider PID.PIR to determine if vCPU has pending interrupts
KVM: VMX: Fix comment to specify PID.ON instead of PIR.ON
KVM: X86: Fix initialization of MSR lists
KVM: fix placement of refcount initialization
KVM: Fix NULL-ptr deref after kvm_create_vm fails
Pull x86 TSX Async Abort and iTLB Multihit mitigations from Thomas Gleixner:
"The performance deterioration departement is not proud at all of
presenting the seventh installment of speculation mitigations and
hardware misfeature workarounds:
1) TSX Async Abort (TAA) - 'The Annoying Affair'
TAA is a hardware vulnerability that allows unprivileged
speculative access to data which is available in various CPU
internal buffers by using asynchronous aborts within an Intel TSX
transactional region.
The mitigation depends on a microcode update providing a new MSR
which allows to disable TSX in the CPU. CPUs which have no
microcode update can be mitigated by disabling TSX in the BIOS if
the BIOS provides a tunable.
Newer CPUs will have a bit set which indicates that the CPU is not
vulnerable, but the MSR to disable TSX will be available
nevertheless as it is an architected MSR. That means the kernel
provides the ability to disable TSX on the kernel command line,
which is useful as TSX is a truly useful mechanism to accelerate
side channel attacks of all sorts.
2) iITLB Multihit (NX) - 'No eXcuses'
iTLB Multihit is an erratum where some Intel processors may incur
a machine check error, possibly resulting in an unrecoverable CPU
lockup, when an instruction fetch hits multiple entries in the
instruction TLB. This can occur when the page size is changed
along with either the physical address or cache type. A malicious
guest running on a virtualized system can exploit this erratum to
perform a denial of service attack.
The workaround is that KVM marks huge pages in the extended page
tables as not executable (NX). If the guest attempts to execute in
such a page, the page is broken down into 4k pages which are
marked executable. The workaround comes with a mechanism to
recover these shattered huge pages over time.
Both issues come with full documentation in the hardware
vulnerabilities section of the Linux kernel user's and administrator's
guide.
Thanks to all patch authors and reviewers who had the extraordinary
priviledge to be exposed to this nuisance.
Special thanks to Borislav Petkov for polishing the final TAA patch
set and to Paolo Bonzini for shepherding the KVM iTLB workarounds and
providing also the backports to stable kernels for those!"
* 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/speculation/taa: Fix printing of TAA_MSG_SMT on IBRS_ALL CPUs
Documentation: Add ITLB_MULTIHIT documentation
kvm: x86: mmu: Recovery of shattered NX large pages
kvm: Add helper function for creating VM worker threads
kvm: mmu: ITLB_MULTIHIT mitigation
cpu/speculation: Uninline and export CPU mitigations helpers
x86/cpu: Add Tremont to the cpu vulnerability whitelist
x86/bugs: Add ITLB_MULTIHIT bug infrastructure
x86/tsx: Add config options to set tsx=on|off|auto
x86/speculation/taa: Add documentation for TSX Async Abort
x86/tsx: Add "auto" option to the tsx= cmdline parameter
kvm/x86: Export MDS_NO=0 to guests when TSX is enabled
x86/speculation/taa: Add sysfs reporting for TSX Async Abort
x86/speculation/taa: Add mitigation for TSX Async Abort
x86/cpu: Add a "tsx=" cmdline option with TSX disabled by default
x86/cpu: Add a helper function x86_read_arch_cap_msr()
x86/msr: Add the IA32_TSX_CTRL MSR
Explicitly exempt ZONE_DEVICE pages from kvm_is_reserved_pfn() and
instead manually handle ZONE_DEVICE on a case-by-case basis. For things
like page refcounts, KVM needs to treat ZONE_DEVICE pages like normal
pages, e.g. put pages grabbed via gup(). But for flows such as setting
A/D bits or shifting refcounts for transparent huge pages, KVM needs to
to avoid processing ZONE_DEVICE pages as the flows in question lack the
underlying machinery for proper handling of ZONE_DEVICE pages.
This fixes a hang reported by Adam Borowski[*] in dev_pagemap_cleanup()
when running a KVM guest backed with /dev/dax memory, as KVM straight up
doesn't put any references to ZONE_DEVICE pages acquired by gup().
Note, Dan Williams proposed an alternative solution of doing put_page()
on ZONE_DEVICE pages immediately after gup() in order to simplify the
auditing needed to ensure is_zone_device_page() is called if and only if
the backing device is pinned (via gup()). But that approach would break
kvm_vcpu_{un}map() as KVM requires the page to be pinned from map() 'til
unmap() when accessing guest memory, unlike KVM's secondary MMU, which
coordinates with mmu_notifier invalidations to avoid creating stale
page references, i.e. doesn't rely on pages being pinned.
[*] http://lkml.kernel.org/r/20190919115547.GA17963@angband.pl
Reported-by: Adam Borowski <kilobyte@angband.pl>
Analyzed-by: David Hildenbrand <david@redhat.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: stable@vger.kernel.org
Fixes: 3565fce3a659 ("mm, x86: get_user_pages() for dax mappings")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Streamline the PID.PIR check and change its call sites to use
the newly added helper.
Suggested-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When vCPU enters block phase, pi_pre_block() inserts vCPU to a per pCPU
linked list of all vCPUs that are blocked on this pCPU. Afterwards, it
changes PID.NV to POSTED_INTR_WAKEUP_VECTOR which its handler
(wakeup_handler()) is responsible to kick (unblock) any vCPU on that
linked list that now has pending posted interrupts.
While vCPU is blocked (in kvm_vcpu_block()), it may be preempted which
will cause vmx_vcpu_pi_put() to set PID.SN. If later the vCPU will be
scheduled to run on a different pCPU, vmx_vcpu_pi_load() will clear
PID.SN but will also *overwrite PID.NDST to this different pCPU*.
Instead of keeping it with original pCPU which vCPU had entered block
phase on.
This results in an issue because when a posted interrupt is delivered, as
the wakeup_handler() will be executed and fail to find blocked vCPU on
its per pCPU linked list of all vCPUs that are blocked on this pCPU.
Which is due to the vCPU being placed on a *different* per pCPU
linked list i.e. the original pCPU in which it entered block phase.
The regression is introduced by commit c112b5f50232 ("KVM: x86:
Recompute PID.ON when clearing PID.SN"). Therefore, partially revert
it and reintroduce the condition in vmx_vcpu_pi_load() responsible for
avoiding changing PID.NDST when loading a blocked vCPU.
Fixes: c112b5f50232 ("KVM: x86: Recompute PID.ON when clearing PID.SN")
Tested-by: Nathan Ni <nathan.ni@oracle.com>
Co-developed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 17e433b54393 ("KVM: Fix leak vCPU's VMCS value into other pCPU")
introduced vmx_dy_apicv_has_pending_interrupt() in order to determine
if a vCPU have a pending posted interrupt. This routine is used by
kvm_vcpu_on_spin() when searching for a a new runnable vCPU to schedule
on pCPU instead of a vCPU doing busy loop.
vmx_dy_apicv_has_pending_interrupt() determines if a
vCPU has a pending posted interrupt solely based on PID.ON. However,
when a vCPU is preempted, vmx_vcpu_pi_put() sets PID.SN which cause
raised posted interrupts to only set bit in PID.PIR without setting
PID.ON (and without sending notification vector), as depicted in VT-d
manual section 5.2.3 "Interrupt-Posting Hardware Operation".
Therefore, checking PID.ON is insufficient to determine if a vCPU has
pending posted interrupts and instead we should also check if there is
some bit set on PID.PIR if PID.SN=1.
Fixes: 17e433b54393 ("KVM: Fix leak vCPU's VMCS value into other pCPU")
Reviewed-by: Jagannathan Raman <jag.raman@oracle.com>
Co-developed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The Outstanding Notification (ON) bit is part of the Posted Interrupt
Descriptor (PID) as opposed to the Posted Interrupts Register (PIR).
The latter is a bitmap for pending vectors.
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The three MSR lists(msrs_to_save[], emulated_msrs[] and
msr_based_features[]) are global arrays of kvm.ko, which are
adjusted (copy supported MSRs forward to override the unsupported MSRs)
when insmod kvm-{intel,amd}.ko, but it doesn't reset these three arrays
to their initial value when rmmod kvm-{intel,amd}.ko. Thus, at the next
installation, kvm-{intel,amd}.ko will do operations on the modified
arrays with some MSRs lost and some MSRs duplicated.
So define three constant arrays to hold the initial MSR lists and
initialize msrs_to_save[], emulated_msrs[] and msr_based_features[]
based on the constant arrays.
Cc: stable@vger.kernel.org
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
[Remove now useless conditionals. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This fixes two different classes of bugs in the Intel graphics hardware:
MMIO register read hang:
"On Intels Gen8 and Gen9 Graphics hardware, a read of specific graphics
MMIO registers when the product is in certain low power states causes
a system hang.
There are two potential triggers for DoS:
a) H/W corruption of the RC6 save/restore vector
b) Hard hang within the MIPI hardware
This prevents the DoS in two areas of the hardware:
1) Detect corruption of RC6 address on exit from low-power state,
and if we find it corrupted, disable RC6 and RPM
2) Permanently lower the MIPI MMIO timeout"
Blitter command streamer unrestricted memory accesses:
"On Intels Gen9 Graphics hardware the Blitter Command Streamer (BCS)
allows writing to Memory Mapped Input Output (MMIO) that should be
blocked. With modifications of page tables, this can lead to privilege
escalation. This exposure is limited to the Guest Physical Address
space and does not allow for access outside of the graphics virtual
machine.
This series establishes a software parser into the Blitter command
stream to scan for, and prevent, reads or writes to MMIO's that should
not be accessible to non-privileged contexts.
Much of the command parser infrastructure has existed for some time,
and is used on Ivybridge/Haswell/Valleyview derived products to allow
the use of features normally blocked by hardware. In this legacy
context, the command parser is employed to allow normally unprivileged
submissions to be run with elevated privileges in order to grant
access to a limited set of extra capabilities. In this mode the parser
is optional; In the event that the parser finds any construct that it
cannot properly validate (e.g. nested command buffers), it simply
aborts the scan and submits the buffer in non-privileged mode.
For Gen9 Graphics, this series makes the parser mandatory for all
Blitter submissions. The incoming user buffer is first copied to a
kernel owned buffer, and parsed. If all checks are successful the
kernel owned buffer is mapped READ-ONLY and submitted on behalf of the
user. If any checks fail, or the parser is unable to complete the scan
(nested buffers), it is forcibly rejected. The successfully scanned
buffer is executed with NORMAL user privileges (key difference from
legacy usage).
Modern usermode does not use the Blitter on later hardware, having
switched over to using the 3D engine instead for performance reasons.
There are however some legacy usermode apps that rely on Blitter,
notably the SNA X-Server. There are no known usermode applications
that require nested command buffers on the Blitter, so the forcible
rejection of such buffers in this patch series is considered an
acceptable limitation"
* Intel graphics fixes in emailed bundle from Jon Bloomfield <jon.bloomfield@intel.com>:
drm/i915/cmdparser: Fix jump whitelist clearing
drm/i915/gen8+: Add RC6 CTX corruption WA
drm/i915: Lower RM timeout to avoid DSI hard hangs
drm/i915/cmdparser: Ignore Length operands during command matching
drm/i915/cmdparser: Add support for backward jumps
drm/i915/cmdparser: Use explicit goto for error paths
drm/i915: Add gen9 BCS cmdparsing
drm/i915: Allow parsing of unsized batches
drm/i915: Support ro ppgtt mapped cmdparser shadow buffers
drm/i915: Add support for mandatory cmdparsing
drm/i915: Remove Master tables from cmdparser
drm/i915: Disable Secure Batches for gen6+
drm/i915: Rename gen7 cmdparser tables
Pull cgroup fix from Tejun Heo:
"There's an inadvertent preemption point in ptrace_stop() which was
reliably triggering for a test scenario significantly slowing it down.
This contains Oleg's fix to remove the unwanted preemption point"
* 'for-5.4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup: freezer: call cgroup_enter_frozen() with preemption disabled in ptrace_stop()
Three small changes: two in the core and one in the qla2xxx
driver. The sg_tablesize fix affects a thinko in the migration to
blk-mq of certain legacy drivers which could cause an oops and the sd
core change should only affect zoned block devices which were wrongly
suppressing error messages for reset all zones.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXcmURyYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishV63APoCnP9P
kJP1Bp1fd7f91FrFaxY7sKH9VZqbioUtwUhE9AD/f0o5/gg/5jSIM90GKXdZYVpt
KIsIaQzVOPL3K7EaFDs=
=77IM
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Three small changes: two in the core and one in the qla2xxx driver.
The sg_tablesize fix affects a thinko in the migration to blk-mq of
certain legacy drivers which could cause an oops and the sd core
change should only affect zoned block devices which were wrongly
suppressing error messages for reset all zones"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: core: Handle drivers which set sg_tablesize to zero
scsi: qla2xxx: fix NPIV tear down process
scsi: sd_zbc: Fix sd_zbc_complete()
When a jump_whitelist bitmap is reused, it needs to be cleared.
Currently this is done with memset() and the size calculation assumes
bitmaps are made of 32-bit words, not longs. So on 64-bit
architectures, only the first half of the bitmap is cleared.
If some whitelist bits are carried over between successive batches
submitted on the same context, this will presumably allow embedding
the rogue instructions that we're trying to reject.
Use bitmap_zero() instead, which gets the calculation right.
Fixes: f8c08d8faee5 ("drm/i915/cmdparser: Add support for backward jumps")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
Reported by syzkaller:
=============================
WARNING: suspicious RCU usage
-----------------------------
./include/linux/kvm_host.h:536 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
no locks held by repro_11/12688.
stack backtrace:
Call Trace:
dump_stack+0x7d/0xc5
lockdep_rcu_suspicious+0x123/0x170
kvm_dev_ioctl+0x9a9/0x1260 [kvm]
do_vfs_ioctl+0x1a1/0xfb0
ksys_ioctl+0x6d/0x80
__x64_sys_ioctl+0x73/0xb0
do_syscall_64+0x108/0xaa0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Commit a97b0e773e4 (kvm: call kvm_arch_destroy_vm if vm creation fails)
sets users_count to 1 before kvm_arch_init_vm(), however, if kvm_arch_init_vm()
fails, we need to decrease this count. By moving it earlier, we can push
the decrease to out_err_no_arch_destroy_vm without introducing yet another
error label.
syzkaller source: https://syzkaller.appspot.com/x/repro.c?x=15209b84e00000
Reported-by: syzbot+75475908cd0910f141ee@syzkaller.appspotmail.com
Fixes: a97b0e773e49 ("kvm: call kvm_arch_destroy_vm if vm creation fails")
Cc: Jim Mattson <jmattson@google.com>
Analyzed-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reported by syzkaller:
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 14727 Comm: syz-executor.3 Not tainted 5.4.0-rc4+ #0
RIP: 0010:kvm_coalesced_mmio_init+0x5d/0x110 arch/x86/kvm/../../../virt/kvm/coalesced_mmio.c:121
Call Trace:
kvm_dev_ioctl_create_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:3446 [inline]
kvm_dev_ioctl+0x781/0x1490 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3494
vfs_ioctl fs/ioctl.c:46 [inline]
file_ioctl fs/ioctl.c:509 [inline]
do_vfs_ioctl+0x196/0x1150 fs/ioctl.c:696
ksys_ioctl+0x62/0x90 fs/ioctl.c:713
__do_sys_ioctl fs/ioctl.c:720 [inline]
__se_sys_ioctl fs/ioctl.c:718 [inline]
__x64_sys_ioctl+0x6e/0xb0 fs/ioctl.c:718
do_syscall_64+0xca/0x5d0 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Commit 9121923c457d ("kvm: Allocate memslots and buses before calling kvm_arch_init_vm")
moves memslots and buses allocations around, however, if kvm->srcu/irq_srcu fails
initialization, NULL will be returned instead of error code, NULL will not be intercepted
in kvm_dev_ioctl_create_vm() and be dereferenced by kvm_coalesced_mmio_init(), this patch
fixes it.
Moving the initialization is required anyway to avoid an incorrect synchronize_srcu that
was also reported by syzkaller:
wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136
__synchronize_srcu+0x197/0x250 kernel/rcu/srcutree.c:921
synchronize_srcu_expedited kernel/rcu/srcutree.c:946 [inline]
synchronize_srcu+0x239/0x3e8 kernel/rcu/srcutree.c:997
kvm_page_track_unregister_notifier+0xe7/0x130 arch/x86/kvm/page_track.c:212
kvm_mmu_uninit_vm+0x1e/0x30 arch/x86/kvm/mmu.c:5828
kvm_arch_destroy_vm+0x4a2/0x5f0 arch/x86/kvm/x86.c:9579
kvm_create_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:702 [inline]
so do it.
Reported-by: syzbot+89a8060879fa0bd2db4f@syzkaller.appspotmail.com
Reported-by: syzbot+e27e7027eb2b80e44225@syzkaller.appspotmail.com
Fixes: 9121923c457d ("kvm: Allocate memslots and buses before calling kvm_arch_init_vm")
Cc: Jim Mattson <jmattson@google.com>
Cc: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A set of fixes that have trickled in over the last couple of weeks:
- MAINTAINER update for Cavium/Marvell ThunderX2
- stm32 tweaks to pinmux for Joystick/Camera, and RAM allocation for CAN
interfaces
- i.MX fixes for voltage regulator GPIO mappings, fixes voltage scaling
issues
- More i.MX fixes for various issues on i.MX eval boards: interrupt
storm due to u-boot leaving pins in new states, fixing power button
config, a couple of compatible-string corrections.
- Powerdown and Suspend/Resume fixes for Allwinner A83-based tablets
- A few documentation tweaks and a fix of a memory leak in the reset
subsystem
-----BEGIN PGP SIGNATURE-----
iQJDBAABCAAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAl3IVbUPHG9sb2ZAbGl4
b20ubmV0AAoJEIwa5zzehBx3xTQQAKJcHO1Qy7+qk3w74ko3d2n9jnNAuFqma8om
zhx+zyVrf28HI90rmJWx+mA+rVnKeNwqf7k6qeoukwxn4zVtZTx4+A6HMFOQ1cDP
zEdVLbCp+99I3itBITMo5NjF3FsgRp8l5UHUmFBU8uPcjotPIVigVIum9KJTK1ZM
3xcCOtOnydGagjHKM/QljSBxcg3ii+9cDUpJPwxYPCtv9kpCWiC/+mHg5bHD/kI2
Hr6XqIV4gepc0LsV9OJthMgSzCyFYBNckh2EfAiI3sEb06ifJgrXZJT3GvG0BnRh
DzN6KaxjILAlZmijRwKXmEDmSpyPaEaqlnPT4XdF7e0yVIa6ekgyS7oMdg6iQd2U
Vbvq8k+NRWIg/MEvJ9lwuBW0luwZ3BNuPrSzIK4VG5d47qb3kosTe7KsZ4VYYEYd
vkmNNaRlk+RFVOtWUsoNo18GjheEiWvW3ZRr8MjYwDKYbryXEFmNPbM4xr57e7LX
QTtNumrWvS/xm1TGgPDBOUZzGh9UZVonlQVHf5Ix8c4sLR6wkRWPni4N4kJNfcD6
pPwTQIpwxvCwpyuqtc6UFungBT3aj0FNMNNg06KfpDMXwyo8AFjPSbr7Fe8e5wjm
vC5+VhB04l1DlX8ThwPvnKaIBtYG26AdB7ffhjQqlU5s4XnpdMXmfWlZtB8hp/oI
VCtWgvsx
=Ei7j
-----END PGP SIGNATURE-----
Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Olof Johansson:
"A set of fixes that have trickled in over the last couple of weeks:
- MAINTAINER update for Cavium/Marvell ThunderX2
- stm32 tweaks to pinmux for Joystick/Camera, and RAM allocation for
CAN interfaces
- i.MX fixes for voltage regulator GPIO mappings, fixes voltage
scaling issues
- More i.MX fixes for various issues on i.MX eval boards: interrupt
storm due to u-boot leaving pins in new states, fixing power button
config, a couple of compatible-string corrections.
- Powerdown and Suspend/Resume fixes for Allwinner A83-based tablets
- A few documentation tweaks and a fix of a memory leak in the reset
subsystem"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
MAINTAINERS: update Cavium ThunderX2 maintainers
ARM: dts: stm32: change joystick pinctrl definition on stm32mp157c-ev1
ARM: dts: stm32: remove OV5640 pinctrl definition on stm32mp157c-ev1
ARM: dts: stm32: Fix CAN RAM mapping on stm32mp157c
ARM: dts: stm32: relax qspi pins slew-rate for stm32mp157
arm64: dts: zii-ultra: fix ARM regulator GPIO handle
ARM: sunxi: Fix CPU powerdown on A83T
ARM: dts: sun8i-a83t-tbs-a711: Fix WiFi resume from suspend
arm64: dts: imx8mn: fix compatible string for sdma
arm64: dts: imx8mm: fix compatible string for sdma
reset: fix reset_control_ops kerneldoc comment
ARM: dts: imx6-logicpd: Re-enable SNVS power key
soc: imx: gpc: fix initialiser format
ARM: dts: imx6qdl-sabreauto: Fix storm of accelerometer interrupts
arm64: dts: ls1028a: fix a compatible issue
reset: fix reset_control_get_exclusive kerneldoc comment
reset: fix reset_control_lookup kerneldoc comment
reset: fix of_reset_control_get_count kerneldoc comment
reset: fix of_reset_simple_xlate kerneldoc comment
reset: Fix memory leak in reset_control_array_put()
Here is a mix of a number of IIO driver fixes for 5.4-rc7, and a whole
new staging driver.
The IIO fixes resolve some reported issues, all are tiny.
The staging driver addition is the vboxsf filesystem, which is the
VirtualBox guest shared folder code. Hans has been trying to get
filesystem reviewers to review the code for many months now, and
Christoph finally said to just merge it in staging now as it is
stand-alone and the filesystem people can review it easier over time
that way.
I know it's late for this big of an addition, but it is stand-alone.
The code has been in linux-next for a while, long enough to pick up a
few tiny fixes for it already so people are looking at it.
All of these have been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXcgvVg8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ylt6gCdG1hmZiOXhmoO1HBBFILqJkrzEVEAn3amZIJJ
n0gz5/FDfQVFGl/PpKCE
=u/mY
-----END PGP SIGNATURE-----
Merge tag 'staging-5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull IIO fixes and staging driver from Greg KH:
"Here is a mix of a number of IIO driver fixes for 5.4-rc7, and a whole
new staging driver.
The IIO fixes resolve some reported issues, all are tiny.
The staging driver addition is the vboxsf filesystem, which is the
VirtualBox guest shared folder code. Hans has been trying to get
filesystem reviewers to review the code for many months now, and
Christoph finally said to just merge it in staging now as it is
stand-alone and the filesystem people can review it easier over time
that way.
I know it's late for this big of an addition, but it is stand-alone.
The code has been in linux-next for a while, long enough to pick up a
few tiny fixes for it already so people are looking at it.
All of these have been in linux-next with no reported issues"
* tag 'staging-5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
staging: Fix error return code in vboxsf_fill_super()
staging: vboxsf: fix dereference of pointer dentry before it is null checked
staging: vboxsf: Remove unused including <linux/version.h>
staging: Add VirtualBox guest shared folder (vboxsf) support
iio: adc: stm32-adc: fix stopping dma
iio: imu: inv_mpu6050: fix no data on MPU6050
iio: srf04: fix wrong limitation in distance measuring
iio: imu: adis16480: make sure provided frequency is positive
Here are a number of late-arrival driver fixes for issues reported for
some char/misc drivers for 5.4-rc7
These all come from the different subsystem/driver maintainers as things
that they had reports for and wanted to see fixed.
All of these have been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCXcguaA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yk9agCdHawCyOWaCfWrUF+66DfI0ql5HhQAoKI0n4yT
7N8GuJ5KsVKmtkkg9Oww
=CZY+
-----END PGP SIGNATURE-----
Merge tag 'char-misc-5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are a number of late-arrival driver fixes for issues reported for
some char/misc drivers for 5.4-rc7
These all come from the different subsystem/driver maintainers as
things that they had reports for and wanted to see fixed.
All of these have been in linux-next with no reported issues"
* tag 'char-misc-5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
intel_th: pci: Add Jasper Lake PCH support
intel_th: pci: Add Comet Lake PCH support
intel_th: msu: Fix possible memory leak in mode_store()
intel_th: msu: Fix overflow in shift of an unsigned int
intel_th: msu: Fix missing allocation failure check on a kstrndup
intel_th: msu: Fix an uninitialized mutex
intel_th: gth: Fix the window switching sequence
soundwire: slave: fix scanf format
soundwire: intel: fix intel_register_dai PDI offsets and numbers
interconnect: Add locking in icc_set_tag()
interconnect: qcom: Fix icc_onecell_data allocation
soundwire: depend on ACPI || OF
soundwire: depend on ACPI
thunderbolt: Drop unnecessary read when writing LC command in Ice Lake
thunderbolt: Fix lockdep circular locking depedency warning
thunderbolt: Read DP IN adapter first two dwords in one go
- fix a regression from this merge window in the configfs
symlink handling (Honggang Li)
-----BEGIN PGP SIGNATURE-----
iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl3IG00LHGhjaEBsc3Qu
ZGUACgkQD55TZVIEUYPJbBAAw57uaM2mKcUe1syqwsaliqYVGqZ7X+4sMdtwNPAI
W01aolnmmukcWInlZBqiHFu3+iAVYJND9NvP7ZA+p6/meJfWiIrgWMEVS4X6HFpg
qsBCP1qh8BvLxCzHC+i9tkgPMpicYjImE/pza99ZtrMHTRC7ao5I3GKbWBIdEpDv
5NZDbt2g79Y1YfqGRo0xcY0etwpau+iN4rXivjj4qMO1o+Vt4rWlGZehhD2/r2M0
NGeR3JpYe1MdxvyECMoDI+aWuOiDFrJ/ZfWfTuCskqwTyZ1BtKElBnqS6VFn7yxL
XqPNwe6Sr9fz7RRZXQ1iH8O/SYct25xVyc6JQlAvcpp0emAUsj2bNAQ3Hj4W3Krw
h1D5HKvte+CjIqZEnFlqI6GeHawdbSpJoE1VOECS1rXpYhvs5V+2e5RyT2gmj1Pp
X58Q0xF5Ver2sF0NkhAMTxXL77L8cpRDVljveBXgUh/MzTdPcq1h/RsKXwwVBD23
Vsmg4qKlBAWJLK1TJ4wvvmmp0N/XzcaNWm7cKCxJDBlzY8LpcVTmUJfGty96mHqb
cRZLMNcWbtPFSkHfzkYeuWWnhhtgXBmEcTawOd3y0s+jd8wjhRr2wuhL/MSCicJG
t0A+4/5p4s8H94OAiq9tF8yOAQpvzmrrunW+lyyOQchK4kBBM8rD+cMIRziL+u/2
B84=
=kgn+
-----END PGP SIGNATURE-----
Merge tag 'configfs-for-5.4-2' of git://git.infradead.org/users/hch/configfs
Pull configfs regression fix from Christoph Hellwig:
"Fix a regression from this merge window in the configfs symlink
handling (Honggang Li)"
* tag 'configfs-for-5.4-2' of git://git.infradead.org/users/hch/configfs:
configfs: calculate the depth of parent item
Pull x86 fixes from Thomas Gleixner:
"A small set of fixes for x86:
- Make the tsc=reliable/nowatchdog command line parameter work again.
It was broken with the introduction of the early TSC clocksource.
- Prevent the evaluation of exception stacks before they are set up.
This causes a crash in dumpstack because the stack walk termination
gets screwed up.
- Prevent a NULL pointer dereference in the rescource control file
system.
- Avoid bogus warnings about APIC id mismatch related to the LDR
which can happen when the LDR is not in use and therefore not
initialized. Only evaluate that when the APIC is in logical
destination mode"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/tsc: Respect tsc command line paraemeter for clocksource_tsc_early
x86/dumpstack/64: Don't evaluate exception stacks before setup
x86/apic/32: Avoid bogus LDR warnings
x86/resctrl: Prevent NULL pointer dereference when reading mondata
Pull timer fixes from Thomas Gleixner:
"A small set of fixes for timekeepoing and clocksource drivers:
- VDSO data was updated conditional on the availability of a VDSO
capable clocksource. This causes the VDSO functions which do not
depend on a VDSO capable clocksource to operate on stale data.
Always update unconditionally.
- Prevent a double free in the mediatek driver
- Use the proper helper in the sh_mtu2 driver so it won't attempt to
initialize non-existing interrupts"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timekeeping/vsyscall: Update VDSO data unconditionally
clocksource/drivers/sh_mtu2: Do not loop using platform_get_irq_by_name()
clocksource/drivers/mediatek: Fix error handling
Pull scheduler fixes from Thomas Gleixner:
"Two fixes for scheduler regressions:
- Plug a subtle race condition which was introduced with the rework
of the next task selection functionality. The change of task
properties became unprotected which can be observed inconsistently
causing state corruption.
- A trivial compile fix for CONFIG_CGROUPS=n"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Fix pick_next_task() vs 'change' pattern race
sched/core: Fix compilation error when cgroup not selected
Pull perf tooling fixes from Thomas Gleixner:
- Fix the time sorting algorithm which was broken due to truncation of
big numbers
- Fix the python script generator fail caused by a broken tracepoint
array iterator
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools: Fix time sorting
perf tools: Remove unused trace_find_next_event()
perf scripting engines: Iterate on tep event arrays directly
Pull irq fixlet from Thomas Gleixner:
"A trivial fix for a kernel doc regression where an argument change was
not reflected in the documentation"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irq/irqdomain: Update __irq_domain_alloc_fwnode() function documentation
Pull stacktrace fix from Thomas Gleixner:
"A small fix for a stacktrace regression.
Saving a stacktrace for a foreign task skipped an extra entry which
makes e.g. the output of /proc/$PID/stack incomplete"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
stacktrace: Don't skip first entry on noncurrent tasks