IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
toolchains which do not generate section symbols anymore. And a cleanup
ontop.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmAWi8EACgkQEsHwGGHe
VUrZhBAAmbaEBdU05+ah24r+XHLsCJBQwlwHAr71YfNnEpq/KRHXLtc3QJLAdOVf
ku0536gDJvMUX7d7ap2ooSVAD9Ed1h4N7BvOn8eLFiaPc6NG9Tw6FZc/X6OKtyLd
DyFOsNAa9JnjjeqT7TTYdqbcJUzPSqd3Ufg5V4UZcVwvGTkbc+k1TktnteTMXWUI
t99wXCOfw2accdUrr3MIkdvSGNo099VZa/DBZQVmpjCcMSOfe/0KQIoeVagEpAew
T0WxONdM62Nz4Tv03N6m6EqVpIOc8BueRuOWlX/c5XVCmYx8BDSdFb6EY9sEh10i
hLU1U36BCUT1uAA/ZAuw/I22fy5MXqbrGvWJrcW8Wav1fQfaDYkDyGNE+aBjXysQ
uZGTzbfAdAS2B8XTElzYJZwh1WW7Je7b2pZhL5/6kwoa8E82NsR7a2inl6pdkKin
LcrLlxrSZYbAjhYuA3Da4iErvtu/UloQwfDhga7NasGdVQzlwUQBX67Tgt1PA9B2
JWoeY1NKBGboNEQa3NWq37yCtfcpx2hL4wWgyUbj0TMOXO06V/ZhrPzIQDrMmVGx
g52NrYnH/CujrKgWH3+Q+kBWA/BSVP5p3UnhLCDM1X7dyZiimuLOJNDUQ9WldENV
rsGgKyW3/6F4UzmqLr0oOB6X9/2v15LSktN9BJtv3UWUCl/PfXU=
=VV7d
-----END PGP SIGNATURE-----
Merge tag 'x86_entry_for_v5.11_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Borislav Petkov:
"A single fix for objtool to generate proper unwind info for newer
toolchains which do not generate section symbols anymore. And a
cleanup ontop.
This was originally going to go during the next merge window but
people can already trigger a build error with binutils-2.36 which
doesn't emit section symbols - something which objtool relies on - so
let's expedite it"
* tag 'x86_entry_for_v5.11_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/entry: Remove put_ret_addr_in_rdi THUNK macro argument
x86/entry: Emit a symbol for register restoring thunk
drivers/net/can/dev.c
b552766c872f ("can: dev: prevent potential information leak in can_fill_info()")
3e77f70e7345 ("can: dev: move driver related infrastructure into separate subdir")
0a042c6ec991 ("can: dev: move netlink related code into seperate file")
Code move.
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
57ac4a31c483 ("net/mlx5e: Correctly handle changing the number of queues when the interface is down")
214baf22870c ("net/mlx5e: Support HTB offload")
Adjacent code changes
net/switchdev/switchdev.c
20776b465c0c ("net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP")
ffb68fc58e96 ("net: switchdev: remove the transaction structure from port object notifiers")
bae33f2b5afe ("net: switchdev: remove the transaction structure from port attributes")
Transaction parameter gets dropped otherwise keep the fix.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYBLX8QAKCRCAXGG7T9hj
vr2TAP4ylwxPVbf1l1V8zYCtCoNg087+Ubolr5kXXJkesG/nkgD6A2ix2oN1sC0Z
kbFBeZHqgP4AbVl7IhBALVFa1GPxWQg=
=NPGM
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.11-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- A fix for a regression introduced in 5.11 resulting in Xen dom0
having problems to correctly initialize Xenstore.
- A fix for avoiding WARN splats when booting as Xen dom0 with
CONFIG_AMD_MEM_ENCRYPT enabled due to a missing trap handler for the
#VC exception (even if the handler should never be called).
- A fix for the Xen bklfront driver adapting to the correct but
unexpected behavior of new qemu.
* tag 'for-linus-5.11-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: avoid warning in Xen pv guest with CONFIG_AMD_MEM_ENCRYPT enabled
xen: Fix XenStore initialisation for XS_LOCAL
xen-blkfront: allow discard-* nodes to be optional
* Documentation fixes
* Avoid performance regression due to SEV-ES patches
ARM:
- Don't allow tagged pointers to point to memslots
- Filter out ARMv8.1+ PMU events on v8.0 hardware
- Hide PMU registers from userspace when no PMU is configured
- More PMU cleanups
- Don't try to handle broken PSCI firmware
- More sys_reg() to reg_to_encoding() conversions
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmAQSWEUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroMgBQgAlKke5b6SljZs19uYbzwpOClI5V9M
x5Sl9doPI/8TgM1/Oe/T3THb4iulVAFOFulFt+gAGeEJPtQN42bQA0iSqqKOXsox
ordveCZ90hgo8SE+F3vcL4k7DAZw7eoHyQxJJ7L5UmMKosQy+ujouVYyxMeBsmvK
mZWtqPrAK+1Uot6YQu52SPCqqUuIAVd/loGR6s4TZgnaiOykIKWsLOBHLx0mfAI1
gCXdbJ6kMSH6+y+WCmETnqoo6n86a9bcbXEfs36vDpguolw+6I9pqfFUC01StjYa
bm+Ffxf9QYtuxXV51Z2VWHWtbCmYdSDxx4oeXKduLDiw65vWovcY+5KzbA==
=HJUK
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
- x86 bugfixes
- Documentation fixes
- Avoid performance regression due to SEV-ES patches
- ARM:
- Don't allow tagged pointers to point to memslots
- Filter out ARMv8.1+ PMU events on v8.0 hardware
- Hide PMU registers from userspace when no PMU is configured
- More PMU cleanups
- Don't try to handle broken PSCI firmware
- More sys_reg() to reg_to_encoding() conversions
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: allow KVM_REQ_GET_NESTED_STATE_PAGES outside guest mode for VMX
KVM: x86: Revert "KVM: x86: Mark GPRs dirty when written"
KVM: SVM: Unconditionally sync GPRs to GHCB on VMRUN of SEV-ES guest
KVM: nVMX: Sync unsync'd vmcs02 state to vmcs12 on migration
kvm: tracing: Fix unmatched kvm_entry and kvm_exit events
KVM: Documentation: Update description of KVM_{GET,CLEAR}_DIRTY_LOG
KVM: x86: get smi pending status correctly
KVM: x86/pmu: Fix HW_REF_CPU_CYCLES event pseudo-encoding in intel_arch_events[]
KVM: x86/pmu: Fix UBSAN shift-out-of-bounds warning in intel_pmu_refresh()
KVM: x86: Add more protection against undefined behavior in rsvd_bits()
KVM: Documentation: Fix spec for KVM_CAP_ENABLE_CAP_VM
KVM: Forbid the use of tagged userspace addresses for memslots
KVM: arm64: Filter out v8.1+ events on v8.0 HW
KVM: arm64: Compute TPIDR_EL2 ignoring MTE tag
KVM: arm64: Use the reg_to_encoding() macro instead of sys_reg()
KVM: arm64: Allow PSCI SYSTEM_OFF/RESET to return
KVM: arm64: Simplify handling of absent PMU system registers
KVM: arm64: Hide PMU registers from userspace when not available
VMX also uses KVM_REQ_GET_NESTED_STATE_PAGES for the Hyper-V eVMCS,
which may need to be loaded outside guest mode. Therefore we cannot
WARN in that case.
However, that part of nested_get_vmcs12_pages is _not_ needed at
vmentry time. Split it out of KVM_REQ_GET_NESTED_STATE_PAGES handling,
so that both vmentry and migration (and in the latter case, independent
of is_guest_mode) do the parts that are needed.
Cc: <stable@vger.kernel.org> # 5.10.x: f2c7ef3ba: KVM: nSVM: cancel KVM_REQ_GET_NESTED_STATE_PAGES
Cc: <stable@vger.kernel.org> # 5.10.x
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Revert the dirty/available tracking of GPRs now that KVM copies the GPRs
to the GHCB on any post-VMGEXIT VMRUN, even if a GPR is not dirty. Per
commit de3cd117ed2f ("KVM: x86: Omit caching logic for always-available
GPRs"), tracking for GPRs noticeably impacts KVM's code footprint.
This reverts commit 1c04d8c986567c27c56c05205dceadc92efb14ff.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210122235049.3107620-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Drop the per-GPR dirty checks when synchronizing GPRs to the GHCB, the
GRPs' dirty bits are set from time zero and never cleared, i.e. will
always be seen as dirty. The obvious alternative would be to clear
the dirty bits when appropriate, but removing the dirty checks is
desirable as it allows reverting GPR dirty+available tracking, which
adds overhead to all flavors of x86 VMs.
Note, unconditionally writing the GPRs in the GHCB is tacitly allowed
by the GHCB spec, which allows the hypervisor (or guest) to provide
unnecessary info; it's the guest's responsibility to consume only what
it needs (the hypervisor is untrusted after all).
The guest and hypervisor can supply additional state if desired but
must not rely on that additional state being provided.
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Fixes: 291bd20d5d88 ("KVM: SVM: Add initial support for a VMGEXIT VMEXIT")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210122235049.3107620-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Even when we are outside the nested guest, some vmcs02 fields
may not be in sync vs vmcs12. This is intentional, even across
nested VM-exit, because the sync can be delayed until the nested
hypervisor performs a VMCLEAR or a VMREAD/VMWRITE that affects those
rarely accessed fields.
However, during KVM_GET_NESTED_STATE, the vmcs12 has to be up to date to
be able to restore it. To fix that, call copy_vmcs02_to_vmcs12_rare()
before the vmcs12 contents are copied to userspace.
Fixes: 7952d769c29ca ("KVM: nVMX: Sync rarely accessed guest fields only when needed")
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210114205449.8715-2-mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On VMX, if we exit and then re-enter immediately without leaving
the vmx_vcpu_run() function, the kvm_entry event is not logged.
That means we will see one (or more) kvm_exit, without its (their)
corresponding kvm_entry, as shown here:
CPU-1979 [002] 89.871187: kvm_entry: vcpu 1
CPU-1979 [002] 89.871218: kvm_exit: reason MSR_WRITE
CPU-1979 [002] 89.871259: kvm_exit: reason MSR_WRITE
It also seems possible for a kvm_entry event to be logged, but then
we leave vmx_vcpu_run() right away (if vmx->emulation_required is
true). In this case, we will have a spurious kvm_entry event in the
trace.
Fix these situations by moving trace_kvm_entry() inside vmx_vcpu_run()
(where trace_kvm_exit() already is).
A trace obtained with this patch applied looks like this:
CPU-14295 [000] 8388.395387: kvm_entry: vcpu 0
CPU-14295 [000] 8388.395392: kvm_exit: reason MSR_WRITE
CPU-14295 [000] 8388.395393: kvm_entry: vcpu 0
CPU-14295 [000] 8388.395503: kvm_exit: reason EXTERNAL_INTERRUPT
Of course, not calling trace_kvm_entry() in common x86 code any
longer means that we need to adjust the SVM side of things too.
Signed-off-by: Lorenzo Brescia <lorenzo.brescia@edu.unito.it>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Message-Id: <160873470698.11652.13483635328769030605.stgit@Wayrath>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The injection process of smi has two steps:
Qemu KVM
Step1:
cpu->interrupt_request &= \
~CPU_INTERRUPT_SMI;
kvm_vcpu_ioctl(cpu, KVM_SMI)
call kvm_vcpu_ioctl_smi() and
kvm_make_request(KVM_REQ_SMI, vcpu);
Step2:
kvm_vcpu_ioctl(cpu, KVM_RUN, 0)
call process_smi() if
kvm_check_request(KVM_REQ_SMI, vcpu) is
true, mark vcpu->arch.smi_pending = true;
The vcpu->arch.smi_pending will be set true in step2, unfortunately if
vcpu paused between step1 and step2, the kvm_run->immediate_exit will be
set and vcpu has to exit to Qemu immediately during step2 before mark
vcpu->arch.smi_pending true.
During VM migration, Qemu will get the smi pending status from KVM using
KVM_GET_VCPU_EVENTS ioctl at the downtime, then the smi pending status
will be lost.
Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
Signed-off-by: Shengen Zhuang <zhuangshengen@huawei.com>
Message-Id: <20210118084720.1585-1-jianjay.zhou@huawei.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The HW_REF_CPU_CYCLES event on the fixed counter 2 is pseudo-encoded as
0x0300 in the intel_perfmon_event_map[]. Correct its usage.
Fixes: 62079d8a4312 ("KVM: PMU: add proper support for fixed counter 2")
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20201230081916.63417-1-like.xu@linux.intel.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since we know vPMU will not work properly when (1) the guest bit_width(s)
of the [gp|fixed] counters are greater than the host ones, or (2) guest
requested architectural events exceeds the range supported by the host, so
we can setup a smaller left shift value and refresh the guest cpuid entry,
thus fixing the following UBSAN shift-out-of-bounds warning:
shift exponent 197 is too large for 64-bit type 'long long unsigned int'
Call Trace:
__dump_stack lib/dump_stack.c:79 [inline]
dump_stack+0x107/0x163 lib/dump_stack.c:120
ubsan_epilogue+0xb/0x5a lib/ubsan.c:148
__ubsan_handle_shift_out_of_bounds.cold+0xb1/0x181 lib/ubsan.c:395
intel_pmu_refresh.cold+0x75/0x99 arch/x86/kvm/vmx/pmu_intel.c:348
kvm_vcpu_after_set_cpuid+0x65a/0xf80 arch/x86/kvm/cpuid.c:177
kvm_vcpu_ioctl_set_cpuid2+0x160/0x440 arch/x86/kvm/cpuid.c:308
kvm_arch_vcpu_ioctl+0x11b6/0x2d70 arch/x86/kvm/x86.c:4709
kvm_vcpu_ioctl+0x7b9/0xdb0 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3386
vfs_ioctl fs/ioctl.c:48 [inline]
__do_sys_ioctl fs/ioctl.c:753 [inline]
__se_sys_ioctl fs/ioctl.c:739 [inline]
__x64_sys_ioctl+0x193/0x200 fs/ioctl.c:739
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
Reported-by: syzbot+ae488dc136a4cc6ba32b@syzkaller.appspotmail.com
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20210118025800.34620-1-like.xu@linux.intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add compile-time asserts in rsvd_bits() to guard against KVM passing in
garbage hardcoded values, and cap the upper bound at '63' for dynamic
values to prevent generating a mask that would overflow a u64.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20210113204515.3473079-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Merge misc fixes from Andrew Morton:
"18 patches.
Subsystems affected by this patch series: mm (pagealloc, memcg, kasan,
memory-failure, and highmem), ubsan, proc, and MAINTAINERS"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
MAINTAINERS: add a couple more files to the Clang/LLVM section
proc_sysctl: fix oops caused by incorrect command parameters
powerpc/mm/highmem: use __set_pte_at() for kmap_local()
mips/mm/highmem: use set_pte() for kmap_local()
mm/highmem: prepare for overriding set_pte_at()
sparc/mm/highmem: flush cache and TLB
mm: fix page reference leak in soft_offline_page()
ubsan: disable unsigned-overflow check for i386
kasan, mm: fix resetting page_alloc tags for HW_TAGS
kasan, mm: fix conflicts with init_on_alloc/free
kasan: fix HW_TAGS boot parameters
kasan: fix incorrect arguments passing in kasan_add_zero_shadow
kasan: fix unaligned address is unhandled in kasan_remove_zero_shadow
mm: fix numa stats for thp migration
mm: memcg: fix memcg file_dirty numa stat
mm: memcg/slab: optimize objcg stock draining
mm: fix initialization of struct page for holes in memory layout
x86/setup: don't remove E820_TYPE_RAM for pfn 0
single CPU vs such which are affine to only one CPU, mark per-cpu workqueue
threads as such and make sure that marking "survives" CPU hotplug. Fix CPU
hotplug issues with such kthreads.
- A fix to not push away tasks on CPUs coming online.
- Have workqueue CPU hotplug code use cpu_possible_mask when breaking affinity
on CPU offlining so that pending workers can finish on newly arrived onlined
CPUs too.
- Dump tasks which haven't vacated a CPU which is currently being unplugged.
- Register a special scale invariance callback which gets called on resume
from RAM to read out APERF/MPERF after resume and thus make the schedutil
scaling governor more precise.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmANYCAACgkQEsHwGGHe
VUo+OBAAjfqkijDlXiGX6lrT5gRx5NZICpeMgbWa7J13XHT1ysD/b0fMGFIUyF6k
aszDLTl8U/S1/qGAYlzTSPAFcdZ+ENiFqQ48ozMk4jZC3p0quHTjs/PdiSG6kYBi
+e4smht+bSyLKxsG8hN0kJ+mLEd+uIQ13kP4YkxPgWbJ9WNP/U6HHGBo0rBchtSe
Kn6bdd8CfwmC6rSazp7kdQoFoWeQaoMI1ODX3VphK1GtL1wq8WSICzRhpg3caeyG
3lCIddoNW9mCA9Nkc6R6HeV3uW9JGkPAjnmtTIEHDbg9pib7xNT978ieTQuqNDCi
DlAHDGumzoaiVJZhD/1fj/RXMJr2YUHxtrXWNsXpiKJ9g8Tn+WC0UW/4+Mx2L/km
0RSoXJlMs1fGopS2I/fObZ6RPhmg4D+gJsMCdaHQzX4NgxZAGhNNPxMckZ0IM8A0
2NNXSHUZHVTHeJEW0E/glOcpWb5hG+vDwiBMNEWfTwYpTfrw2EEOZaKniZE7WlSL
4ItM9rkLGl1KToJzAH4A0oUtSy3vtSCo8B1noGlc09Lj+oCIBlr81z9+C79a2oxG
qE7Xd4X7y7Qs3JeCbRZWQa7/2Kf1v4XnjELrJJeCZC85r0ZqJDwRX8w7lkmW2XPU
m4J2prr/DDZSqrRh23/xC1fsU+vcBKSfKUFKAH4Lg2VIaUfSUEk=
=2DAF
-----END PGP SIGNATURE-----
Merge tag 'sched_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Borislav Petkov:
- Correct the marking of kthreads which are supposed to run on a
specific, single CPU vs such which are affine to only one CPU, mark
per-cpu workqueue threads as such and make sure that marking
"survives" CPU hotplug. Fix CPU hotplug issues with such kthreads.
- A fix to not push away tasks on CPUs coming online.
- Have workqueue CPU hotplug code use cpu_possible_mask when breaking
affinity on CPU offlining so that pending workers can finish on newly
arrived onlined CPUs too.
- Dump tasks which haven't vacated a CPU which is currently being
unplugged.
- Register a special scale invariance callback which gets called on
resume from RAM to read out APERF/MPERF after resume and thus make
the schedutil scaling governor more precise.
* tag 'sched_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Relax the set_cpus_allowed_ptr() semantics
sched: Fix CPU hotplug / tighten is_per_cpu_kthread()
sched: Prepare to use balance_push in ttwu()
workqueue: Restrict affinity change to rescuer
workqueue: Tag bound workers with KTHREAD_IS_PER_CPU
kthread: Extract KTHREAD_IS_PER_CPU
sched: Don't run cpu-online with balance_push() enabled
workqueue: Use cpu_possible_mask instead of cpu_active_mask to break affinity
sched/core: Print out straggler tasks in sched_cpu_dying()
x86: PM: Register syscore_ops for scale invariance
- Differentiate which aspects of the FPU state get saved/restored when the FPU
is used in-kernel and fix a boot crash on K7 due to early MXCSR access before
CR4.OSFXSR is even set.
- A couple of noinstr annotation fixes
- Correct die ID setting on AMD for users of topology information which need
the correct die ID
- A SEV-ES fix to handle string port IO to/from kernel memory properly
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmANUr0ACgkQEsHwGGHe
VUos4hAAlBik/z+y+DaZGJyxtpST2YQaEbwbW3UMqyLsdVnLTTRnKzC1T+fEfD2Q
SxtCPYH5iuPbCgOOoQboWt6Aa53JlX9bRBZ/87Ub/ELJ9NgMxMQFXAiaDZAAY6Zy
L2B13KpoGOifPjrGDgksnafyqYv1CYesiArfOffHgvC3/0j7ONdda2SRDQ697TBw
FSV/WfUjCo0+JdXRRaP6YH5t9MxFerHxVH38xTDFwXikS9CVyddosLo5EP2wAQvi
5+160i2jB25vyMEsFBr5wE0xDpWLUdClVpzHXXPG2i0P+NHATiBcreTMPzeYOUXu
Hfc/y4ukOVDoMGlHLNKHq89alI87soMJIEjm2sAG1ZIypKyMJw7YUXQNRR3TcP0U
c7/C3W1mCWD1+8nLtlIMM0Z20DacQOf9YWko95+uh08+S52KpTOgnx+mpoZjK1PQ
Wv9HxPJKycrgRNhfverN5FSiOEW/DdvqNfVHTjuuzNLyKdM1NoZ/YTIyABk4RfFq
USUnC5rk4GqvCYdaLTEKkAJvLCmRKgVYd75Rc4/pPKILS6kv82vpj3BjClBaH0h1
yrvpafvXzOhwKP/J5q0vm57NJdqPZwuW4Ah+74tptmQL4rga84U4FOs3JpNJq0uu
1mj6xSFD8ZyI11BSkYbZAHTy1eNERze+azftCSPq/6EifYvqnsE=
=3rZM
-----END PGP SIGNATURE-----
Merge tag 'x86_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Add a new Intel model number for Alder Lake
- Differentiate which aspects of the FPU state get saved/restored when
the FPU is used in-kernel and fix a boot crash on K7 due to early
MXCSR access before CR4.OSFXSR is even set.
- A couple of noinstr annotation fixes
- Correct die ID setting on AMD for users of topology information which
need the correct die ID
- A SEV-ES fix to handle string port IO to/from kernel memory properly
* tag 'x86_urgent_for_v5.11_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Add another Alder Lake CPU to the Intel family
x86/mmx: Use KFPU_387 for MMX string operations
x86/fpu: Add kernel_fpu_begin_mask() to selectively initialize state
x86/topology: Make __max_die_per_package available unconditionally
x86: __always_inline __{rd,wr}msr()
x86/mce: Remove explicit/superfluous tracing
locking/lockdep: Avoid noinstr warning for DEBUG_LOCKDEP
locking/lockdep: Cure noinstr fail
x86/sev: Fix nonistr violation
x86/entry: Fix noinstr fail
x86/cpu/amd: Set __max_die_per_package on AMD
x86/sev-es: Handle string port IO to kernel memory properly
Patch series "mm: fix initialization of struct page for holes in memory layout", v3.
Commit 73a6e474cb37 ("mm: memmap_init: iterate over memblock regions
rather that check each PFN") exposed several issues with the memory map
initialization and these patches fix those issues.
Initially there were crashes during compaction that Qian Cai reported
back in April [1]. It seemed back then that the problem was fixed, but
a few weeks ago Andrea Arcangeli hit the same bug [2] and there was an
additional discussion at [3].
[1] https://lore.kernel.org/lkml/8C537EB7-85EE-4DCF-943E-3CC0ED0DF56D@lca.pw
[2] https://lore.kernel.org/lkml/20201121194506.13464-1-aarcange@redhat.com
[3] https://lore.kernel.org/mm-commits/20201206005401.qKuAVgOXr%akpm@linux-foundation.org
This patch (of 2):
The first 4Kb of memory is a BIOS owned area and to avoid its allocation
for the kernel it was not listed in e820 tables as memory. As the result,
pfn 0 was never recognised by the generic memory management and it is not
a part of neither node 0 nor ZONE_DMA.
If set_pfnblock_flags_mask() would be ever called for the pageblock
corresponding to the first 2Mbytes of memory, having pfn 0 outside of
ZONE_DMA would trigger
VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page), pfn), page);
Along with reserving the first 4Kb in e820 tables, several first pages are
reserved with memblock in several places during setup_arch(). These
reservations are enough to ensure the kernel does not touch the BIOS area
and it is not necessary to remove E820_TYPE_RAM for pfn 0.
Remove the update of e820 table that changes the type of pfn 0 and move
the comment describing why it was done to trim_low_memory_range() that
reserves the beginning of the memory.
Link: https://lkml.kernel.org/r/20210111194017.22696-2-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Qian Cai <cai@lca.pw>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add Alder Lake mobile CPU model number to Intel family.
Signed-off-by: Gayatri Kammela <gayatri.kammela@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210121215004.11618-1-tony.luck@intel.com
The default kernel_fpu_begin() doesn't work on systems that support XMM but
haven't yet enabled CR4.OSFXSR. This causes crashes when _mmx_memcpy() is
called too early because LDMXCSR generates #UD when the aforementioned bit
is clear.
Fix it by using kernel_fpu_begin_mask(KFPU_387) explicitly.
Fixes: 7ad816762f9b ("x86/fpu: Reset MXCSR to default in kernel_fpu_begin()")
Reported-by: Krzysztof Mazur <krzysiek@podlesie.net>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Krzysztof Piotr Olędzki <ole@ans.pl>
Tested-by: Krzysztof Mazur <krzysiek@podlesie.net>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/e7bf21855fe99e5f3baa27446e32623358f69e8d.1611205691.git.luto@kernel.org
Currently, requesting kernel FPU access doesn't distinguish which parts of
the extended ("FPU") state are needed. This is nice for simplicity, but
there are a few cases in which it's suboptimal:
- The vast majority of in-kernel FPU users want XMM/YMM/ZMM state but do
not use legacy 387 state. These users want MXCSR initialized but don't
care about the FPU control word. Skipping FNINIT would save time.
(Empirically, FNINIT is several times slower than LDMXCSR.)
- Code that wants MMX doesn't want or need MXCSR initialized.
_mmx_memcpy(), for example, can run before CR4.OSFXSR gets set, and
initializing MXCSR will fail because LDMXCSR generates an #UD when the
aforementioned CR4 bit is not set.
- Any future in-kernel users of XFD (eXtended Feature Disable)-capable
dynamic states will need special handling.
Add a more specific API that allows callers to specify exactly what they
want.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Krzysztof Piotr Olędzki <ole@ans.pl>
Link: https://lkml.kernel.org/r/aff1cac8b8fc7ee900cf73e8f2369966621b053f.1611205691.git.luto@kernel.org
Conflicts:
drivers/net/can/dev.c
commit 03f16c5075b2 ("can: dev: can_restart: fix use after free bug")
commit 3e77f70e7345 ("can: dev: move driver related infrastructure into separate subdir")
Code move.
drivers/net/dsa/b53/b53_common.c
commit 8e4052c32d6b ("net: dsa: b53: fix an off by one in checking "vlan->vid"")
commit b7a9e0da2d1c ("net: switchdev: remove vid_begin -> vid_end range from VLAN objects")
Field rename.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYAhFXwAKCRCAXGG7T9hj
vjCeAQCR74dTRa34VJNDWdDME1NMSlFln8bIL+lvrC3/2p2E5QD/eLcgVFdDpeZ9
dMjKzcddPL1mI99SurwNmyai4U9d6A4=
=PbJO
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.11-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fix from Juergen Gross:
"A fix for build failure showing up in some configurations"
* tag 'for-linus-5.11-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: fix 'nopvspin' build error
-----BEGIN PGP SIGNATURE-----
iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmAGxt0THHdlaS5saXVA
a2VybmVsLm9yZwAKCRB2FHBfkEGgXjMhCACtIFiJERxNFqKLVn1s+2IyA3+Lv0UI
Suyl+tUTgiN3WzPNAbYfd6cYKk/JvAF0KvCtvuFfRS+oRJrT8+ONBBRxSzpDixGO
aZfvzObOstWUQ0XsP2Q+S4YZlsLHtG76sUSsWCB8wdsnG0hujbrQ43M1Vj15EQYj
EKFmsJfxeMSjekaI7QRkU7LR9gZgacCWthCbn4KFgtXCREA1sI5OUnW9NAei1MSG
M/adfAP4xqoBpNjXq+6M2Ha4H5SmtZRlOQ4bk6ZgebhG0T9LT7UBPqaHHed8CcOj
PzreKfYYYGExhJwLKuCji7r6kSGZcZ6kiIMdFm+vliSK/QJeFCFclhdX
=c3XM
-----END PGP SIGNATURE-----
Merge tag 'hyperv-fixes-signed-20210119' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv fix from Wei Liu:
"One patch from Dexuan to fix clockevent initialization"
* tag 'hyperv-fixes-signed-20210119' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
x86/hyperv: Initialize clockevents after LAPIC is initialized
On x86 scale invariace tends to be disabled during resume from
suspend-to-RAM, because the MPERF or APERF MSR values are not as
expected then due to updates taking place after the platform
firmware has been invoked to complete the suspend transition.
That, of course, is not desirable, especially if the schedutil
scaling governor is in use, because the lack of scale invariance
causes it to be less reliable.
To counter that effect, modify init_freq_invariance() to register
a syscore_ops object for scale invariance with the ->resume callback
pointing to init_counter_refs() which will run on the CPU starting
the resume transition (the other CPUs will be taken care of the
"online" operations taking place later).
Fixes: e2b0d619b400 ("x86, sched: check for counters overflow in frequency invariant accounting")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Giovanni Gherdovich <ggherdovich@suse.cz>
Link: https://lkml.kernel.org/r/1803209.Mvru99baaF@kreacher
That logic is unused since
320100a5ffe5 ("x86/entry: Remove the TRACE_IRQS cruft")
Remove it.
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/YAAszZJ2GcIYZmB5@hirez.programming.kicks-ass.net
Fix build error in x86/xen/ when PARAVIRT_SPINLOCKS is not enabled.
Fixes this build error:
../arch/x86/xen/smp_hvm.c: In function ‘xen_hvm_smp_init’:
../arch/x86/xen/smp_hvm.c:77:3: error: ‘nopvspin’ undeclared (first use in this function)
nopvspin = true;
Fixes: 3d7746bea925 ("x86/xen: Fix xen_hvm_smp_init() when vector callback not available")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/20210115191123.27572-1-rdunlap@infradead.org
Signed-off-by: Juergen Gross <jgross@suse.com>
With commit 4df4cb9e99f8, the Hyper-V direct-mode STIMER is actually
initialized before LAPIC is initialized: see
apic_intr_mode_init()
x86_platform.apic_post_init()
hyperv_init()
hv_stimer_alloc()
apic_bsp_setup()
setup_local_APIC()
setup_local_APIC() temporarily disables LAPIC, initializes it and
re-eanble it. The direct-mode STIMER depends on LAPIC, and when it's
registered, it can be programmed immediately and the timer can fire
very soon:
hv_stimer_init
clockevents_config_and_register
clockevents_register_device
tick_check_new_device
tick_setup_device
tick_setup_periodic(), tick_setup_oneshot()
clockevents_program_event
When the timer fires in the hypervisor, if the LAPIC is in the
disabled state, new versions of Hyper-V ignore the event and don't inject
the timer interrupt into the VM, and hence the VM hangs when it boots.
Note: when the VM starts/reboots, the LAPIC is pre-enabled by the
firmware, so the window of LAPIC being temporarily disabled is pretty
small, and the issue can only happen once out of 100~200 reboots for
a 40-vCPU VM on one dev host, and on another host the issue doesn't
reproduce after 2000 reboots.
The issue is more noticeable for kdump/kexec, because the LAPIC is
disabled by the first kernel, and stays disabled until the kdump/kexec
kernel enables it. This is especially an issue to a Generation-2 VM
(for which Hyper-V doesn't emulate the PIT timer) when CONFIG_HZ=1000
(rather than CONFIG_HZ=250) is used.
Fix the issue by moving hv_stimer_alloc() to a later place where the
LAPIC timer is initialized.
Fixes: 4df4cb9e99f8 ("x86/hyperv: Initialize clockevents earlier in CPU onlining")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20210116223136.13892-1-decui@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Daniel Borkmann says:
====================
pull-request: bpf-next 2021-01-16
1) Extend atomic operations to the BPF instruction set along with x86-64 JIT support,
that is, atomic{,64}_{xchg,cmpxchg,fetch_{add,and,or,xor}}, from Brendan Jackman.
2) Add support for using kernel module global variables (__ksym externs in BPF
programs) retrieved via module's BTF, from Andrii Nakryiko.
3) Generalize BPF stackmap's buildid retrieval and add support to have buildid
stored in mmap2 event for perf, from Jiri Olsa.
4) Various fixes for cross-building BPF sefltests out-of-tree which then will
unblock wider automated testing on ARM hardware, from Jean-Philippe Brucker.
5) Allow to retrieve SOL_SOCKET opts from sock_addr progs, from Daniel Borkmann.
6) Clean up driver's XDP buffer init and split into two helpers to init per-
descriptor and non-changing fields during processing, from Lorenzo Bianconi.
7) Minor misc improvements to libbpf & bpftool, from Ian Rogers.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (41 commits)
perf: Add build id data in mmap2 event
bpf: Add size arg to build_id_parse function
bpf: Move stack_map_get_build_id into lib
bpf: Document new atomic instructions
bpf: Add tests for new BPF atomic operations
bpf: Add bitwise atomic instructions
bpf: Pull out a macro for interpreting atomic ALU operations
bpf: Add instructions for atomic_[cmp]xchg
bpf: Add BPF_FETCH field / create atomic_fetch_add instruction
bpf: Move BPF_STX reserved field check into BPF_STX verifier code
bpf: Rename BPF_XADD and prepare to encode other atomics in .imm
bpf: x86: Factor out a lookup table for some ALU opcodes
bpf: x86: Factor out emission of REX byte
bpf: x86: Factor out emission of ModR/M for *(reg + off)
tools/bpftool: Add -Wall when building BPF programs
bpf, libbpf: Avoid unused function warning on bpf_tail_call_static
selftests/bpf: Install btf_dump test cases
selftests/bpf: Fix installation of urandom_read
selftests/bpf: Move generated test files to $(TEST_GEN_FILES)
selftests/bpf: Fix out-of-tree build
...
====================
Link: https://lore.kernel.org/r/20210116012922.17823-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYAGllQAKCRCAXGG7T9hj
vqEtAP9uws/W/JPcnsohK76hMcFAVxZCVdX7C3HvfW5tp6hqMgEAg9ic8sYiuHhn
6FouRu/ZXHJEg3PpS5W66yKNIYPvGgw=
=rR+L
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.11-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- A series to fix a regression when running as a fully virtualized
guest on an old Xen hypervisor not supporting PV interrupt callbacks
for HVM guests.
- A patch to add support to query Xen resource sizes (setting was
possible already) from user mode.
* tag 'for-linus-5.11-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
x86/xen: Fix xen_hvm_smp_init() when vector callback not available
x86/xen: Don't register Xen IPIs when they aren't going to be used
x86/xen: Add xen_no_vector_callback option to test PCI INTX delivery
xen: Set platform PCI device INTX affinity to CPU0
xen: Fix event channel callback via INTX/GSI
xen/privcmd: allow fetching resource sizes
This adds instructions for
atomic[64]_[fetch_]and
atomic[64]_[fetch_]or
atomic[64]_[fetch_]xor
All these operations are isomorphic enough to implement with the same
verifier, interpreter, and x86 JIT code, hence being a single commit.
The main interesting thing here is that x86 doesn't directly support
the fetch_ version these operations, so we need to generate a CMPXCHG
loop in the JIT. This requires the use of two temporary registers,
IIUC it's safe to use BPF_REG_AX and x86's AUX_REG for this purpose.
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210114181751.768687-10-jackmanb@google.com
This adds two atomic opcodes, both of which include the BPF_FETCH
flag. XCHG without the BPF_FETCH flag would naturally encode
atomic_set. This is not supported because it would be of limited
value to userspace (it doesn't imply any barriers). CMPXCHG without
BPF_FETCH woulud be an atomic compare-and-write. We don't have such
an operation in the kernel so it isn't provided to BPF either.
There are two significant design decisions made for the CMPXCHG
instruction:
- To solve the issue that this operation fundamentally has 3
operands, but we only have two register fields. Therefore the
operand we compare against (the kernel's API calls it 'old') is
hard-coded to be R0. x86 has similar design (and A64 doesn't
have this problem).
A potential alternative might be to encode the other operand's
register number in the immediate field.
- The kernel's atomic_cmpxchg returns the old value, while the C11
userspace APIs return a boolean indicating the comparison
result. Which should BPF do? A64 returns the old value. x86 returns
the old value in the hard-coded register (and also sets a
flag). That means return-old-value is easier to JIT, so that's
what we use.
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20210114181751.768687-8-jackmanb@google.com
The BPF_FETCH field can be set in bpf_insn.imm, for BPF_ATOMIC
instructions, in order to have the previous value of the
atomically-modified memory location loaded into the src register
after an atomic op is carried out.
Suggested-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210114181751.768687-7-jackmanb@google.com
A subsequent patch will add additional atomic operations. These new
operations will use the same opcode field as the existing XADD, with
the immediate discriminating different operations.
In preparation, rename the instruction mode BPF_ATOMIC and start
calling the zero immediate BPF_ADD.
This is possible (doesn't break existing valid BPF progs) because the
immediate field is currently reserved MBZ and BPF_ADD is zero.
All uses are removed from the tree but the BPF_XADD definition is
kept around to avoid breaking builds for people including kernel
headers.
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Björn Töpel <bjorn.topel@gmail.com>
Link: https://lore.kernel.org/bpf/20210114181751.768687-5-jackmanb@google.com
A later commit will need to lookup a subset of these opcodes. To
avoid duplicating code, pull out a table.
The shift opcodes won't be needed by that later commit, but they're
already duplicated, so fold them into the table anyway.
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210114181751.768687-4-jackmanb@google.com
The JIT case for encoding atomic ops is about to get more
complicated. In order to make the review & resulting code easier,
let's factor out some shared helpers.
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210114181751.768687-3-jackmanb@google.com
The case for JITing atomics is about to get more complicated. Let's
factor out some common code to make the review and result more
readable.
NB the atomics code doesn't yet use the new helper - a subsequent
patch will add its use as a side-effect of other changes.
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210114181751.768687-2-jackmanb@google.com
Arnd found a randconfig that produces the warning:
arch/x86/entry/thunk_64.o: warning: objtool: missing symbol for insn at
offset 0x3e
when building with LLVM_IAS=1 (Clang's integrated assembler). Josh
notes:
With the LLVM assembler not generating section symbols, objtool has no
way to reference this code when it generates ORC unwinder entries,
because this code is outside of any ELF function.
The limitation now being imposed by objtool is that all code must be
contained in an ELF symbol. And .L symbols don't create such symbols.
So basically, you can use an .L symbol *inside* a function or a code
segment, you just can't use the .L symbol to contain the code using a
SYM_*_START/END annotation pair.
Fangrui notes that this optimization is helpful for reducing image size
when compiling with -ffunction-sections and -fdata-sections. I have
observed on the order of tens of thousands of symbols for the kernel
images built with those flags.
A patch has been authored against GNU binutils to match this behavior
of not generating unused section symbols ([1]), so this will
also become a problem for users of GNU binutils once they upgrade to 2.36.
Omit the .L prefix on a label so that the assembler will emit an entry
into the symbol table for the label, with STB_LOCAL binding. This
enables objtool to generate proper unwind info here with LLVM_IAS=1 or
GNU binutils 2.36+.
[ bp: Massage commit message. ]
Reported-by: Arnd Bergmann <arnd@arndb.de>
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Suggested-by: Borislav Petkov <bp@alien8.de>
Suggested-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20210112194625.4181814-1-ndesaulniers@google.com
Link: https://github.com/ClangBuiltLinux/linux/issues/1209
Link: https://reviews.llvm.org/D93783
Link: https://sourceware.org/binutils/docs/as/Symbol-Names.html
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d1bcae833b32f1408485ce69f844dcd7ded093a8 [1]
Move it outside of CONFIG_SMP in order to avoid ifdeffery at the usage
sites.
Fixes: 76e2fc63ca40 ("x86/cpu/amd: Set __max_die_per_package on AMD")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210114111814.5346-1-bp@alien8.de
Only the IPI-related functions in the smp_ops should be conditional
on the vector callback being available. The rest should still happen:
• xen_hvm_smp_prepare_boot_cpu()
This function does two things, both of which should still happen if
there is no vector callback support.
The call to xen_vcpu_setup() for vCPU0 should still happen as it just
sets up the vcpu_info for CPU0. That does happen for the secondary
vCPUs too, from xen_cpu_up_prepare_hvm().
The second thing it does is call xen_init_spinlocks(), which perhaps
counter-intuitively should *also* still be happening in the case
without vector callbacks, so that it can clear its local xen_pvspin
flag and disable the virt_spin_lock_key accordingly.
Checking xen_have_vector_callback in xen_init_spinlocks() itself
would affect PV guests, so set the global nopvspin flag in
xen_hvm_smp_init() instead, when vector callbacks aren't available.
• xen_hvm_smp_prepare_cpus()
This does some IPI-related setup by calling xen_smp_intr_init() and
xen_init_lock_cpu(), which can be made conditional. And it sets the
xen_vcpu_id to XEN_VCPU_ID_INVALID for all possible CPUS, which does
need to happen.
• xen_smp_cpus_done()
This offlines any vCPUs which doesn't fit in the global shared_info
page, if separate vcpu_info placement isn't available. That part also
needs to happen regardless of vector callback support.
• xen_hvm_cpu_die()
This doesn't actually do anything other than commin_cpu_die() right
right now in the !vector_callback case; all three teardown functions
it calls should be no-ops. But to guard against future regressions
it's useful to call it anyway, and for it to explicitly check for
xen_have_vector_callback before calling those additional functions.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210106153958.584169-6-dwmw2@infradead.org
Signed-off-by: Juergen Gross <jgross@suse.com>
In the case where xen_have_vector_callback is false, we still register
the IPI vectors in xen_smp_intr_init() for the secondary CPUs even
though they aren't going to be used. Stop doing that.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210106153958.584169-5-dwmw2@infradead.org
Signed-off-by: Juergen Gross <jgross@suse.com>
It's useful to be able to test non-vector event channel delivery, to make
sure Linux will work properly on older Xen which doesn't have it.
It's also useful for those working on Xen and Xen-compatible hypervisors,
because there are guest kernels still in active use which use PCI INTX
even when vector delivery is available.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: https://lore.kernel.org/r/20210106153958.584169-4-dwmw2@infradead.org
Signed-off-by: Juergen Gross <jgross@suse.com>
When the compiler choses to not inline the trivial MSR helpers:
vmlinux.o: warning: objtool: __sev_es_nmi_complete()+0xce: call to __wrmsr.constprop.14() leaves .noinstr.text section
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested
Link: https://lore.kernel.org/r/X/bf3gV+BW7kGEsB@hirez.programming.kicks-ass.net
There's some explicit tracing left in exc_machine_check_kernel(),
remove it, as it's already implied by irqentry_nmi_enter().
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210106144017.719310466@infradead.org
When the compiler fails to inline, it violates nonisntr:
vmlinux.o: warning: objtool: __sev_es_nmi_complete()+0xc7: call to sev_es_wr_ghcb_msr() leaves .noinstr.text section
Fixes: 4ca68e023b11 ("x86/sev-es: Handle NMI State")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210106144017.532902065@infradead.org
Set the maximum DIE per package variable on AMD using the
NodesPerProcessor topology value. This will be used by RAPL, among
others, to determine the maximum number of DIEs on the system in order
to do per-DIE manipulations.
[ bp: Productize into a proper patch. ]
Fixes: 028c221ed190 ("x86/CPU/AMD: Save AMD NodeId as cpu_die_id")
Reported-by: Johnathan Smithinovic <johnathan.smithinovic@gmx.at>
Reported-by: Rafael Kitover <rkitover@gmail.com>
Signed-off-by: Yazen Ghannam <Yazen.Ghannam@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Johnathan Smithinovic <johnathan.smithinovic@gmx.at>
Tested-by: Rafael Kitover <rkitover@gmail.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=210939
Link: https://lkml.kernel.org/r/20210106112106.GE5729@zn.tnic
Link: https://lkml.kernel.org/r/20210111101455.1194-1-bp@alien8.de
Don't assume dest/source buffers are userspace addresses when manually
copying data for string I/O or MOVS MMIO, as {get,put}_user() will fail
if handed a kernel address and ultimately lead to a kernel panic.
When invoking INSB/OUTSB instructions in kernel space in a
SEV-ES-enabled VM, the kernel crashes with the following message:
"SEV-ES: Unsupported exception in #VC instruction emulation - can't continue"
Handle that case properly.
[ bp: Massage commit message. ]
Fixes: f980f9c31a92 ("x86/sev-es: Compile early handler code into kernel image")
Signed-off-by: Hyunwook (Wooky) Baek <baekhw@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: David Rientjes <rientjes@google.com>
Link: https://lkml.kernel.org/r/20210110071102.2576186-1-baekhw@google.com