IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
changes over sysfs have been done instead of waiting until something
else triggers the workqueue later - another error or the polling
interval cycle is reached
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmQXB9kACgkQEsHwGGHe
VUqqjA//YbcRx2PFcZT5nnuQlb6bptsluCUrHOcJVT/1fe0ayrlvahuw/QtSXRH4
Vwukc3+1cehp3CcSbHKAKOArTL7NV2tbk+EZQk+Ae+7QdRz/9TuEenL6ipCC1cr4
Z3Bo3KZmHlBcoJaQDcQWWIL8TiYAPXdqXWksh8q+0pxDI2wuFguFBJI84j+AUZH+
I4EDXLfzQn8RQZgiggEIez0aOIig74eaPfhHsNlqJJYG4x/EVgmRn9qJpYBGAeq6
xQR6NvHUTjCCZAASI1QJ/IT5rXD17iey3J/gIw3QZEhotBCCDdk5vh8S8zqDStRF
x3Za7qeC5m4HMfB/09v8HGeTitlaT0BYmM2CFOsru7I/qI+dJccDTwLmF8UY5Nj2
G6454A7ZEQ13lhfAoDIeVFfoSkqyXNz+McTtOQ8/xDJ5hnuNJ4WtT7sWemWZlV5S
l14xVFbojtGNmQygUGeL7cxl6h12Y9zFNwh1A5HzwH4EvywQJW7/35pxXEZIO3tl
EioXKe1eSLcKoD9VAv8icmstpwJl1Gm5Xge1oyw8cyTW6d3hM8ZOEqdTAJvRkfG7
LwPl3qC6Hrqhjc26WZ9pxmvR1hSYLWIidy6MlNeO9mf6wZR/ub+SmuHzy6n7TZl4
pTsVver93ZgS1J8CJ0ohCK1jHs+2aLvh/6qiJvIw9lbgAZ2HKPo=
=Q6Dq
-----END PGP SIGNATURE-----
Merge tag 'ras_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull RAS fix from Borislav Petkov:
- Flush out logged errors immediately after MCA banks configuration
changes over sysfs have been done instead of waiting until something
else triggers the workqueue later - another error or the polling
interval cycle is reached
* tag 'ras_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce: Make sure logged MCEs are processed after sysfs update
- Clear temporary storage in the resctrl code to prevent access to an
unexistent MSR
- Add a simple throttling mechanism to protect the hypervisor from potentially
malicious SEV guests issuing requests in rapid succession.
In order to not jeopardize the sanity of everyone involved in
maintaining this code, the request issuing side has received
a cleanup, split in more or less trivial, small and digestible pieces.
Otherwise, the code was threatening to become an unmaintainable mess.
Therefore, that cleanup is marked indirectly also for stable so that
there's no differences between the upstream code and the stable
variant when it comes down to backporting more there.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmQW/64ACgkQEsHwGGHe
VUoWzBAAl1KD4RR5EhrppOCl5mWtZmKUf+COag7RiqggXJyhCTXO+5N24dHcgoJB
h60gY7Nxg0CpZVbkMDSpJIuclmlMkiCLgUeuvN6E5ofgb/ZSv9nDuCXPUtLQ962d
T6071/v48G+2PVGm+PD1xAwP3065i3itVV/k6Xn8fxeXf/fq8L5eU5tADuFICI0b
dKbd7U+TEQAh5E6BUwms2G1P0glJqqL37H22fTcyxI6D2T/UJLlc4+or5JmTofDa
XJE/UHn+ZaGZYjhdr/BrlcxnY1jUTQH2K3wciADmNolkuCpDQJs6GgN98lXdhT34
vyWQVokHGEKE8Va6m5wZX90eKraSc27/0d5ZlHz/rIJgVBxp/VvCzqLUZRvkRwwk
k7bVOeZHe6P+b0QQl7uL9U2ff0sV/4PX0NLr+jzQdlA2ZYuTV6YgBDl7nAe1Tw/J
gJViAvDbm26mlTG1wQrvw9M2P4AQIYpEmD4KPs7j2aQafUgtGqfTBwyeKHXdtMLJ
TrkEISZZ8BVVvYghctN4R21IryUSnfq2eXxPwxUMh78SrO8sC23QJ5PVqKM/enF8
azf/ZBgANidzqJ44k2Ow2bnO0ZTYZblvl3NUCMNa5SjQmzAEUzupHEKUgV10MFMR
J3lspGU47BVeirFPWlCYKr+3Buwzur5xo5wCmrezxbN0fAo9k5M=
=6UGz
-----END PGP SIGNATURE-----
Merge tag 'x86_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
"There's a little bit more 'movement' in there for my taste but it
needs to happen and should make the code better after it.
- Check cmdline_find_option()'s return value before further
processing
- Clear temporary storage in the resctrl code to prevent access to an
unexistent MSR
- Add a simple throttling mechanism to protect the hypervisor from
potentially malicious SEV guests issuing requests in rapid
succession.
In order to not jeopardize the sanity of everyone involved in
maintaining this code, the request issuing side has received a
cleanup, split in more or less trivial, small and digestible
pieces. Otherwise, the code was threatening to become an
unmaintainable mess.
Therefore, that cleanup is marked indirectly also for stable so
that there's no differences between the upstream code and the
stable variant when it comes down to backporting more there"
* tag 'x86_urgent_for_v6.3_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Fix use of uninitialized buffer in sme_enable()
x86/resctrl: Clear staged_config[] before and after it is used
virt/coco/sev-guest: Add throttling awareness
virt/coco/sev-guest: Convert the sw_exit_info_2 checking to a switch-case
virt/coco/sev-guest: Do some code style cleanups
virt/coco/sev-guest: Carve out the request issuing logic into a helper
virt/coco/sev-guest: Remove the disable_vmpck label in handle_guest_request()
virt/coco/sev-guest: Simplify extended guest request handling
virt/coco/sev-guest: Check SEV_SNP attribute at probe time
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCZBQKJwAKCRCAXGG7T9hj
vuVgAQDhvr5mBFNqFxIfTnE8+oEsnYb0OgmR+9U3h+ECDB0P0gEAmR1fAee441YE
2DWOAlvjmqoI2K8DTTabizXvm7x3bQk=
=jcYl
-----END PGP SIGNATURE-----
Merge tag 'for-linus-6.3-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
- cleanup for xen time handling
- enable the VGA console in a Xen PVH dom0
- cleanup in the xenfs driver
* tag 'for-linus-6.3-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen: remove unnecessary (void*) conversions
x86/PVH: obtain VGA console info in Dom0
x86/xen/time: cleanup xen_tsc_safe_clocksource
xen: update arch/x86/include/asm/xen/cpuid.h
* A pair of fixes to the ASID allocator to avoid leaking stale mappings
between tasks.
* A fix to the vmalloc fault handler to tolerate huge pages.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmQUf1cTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYiafvD/4ixaHUMYFBBsw0Vo2kXaILmBYNZOmz
KoAHykqlg4TRZ0xtOK/iLcSsiDXVVbI91iBeKjrwOiJ2+Sk4gDm01JMhOK6eJh4I
boQAoRNgUBJLiKp7ZlybJ3R8yXw4VkKK0lJKNd9zOko+76Z8cQitsiwliWQwnpJw
jtKpzYZ8Plxki+0jUt7/21FUF0sy1UspgFTQdV6XfBGtIqVuVNgRLK4emjrKxl7s
fpkvQfD9ZPCuCNqg42o9VULK8fQfQSi5jt9POrGVKg7EaEHb7NfxttWxu/VkMBoI
cTa9zNSM4DYfmubOTqPoE4MxxmY294vii2JnoimQPDWlT9gGRD5Puf/rmm420cUE
yhsl4HdurDBRw3608pIfXWl9pTBo/doFImrQfY/IuGlR6Jy632NFFdPXa0vA/RoM
JBpAVJrUGRRo6w5B+GM5XVpxQNiBtMtGSVYNG2185Gtszlw6CebG31Da39kBPr2O
G/QFTVaZJnlHVqEJwOm/7TuYM/8u+uT6eiuYiRBcHImOIleUJPGYnDfG+dav3nln
E4DXBref4ikAZX794rEQnB6Ayt3Hl1E5lZ9HA+sezMNwv2zhT9rYAgF+oM8/A6FV
3JxcBmkNj3lqKzwNK85YOHE7us/5u+PY7HPrUngC7iORvh2wSh+AVfiu7mXdhrWD
e6NwgE4EoZOgqw==
=A1sl
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- fixes to the ASID allocator to avoid leaking stale mappings between
tasks
- fix the vmalloc fault handler to tolerate huge pages
* tag 'riscv-for-linus-6.3-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: mm: Support huge page in vmalloc_fault()
riscv: asid: Fixup stale TLB entry cause application crash
Revert "riscv: mm: notify remote harts about mmu cache updates"
- Update defconfigs.
- Fix early boot code by adding missing intersection check to prevent
potential overwriting of the ipl report.
- Fix a use-after-free issue in s390-specific code related to PCI
resources being retained after hot-unplugging individual functions,
by removing the resources from the PCI bus's resource list and using
the zpci_bar_struct's resource pointer directly.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmQULf8ACgkQjYWKoQLX
FBh2Zwf/QP0r5FHhU9MO5z00DOfXunP4jJcBW18i4owbjDvEGEJxuixE6KklHfPI
j918vduGI6YuVlhAfAQPPbFH4GWPc8HMlv/HSifWXq+VeDZToSKv9l0rZbE86blC
qNJs+MHWc9KSbEr2KUfI4/im9ENb5dGO00JLK0sueZKztY9wdVRIU3JVJGQgQSDU
BUuUMdiEu6ZZI4fatRumZCKO3V6B47sSc0erxDJ8K9xy8zJdSJ4YJR+WsQHoTFE2
Ap1q8TZz2PQ7hR3qUZy+iGuLWJX7TzyCsEBceRBnm9DGPp3gKfep0u9Dw83oHe5v
lmo9e34wS62cEGl8Pia+lwLUWYzXZg==
=hUmN
-----END PGP SIGNATURE-----
Merge tag 's390-6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:
- Update defconfigs
- Fix early boot code by adding missing intersection check to prevent
potential overwriting of the ipl report
- Fix a use-after-free issue in s390-specific code related to PCI
resources being retained after hot-unplugging individual functions,
by removing the resources from the PCI bus's resource list and using
the zpci_bar_struct's resource pointer directly
* tag 's390-6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: update defconfigs
PCI: s390: Fix use-after-free of PCI resources with per-function hotplug
s390/ipl: add missing intersection check to ipl_report handling
- Fix false detection of read faults, introduced by execute-only support.
- Fix a build failure when GENERIC_ALLOCATOR is not selected.
Thanks to: Russell Currey, Randy Dunlap, Michal Suchánek, Nathan Lynch,
Benjamin Gray.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmQT3MgTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgIvBD/4n/LI1Q1B4K5yl84DeSybcqVcK+C/h
p+Js7zBRRfGH7O1Orum3WSbHguBwa0iAJvCKJTcR/1FIyljUU9NtAbo2jc+JYRJc
Td/PMdG3XvG8eg87uELcE7VPixVskRAIolU/GRNZ+fJ8hSJDmJiomKl3IzTmAWsn
NF+RXKKvC3q7f/iVxBBSI4w+ZWC8geunLQIBhVQVjjC94sHDG/ONijBQjr7RUYBX
MjophiqzFFKX7y7v0Cd0mRwLpZTKb5DK0iQ/H2mc9S9ktI7HBhH/fvsOlSw9GjS0
41n3yvm2/RXVe2F2oZpYyU5aALOV9wpIqPb90LUE2p2+qpKZaW4AYN6Ts51iZahk
Wgi0Hm9YIDrdzU39PjrrdGT9u4Qc6kCfF18NKThN2K4RzeaGAVIw/UGEe2z2qhlx
RhGGtaC0D0/WbVYQHGf1X8Hi8vVyZ4XlAmR2mBIR1wqkox8GTA+r0AjaJ4o4Mscs
kRbcLoDR6iXbMNqN7zrPtepI8l6H9zdWrazQcH4X3Zck5nJYCC+Z5imJC+uay9XB
7rCKDZ+itcVFhYgJf0M5TwURaULAPb2pEVuu9hqIZENiit8EQmpU89w04tvmTqSS
NPkVDa8rFtfa+R9qvJOpyOMKY8mzIMmdY25YvdXP5l8NYZoLG0T4IixTpWFcGoLg
+mKla4fp2e4wxw==
=BXsq
-----END PGP SIGNATURE-----
Merge tag 'powerpc-6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fix false detection of read faults, introduced by execute-only
support
- Fix a build failure when GENERIC_ALLOCATOR is not selected
Thanks to Russell Currey, Randy Dunlap, Michal Suchánek, Nathan Lynch,
and Benjamin Gray.
* tag 'powerpc-6.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/mm: Fix false detection of read faults
powerpc/pseries: RTAS work area requires GENERIC_ALLOCATOR
* Address a rather annoying bug w.r.t. guest timer offsetting. The
synchronization of timer offsets between vCPUs was broken, leading to
inconsistent timer reads within the VM.
x86:
* New tests for the slow path of the EVTCHNOP_send Xen hypercall
* Add missing nVMX consistency checks for CR0 and CR4
* Fix bug that broke AMD GATag on 512 vCPU machines
Selftests:
* Skip hugetlb tests if huge pages are not available
* Sync KVM exit reasons
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmQQhBMUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroPJsAf/aqKQtRJH2YDHuS/OvlH546lgrPTY
zc2S187N4OofqKvm8HWAJOPravGI4Lkc3Jvlq2jPnlwl66musfako5YGXyyJesIP
9pc32jxwbhpHyp39tSTxlNbjE68E4Tau2iFa5n6fq/2BOEkZNGRhTDWPfbJV4yZO
JpkaguNm1nuZfKnRNxaaYhJwbqPIBc8l+Y3Q3nw6QLZHaNoupsd2pY3c4SuTYFcW
UxUaFtNkpXQxbwve0MWFLh/JztOzFhQcdMi3OSTBYZz32T0vncjXFDuARfKLNKyw
FgwkHgs2/d35AgE0JEwz1u6+/RMHvUheG08zkp8//lINfNgF/Cka7Dz2uA==
=B1LI
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM64:
- Address a rather annoying bug w.r.t. guest timer offsetting. The
synchronization of timer offsets between vCPUs was broken, leading
to inconsistent timer reads within the VM.
x86:
- New tests for the slow path of the EVTCHNOP_send Xen hypercall
- Add missing nVMX consistency checks for CR0 and CR4
- Fix bug that broke AMD GATag on 512 vCPU machines
Selftests:
- Skip hugetlb tests if huge pages are not available
- Sync KVM exit reasons"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: selftests: Sync KVM exit reasons in selftests
KVM: selftests: Add macro to generate KVM exit reason strings
KVM: selftests: Print expected and actual exit reason in KVM exit reason assert
KVM: selftests: Make vCPU exit reason test assertion common
KVM: selftests: Add EVTCHNOP_send slow path test to xen_shinfo_test
KVM: selftests: Use enum for test numbers in xen_shinfo_test
KVM: selftests: Add helpers to make Xen-style VMCALL/VMMCALL hypercalls
KVM: selftests: Move the guts of kvm_hypercall() to a separate macro
KVM: SVM: WARN if GATag generation drops VM or vCPU ID information
KVM: SVM: Modify AVIC GATag to support max number of 512 vCPUs
KVM: SVM: Fix a benign off-by-one bug in AVIC physical table mask
selftests: KVM: skip hugetlb tests if huge pages are not available
KVM: VMX: Use tabs instead of spaces for indentation
KVM: VMX: Fix indentation coding style issue
KVM: nVMX: remove unnecessary #ifdef
KVM: nVMX: add missing consistency checks for CR0 and CR4
KVM: arm64: timers: Convert per-vcpu virtual offset to a global value
cmdline_find_option() may fail before doing any initialization of
the buffer array. This may lead to unpredictable results when the same
buffer is used later in calls to strncmp() function. Fix the issue by
returning early if cmdline_find_option() returns an error.
Found by Linux Verification Center (linuxtesting.org) with static
analysis tool SVACE.
Fixes: aca20d546214 ("x86/mm: Add support to make use of Secure Memory Encryption")
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20230306160656.14844-1-n.zhandarovich@fintech.ru
As a temporary storage, staged_config[] in rdt_domain should be cleared
before and after it is used. The stale value in staged_config[] could
cause an MSR access error.
Here is a reproducer on a system with 16 usable CLOSIDs for a 15-way L3
Cache (MBA should be disabled if the number of CLOSIDs for MB is less than
16.) :
mount -t resctrl resctrl -o cdp /sys/fs/resctrl
mkdir /sys/fs/resctrl/p{1..7}
umount /sys/fs/resctrl/
mount -t resctrl resctrl /sys/fs/resctrl
mkdir /sys/fs/resctrl/p{1..8}
An error occurs when creating resource group named p8:
unchecked MSR access error: WRMSR to 0xca0 (tried to write 0x00000000000007ff) at rIP: 0xffffffff82249142 (cat_wrmsr+0x32/0x60)
Call Trace:
<IRQ>
__flush_smp_call_function_queue+0x11d/0x170
__sysvec_call_function+0x24/0xd0
sysvec_call_function+0x89/0xc0
</IRQ>
<TASK>
asm_sysvec_call_function+0x16/0x20
When creating a new resource control group, hardware will be configured
by the following process:
rdtgroup_mkdir()
rdtgroup_mkdir_ctrl_mon()
rdtgroup_init_alloc()
resctrl_arch_update_domains()
resctrl_arch_update_domains() iterates and updates all resctrl_conf_type
whose have_new_ctrl is true. Since staged_config[] holds the same values as
when CDP was enabled, it will continue to update the CDP_CODE and CDP_DATA
configurations. When group p8 is created, get_config_index() called in
resctrl_arch_update_domains() will return 16 and 17 as the CLOSIDs for
CDP_CODE and CDP_DATA, which will be translated to an invalid register -
0xca0 in this scenario.
Fix it by clearing staged_config[] before and after it is used.
[reinette: re-order commit tags]
Fixes: 75408e43509e ("x86/resctrl: Allow different CODE/DATA configurations to be staged")
Suggested-by: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Shawn Wang <shawnwang@linux.alibaba.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/2fad13f49fbe89687fc40e9a5a61f23a28d1507a.1673988935.git.reinette.chatre%40intel.com
To support detection of read faults with Radix execute-only memory, the
vma_is_accessible() check in access_error() (which checks for PROT_NONE)
was replaced with a check to see if VM_READ was missing, and if so,
returns true to assert the fault was caused by a bad read.
This is incorrect, as it ignores that both VM_WRITE and VM_EXEC imply
read on powerpc, as defined in protection_map[]. This causes mappings
containing VM_WRITE or VM_EXEC without VM_READ to misreport the cause of
page faults, since the MMU is still allowing reads.
Correct this by restoring the original vma_is_accessible() check for
PROT_NONE mappings, and adding a separate check for Radix PROT_EXEC-only
mappings.
Fixes: 395cac7752b9 ("powerpc/mm: Support execute-only memory on the Radix MMU")
Reported-by: Michal Suchánek <msuchanek@suse.de>
Link: https://lore.kernel.org/r/20230308152702.GR19419@kitsune.suse.cz
Tested-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230310050834.63105-1-ruscur@russell.cc
Since RISC-V supports ioremap() with huge page (pud/pmd) mapping,
However, vmalloc_fault() assumes that the vmalloc range is limited
to pte mappings. To complete the vmalloc_fault() function by adding
huge page support.
Fixes: 310f541a027b ("riscv: Enable HAVE_ARCH_HUGE_VMAP for 64BIT")
Cc: stable@vger.kernel.org
Signed-off-by: Dylan Jhong <dylan@andestech.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20230310075021.3919290-1-dylan@andestech.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
- Do not allow histogram values to have modifies.
Can cause a NULL pointer dereference if they do.
- Warn if hist_field_name() is passed a NULL.
Prevent the NULL pointer dereference mentioned above.
- Fix invalid address look up race in lookup_rec()
- Define ftrace_stub_graph conditionally to prevent linker errors
- Always check if RCU is watching at all tracepoint locations
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZBDuTBQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qsboAP4yfrFYvIIKM5EkzkEiPI+V2hdlA12x
bt839jO5AWCmhAEAiY8FmKatpBJQKsiGqSOab8aHOMnhGFZwltCHAPa9PAI=
=vtA2
-----END PGP SIGNATURE-----
Merge tag 'trace-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fixes from Steven Rostedt:
- Do not allow histogram values to have modifies. They can cause a NULL
pointer dereference if they do.
- Warn if hist_field_name() is passed a NULL. Prevent the NULL pointer
dereference mentioned above.
- Fix invalid address look up race in lookup_rec()
- Define ftrace_stub_graph conditionally to prevent linker errors
- Always check if RCU is watching at all tracepoint locations
* tag 'trace-v6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Make tracepoint lockdep check actually test something
ftrace,kcfi: Define ftrace_stub_graph conditionally
ftrace: Fix invalid address access in lookup_rec() when index is 0
tracing: Check field value in hist_field_name()
tracing: Do not let histogram values have some modifiers
A new platform-op was added to Xen to allow obtaining the same VGA
console information PV Dom0 is handed. Invoke the new function and have
the output data processed by xen_init_vga().
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/8f315e92-7bda-c124-71cc-478ab9c5e610@suse.com
Signed-off-by: Juergen Gross <jgross@suse.com>
WARN if generating a GATag given a VM ID and vCPU ID doesn't yield the
same IDs when pulling the IDs back out of the tag. Don't bother adding
error handling to callers, this is very much a paranoid sanity check as
KVM fully controls the VM ID and is supposed to reject too-big vCPU IDs.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Message-Id: <20230207002156.521736-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Define AVIC_VCPU_ID_MASK based on AVIC_PHYSICAL_MAX_INDEX, i.e. the mask
that effectively controls the largest guest physical APIC ID supported by
x2AVIC, instead of hardcoding the number of bits to 8 (and the number of
VM bits to 24).
The AVIC GATag is programmed into the AMD IOMMU IRTE to provide a
reference back to KVM in case the IOMMU cannot inject an interrupt into a
non-running vCPU. In such a case, the IOMMU notifies software by creating
a GALog entry with the corresponded GATag, and KVM then uses the GATag to
find the correct VM+vCPU to kick. Dropping bit 8 from the GATag results
in kicking the wrong vCPU when targeting vCPUs with x2APIC ID > 255.
Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
Cc: stable@vger.kernel.org
Reported-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Message-Id: <20230207002156.521736-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Define the "physical table max index mask" as bits 8:0, not 9:0. x2AVIC
currently supports a max of 512 entries, i.e. the max index is 511, and
the inputs to GENMASK_ULL() are inclusive. The bug is benign as bit 9 is
reserved and never set by KVM, i.e. KVM is just clearing bits that are
guaranteed to be zero.
Note, as of this writing, APM "Rev. 3.39-October 2022" incorrectly states
that bits 11:8 are reserved in Table B-1. VMCB Layout, Control Area. I.e.
that table wasn't updated when x2AVIC support was added.
Opportunistically fix the comment for the max AVIC ID to align with the
code, and clean up comment formatting too.
Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
Cc: stable@vger.kernel.org
Cc: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Message-Id: <20230207002156.521736-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Code indentation should use tabs where possible and miss a '*'.
Signed-off-by: Rong Tao <rongtao@cestc.cn>
Message-Id: <tencent_A492CB3F9592578451154442830EA1B02C07@qq.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Code indentation should use tabs where possible.
Signed-off-by: Rong Tao <rongtao@cestc.cn>
Message-Id: <tencent_31E6ACADCB6915E157CF5113C41803212107@qq.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
nested_vmx_check_controls() has already run by the time KVM checks host state,
so the "host address space size" exit control can only be set on x86-64 hosts.
Simplify the condition at the cost of adding some dead code to 32-bit kernels.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The effective values of the guest CR0 and CR4 registers may differ from
those included in the VMCS12. In particular, disabling EPT forces
CR4.PAE=1 and disabling unrestricted guest mode forces CR0.PG=CR0.PE=1.
Therefore, checks on these bits cannot be delegated to the processor
and must be performed by KVM.
Reported-by: Reima ISHII <ishiir@g.ecc.u-tokyo.ac.jp>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A single patch to address a rather annoying bug w.r.t. guest timer
offsetting. Effectively the synchronization of timer offsets between
vCPUs was broken, leading to inconsistent timer reads within the VM.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQSNXHjWXuzMZutrKNKivnWIJHzdFgUCZAzwRwAKCRCivnWIJHzd
Fh0nAP4seI9aMrv0EnCHS9nufCSYQZYGBxOe+8EyUOIERxyCPgEAspn6fNJWnc6o
RWbFGMyNHPgeQgGjH+g4ehqh5LSeMww=
=uFU2
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-fixes-6.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 6.3, part #1
A single patch to address a rather annoying bug w.r.t. guest timer
offsetting. Effectively the synchronization of timer offsets between
vCPUs was broken, leading to inconsistent timer reads within the VM.
The RTAS work area allocator uses code that is built by
GENERIC_ALLOCATOR, so the PSERIES Kconfig should select the
required Kconfig symbol to fix multiple build errors.
powerpc64-linux-ld: arch/powerpc/platforms/pseries/rtas-work-area.o: in function `.rtas_work_area_allocator_init':
rtas-work-area.c:(.init.text+0x288): undefined reference to `.gen_pool_create'
powerpc64-linux-ld: rtas-work-area.c:(.init.text+0x2dc): undefined reference to `.gen_pool_set_algo'
powerpc64-linux-ld: rtas-work-area.c:(.init.text+0x310): undefined reference to `.gen_pool_add_owner'
powerpc64-linux-ld: rtas-work-area.c:(.init.text+0x43c): undefined reference to `.gen_pool_destroy'
powerpc64-linux-ld: arch/powerpc/platforms/pseries/rtas-work-area.o:(.toc+0x0): undefined reference to `gen_pool_first_fit_order_align'
powerpc64-linux-ld: arch/powerpc/platforms/pseries/rtas-work-area.o: in function `.__rtas_work_area_alloc':
rtas-work-area.c:(.ref.text+0x14c): undefined reference to `.gen_pool_alloc_algo_owner'
powerpc64-linux-ld: rtas-work-area.c:(.ref.text+0x238): undefined reference to `.gen_pool_alloc_algo_owner'
powerpc64-linux-ld: arch/powerpc/platforms/pseries/rtas-work-area.o: in function `.rtas_work_area_free':
rtas-work-area.c:(.ref.text+0x44c): undefined reference to `.gen_pool_free_owner'
Fixes: 43033bc62d34 ("powerpc/pseries: add RTAS work area allocator")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Nathan Lynch <nathanl@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230223070116.660-2-rdunlap@infradead.org
A potentially malicious SEV guest can constantly hammer the hypervisor
using this driver to send down requests and thus prevent or at least
considerably hinder other guests from issuing requests to the secure
processor which is a shared platform resource.
Therefore, the host is permitted and encouraged to throttle such guest
requests.
Add the capability to handle the case when the hypervisor throttles
excessive numbers of requests issued by the guest. Otherwise, the VM
platform communication key will be disabled, preventing the guest from
attesting itself.
Realistically speaking, a well-behaved guest should not even care about
throttling. During its lifetime, it would end up issuing a handful of
requests which the hardware can easily handle.
This is more to address the case of a malicious guest. Such guest should
get throttled and if its VMPCK gets disabled, then that's its own
wrongdoing and perhaps that guest even deserves it.
To the implementation: the hypervisor signals with SNP_GUEST_REQ_ERR_BUSY
that the guest requests should be throttled. That error code is returned
in the upper 32-bit half of exitinfo2 and this is part of the GHCB spec
v2.
So the guest is given a throttling period of 1 minute in which it
retries the request every 2 seconds. This is a good default but if it
turns out to not pan out in practice, it can be tweaked later.
For safety, since the encryption algorithm in GHCBv2 is AES_GCM, control
must remain in the kernel to complete the request with the current
sequence number. Returning without finishing the request allows the
guest to make another request but with different message contents. This
is IV reuse, and breaks cryptographic protections.
[ bp:
- Rewrite commit message and do a simplified version.
- The stable tags are supposed to denote that a cleanup should go
upfront before backporting this so that any future fixes to this
can preserve the sanity of the backporter(s). ]
Fixes: d5af44dde546 ("x86/sev: Provide support for SNP guest request NAEs")
Signed-off-by: Dionna Glaze <dionnaglaze@google.com>
Co-developed-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: <stable@kernel.org> # d6fd48eff750 ("virt/coco/sev-guest: Check SEV_SNP attribute at probe time")
Cc: <stable@kernel.org> # 970ab823743f (" virt/coco/sev-guest: Simplify extended guest request handling")
Cc: <stable@kernel.org> # c5a338274bdb ("virt/coco/sev-guest: Remove the disable_vmpck label in handle_guest_request()")
Cc: <stable@kernel.org> # 0fdb6cc7c89c ("virt/coco/sev-guest: Carve out the request issuing logic into a helper")
Cc: <stable@kernel.org> # d25bae7dc7b0 ("virt/coco/sev-guest: Do some code style cleanups")
Cc: <stable@kernel.org> # fa4ae42cc60a ("virt/coco/sev-guest: Convert the sw_exit_info_2 checking to a switch-case")
Link: https://lore.kernel.org/r/20230214164638.1189804-2-dionnaglaze@google.com
snp_issue_guest_request() checks the value returned by the hypervisor in
sw_exit_info_2 and returns a different error depending on it.
Convert those checks into a switch-case to make it more readable when
more error values are going to be checked in the future.
No functional changes.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/20230307192449.24732-8-bp@alien8.de
Return a specific error code - -ENOSPC - to signal the too small cert
data buffer instead of checking exit code and exitinfo2.
While at it, hoist the *fw_err assignment in snp_issue_guest_request()
so that a proper error value is returned to the callers.
[ Tom: check override_err instead of err. ]
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230307192449.24732-4-bp@alien8.de
No need to check it on every ioctl. And yes, this is a common SEV driver
but it does only SNP-specific operations currently. This can be
revisited later, when more use cases appear.
No functional changes.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/20230307192449.24732-3-bp@alien8.de
On s390 PCI functions may be hotplugged individually even when they
belong to a multi-function device. In particular on an SR-IOV device VFs
may be removed and later re-added.
In commit a50297cf8235 ("s390/pci: separate zbus creation from
scanning") it was missed however that struct pci_bus and struct
zpci_bus's resource list retained a reference to the PCI functions MMIO
resources even though those resources are released and freed on
hot-unplug. These stale resources may subsequently be claimed when the
PCI function re-appears resulting in use-after-free.
One idea of fixing this use-after-free in s390 specific code that was
investigated was to simply keep resources around from the moment a PCI
function first appeared until the whole virtual PCI bus created for
a multi-function device disappears. The problem with this however is
that due to the requirement of artificial MMIO addreesses (address
cookies) extra logic is then needed to keep the address cookies
compatible on re-plug. At the same time the MMIO resources semantically
belong to the PCI function so tying their lifecycle to the function
seems more logical.
Instead a simpler approach is to remove the resources of an individually
hot-unplugged PCI function from the PCI bus's resource list while
keeping the resources of other PCI functions on the PCI bus untouched.
This is done by introducing pci_bus_remove_resource() to remove an
individual resource. Similarly the resource also needs to be removed
from the struct zpci_bus's resource list. It turns out however, that
there is really no need to add the MMIO resources to the struct
zpci_bus's resource list at all and instead we can simply use the
zpci_bar_struct's resource pointer directly.
Fixes: a50297cf8235 ("s390/pci: separate zbus creation from scanning")
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20230306151014.60913-2-schnelle@linux.ibm.com
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
The code which handles the ipl report is searching for a free location
in memory where it could copy the component and certificate entries to.
It checks for intersection between the sections required for the kernel
and the component/certificate data area, but fails to check whether
the data structures linking these data areas together intersect.
This might cause the iplreport copy code to overwrite the iplreport
itself. Fix this by adding two addtional intersection checks.
Cc: <stable@vger.kernel.org>
Fixes: 9641b8cc733f ("s390/ipl: read IPL report at early boot")
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
A recent change introduced a flag to queue up errors found during
boot-time polling. These errors will be processed during late init once
the MCE subsystem is fully set up.
A number of sysfs updates call mce_restart() which goes through a subset
of the CPU init flow. This includes polling MCA banks and logging any
errors found. Since the same function is used as boot-time polling,
errors will be queued. However, the system is now past late init, so the
errors will remain queued until another error is found and the workqueue
is triggered.
Call mce_schedule_work() at the end of mce_restart() so that queued
errors are processed.
Fixes: 3bff147b187d ("x86/mce: Defer processing of early errors")
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20230301221420.2203184-1-yazen.ghannam@amd.com
impact to anything as those machines will fallback to XSAVEC which is
equivalent there.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmQNtvAACgkQEsHwGGHe
VUpGiRAAjlYpvaQK24s8MiQr3LBC0pKsgKstf1Jx5C+HspmS5JAdF83646kMOUKm
MUGPfQwK1nN5kO0/fOlo4O6vhSIF2Ft/Xfrd/APZm6qJhR3pli9675NeF8fH2D5t
Ypgtl6psRudkB3RUmE1cmHWbr9dMnHZZLnL6iA/qHYXCY3kaw96ncM6HjdnrjXRd
OV2+N4dyhTet3MdUdw7dSr1uz75O5PQH/1FwR1V2zroF1sjImaIwQ7JN51hIITxw
DzfTbfuJzdAqwfztBFG/yZ5K+DEoU5BemHHIuhq+X9/7GeLMd059DdnZuXSX8mcH
jjzOa/E5r/PjYze0XRWT3RbI5fbSc1qhNbmj3kLNP3KE/F3S74n6FR58oLNqosVk
zw1TYP8oocdjG1VxJdm5qndIzwHMSj3qkd+BSNZZ1fwINVLXtSDubtThkN/i+81+
nqnMA8HFrcwy1bhwq4jd5dmP7tjlODATfeL4ZV6/6J1RX8Vwu+bjdy8PM+vJYJ0d
pnFLT20cf6Or0MQHUssO+uh6oC3aQ6AxPWJcuUfbdSLYzjr2EObgCHXGZOhCjvhC
CsALcmwnLh5XzwglzWoXyyv+tsJar63XYcPSEIt+gIfXpLf7ZbzcOSDLDkri6B3Z
fCABGASFnoXr7ZYnGxH4L5WKWOk1W+pgpxyC4mnzD9oHtXIzUPU=
=u6kj
-----END PGP SIGNATURE-----
Merge tag 'x86_urgent_for_v6.3_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fix from Borislav Petkov:
"A single erratum fix for AMD machines:
- Disable XSAVES on AMD Zen1 and Zen2 machines due to an erratum. No
impact to anything as those machines will fallback to XSAVEC which
is equivalent there"
* tag 'x86_urgent_for_v6.3_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/CPU/AMD: Disable XSAVES on AMD family 0x17
Having a per-vcpu virtual offset is a pain. It needs to be synchronized
on each update, and expands badly to a setup where different timers can
have different offsets, or have composite offsets (as with NV).
So let's start by replacing the use of the CNTVOFF_EL2 shadow register
(which we want to reclaim for NV anyway), and make the virtual timer
carry a pointer to a VM-wide offset.
This simplifies the code significantly. It also addresses two terrible bugs:
- The use of CNTVOFF_EL2 leads to some nice offset corruption
when the sysreg gets reset, as reported by Joey.
- The kvm mutex is taken from a vcpu ioctl, which goes against
the locking rules...
Reported-by: Joey Gouly <joey.gouly@arm.com>
Reviewed-by: Reiji Watanabe <reijiw@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230224173915.GA17407@e124191.cambridge.arm.com
Tested-by: Joey Gouly <joey.gouly@arm.com>
Link: https://lore.kernel.org/r/20230224191640.3396734-1-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCZAuQ7gAKCRBZ7Krx/gZQ
6xSmAPsFPc3ykvOWwCl7eTGS65gHZpK80e5lX9kZB8KIa5JjaAEA551vgRWi34+D
PWvDDpN1QUFL6HHL+FR7heLJr2SKIwA=
=RsPH
-----END PGP SIGNATURE-----
Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc fixes from Al Viro:
"pick_file() speculation fix + fix for alpha mis(merge,cherry-pick)
The fs/file.c one is a genuine missing speculation barrier in
pick_file() (reachable e.g. via close(2)). The alpha one is strictly
speaking not a bug fix, but only because confusion between
preempt_enable() and preempt_disable() is harmless on architecture
without CONFIG_PREEMPT.
Looks like alpha.git picked the wrong version of patch - that braino
used to be there in early versions, but it had been fixed quite a
while ago..."
* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: prevent out-of-bounds array speculation when closing a file descriptor
alpha: fix lazy-FPU mis(merged/applied/whatnot)
* RISC-V architecture-specific ELF attributes have been disabled in the
kernel builds.
* A fix for a locking failure while during errata patching that
manifests on SiFive-based systems.
* A fix for a KASAN failure during stack unwinding.
* A fix for some lockdep failures during text patching.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmQLXKsTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYiTd6D/9QxHDNOEiyrY7bSLkbeLiRvASXyMyx
/oKT8fmfsaikJngM/ZttQOiCEqPUDdeFfw0Mg3V9WxvrUsRtKDtfRN+pTUirFKpB
yWMAHuQ69QaWUJqaxfRa6JhyjgbnOCp3t/8RM/TywU3O4QG2MWbPg8QENnxtDAUN
X4zNis6xx19PCT2508irZNGysgIqlIRcDRJ/x85LQXIQBRREAnNTojgepwUxfDeZ
E4iPjlPU/i0eFeqpZkd3vZjjpkAZ91VfqCqmJTKv7JGKg0xeKM4z+EGlhNiS8odF
W9YS6Sf1wCa5smhq4vTF4PGEpineK+KrdJ5lPZXLKbbNtv/c1YNVSXBdi88DsWEn
jYnptL5mKWXAAIdOKdP50LjmiwDQP3BCqR+Ck9HYvPEST65QRwYC2pvReszsX5Bu
guiNSFuh0eEszu6VqJCFjrPKxLGODi0Ug2XeJd5NQ2sd7OkkrUP0wuJaYbn+xnUb
RWJNZY5jpKf+euPuJd5VgPiiOiOE+7gfG0X3bOB27f2OJPS4BFT8Z5NTzIx3qUet
I7hwuXHhNCCl+loczljNnwDZO1g4ktAvOjrRm42MZ2ERyAAvso9UiaI+zdciATP7
UgWvxybQc8oe+XAPpsnr5UwBl3Hy9o1ELXBY1avV8b5XQy7tAIP5eQW4R7nNYxM7
DvNBcO8rOEFx0w==
=goS0
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- RISC-V architecture-specific ELF attributes have been disabled in the
kernel builds
- A fix for a locking failure while during errata patching that
manifests on SiFive-based systems
- A fix for a KASAN failure during stack unwinding
- A fix for some lockdep failures during text patching
* tag 'riscv-for-linus-6.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: Don't check text_mutex during stop_machine
riscv: Use READ_ONCE_NOCHECK in imprecise unwinding stack mode
RISC-V: fix taking the text_mutex twice during sifive errata patching
RISC-V: Stop emitting attributes
When CONFIG_FUNCTION_GRAPH_TRACER is disabled, __kcfi_typeid_ftrace_stub_graph
is missing, causing a link failure:
ld.lld: error: undefined symbol: __kcfi_typeid_ftrace_stub_graph
referenced by arch/x86/kernel/ftrace_64.o:(__cfi_ftrace_stub_graph) in archive vmlinux.a
Mark the reference to it as conditional on the same symbol, as
is done on arm64.
Link: https://lore.kernel.org/linux-trace-kernel/20230131093643.3850272-1-arnd@kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Fixes: 883bbbffa5a4 ("ftrace,kcfi: Separate ftrace_stub() and ftrace_stub_graph()")
See-also: 2598ac6ec493 ("arm64: ftrace: Define ftrace_stub_graph only with FUNCTION_GRAPH_TRACER")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Sergey Matyukevich <geomatsi@gmail.com> says:
Some time ago two different patches have been posted to fix stale TLB
entries that caused applications crashes.
The patch [0] suggested 'aggregating' mm_cpumask, i.e. current cpu is not
cleared for the switched-out task in switch_mm function. For additional
explanations see the commit message by Guo Ren. The same approach is
used by arc architecture, so another good comment is for switch_mm
in arch/arc/include/asm/mmu_context.h.
The patch [1] attempted to reduce the number of TLB flushes by deferring
(and possibly avoiding) them for CPUs not running the task.
Patch [1] has been merged. However we already have two bug reports from
different vendors. So apparently something is missing in the approach
suggested in [1]. In both cases the patch [0] fixed the issue.
This patch series reverts [1] and replaces it by [0].
[0] https://lore.kernel.org/linux-riscv/20221111075902.798571-1-guoren@kernel.org/
[1] https://lore.kernel.org/linux-riscv/20220829205219.283543-1-geomatsi@gmail.com/
* b4-shazam-merge:
riscv: asid: Fixup stale TLB entry cause application crash
Revert "riscv: mm: notify remote harts about mmu cache updates"
Link: https://lore.kernel.org/r/20230226150137.1919750-1-geomatsi@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
After use_asid_allocator is enabled, the userspace application will
crash by stale TLB entries. Because only using cpumask_clear_cpu without
local_flush_tlb_all couldn't guarantee CPU's TLB entries were fresh.
Then set_mm_asid would cause the user space application to get a stale
value by stale TLB entry, but set_mm_noasid is okay.
Here is the symptom of the bug:
unhandled signal 11 code 0x1 (coredump)
0x0000003fd6d22524 <+4>: auipc s0,0x70
0x0000003fd6d22528 <+8>: ld s0,-148(s0) # 0x3fd6d92490
=> 0x0000003fd6d2252c <+12>: ld a5,0(s0)
(gdb) i r s0
s0 0x8082ed1cc3198b21 0x8082ed1cc3198b21
(gdb) x /2x 0x3fd6d92490
0x3fd6d92490: 0xd80ac8a8 0x0000003f
The core dump file shows that register s0 is wrong, but the value in
memory is correct. Because 'ld s0, -148(s0)' used a stale mapping entry
in TLB and got a wrong result from an incorrect physical address.
When the task ran on CPU0, which loaded/speculative-loaded the value of
address(0x3fd6d92490), then the first version of the mapping entry was
PTWed into CPU0's TLB.
When the task switched from CPU0 to CPU1 (No local_tlb_flush_all here by
asid), it happened to write a value on the address (0x3fd6d92490). It
caused do_page_fault -> wp_page_copy -> ptep_clear_flush ->
ptep_get_and_clear & flush_tlb_page.
The flush_tlb_page used mm_cpumask(mm) to determine which CPUs need TLB
flush, but CPU0 had cleared the CPU0's mm_cpumask in the previous
switch_mm. So we only flushed the CPU1 TLB and set the second version
mapping of the PTE. When the task switched from CPU1 to CPU0 again, CPU0
still used a stale TLB mapping entry which contained a wrong target
physical address. It raised a bug when the task happened to read that
value.
CPU0 CPU1
- switch 'task' in
- read addr (Fill stale mapping
entry into TLB)
- switch 'task' out (no tlb_flush)
- switch 'task' in (no tlb_flush)
- write addr cause pagefault
do_page_fault() (change to
new addr mapping)
wp_page_copy()
ptep_clear_flush()
ptep_get_and_clear()
& flush_tlb_page()
write new value into addr
- switch 'task' out (no tlb_flush)
- switch 'task' in (no tlb_flush)
- read addr again (Use stale
mapping entry in TLB)
get wrong value from old phyical
addr, BUG!
The solution is to keep all CPUs' footmarks of cpumask(mm) in switch_mm,
which could guarantee to invalidate all stale TLB entries during TLB
flush.
Fixes: 65d4b9c53017 ("RISC-V: Implement ASID allocator")
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Tested-by: Zong Li <zong.li@sifive.com>
Tested-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com>
Cc: Anup Patel <apatel@ventanamicro.com>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
Cc: stable@vger.kernel.org
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20230226150137.1919750-3-geomatsi@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
This reverts the remaining bits of commit 4bd1d80efb5a ("riscv: mm:
notify remote harts harts about mmu cache updates").
According to bug reports, suggested approach to fix stale TLB entries
is not sufficient. It needs to be replaced by a more robust solution.
Fixes: 4bd1d80efb5a ("riscv: mm: notify remote harts about mmu cache updates")
Reported-by: Zong Li <zong.li@sifive.com>
Reported-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com>
Cc: stable@vger.kernel.org
Reviewed-by: Guo Ren <guoren@kernel.org>
Link: https://lore.kernel.org/r/20230226150137.1919750-2-geomatsi@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
We're currently using stop_machine() to update ftrace & kprobes, which
means that the thread that takes text_mutex during may not be the same
as the thread that eventually patches the code. This isn't actually a
race because the lock is still held (preventing any other concurrent
accesses) and there is only one thread running during stop_machine(),
but it does trigger a lockdep failure.
This patch just elides the lockdep check during stop_machine.
Fixes: c15ac4fd60d5 ("riscv/ftrace: Add dynamic function tracer support")
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Reported-by: Changbin Du <changbin.du@gmail.com>
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230303143754.4005217-1-conor.dooley@microchip.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
- Fix systems with memory at end of 32-bit address space,
- Fix initrd on systems where memory does not start at address zero,
- Fix 68030 handling of bus errors for addresses in exception tables.
-----BEGIN PGP SIGNATURE-----
iIsEABYIADMWIQQ9qaHoIs/1I4cXmEiKwlD9ZEnxcAUCZAmqMBUcZ2VlcnRAbGlu
dXgtbTY4ay5vcmcACgkQisJQ/WRJ8XAi4AEA2hE4fkPIzobqWXZrLblAYnyy4vIP
S5TeXusdpMahnBEA/36mt2bFZFthHnfvQK/4YjBGDIQPV8+DOIt2YiYFZwIL
=ssyX
-----END PGP SIGNATURE-----
Merge tag 'm68k-for-v6.3-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k
Pull m68k fixes from Geert Uytterhoeven:
- Fix systems with memory at end of 32-bit address space
- Fix initrd on systems where memory does not start at address zero
- Fix 68030 handling of bus errors for addresses in exception tables
* tag 'm68k-for-v6.3-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Only force 030 bus error if PC not in exception table
m68k: mm: Move initrd phys_to_virt handling after paging_init()
m68k: mm: Fix systems with memory at end of 32-bit address space
We fetch %SR value from sigframe; it might have been modified by signal
handler, so we can't trust it with any bits that are not modifiable in
user mode.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Rich Felker <dalias@libc.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The implementation of 'current' on x86 is very intentionally special: it
is a very common thing to look up, and it uses 'this_cpu_read_stable()'
to get the current thread pointer efficiently from per-cpu storage.
And the keyword in there is 'stable': the current thread pointer never
changes as far as a single thread is concerned. Even if when a thread
is preempted, or moved to another CPU, or even across an explicit call
'schedule()' that thread will still have the same value for 'current'.
It is, after all, the kernel base pointer to thread-local storage.
That's why it's stable to begin with, but it's also why it's important
enough that we have that special 'this_cpu_read_stable()' access for it.
So this is all done very intentionally to allow the compiler to treat
'current' as a value that never visibly changes, so that the compiler
can do CSE and combine multiple different 'current' accesses into one.
However, there is obviously one very special situation when the
currently running thread does actually change: inside the scheduler
itself.
So the scheduler code paths are special, and do not have a 'current'
thread at all. Instead there are _two_ threads: the previous and the
next thread - typically called 'prev' and 'next' (or prev_p/next_p)
internally.
So this is all actually quite straightforward and simple, and not all
that complicated.
Except for when you then have special code that is run in scheduler
context, that code then has to be aware that 'current' isn't really a
valid thing. Did you mean 'prev'? Did you mean 'next'?
In fact, even if then look at the code, and you use 'current' after the
new value has been assigned to the percpu variable, we have explicitly
told the compiler that 'current' is magical and always stable. So the
compiler is quite free to use an older (or newer) value of 'current',
and the actual assignment to the percpu storage is not relevant even if
it might look that way.
Which is exactly what happened in the resctl code, that blithely used
'current' in '__resctrl_sched_in()' when it really wanted the new
process state (as implied by the name: we're scheduling 'into' that new
resctl state). And clang would end up just using the old thread pointer
value at least in some configurations.
This could have happened with gcc too, and purely depends on random
compiler details. Clang just seems to have been more aggressive about
moving the read of the per-cpu current_task pointer around.
The fix is trivial: just make the resctl code adhere to the scheduler
rules of using the prev/next thread pointer explicitly, instead of using
'current' in a situation where it just wasn't valid.
That same code is then also used outside of the scheduler context (when
a thread resctl state is explicitly changed), and then we will just pass
in 'current' as that pointer, of course. There is no ambiguity in that
case.
The fix may be trivial, but noticing and figuring out what went wrong
was not. The credit for that goes to Stephane Eranian.
Reported-by: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/lkml/20230303231133.1486085-1-eranian@google.com/
Link: https://lore.kernel.org/lkml/alpine.LFD.2.01.0908011214330.3304@localhost.localdomain/
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Tony Luck <tony.luck@intel.com>
Tested-by: Stephane Eranian <eranian@google.com>
Tested-by: Babu Moger <babu.moger@amd.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
AMD Erratum 1386 is summarised as:
XSAVES Instruction May Fail to Save XMM Registers to the Provided
State Save Area
This piece of accidental chronomancy causes the %xmm registers to
occasionally reset back to an older value.
Ignore the XSAVES feature on all AMD Zen1/2 hardware. The XSAVEC
instruction (which works fine) is equivalent on affected parts.
[ bp: Typos, move it into the F17h-specific function. ]
Reported-by: Tavis Ormandy <taviso@gmail.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20230307174643.1240184-1-andrew.cooper3@citrix.com
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZAZZ1wAKCRDbK58LschI
g4fcAQDYVsICeBDmhdBdZs7Kb91/s6SrU6B0jy4zs0gOIBBOhgD7B3jt3dMTD2tp
rPLHlv6uUoYS7mbZsrZi/XjVw8UmewM=
=VUnr
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2023-03-06
We've added 8 non-merge commits during the last 7 day(s) which contain
a total of 9 files changed, 64 insertions(+), 18 deletions(-).
The main changes are:
1) Fix BTF resolver for DATASEC sections when a VAR points at a modifier,
that is, keep resolving such instances instead of bailing out,
from Lorenz Bauer.
2) Fix BPF test framework with regards to xdp_frame info misplacement
in the "live packet" code, from Alexander Lobakin.
3) Fix an infinite loop in BPF sockmap code for TCP/UDP/AF_UNIX,
from Liu Jian.
4) Fix a build error for riscv BPF JIT under PERF_EVENTS=n,
from Randy Dunlap.
5) Several BPF doc fixes with either broken links or external instead
of internal doc links, from Bagas Sanjaya.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
selftests/bpf: check that modifier resolves after pointer
btf: fix resolving BTF_KIND_VAR after ARRAY, STRUCT, UNION, PTR
bpf, test_run: fix &xdp_frame misplacement for LIVE_FRAMES
bpf, doc: Link to submitting-patches.rst for general patch submission info
bpf, doc: Do not link to docs.kernel.org for kselftest link
bpf, sockmap: Fix an infinite loop error when len is 0 in tcp_bpf_recvmsg_parser()
riscv, bpf: Fix patch_text implicit declaration
bpf, docs: Fix link to BTF doc
====================
Link: https://lore.kernel.org/r/20230306215944.11981-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Looks like a braino that used to be fixed in e.g. #next.alpha
had gotten into alpha.git cherry-picked version of that patch.
Sure, alpha has no preempt, but preempt_enable() in place of
preempt_disable() is actively confusing the readers...
Other than that, the cherry-picked variant matches what I have.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The RISC-V ELF attributes don't contain any useful information. New
toolchains ignore them, but they frequently trip up various older/mixed
toolchains. So just turn them off.
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20230223224605.6995-1-palmer@rivosinc.com
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>