IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
- Fix numerous issues with instrumentation and exception entry
- Fix hideous typo in unused register field definition
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl/HvbwQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNFTMB/oD0ucfP6CH65w+7Sbsv2L8FABfYzSrA9gP
f1cmeh1+MyRN4Nbx2ves5wcRGoX1CgZ8KFAmLXG6yyn7UDA/q27CTELknwobhOft
tQIPB2hFDW9qq3VBXFReL3aoXLnWUiRL3nBxQFt7LG1Xor/ivEb1ZFht351UklDh
u1P6NVptpjXFuGPvdqxkHo2WzT0QHI57MRuc1l7I1FRo4dV1nKSlwohu0Ydii4q9
8oLhx77Ga1SWK80IztNmpo7CSMP/FLGDwbUE3vAaftUJx5CBt+lYR1CeWNACSEvy
22y7CkJWKGQccG62oHI7zQaZm1+fum70ndP5dDlfQW/BcCaz8vRH
=KSDc
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"I'm sad to say that we've got an unusually large arm64 fixes pull for
rc7 which addresses numerous significant instrumentation issues with
our entry code.
Without these patches, lockdep is hopelessly unreliable in some
configurations [1,2] and syzkaller is therefore not a lot of use
because it's so noisy.
Although much of this has always been broken, it appears to have been
exposed more readily by other changes such as 044d0d6de9f5 ("lockdep:
Only trace IRQ edges") and general lockdep improvements around IRQ
tracing and NMIs.
Fixing this properly required moving much of the instrumentation hooks
from our entry assembly into C, which Mark has been working on for the
last few weeks. We're not quite ready to move to the recently added
generic functions yet, but the code here has been deliberately written
to mimic that closely so we can look at cleaning things up once we
have a bit more breathing room.
Having said all that, the second version of these patches was posted
last week and I pushed it into our CI (kernelci and cki) along with a
commit which forced on PROVE_LOCKING, NOHZ_FULL and
CONTEXT_TRACKING_FORCE. The result? We found a real bug in the
md/raid10 code [3].
Oh, and there's also a really silly typo patch that's unrelated.
Summary:
- Fix numerous issues with instrumentation and exception entry
- Fix hideous typo in unused register field definition"
[1] https://lore.kernel.org/r/CACT4Y+aAzoJ48Mh1wNYD17pJqyEcDnrxGfApir=-j171TnQXhw@mail.gmail.com
[2] https://lore.kernel.org/r/20201119193819.GA2601289@elver.google.com
[3] https://lore.kernel.org/r/94c76d5e-466a-bc5f-e6c2-a11b65c39f83@redhat.com
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mte: Fix typo in macro definition
arm64: entry: fix EL1 debug transitions
arm64: entry: fix NMI {user, kernel}->kernel transitions
arm64: entry: fix non-NMI kernel<->kernel transitions
arm64: ptrace: prepare for EL1 irq/rcu tracking
arm64: entry: fix non-NMI user<->kernel transitions
arm64: entry: move el1 irq/nmi logic to C
arm64: entry: prepare ret_to_user for function call
arm64: entry: move enter_from_user_mode to entry-common.c
arm64: entry: mark entry code as noinstr
arm64: mark idle code as noinstr
arm64: syscall: exit userspace before unmasking exceptions
In debug_exception_enter() and debug_exception_exit() we trace hardirqs
on/off while RCU isn't guaranteed to be watching, and we don't save and
restore the hardirq state, and so may return with this having changed.
Handle this appropriately with new entry/exit helpers which do the bare
minimum to ensure this is appropriately maintained, without marking
debug exceptions as NMIs. These are placed in entry-common.c with the
other entry/exit helpers.
In future we'll want to reconsider whether some debug exceptions should
be NMIs, but this will require a significant refactoring, and for now
this should prevent issues with lockdep and RCU.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marins <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201130115950.22492-12-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Exceptions which can be taken at (almost) any time are consdiered to be
NMIs. On arm64 that includes:
* SDEI events
* GICv3 Pseudo-NMIs
* Kernel stack overflows
* Unexpected/unhandled exceptions
... but currently debug exceptions (BRKs, breakpoints, watchpoints,
single-step) are not considered NMIs.
As these can be taken at any time, kernel features (lockdep, RCU,
ftrace) may not be in a consistent kernel state. For example, we may
take an NMI from the idle code or partway through an entry/exit path.
While nmi_enter() and nmi_exit() handle most of this state, notably they
don't save/restore the lockdep state across an NMI being taken and
handled. When interrupts are enabled and an NMI is taken, lockdep may
see interrupts become disabled within the NMI code, but not see
interrupts become enabled when returning from the NMI, leaving lockdep
believing interrupts are disabled when they are actually disabled.
The x86 code handles this in idtentry_{enter,exit}_nmi(), which will
shortly be moved to the generic entry code. As we can't use either yet,
we copy the x86 approach in arm64-specific helpers. All the NMI
entrypoints are marked as noinstr to prevent any instrumentation
handling code being invoked before the state has been corrected.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201130115950.22492-11-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
There are periods in kernel mode when RCU is not watching and/or the
scheduler tick is disabled, but we can still take exceptions such as
interrupts. The arm64 exception handlers do not account for this, and
it's possible that RCU is not watching while an exception handler runs.
The x86/generic entry code handles this by ensuring that all (non-NMI)
kernel exception handlers call irqentry_enter() and irqentry_exit(),
which handle RCU, lockdep, and IRQ flag tracing. We can't yet move to
the generic entry code, and already hadnle the user<->kernel transitions
elsewhere, so we add new kernel<->kernel transition helpers alog the
lines of the generic entry code.
Since we now track interrupts becoming masked when an exception is
taken, local_daif_inherit() is modified to track interrupts becoming
re-enabled when the original context is inherited. To balance the
entry/exit paths, each handler masks all DAIF exceptions before
exit_to_kernel_mode().
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201130115950.22492-10-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
When built with PROVE_LOCKING, NO_HZ_FULL, and CONTEXT_TRACKING_FORCE
will WARN() at boot time that interrupts are enabled when we call
context_tracking_user_enter(), despite the DAIF flags indicating that
IRQs are masked.
The problem is that we're not tracking IRQ flag changes accurately, and
so lockdep believes interrupts are enabled when they are not (and
vice-versa). We can shuffle things so to make this more accurate. For
kernel->user transitions there are a number of constraints we need to
consider:
1) When we call __context_tracking_user_enter() HW IRQs must be disabled
and lockdep must be up-to-date with this.
2) Userspace should be treated as having IRQs enabled from the PoV of
both lockdep and tracing.
3) As context_tracking_user_enter() stops RCU from watching, we cannot
use RCU after calling it.
4) IRQ flag tracing and lockdep have state that must be manipulated
before RCU is disabled.
... with similar constraints applying for user->kernel transitions, with
the ordering reversed.
The generic entry code has enter_from_user_mode() and
exit_to_user_mode() helpers to handle this. We can't use those directly,
so we add arm64 copies for now (without the instrumentation markers
which aren't used on arm64). These replace the existing user_exit() and
user_exit_irqoff() calls spread throughout handlers, and the exception
unmasking is left as-is.
Note that:
* The accounting for debug exceptions from userspace now happens in
el0_dbg() and ret_to_user(), so this is removed from
debug_exception_enter() and debug_exception_exit(). As
user_exit_irqoff() wakes RCU, the userspace-specific check is removed.
* The accounting for syscalls now happens in el0_svc(),
el0_svc_compat(), and ret_to_user(), so this is removed from
el0_svc_common(). This does not adversely affect the workaround for
erratum 1463225, as this does not depend on any of the state tracking.
* In ret_to_user() we mask interrupts with local_daif_mask(), and so we
need to inform lockdep and tracing. Here a trace_hardirqs_off() is
sufficient and safe as we have not yet exited kernel context and RCU
is usable.
* As PROVE_LOCKING selects TRACE_IRQFLAGS, the ifdeferry in entry.S only
needs to check for the latter.
* EL0 SError handling will be dealt with in a subsequent patch, as this
needs to be treated as an NMI.
Prior to this patch, booting an appropriately-configured kernel would
result in spats as below:
| DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
| WARNING: CPU: 2 PID: 1 at kernel/locking/lockdep.c:5280 check_flags.part.54+0x1dc/0x1f0
| Modules linked in:
| CPU: 2 PID: 1 Comm: init Not tainted 5.10.0-rc3 #3
| Hardware name: linux,dummy-virt (DT)
| pstate: 804003c5 (Nzcv DAIF +PAN -UAO -TCO BTYPE=--)
| pc : check_flags.part.54+0x1dc/0x1f0
| lr : check_flags.part.54+0x1dc/0x1f0
| sp : ffff80001003bd80
| x29: ffff80001003bd80 x28: ffff66ce801e0000
| x27: 00000000ffffffff x26: 00000000000003c0
| x25: 0000000000000000 x24: ffffc31842527258
| x23: ffffc31842491368 x22: ffffc3184282d000
| x21: 0000000000000000 x20: 0000000000000001
| x19: ffffc318432ce000 x18: 0080000000000000
| x17: 0000000000000000 x16: ffffc31840f18a78
| x15: 0000000000000001 x14: ffffc3184285c810
| x13: 0000000000000001 x12: 0000000000000000
| x11: ffffc318415857a0 x10: ffffc318406614c0
| x9 : ffffc318415857a0 x8 : ffffc31841f1d000
| x7 : 647261685f706564 x6 : ffffc3183ff7c66c
| x5 : ffff66ce801e0000 x4 : 0000000000000000
| x3 : ffffc3183fe00000 x2 : ffffc31841500000
| x1 : e956dc24146b3500 x0 : 0000000000000000
| Call trace:
| check_flags.part.54+0x1dc/0x1f0
| lock_is_held_type+0x10c/0x188
| rcu_read_lock_sched_held+0x70/0x98
| __context_tracking_enter+0x310/0x350
| context_tracking_enter.part.3+0x5c/0xc8
| context_tracking_user_enter+0x6c/0x80
| finish_ret_to_user+0x2c/0x13cr
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201130115950.22492-8-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
In preparation for reworking the EL1 irq/nmi entry code, move the
existing logic to C. We no longer need the asm_nmi_enter() and
asm_nmi_exit() wrappers, so these are removed. The new C functions are
marked noinstr, which prevents compiler instrumentation and runtime
probing.
In subsequent patches we'll want the new C helpers to be called in all
cases, so we don't bother wrapping the calls with ifdeferry. Even when
the new C functions are stubs the trivial calls are unlikely to have a
measurable impact on the IRQ or NMI paths anyway.
Prototypes are added to <asm/exception.h> as otherwise (in some
configurations) GCC will complain about the lack of a forward
declaration. We already do this for existing function, e.g.
enter_from_user_mode().
The new helpers are marked as noinstr (which prevents all
instrumentation, tracing, and kprobes). Otherwise, there should be no
functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201130115950.22492-7-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
In a subsequent patch ret_to_user will need to make a C function call
(in some configurations) which may clobber x0-x18 at the start of the
finish_ret_to_user block, before enable_step_tsk consumes the flags
loaded into x1.
In preparation for this, let's load the flags into x19, which is
preserved across C function calls. This avoids a redundant reload of the
flags and ensures we operate on a consistent shapshot regardless.
There should be no functional change as a result of this patch. At this
point of the entry/exit paths we only need to preserve x28 (tsk) and the
sp, and x19 is free for this use.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201130115950.22492-6-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
In later patches we'll want to extend enter_from_user_mode() and add a
corresponding exit_to_user_mode(). As these will be common for all
entries/exits from userspace, it'd be better for these to live in
entry-common.c with the rest of the entry logic.
This patch moves enter_from_user_mode() into entry-common.c. As with
other functions in entry-common.c it is marked as noinstr (which
prevents all instrumentation, tracing, and kprobes) but there are no
other functional changes.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201130115950.22492-5-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Functions in entry-common.c are marked as notrace and NOKPROBE_SYMBOL(),
but they're still subject to other instrumentation which may rely on
lockdep/rcu/context-tracking being up-to-date, and may cause nested
exceptions (e.g. for WARN/BUG or KASAN's use of BRK) which will corrupt
exceptions registers which have not yet been read.
Prevent this by marking all functions in entry-common.c as noinstr to
prevent compiler instrumentation. This also blacklists the functions for
tracing and kprobes, so we don't need to handle that separately.
Functions elsewhere will be dealt with in subsequent patches.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201130115950.22492-4-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Core code disables RCU when calling arch_cpu_idle(), so it's not safe
for arch_cpu_idle() or its calees to be instrumented, as the
instrumentation callbacks may attempt to use RCU or other features which
are unsafe to use in this context.
Mark them noinstr to prevent issues.
The use of local_irq_enable() in arch_cpu_idle() is similarly
problematic, and the "sched/idle: Fix arch_cpu_idle() vs tracing" patch
queued in the tip tree addresses that case.
Reported-by: Marco Elver <elver@google.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201130115950.22492-3-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
In el0_svc_common() we unmask exceptions before we call user_exit(), and
so there's a window where an IRQ or debug exception can be taken while
RCU is not watching. In do_debug_exception() we account for this in via
debug_exception_{enter,exit}(), but in the el1_irq asm we do not and we
call trace functions which rely on RCU before we have a guarantee that
RCU is watching.
Let's avoid this by having el0_svc_common() exit userspace before
unmasking exceptions, matching what we do for all other EL0 entry paths.
We can use user_exit_irqoff() to avoid the pointless save/restore of IRQ
flags while we're sure exceptions are masked in DAIF.
The workaround for Cortex-A76 erratum 1463225 may trigger a debug
exception before this point, but the debug code invoked in this case is
safe even when RCU is not watching.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201130115950.22492-2-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
idle path. Similar to the entry path the low level idle functions have to
be non-instrumentable.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/DpAUTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoXSLD/9klc0YimnEnROW6Q5Svb2IcyIutmXF
bOIY1bYYoKILOBj3wyvDUhmdMuq5zh7H9yG11hO8MaVVWVQcLcOMLdHTYm9dcdmF
xQk33+xqjuhRShB+nEmC9ayYtWogtH6W6uZ6WDtF9ZltMKU85n5ddGJ/Fvo+HoCb
NbOdHGJdJ3/3ZCeHnxOnxM+5/GwjkBuccTV/tXmb3yXrfU9DBySyQ4/UchcpF43w
LcEb0kiQbpZsBTByKJOQV8+RR654S0sILlvRwVXpmj94vrgGwhlVk1/9rz7tkOhF
ksoo1mTVu75LMt22G/hXxE63787yRvFdHjapf0+kCOAuhl992NK+xlGDH8o9DXcu
9y73D4bI0HnDFs20w6vs20iLvxECJiYHJqlgR5ZwFUToceaNgtiYr8kzuD7Zbae1
KG2E7BuNSwHWMtf97fGn44GZknPEOaKdDn4Wv6/bvKHxLm77qe11RKF70Stcz2AI
am13KmQzzsHGF5qNWwpElRUxSdxfJMR66RnOdTQULGrRedaZTFol/y2pnVzTSe3k
SZnlpL5kE7y92UYDogPb5wWA7b+YkJN0OdSkRFy1FH26ZG8E4M7ZJ2tql5Sw7pGM
lsTjXpAUphnK5rz7QcYE8KAZWj//fIAcElIrvdklVcBnS3IqjfksYW27B64133vx
cT1B/lA1PHXj6Q==
=raED
-----END PGP SIGNATURE-----
Merge tag 'locking-urgent-2020-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
"Two more places which invoke tracing from RCU disabled regions in the
idle path.
Similar to the entry path the low level idle functions have to be
non-instrumentable"
* tag 'locking-urgent-2020-11-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
intel_idle: Fix intel_idle() vs tracing
sched/idle: Fix arch_cpu_idle() vs tracing
We call arch_cpu_idle() with RCU disabled, but then use
local_irq_{en,dis}able(), which invokes tracing, which relies on RCU.
Switch all arch_cpu_idle() implementations to use
raw_local_irq_{en,dis}able() and carefully manage the
lockdep,rcu,tracing state like we do in entry.
(XXX: we really should change arch_cpu_idle() to not return with
interrupts enabled)
Reported-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lkml.kernel.org/r/20201120114925.594122626@infradead.org
- A set of commits which reduce the stack usage of various perf event
handling functions which allocated large data structs on stack causing
stack overflows in the worst case.
- Use the proper mechanism for detecting soft interrupts in the recursion
protection.
- Make the resursion protection simpler and more robust.
- Simplify the scheduling of event groups to make the code more robust and
prepare for fixing the issues vs. scheduling of exclusive event groups.
- Prevent event multiplexing and rotation for exclusive event groups
- Correct the perf event attribute exclusive semantics to take pinned
events, e.g. the PMU watchdog, into account
- Make the anythread filtering conditional for Intel's generic PMU
counters as it is not longer guaranteed to be supported on newer
CPUs. Check the corresponding CPUID leaf to make sure.
- Fixup a duplicate initialization in an array which was probably cause by
the usual copy & paste - forgot to edit mishap.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl+xIi0THHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYofixD/4+4gc8DhOmAkMrN0Z9tiW8ebgMKmb9
wZRkMr5Osi0GzLJOPZ6SdY6jd0A3rMN/sW6P1DT6pDtcty4bKFoW5VZBuUDIAhel
BC4C93L3y1En/GEZu1GTy3LvsBwLBQTOoY4goDjbdAbk60S/0RTHOGyQsRsOQFe6
fVs3iXozAFuaR6I6N3dlxuJAE51zvr8MyBWaUoByNDB//1+lLNW+JfClaAOG1oXx
qZIg/niatBVGzSGgKNRUyh3g8G1HJtabsA/NZ4PH8ZHuYABfmj4lmmUPR77ICLfV
wMITEBG7eaktB8EqM9hvaoOZLA5kpXHO2JbCFSs4c4x11mlC8g7QMV3poCw33YoN
a5TmT1A3muri1riy1/Ee9lXACOq7/tf2+Xfn9o6dvDdBwd6s5pzlhLGR8gILp2lF
2bcg3IwYvHT/Kiurb/WGNpbCqQIPJpcUcfs3tNBCCtKegahUQNnGjxN3NVo9RCit
zfL6xIJ8eZiYnsxXx4NKm744AukWiql3aRNgRkOdBP5WC68xt6VLcxG1YZKUoDhy
jRSOCD/DuPSMSvAAgN7S8OWlPsKWBxVxxWYV+K8FpwhgzbQ3WbS3UDiYkhgjeOxu
OlM692oWpllKvQWlvYthr2Be6oPCRRi1vvADNNbTKzgHk5i61bwympsGl1EZx3Pz
2ROp7NJFRESnqw==
=FzCf
-----END PGP SIGNATURE-----
Merge tag 'perf-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
"A set of fixes for perf:
- A set of commits which reduce the stack usage of various perf
event handling functions which allocated large data structs on
stack causing stack overflows in the worst case
- Use the proper mechanism for detecting soft interrupts in the
recursion protection
- Make the resursion protection simpler and more robust
- Simplify the scheduling of event groups to make the code more
robust and prepare for fixing the issues vs. scheduling of
exclusive event groups
- Prevent event multiplexing and rotation for exclusive event groups
- Correct the perf event attribute exclusive semantics to take
pinned events, e.g. the PMU watchdog, into account
- Make the anythread filtering conditional for Intel's generic PMU
counters as it is not longer guaranteed to be supported on newer
CPUs. Check the corresponding CPUID leaf to make sure
- Fixup a duplicate initialization in an array which was probably
caused by the usual 'copy & paste - forgot to edit' mishap"
* tag 'perf-urgent-2020-11-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/intel/uncore: Fix Add BW copypasta
perf/x86/intel: Make anythread filter support conditional
perf: Tweak perf_event_attr::exclusive semantics
perf: Fix event multiplexing for exclusive groups
perf: Simplify group_sched_in()
perf: Simplify group_sched_out()
perf/x86: Make dummy_iregs static
perf/arch: Remove perf_sample_data::regs_user_copy
perf: Optimize get_recursion_context()
perf: Fix get_recursion_context()
perf/x86: Reduce stack usage for x86_pmu::drain_pebs()
perf: Reduce stack usage of perf_output_begin()
- Spectre/Meltdown safelisting for some Qualcomm KRYO cores
- Fix RCU splat when failing to online a CPU due to a feature mismatch
- Fix a recently introduced sparse warning in kexec()
- Fix handling of CPU erratum 1418040 for late CPUs
- Ensure hot-added memory falls within linear-mapped region
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl+ubogQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNPD7B/9i5ao44AEJwjz0a68S/jD7kUD7i3xVkCNN
Y8i/i9mx44IAcf8pmyQh3ngaFlJuF2C6oC/SQFiDbmVeGeZXLXvXV7uGAqXG5Xjm
O2Svgr1ry176JWpsB7MNnZwzAatQffdkDjbjQCcUnUIKYcLvge8H2fICljujGcfQ
094vNmT9VerTWRbWDti3Ck/ug+sanVHuzk5BWdKx3jamjeTqo+sBZK/wgBr6UoYQ
mT3BFX42kLIGg+AzwXRDPlzkJymjYgQDbSwGsvny8qKdOEJbAUwWXYZ5sTs9J/gU
E9PT3VJI7BYtTd1uPEWkD645U3arfx3Pf2JcJlbkEp86qx4CUF9s
=T6k4
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
- Spectre/Meltdown safelisting for some Qualcomm KRYO cores
- Fix RCU splat when failing to online a CPU due to a feature mismatch
- Fix a recently introduced sparse warning in kexec()
- Fix handling of CPU erratum 1418040 for late CPUs
- Ensure hot-added memory falls within linear-mapped region
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: cpu_errata: Apply Erratum 845719 to KRYO2XX Silver
arm64: proton-pack: Add KRYO2XX silver CPUs to spectre-v2 safe-list
arm64: kpti: Add KRYO2XX gold/silver CPU cores to kpti safelist
arm64: Add MIDR value for KRYO2XX gold/silver CPU cores
arm64/mm: Validate hotplug range before creating linear mapping
arm64: smp: Tell RCU about CPUs that fail to come online
arm64: psci: Avoid printing in cpu_psci_cpu_die()
arm64: kexec_file: Fix sparse warning
arm64: errata: Fix handling of 1418040 with late CPU onlining
QCOM KRYO2XX Silver cores are Cortex-A53 based and are
susceptible to the 845719 erratum. Add them to the lookup
list to apply the erratum.
Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Link: https://lore.kernel.org/r/20201104232218.198800-5-konrad.dybcio@somainline.org
Signed-off-by: Will Deacon <will@kernel.org>
QCOM KRYO2XX gold (big) silver (LITTLE) CPU cores are based on
Cortex-A73 and Cortex-A53 respectively and are meltdown safe,
hence add them to kpti_safe_list[].
Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org>
Link: https://lore.kernel.org/r/20201104232218.198800-3-konrad.dybcio@somainline.org
Signed-off-by: Will Deacon <will@kernel.org>
Commit ce3d31ad3cac ("arm64/smp: Move rcu_cpu_starting() earlier") ensured
that RCU is informed early about incoming CPUs that might end up calling
into printk() before they are online. However, if such a CPU fails the
early CPU feature compatibility checks in check_local_cpu_capabilities(),
then it will be powered off or parked without informing RCU, leading to
an endless stream of stalls:
| rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
| rcu: 2-O...: (0 ticks this GP) idle=002/1/0x4000000000000000 softirq=0/0 fqs=2593
| (detected by 0, t=5252 jiffies, g=9317, q=136)
| Task dump for CPU 2:
| task:swapper/2 state:R running task stack: 0 pid: 0 ppid: 1 flags:0x00000028
| Call trace:
| ret_from_fork+0x0/0x30
Ensure that the dying CPU invokes rcu_report_dead() prior to being powered
off or parked.
Cc: Qian Cai <cai@redhat.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Suggested-by: Qian Cai <cai@redhat.com>
Link: https://lore.kernel.org/r/20201105222242.GA8842@willie-the-truck
Link: https://lore.kernel.org/r/20201106103602.9849-3-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
cpu_psci_cpu_die() is called in the context of the dying CPU, which
will no longer be online or tracked by RCU. It is therefore not generally
safe to call printk() if the PSCI "cpu off" request fails, so remove the
pr_crit() invocation.
Cc: Qian Cai <cai@redhat.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/20201106103602.9849-2-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Sparse gets cross about us returning 0 from image_load(), which has a
return type of 'void *':
>> arch/arm64/kernel/kexec_image.c:130:16: sparse: sparse: Using plain integer as NULL pointer
Return NULL instead, as we don't use the return value for anything if it
does not indicate an error.
Cc: Benjamin Gwin <bgwin@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 108aa503657e ("arm64: kexec_file: try more regions if loading segments fails")
Link: https://lore.kernel.org/r/202011091736.T0zH8kaC-lkp@intel.com
Signed-off-by: Will Deacon <will@kernel.org>
In a surprising turn of events, it transpires that CPU capabilities
configured as ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE are never set as the
result of late-onlining. Therefore our handling of erratum 1418040 does
not get activated if it is not required by any of the boot CPUs, even
though we allow late-onlining of an affected CPU.
In order to get things working again, replace the cpus_have_const_cap()
invocation with an explicit check for the current CPU using
this_cpu_has_cap().
Cc: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
Cc: Stephen Boyd <swboyd@chromium.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201106114952.10032-1-will@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
struct perf_sample_data lives on-stack, we should be careful about it's
size. Furthermore, the pt_regs copy in there is only because x86_64 is a
trainwreck, solve it differently.
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
Link: https://lkml.kernel.org/r/20201030151955.258178461@infradead.org
- Fix early use of kprobes
- Fix kernel placement in kexec_file_load()
- Bump maximum number of NUMA nodes
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl+lLeQQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNHgjB/9RJMWwEo6TJ0JyJdBmgEy9k+F5k7zUEtNO
dmZBXt1V8Gvw2MLRKAayWLumoJCUf0ZTICJ9+wnYAKkGtKvDfuEofrEOe/W/jB8m
V2Nm7Y+UWL/D0E5+jdyGIqsPiljaZg8GCyOxN6BDuqgl/T8/3YlpSudMvlr7xm8s
F71k2u2EvSybcRFmtp9A5x0eUeWRSQtLa1+fWmpyAPAX64YJ9bh2w3/g5SecocUK
Ra8H91XO5BT2sHsDDQe67iUfZz9Y1N1UbNiuzCZIL7+xTcQ6DKw4JJ/2Z5BfkH0D
04THZZqYt5AjYQmUULMmPcbSzMp4E30s5dmckevq8E+LG0imLDYp
=w7Ip
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"Here's the weekly batch of fixes for arm64. Not an awful lot here, but
there are still a few unresolved issues relating to CPU hotplug, RCU
and IRQ tracing that I hope to queue fixes for next week.
Summary:
- Fix early use of kprobes
- Fix kernel placement in kexec_file_load()
- Bump maximum number of NUMA nodes"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: kexec_file: try more regions if loading segments fails
arm64: kprobes: Use BRK instead of single-step when executing instructions out-of-line
arm64: NUMA: Kconfig: Increase NODES_SHIFT to 4
It's possible that the first region picked for the new kernel will make
it impossible to fit the other segments in the required 32GB window,
especially if we have a very large initrd.
Instead of giving up, we can keep testing other regions for the kernel
until we find one that works.
Suggested-by: Ryan O'Leary <ryanoleary@google.com>
Signed-off-by: Benjamin Gwin <bgwin@google.com>
Link: https://lore.kernel.org/r/20201103201106.2397844-1-bgwin@google.com
Signed-off-by: Will Deacon <will@kernel.org>
Commit 36dadef23fcc ("kprobes: Init kprobes in early_initcall") enabled
using kprobes from early_initcall. Unfortunately at this point the
hardware debug infrastructure is not operational. The OS lock may still
be locked, and the hardware watchpoints may have unknown values when
kprobe enables debug monitors to single-step instructions.
Rather than using hardware single-step, append a BRK instruction after
the instruction to be executed out-of-line.
Fixes: 36dadef23fcc ("kprobes: Init kprobes in early_initcall")
Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20201103134900.337243-1-jean-philippe@linaro.org
Signed-off-by: Will Deacon <will@kernel.org>
* selftest fix
* Force PTE mapping on device pages provided via VFIO
* Fix detection of cacheable mapping at S2
* Fallback to PMD/PTE mappings for composite huge pages
* Fix accounting of Stage-2 PGD allocation
* Fix AArch32 handling of some of the debug registers
* Simplify host HYP entry
* Fix stray pointer conversion on nVHE TLB invalidation
* Fix initialization of the nVHE code
* Simplify handling of capabilities exposed to HYP
* Nuke VCPUs caught using a forbidden AArch32 EL0
x86:
* new nested virtualization selftest
* Miscellaneous fixes
* make W=1 fixes
* Reserve new CPUID bit in the KVM leaves
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl+dhRAUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroPWCgf/U997UW/11IdNtkehQO/DFdx7lHev
+IahN1Pnbt92ZoR5nGhK9pgvDahIVhqTmUvgV+3fD24OnqXTpYTu1fliBvL6ynbN
J9Ycf0zFAgwfgTTD5UexTlEovnhX4xz7NDmd6rpxGDZdMaBHQFPkCXBFK45pf4nd
O349aHV0X1AA7Tt/sLhpXpi74Vake1xErLHKhIVLHKyo/zDm+Q0UZry068NNBzTr
St3+QSGlFXhuekVrZLh+DShh6rZGLyY9tcySt6o0Jk7fSs1lmEnPbBgeeqYmyHMd
Yn+ybhthmNkkpI8so70TA9roiVar4UmjnMBOiav62bo7ue26pKE5cWQyXw==
=mvBr
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM:
- selftest fix
- force PTE mapping on device pages provided via VFIO
- fix detection of cacheable mapping at S2
- fallback to PMD/PTE mappings for composite huge pages
- fix accounting of Stage-2 PGD allocation
- fix AArch32 handling of some of the debug registers
- simplify host HYP entry
- fix stray pointer conversion on nVHE TLB invalidation
- fix initialization of the nVHE code
- simplify handling of capabilities exposed to HYP
- nuke VCPUs caught using a forbidden AArch32 EL0
x86:
- new nested virtualization selftest
- miscellaneous fixes
- make W=1 fixes
- reserve new CPUID bit in the KVM leaves"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: vmx: remove unused variable
KVM: selftests: Don't require THP to run tests
KVM: VMX: eVMCS: make evmcs_sanitize_exec_ctrls() work again
KVM: selftests: test behavior of unmapped L2 APIC-access address
KVM: x86: Fix NULL dereference at kvm_msr_ignored_check()
KVM: x86: replace static const variables with macros
KVM: arm64: Handle Asymmetric AArch32 systems
arm64: cpufeature: upgrade hyp caps to final
arm64: cpufeature: reorder cpus_have_{const, final}_cap()
KVM: arm64: Factor out is_{vhe,nvhe}_hyp_code()
KVM: arm64: Force PTE mapping on fault resulting in a device mapping
KVM: arm64: Use fallback mapping sizes for contiguous huge page sizes
KVM: arm64: Fix masks in stage2_pte_cacheable()
KVM: arm64: Fix AArch32 handling of DBGD{CCINT,SCRext} and DBGVCR
KVM: arm64: Allocate stage-2 pgd pages with GFP_KERNEL_ACCOUNT
KVM: arm64: Drop useless PAN setting on host EL1 to EL2 transition
KVM: arm64: Remove leftover kern_hyp_va() in nVHE TLB invalidation
KVM: arm64: Don't corrupt tpidr_el2 on failed HVC call
x86/kvm: Reserve KVM_FEATURE_MSI_EXT_DEST_ID
- Force PTE mapping on device pages provided via VFIO
- Fix detection of cacheable mapping at S2
- Fallback to PMD/PTE mappings for composite huge pages
- Fix accounting of Stage-2 PGD allocation
- Fix AArch32 handling of some of the debug registers
- Simplify host HYP entry
- Fix stray pointer conversion on nVHE TLB invalidation
- Fix initialization of the nVHE code
- Simplify handling of capabilities exposed to HYP
- Nuke VCPUs caught using a forbidden AArch32 EL0
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAl+cO3oPHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpDJdoP/jiKYR8iVkq/RmIsQl383KwQiJGTMi0iL2Zw
/tHnf8bKowAPyG8bqyXMJqlWOb7tcp6U3m+WhENAZHWH02r2M921q0DGVW5p48ou
Ek4zJnFF1iL5ryOBgROKK1nymUZOi3W1a1SsD6ZPImQsKsjNGbqKgWsGs8i9ft0P
vkNZwlqebzJp+OR3agJemc8dkXcGlcRHk7fffdMcU8jsF5RJ9zC0XU0+scKryxhV
o8PzKSlwCeisyL+Vz+s7POzoD3Rt+P+qjblz5NWqy/NHuLh+V9hzUSDOjWbZb70f
Er29vGv7Yjb4nKK2KUzNqirSfXsRylfsjGr+YibP6uKEUMuUm/V41DqzT7nMalIm
cOBGtPk6W9wOL8JNDmlyVGCfATI+5RrErQ8nFClrPu3qw4Hv4pb1Ad5OgAhNE0u1
PfUyBBtQKNAjTdVCRfSuFL4d2yegy1rrpCmYWrvdQjLlXemwgYgKnSQN98cZHgjA
foCAP5gJpAWGualyhKJx2CkY/5deeWKS39ISiNgHo5eRvKsGEnMN7j9UX77VbhRr
PkwCmeUJ3kjzaAfmtcBN/iLjwQbWypidjX2Vbfl5WoVdLuiYXFZIvsdaqRHGl56F
5zhYxM8DKODNEJKMl7a89oEFGKy8x1PQ0kqer9a6GBWkNDrQMOSL4+FkxCyM2m9g
RoHtmdy0
=gVaX
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-fixes-5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 5.10, take #1
- Force PTE mapping on device pages provided via VFIO
- Fix detection of cacheable mapping at S2
- Fallback to PMD/PTE mappings for composite huge pages
- Fix accounting of Stage-2 PGD allocation
- Fix AArch32 handling of some of the debug registers
- Simplify host HYP entry
- Fix stray pointer conversion on nVHE TLB invalidation
- Fix initialization of the nVHE code
- Simplify handling of capabilities exposed to HYP
- Nuke VCPUs caught using a forbidden AArch32 EL0
We finalize caps before initializing kvm hyp code, and any use of
cpus_have_const_cap() in kvm hyp code generates redundant and
potentially unsound code to read the cpu_hwcaps array.
A number of helper functions used in both hyp context and regular kernel
context use cpus_have_const_cap(), as some regular kernel code runs
before the capabilities are finalized. It's tedious and error-prone to
write separate copies of these for hyp and non-hyp code.
So that we can avoid the redundant code, let's automatically upgrade
cpus_have_const_cap() to cpus_have_final_cap() when used in hyp context.
With this change, there's never a reason to access to cpu_hwcaps array
from hyp code, and we don't need to create an NVHE alias for this.
This should have no effect on non-hyp code.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Cc: David Brazdil <dbrazdil@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20201026134931.28246-4-mark.rutland@arm.com
The call to rcu_cpu_starting() in secondary_start_kernel() is not early
enough in the CPU-hotplug onlining process, which results in lockdep
splats as follows:
WARNING: suspicious RCU usage
-----------------------------
kernel/locking/lockdep.c:3497 RCU-list traversed in non-reader section!!
other info that might help us debug this:
RCU used illegally from offline CPU!
rcu_scheduler_active = 1, debug_locks = 1
no locks held by swapper/1/0.
Call trace:
dump_backtrace+0x0/0x3c8
show_stack+0x14/0x60
dump_stack+0x14c/0x1c4
lockdep_rcu_suspicious+0x134/0x14c
__lock_acquire+0x1c30/0x2600
lock_acquire+0x274/0xc48
_raw_spin_lock+0xc8/0x140
vprintk_emit+0x90/0x3d0
vprintk_default+0x34/0x40
vprintk_func+0x378/0x590
printk+0xa8/0xd4
__cpuinfo_store_cpu+0x71c/0x868
cpuinfo_store_cpu+0x2c/0xc8
secondary_start_kernel+0x244/0x318
This is avoided by moving the call to rcu_cpu_starting up near the
beginning of the secondary_start_kernel() function.
Signed-off-by: Qian Cai <cai@redhat.com>
Acked-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/lkml/160223032121.7002.1269740091547117869.tip-bot2@tip-bot2/
Link: https://lore.kernel.org/r/20201028182614.13655-1-cai@redhat.com
Signed-off-by: Will Deacon <will@kernel.org>
On Cortex-A77 r0p0 and r1p0, a sequence of a non-cacheable or device load
and a store exclusive or PAR_EL1 read can cause a deadlock.
The workaround requires a DMB SY before and after a PAR_EL1 register
read. In addition, it's possible an interrupt (doing a device read) or
KVM guest exit could be taken between the DMB and PAR read, so we
also need a DMB before returning from interrupt and before returning to
a guest.
A deadlock is still possible with the workaround as KVM guests must also
have the workaround. IOW, a malicious guest can deadlock an affected
systems.
This workaround also depends on a firmware counterpart to enable the h/w
to insert DMB SY after load and store exclusive instructions. See the
errata document SDEN-1152370 v10 [1] for more information.
[1] https://static.docs.arm.com/101992/0010/Arm_Cortex_A77_MP074_Software_Developer_Errata_Notice_v10.pdf
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: kvmarm@lists.cs.columbia.edu
Link: https://lore.kernel.org/r/20201028182839.166037-2-robh@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Commit 76085aff29f5 ("efi/libstub/arm64: align PE/COFF sections to segment
alignment") increased the PE/COFF section alignment to match the minimum
segment alignment of the kernel image, which ensures that the kernel does
not need to be moved around in memory by the EFI stub if it was built as
relocatable.
However, the first PE/COFF section starts at _stext, which is only 4 KB
aligned, and so the section layout is inconsistent. Existing EFI loaders
seem to care little about this, but it is better to clean this up.
So let's pad the header to 64 KB to match the PE/COFF section alignment.
Fixes: 76085aff29f5 ("efi/libstub/arm64: align PE/COFF sections to segment alignment")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20201027073209.2897-2-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
Now that we started making the linker warn about orphan sections
(input sections that are not explicitly consumed by an output section),
some configurations produce the following warning:
aarch64-linux-gnu-ld: warning: orphan section `.igot.plt' from
`arch/arm64/kernel/head.o' being placed in section `.igot.plt'
It could be any file that triggers this - head.o is simply the first
input file in the link - and the resulting .igot.plt section never
actually appears in vmlinux as it turns out to be empty.
So let's add .igot.plt to our collection of input sections to disregard
unless they are empty.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20201028133332.5571-1-ardb@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
The icache_policy_str[] definition causes a warning when extra
warning flags are enabled:
arch/arm64/kernel/cpuinfo.c:38:26: warning: initialized field overwritten [-Woverride-init]
38 | [ICACHE_POLICY_VIPT] = "VIPT",
| ^~~~~~
arch/arm64/kernel/cpuinfo.c:38:26: note: (near initialization for 'icache_policy_str[2]')
arch/arm64/kernel/cpuinfo.c:39:26: warning: initialized field overwritten [-Woverride-init]
39 | [ICACHE_POLICY_PIPT] = "PIPT",
| ^~~~~~
arch/arm64/kernel/cpuinfo.c:39:26: note: (near initialization for 'icache_policy_str[3]')
arch/arm64/kernel/cpuinfo.c:40:27: warning: initialized field overwritten [-Woverride-init]
40 | [ICACHE_POLICY_VPIPT] = "VPIPT",
| ^~~~~~~
arch/arm64/kernel/cpuinfo.c:40:27: note: (near initialization for 'icache_policy_str[0]')
There is no real need for the default initializer here, as printing a
NULL string is harmless. Rewrite the logic to have an explicit
reserved value for the only one that uses the default value.
This partially reverts the commit that removed ICACHE_POLICY_AIVIVT.
Fixes: 155433cb365e ("arm64: cache: Remove support for ASID-tagged VIVT I-caches")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20201026193807.3816388-1-arnd@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
According to the SMCCC spec[1](7.5.2 Discovery) the
ARM_SMCCC_ARCH_WORKAROUND_1 function id only returns 0, 1, and
SMCCC_RET_NOT_SUPPORTED.
0 is "workaround required and safe to call this function"
1 is "workaround not required but safe to call this function"
SMCCC_RET_NOT_SUPPORTED is "might be vulnerable or might not be, who knows, I give up!"
SMCCC_RET_NOT_SUPPORTED might as well mean "workaround required, except
calling this function may not work because it isn't implemented in some
cases". Wonderful. We map this SMC call to
0 is SPECTRE_MITIGATED
1 is SPECTRE_UNAFFECTED
SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE
For KVM hypercalls (hvc), we've implemented this function id to return
SMCCC_RET_NOT_SUPPORTED, 0, and SMCCC_RET_NOT_REQUIRED. One of those
isn't supposed to be there. Per the code we call
arm64_get_spectre_v2_state() to figure out what to return for this
feature discovery call.
0 is SPECTRE_MITIGATED
SMCCC_RET_NOT_REQUIRED is SPECTRE_UNAFFECTED
SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE
Let's clean this up so that KVM tells the guest this mapping:
0 is SPECTRE_MITIGATED
1 is SPECTRE_UNAFFECTED
SMCCC_RET_NOT_SUPPORTED is SPECTRE_VULNERABLE
Note: SMCCC_RET_NOT_AFFECTED is 1 but isn't part of the SMCCC spec
Fixes: c118bbb52743 ("arm64: KVM: Propagate full Spectre v2 workaround state to KVM guests")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Cc: Andre Przywara <andre.przywara@arm.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://developer.arm.com/documentation/den0028/latest [1]
Link: https://lore.kernel.org/r/20201023154751.1973872-1-swboyd@chromium.org
Signed-off-by: Will Deacon <will@kernel.org>
As it stands now, the vdso32 Makefile hardcodes the linker to ld.bfd
using -fuse-ld=bfd with $(CC). This was taken from the arm vDSO
Makefile, as the comment notes, done in commit d2b30cd4b722 ("ARM:
8384/1: VDSO: force use of BFD linker").
Commit fe00e50b2db8 ("ARM: 8858/1: vdso: use $(LD) instead of $(CC) to
link VDSO") changed that Makefile to use $(LD) directly instead of
through $(CC), which matches how the rest of the kernel operates. Since
then, LD=ld.lld means that the arm vDSO will be linked with ld.lld,
which has shown no problems so far.
Allow ld.lld to link this vDSO as we do the regular arm vDSO. To do
this, we need to do a few things:
* Add a LD_COMPAT variable, which defaults to $(CROSS_COMPILE_COMPAT)ld
with gcc and $(LD) if LLVM is 1, which will be ld.lld, or
$(CROSS_COMPILE_COMPAT)ld if not, which matches the logic of the main
Makefile. It is overrideable for further customization and avoiding
breakage.
* Eliminate cc32-ldoption, which matches commit 055efab3120b ("kbuild:
drop support for cc-ldoption").
With those, we can use $(LD_COMPAT) in cmd_ldvdso and change the flags
from compiler linker flags to linker flags directly. We eliminate
-mfloat-abi=soft because it is not handled by the linker.
Reported-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1033
Link: https://lore.kernel.org/r/20201020011406.1818918-1-natechancellor@gmail.com
Signed-off-by: Will Deacon <will@kernel.org>
Use a more generic form for __section that requires quotes to avoid
complications with clang and gcc differences.
Remove the quote operator # from compiler_attributes.h __section macro.
Convert all unquoted __section(foo) uses to quoted __section("foo").
Also convert __attribute__((section("foo"))) uses to __section("foo")
even if the __attribute__ has multiple list entry forms.
Conversion done using the script at:
https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@gooogle.com>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- New page table code for both hypervisor and guest stage-2
- Introduction of a new EL2-private host context
- Allow EL2 to have its own private per-CPU variables
- Support of PMU event filtering
- Complete rework of the Spectre mitigation
PPC:
- Fix for running nested guests with in-kernel IRQ chip
- Fix race condition causing occasional host hard lockup
- Minor cleanups and bugfixes
x86:
- allow trapping unknown MSRs to userspace
- allow userspace to force #GP on specific MSRs
- INVPCID support on AMD
- nested AMD cleanup, on demand allocation of nested SVM state
- hide PV MSRs and hypercalls for features not enabled in CPUID
- new test for MSR_IA32_TSC writes from host and guest
- cleanups: MMU, CPUID, shared MSRs
- LAPIC latency optimizations ad bugfixes
For x86, also included in this pull request is a new alternative and
(in the future) more scalable implementation of extended page tables
that does not need a reverse map from guest physical addresses to
host physical addresses. For now it is disabled by default because
it is still lacking a few of the existing MMU's bells and whistles.
However it is a very solid piece of work and it is already available
for people to hammer on it.
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl+S8dsUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroM40Af+M46NJmuS5rcwFfybvK/c42KT6svX
Co1NrZDwzSQ2mMy3WQzH9qeLvb+nbY4sT3n5BPNPNsT+aIDPOTDt//qJ2/Ip9UUs
tRNea0MAR96JWLE7MSeeRxnTaQIrw/AAZC0RXFzZvxcgytXwdqBExugw4im+b+dn
Dcz8QxX1EkwT+4lTm5HC0hKZAuo4apnK1QkqCq4SdD2QVJ1YE6+z7pgj4wX7xitr
STKD6q/Yt/0ndwqS0GSGbyg0jy6mE620SN6isFRkJYwqfwLJci6KnqvEK67EcNMu
qeE017K+d93yIVC46/6TfVHzLR/D1FpQ8LZ16Yl6S13OuGIfAWBkQZtPRg==
=AD6a
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
"For x86, there is a new alternative and (in the future) more scalable
implementation of extended page tables that does not need a reverse
map from guest physical addresses to host physical addresses.
For now it is disabled by default because it is still lacking a few of
the existing MMU's bells and whistles. However it is a very solid
piece of work and it is already available for people to hammer on it.
Other updates:
ARM:
- New page table code for both hypervisor and guest stage-2
- Introduction of a new EL2-private host context
- Allow EL2 to have its own private per-CPU variables
- Support of PMU event filtering
- Complete rework of the Spectre mitigation
PPC:
- Fix for running nested guests with in-kernel IRQ chip
- Fix race condition causing occasional host hard lockup
- Minor cleanups and bugfixes
x86:
- allow trapping unknown MSRs to userspace
- allow userspace to force #GP on specific MSRs
- INVPCID support on AMD
- nested AMD cleanup, on demand allocation of nested SVM state
- hide PV MSRs and hypercalls for features not enabled in CPUID
- new test for MSR_IA32_TSC writes from host and guest
- cleanups: MMU, CPUID, shared MSRs
- LAPIC latency optimizations ad bugfixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (232 commits)
kvm: x86/mmu: NX largepage recovery for TDP MMU
kvm: x86/mmu: Don't clear write flooding count for direct roots
kvm: x86/mmu: Support MMIO in the TDP MMU
kvm: x86/mmu: Support write protection for nesting in tdp MMU
kvm: x86/mmu: Support disabling dirty logging for the tdp MMU
kvm: x86/mmu: Support dirty logging for the TDP MMU
kvm: x86/mmu: Support changed pte notifier in tdp MMU
kvm: x86/mmu: Add access tracking for tdp_mmu
kvm: x86/mmu: Support invalidate range MMU notifier for TDP MMU
kvm: x86/mmu: Allocate struct kvm_mmu_pages for all pages in TDP MMU
kvm: x86/mmu: Add TDP MMU PF handler
kvm: x86/mmu: Remove disallowed_hugepage_adjust shadow_walk_iterator arg
kvm: x86/mmu: Support zapping SPTEs in the TDP MMU
KVM: Cache as_id in kvm_memory_slot
kvm: x86/mmu: Add functions to handle changed TDP SPTEs
kvm: x86/mmu: Allocate and free TDP MMU roots
kvm: x86/mmu: Init / Uninit the TDP MMU
kvm: x86/mmu: Introduce tdp_iter
KVM: mmu: extract spte.h and spte.c
KVM: mmu: Separate updating a PTE from kvm_set_pte_rmapp
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl+SOXIQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgptrcD/93VUDmRAn73ChKNd0TtXUicJlAlNLVjvfs
VFTXWBDnlJnGkZT7ElkDD9b8dsz8l4xGf/QZ5dzhC/th2OsfObQkSTfe0lv5cCQO
mX7CRSrDpjaHtW+WGPDa0oQsGgIfpqUz2IOg9NKbZZ1LJ2uzYfdOcf3oyRgwZJ9B
I3sh1vP6OzjZVVCMmtMTM+sYZEsDoNwhZwpkpiwMmj8tYtOPgKCYKpqCiXrGU0x2
ML5FtDIwiwU+O3zYYdCBWqvCb2Db0iA9Aov2whEBz/V2jnmrN5RMA/90UOh1E2zG
br4wM1Wt3hNrtj5qSxZGlF/HEMYJVB8Z2SgMjYu4vQz09qRVVqpGdT/dNvLAHQWg
w4xNCj071kVZDQdfwnqeWSKYUau9Xskvi8xhTT+WX8a5CsbVrM9vGslnS5XNeZ6p
h2D3Q+TAYTvT756icTl0qsYVP7PrPY7DdmQYu0q+Lc3jdGI+jyxO2h9OFBRLZ3p6
zFX2N8wkvvCCzP2DwVnnhIi/GovpSh7ksHnb039F36Y/IhZPqV1bGqdNQVdanv6I
8fcIDM6ltRQ7dO2Br5f1tKUZE9Pm6x60b/uRVjhfVh65uTEKyGRhcm5j9ztzvQfI
cCBg4rbVRNKolxuDEkjsAFXVoiiEEsb7pLf4pMO+Dr62wxFG589tQNySySneUIVZ
J9ILnGAAeQ==
=aVWo
-----END PGP SIGNATURE-----
Merge tag 'arch-cleanup-2020-10-22' of git://git.kernel.dk/linux-block
Pull arch task_work cleanups from Jens Axboe:
"Two cleanups that don't fit other categories:
- Finally get the task_work_add() cleanup done properly, so we don't
have random 0/1/false/true/TWA_SIGNAL confusing use cases. Updates
all callers, and also fixes up the documentation for
task_work_add().
- While working on some TIF related changes for 5.11, this
TIF_NOTIFY_RESUME cleanup fell out of that. Remove some arch
duplication for how that is handled"
* tag 'arch-cleanup-2020-10-22' of git://git.kernel.dk/linux-block:
task_work: cleanup notification modes
tracehook: clear TIF_NOTIFY_RESUME in tracehook_notify_resume()
- Improve performance of Spectre-v2 mitigation on Falkor CPUs (if you're lucky
enough to have one)
- Select HAVE_MOVE_PMD. This has been shown to improve mremap() performance,
which is used heavily by the Android runtime GC, and it seems we forgot to
enable this upstream back in 2018.
- Ensure linker flags are consistent between LLVM and BFD
- Fix stale comment in Spectre mitigation rework
- Fix broken copyright header
- Fix KASLR randomisation of the linear map
- Prevent arm64-specific prctl()s from compat tasks (return -EINVAL)
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAl+QEPAQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNE8jB/0YNYKO9mis/Xn5KcOCwlg4dbc2uVBknZXD
f7otEJ6SOax2HcWz8qJlrJ+qbGFawPIqFBUAM0vU1VmoyctIoKRFTA8ACfWfWtnK
QBfHrcxtJCh/GGq+E1IyuqWzCjppeY/7gYVdgi1xDEZRSaLz53MC1GVBwKBtu5cf
X2Bfm8d9+PSSnmKfpO65wSCTvN3PQX1SNEHwwTWFZQx0p7GcQK1DdwoobM6dRnVy
+e984ske+2a+nTrkhLSyQIgsfHuLB4pD6XdM/UOThnfdNxdQ0dUGn375sXP+b4dW
7MTH9HP/dXIymTcuErMXOHJXLk/zUiUBaOxkmOxdvrhQd0uFNFIc
=e9p9
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull more arm64 updates from Will Deacon:
"A small selection of further arm64 fixes and updates. Most of these
are fixes that came in during the merge window, with the exception of
the HAVE_MOVE_PMD mremap() speed-up which we discussed back in 2018
and somehow forgot to enable upstream.
- Improve performance of Spectre-v2 mitigation on Falkor CPUs (if
you're lucky enough to have one)
- Select HAVE_MOVE_PMD. This has been shown to improve mremap()
performance, which is used heavily by the Android runtime GC, and
it seems we forgot to enable this upstream back in 2018.
- Ensure linker flags are consistent between LLVM and BFD
- Fix stale comment in Spectre mitigation rework
- Fix broken copyright header
- Fix KASLR randomisation of the linear map
- Prevent arm64-specific prctl()s from compat tasks (return -EINVAL)"
Link: https://lore.kernel.org/kvmarm/20181108181201.88826-3-joelaf@google.com/
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: proton-pack: Update comment to reflect new function name
arm64: spectre-v2: Favour CPU-specific mitigation at EL2
arm64: link with -z norelro regardless of CONFIG_RELOCATABLE
arm64: Fix a broken copyright header in gen_vdso_offsets.sh
arm64: mremap speedup - Enable HAVE_MOVE_PMD
arm64: mm: use single quantity to represent the PA to VA translation
arm64: reject prctl(PR_PAC_RESET_KEYS) on compat tasks
- Support 'make compile_commands.json' to generate the compilation
database more easily, avoiding stale entries
- Support 'make clang-analyzer' and 'make clang-tidy' for static checks
using clang-tidy
- Preprocess scripts/modules.lds.S to allow CONFIG options in the module
linker script
- Drop cc-option tests from compiler flags supported by our minimal
GCC/Clang versions
- Use always 12-digits commit hash for CONFIG_LOCALVERSION_AUTO=y
- Use sha1 build id for both BFD linker and LLD
- Improve deb-pkg for reproducible builds and rootless builds
- Remove stale, useless scripts/namespace.pl
- Turn -Wreturn-type warning into error
- Fix build error of deb-pkg when CONFIG_MODULES=n
- Replace 'hostname' command with more portable 'uname -n'
- Various Makefile cleanups
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAl+RfS0VHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGG1QP/2hzoMzK1YXErPUhGrhYU1rxz7Nu
HkLTIkyKF1HPwSJf5XyNW/FTBI4SDlkNoVg/weEDCS1yFxxpvQLIck8ChzA1kIIM
P+1IfBWOTzqn91XsapU2zwSno3gylphVchVIvYAB3oLUotGeMSluy1cQtBRzyA5D
rj2Q7H8fzkzk3YoBcBC/BOKDlfo/usqQ1X/gsfRFwN/BJxeZSYoujNBE7KtHaDsd
8K/ggBIqmST4NBn+M8c11d8CxzvWbtG1gq3EkUL5nG8T13DsGn1EFC0SPt85bkvv
f9YywfJi37HixhZzK6tXYjN/PWoiEY6z90mhd0NtZghQT7kQMiTQ3sWrM8dX3ssf
phBzO94uFQDjhyxOaSSsCoI/TIciAPo4+G8PNjcaEtj63IEfhEz/dnlstYwY5Y9P
Pp3aZtVjSGJwGW2u2EUYj6paFVqjf6DXQjQKPNHnsYCEidIvFTjjguRGvx9gl6mx
yd8oseOsAtOEf0alRe9MMdvN17O3UrRAxgBdap7fktg02TLVRGxZIbuwKmBf29ho
ORl9zeFkYBn6XQFyuItJoXy/kYFyHDaBEPYCRQcY4dwqcjZIiAc/FhYbqYthJ59L
5vLN2etmDIVSuUv1J5nBqHHGCqJChykbqg7riQ651dCNKw4gZB8ctCay2lXhBXMg
1mqOcoG5WWL7//F+
=tZRN
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Support 'make compile_commands.json' to generate the compilation
database more easily, avoiding stale entries
- Support 'make clang-analyzer' and 'make clang-tidy' for static checks
using clang-tidy
- Preprocess scripts/modules.lds.S to allow CONFIG options in the
module linker script
- Drop cc-option tests from compiler flags supported by our minimal
GCC/Clang versions
- Use always 12-digits commit hash for CONFIG_LOCALVERSION_AUTO=y
- Use sha1 build id for both BFD linker and LLD
- Improve deb-pkg for reproducible builds and rootless builds
- Remove stale, useless scripts/namespace.pl
- Turn -Wreturn-type warning into error
- Fix build error of deb-pkg when CONFIG_MODULES=n
- Replace 'hostname' command with more portable 'uname -n'
- Various Makefile cleanups
* tag 'kbuild-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (34 commits)
kbuild: Use uname for LINUX_COMPILE_HOST detection
kbuild: Only add -fno-var-tracking-assignments for old GCC versions
kbuild: remove leftover comment for filechk utility
treewide: remove DISABLE_LTO
kbuild: deb-pkg: clean up package name variables
kbuild: deb-pkg: do not build linux-headers package if CONFIG_MODULES=n
kbuild: enforce -Werror=return-type
scripts: remove namespace.pl
builddeb: Add support for all required debian/rules targets
builddeb: Enable rootless builds
builddeb: Pass -n to gzip for reproducible packages
kbuild: split the build log of kallsyms
kbuild: explicitly specify the build id style
scripts/setlocalversion: make git describe output more reliable
kbuild: remove cc-option test of -Werror=date-time
kbuild: remove cc-option test of -fno-stack-check
kbuild: remove cc-option test of -fno-strict-overflow
kbuild: move CFLAGS_{KASAN,UBSAN,KCSAN} exports to relevant Makefiles
kbuild: remove redundant CONFIG_KASAN check from scripts/Makefile.kasan
kbuild: do not create built-in objects for external module builds
...
The function detect_harden_bp_fw() is gone after commit d4647f0a2ad7
("arm64: Rewrite Spectre-v2 mitigation code"). Update this comment to
reflect the new state of affairs.
Fixes: d4647f0a2ad7 ("arm64: Rewrite Spectre-v2 mitigation code")
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Cc: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201020214544.3206838-3-swboyd@chromium.org
Signed-off-by: Will Deacon <will@kernel.org>
This change removes all instances of DISABLE_LTO from
Makefiles, as they are currently unused, and the preferred
method of disabling LTO is to filter out the flags instead.
Note added by Masahiro Yamada:
DISABLE_LTO was added as preparation for GCC LTO, but GCC LTO was
not pulled into the mainline. (https://lkml.org/lkml/2014/4/8/272)
Suggested-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Spectre-v2 can be mitigated on Falkor CPUs either by calling into
firmware or by issuing a magic, CPU-specific sequence of branches.
Although the latter is faster, the size of the code sequence means that
it cannot be used in the EL2 vectors, and so there is a need for both
mitigations to co-exist in order to achieve optimal performance.
Change the mitigation selection logic for Spectre-v2 so that the
CPU-specific mitigation is used only when the firmware mitigation is
also available, rather than when a firmware mitigation is unavailable.
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
I was going to copy this but I didn't want to chase around the build
system stuff so I did it a different way.
Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
Link: https://lore.kernel.org/r/20201017002637.503579-1-palmer@dabbelt.com
Signed-off-by: Will Deacon <will@kernel.org>
All the callers currently do this, clean it up and move the clearing
into tracehook_notify_resume() instead.
Reviewed-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
It doesn't make sense to issue prctl(PR_PAC_RESET_KEYS) on a
compat task because the 32-bit instruction set does not offer PAuth
instructions. For consistency with other 64-bit only prctls such as
{SET,GET}_TAGGED_ADDR_CTRL, reject the prctl on compat tasks.
Although this is a userspace-visible change, maybe it isn't too late
to make this change given that the hardware isn't available yet and
it's very unlikely that anyone has 32-bit software that actually
depends on this succeeding.
Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/Ie885a1ff84ab498cc9f62d6451e9f2cfd4b1d06a
Link: https://lore.kernel.org/r/20201014052430.11630-1-pcc@google.com
[will: Do the same for the SVE prctl()s]
Signed-off-by: Will Deacon <will@kernel.org>
- Rework cpufreq statistics collection to allow it to take place
when fast frequency switching is enabled in the governor (Viresh
Kumar).
- Make the cpufreq core set the frequency scale on behalf of the
driver and update several cpufreq drivers accordingly (Ionela
Voinescu, Valentin Schneider).
- Add new hardware support to the STI and qcom cpufreq drivers and
improve them (Alain Volmat, Manivannan Sadhasivam).
- Fix multiple assorted issues in cpufreq drivers (Jon Hunter,
Krzysztof Kozlowski, Matthias Kaehlcke, Pali Rohár, Stephan
Gerhold, Viresh Kumar).
- Fix several assorted issues in the operating performance points
(OPP) framework (Stephan Gerhold, Viresh Kumar).
- Allow devfreq drivers to fetch devfreq instances by DT enumeration
instead of using explicit phandles and modify the devfreq core
code to support driver-specific devfreq DT bindings (Leonard
Crestez, Chanwoo Choi).
- Improve initial hardware resetting in the tegra30 devfreq driver
and clean up the tegra cpuidle driver (Dmitry Osipenko).
- Update the cpuidle core to collect state entry rejection
statistics and expose them via sysfs (Lina Iyer).
- Improve the ACPI _CST code handling diagnostics (Chen Yu).
- Update the PSCI cpuidle driver to allow the PM domain
initialization to occur in the OSI mode as well as in the PC
mode (Ulf Hansson).
- Rework the generic power domains (genpd) core code to allow
domain power off transition to be aborted in the absence of the
"power off" domain callback (Ulf Hansson).
- Fix two suspend-to-idle issues in the ACPI EC driver (Rafael
Wysocki).
- Fix the handling of timer_expires in the PM-runtime framework on
32-bit systems and the handling of device links in it (Grygorii
Strashko, Xiang Chen).
- Add IO requests batching support to the hibernate image saving and
reading code and drop a bogus get_gendisk() from there (Xiaoyi
Chen, Christoph Hellwig).
- Allow PCIe ports to be put into the D3cold power state if they
are power-manageable via ACPI (Lukas Wunner).
- Add missing header file include to a power capping driver (Pujin
Shi).
- Clean up the qcom-cpr AVS driver a bit (Liu Shixin).
- Kevin Hilman steps down as designated reviwer of adaptive voltage
scaling (AVS) driverrs (Kevin Hilman).
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl+F4A4SHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxX6QP/iELq9/OsH0aJdDQlY9tnh2Oa13+HB/Y
w1e6W+ZR/YjPgUpMVARwRLKf/gn7dUEwRDHVpGvDOyun+HACCPHB2hg8iktbxdVl
NFAVGZCCRezXqz3opL1hl8C3Dh0CqUPUjWXGMr+Lw2TZQKT+hx9K1dm9Epe3ivyT
RlVH/wifei80cFRcUUj7DI5KLCAyk+uKkZIFnZHAGKK6qOHMqRL5sDZsMUwWpd2i
AdghABjePbaiLTAoZuUsJINAGY4DnIt6ASRdMJ4iksiD6pFITwFs0HSOPe7hZLlv
zbwDPI5+TIkrOy9/aWoMaEIH1OQiFN/O++Slvdjn7gMsRgoW4d300ru4Jo1pOHxb
5twxagCCqlOf4YAaSrMCH4HT+c6fOWoGj2AKzX3DMJyO3/WN+8XNvUxKtC5Px1u+
pWRASjfQMO2j6nNjTCTwDJdYzggiKa54rYH2k7svX7XnTIAf+2E1gv8b4rMTgQrZ
0rq9kULYlhgk3EYjd/DndkvxunRlmiqhzrYB4jc9eDSPNzB8FZEbw1ZMRQTFfjK0
kp0vaEpTJ7JfKSCfluB4UmTuQoGogLl0xbzc+2NNIpwdNmrH2Srvq6wbj35jEDTU
tqsTsBP+XZFOWyFOw/L2J47LTOp0TJnz8z4aycLfrmdNUVnXJoU1sXgFlDzETMgT
0E6cTVwLF7Zi
=rGhy
-----END PGP SIGNATURE-----
Merge tag 'pm-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These rework the collection of cpufreq statistics to allow it to take
place if fast frequency switching is enabled in the governor, rework
the frequency invariance handling in the cpufreq core and drivers, add
new hardware support to a couple of cpufreq drivers, fix a number of
assorted issues and clean up the code all over.
Specifics:
- Rework cpufreq statistics collection to allow it to take place when
fast frequency switching is enabled in the governor (Viresh Kumar).
- Make the cpufreq core set the frequency scale on behalf of the
driver and update several cpufreq drivers accordingly (Ionela
Voinescu, Valentin Schneider).
- Add new hardware support to the STI and qcom cpufreq drivers and
improve them (Alain Volmat, Manivannan Sadhasivam).
- Fix multiple assorted issues in cpufreq drivers (Jon Hunter,
Krzysztof Kozlowski, Matthias Kaehlcke, Pali Rohár, Stephan
Gerhold, Viresh Kumar).
- Fix several assorted issues in the operating performance points
(OPP) framework (Stephan Gerhold, Viresh Kumar).
- Allow devfreq drivers to fetch devfreq instances by DT enumeration
instead of using explicit phandles and modify the devfreq core code
to support driver-specific devfreq DT bindings (Leonard Crestez,
Chanwoo Choi).
- Improve initial hardware resetting in the tegra30 devfreq driver
and clean up the tegra cpuidle driver (Dmitry Osipenko).
- Update the cpuidle core to collect state entry rejection statistics
and expose them via sysfs (Lina Iyer).
- Improve the ACPI _CST code handling diagnostics (Chen Yu).
- Update the PSCI cpuidle driver to allow the PM domain
initialization to occur in the OSI mode as well as in the PC mode
(Ulf Hansson).
- Rework the generic power domains (genpd) core code to allow domain
power off transition to be aborted in the absence of the "power
off" domain callback (Ulf Hansson).
- Fix two suspend-to-idle issues in the ACPI EC driver (Rafael
Wysocki).
- Fix the handling of timer_expires in the PM-runtime framework on
32-bit systems and the handling of device links in it (Grygorii
Strashko, Xiang Chen).
- Add IO requests batching support to the hibernate image saving and
reading code and drop a bogus get_gendisk() from there (Xiaoyi
Chen, Christoph Hellwig).
- Allow PCIe ports to be put into the D3cold power state if they are
power-manageable via ACPI (Lukas Wunner).
- Add missing header file include to a power capping driver (Pujin
Shi).
- Clean up the qcom-cpr AVS driver a bit (Liu Shixin).
- Kevin Hilman steps down as designated reviwer of adaptive voltage
scaling (AVS) drivers (Kevin Hilman)"
* tag 'pm-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (65 commits)
cpufreq: stats: Fix string format specifier mismatch
arm: disable frequency invariance for CONFIG_BL_SWITCHER
cpufreq,arm,arm64: restructure definitions of arch_set_freq_scale()
cpufreq: stats: Add memory barrier to store_reset()
cpufreq: schedutil: Simplify sugov_fast_switch()
ACPI: EC: PM: Drop ec_no_wakeup check from acpi_ec_dispatch_gpe()
ACPI: EC: PM: Flush EC work unconditionally after wakeup
PCI/ACPI: Whitelist hotplug ports for D3 if power managed by ACPI
PM: hibernate: remove the bogus call to get_gendisk() in software_resume()
cpufreq: Move traces and update to policy->cur to cpufreq core
cpufreq: stats: Enable stats for fast-switch as well
cpufreq: stats: Mark few conditionals with unlikely()
cpufreq: stats: Remove locking
cpufreq: stats: Defer stats update to cpufreq_stats_record_transition()
PM: domains: Allow to abort power off when no ->power_off() callback
PM: domains: Rename power state enums for genpd
PM / devfreq: tegra30: Improve initial hardware resetting
PM / devfreq: event: Change prototype of devfreq_event_get_edev_by_phandle function
PM / devfreq: Change prototype of devfreq_get_devfreq_by_phandle function
PM / devfreq: Add devfreq_get_devfreq_by_node function
...
for_each_memblock() is used to iterate over memblock.memory in a few
places that use data from memblock_region rather than the memory ranges.
Introduce separate for_each_mem_region() and
for_each_reserved_mem_region() to improve encapsulation of memblock
internals from its users.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Baoquan He <bhe@redhat.com>
Acked-by: Ingo Molnar <mingo@kernel.org> [x86]
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> [MIPS]
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> [.clang-format]
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: https://lkml.kernel.org/r/20200818151634.14343-18-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Iteration over memblock.reserved with for_each_reserved_mem_region() used
__next_reserved_mem_region() that implemented a subset of
__next_mem_region().
Use __for_each_mem_range() and, essentially, __next_mem_region() with
appropriate parameters to reduce code duplication.
While on it, rename for_each_reserved_mem_region() to
for_each_reserved_mem_range() for consistency.
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> [.clang-format]
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Daniel Axtens <dja@axtens.net>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Emil Renner Berthing <kernel@esmil.dk>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: https://lkml.kernel.org/r/20200818151634.14343-17-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>