docs: Update RCU's hotplug requirements with a bit about design
The rcu_barrier() section of the "Hotplug CPU" section discusses deadlocks, however the description of deadlocks other than those involving rcu_barrier() is rather incomplete. This commit therefore continues the section by describing how RCU's design handles CPU hotplug in a deadlock-free way. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit is contained in:
parent
86b5a7381b
commit
a043260740
@ -1929,16 +1929,45 @@ The Linux-kernel CPU-hotplug implementation has notifiers that are used
|
||||
to allow the various kernel subsystems (including RCU) to respond
|
||||
appropriately to a given CPU-hotplug operation. Most RCU operations may
|
||||
be invoked from CPU-hotplug notifiers, including even synchronous
|
||||
grace-period operations such as ``synchronize_rcu()`` and
|
||||
``synchronize_rcu_expedited()``.
|
||||
grace-period operations such as (``synchronize_rcu()`` and
|
||||
``synchronize_rcu_expedited()``). However, these synchronous operations
|
||||
do block and therefore cannot be invoked from notifiers that execute via
|
||||
``stop_machine()``, specifically those between the ``CPUHP_AP_OFFLINE``
|
||||
and ``CPUHP_AP_ONLINE`` states.
|
||||
|
||||
However, all-callback-wait operations such as ``rcu_barrier()`` are also
|
||||
not supported, due to the fact that there are phases of CPU-hotplug
|
||||
operations where the outgoing CPU's callbacks will not be invoked until
|
||||
after the CPU-hotplug operation ends, which could also result in
|
||||
deadlock. Furthermore, ``rcu_barrier()`` blocks CPU-hotplug operations
|
||||
during its execution, which results in another type of deadlock when
|
||||
invoked from a CPU-hotplug notifier.
|
||||
In addition, all-callback-wait operations such as ``rcu_barrier()`` may
|
||||
not be invoked from any CPU-hotplug notifier. This restriction is due
|
||||
to the fact that there are phases of CPU-hotplug operations where the
|
||||
outgoing CPU's callbacks will not be invoked until after the CPU-hotplug
|
||||
operation ends, which could also result in deadlock. Furthermore,
|
||||
``rcu_barrier()`` blocks CPU-hotplug operations during its execution,
|
||||
which results in another type of deadlock when invoked from a CPU-hotplug
|
||||
notifier.
|
||||
|
||||
Finally, RCU must avoid deadlocks due to interaction between hotplug,
|
||||
timers and grace period processing. It does so by maintaining its own set
|
||||
of books that duplicate the centrally maintained ``cpu_online_mask``,
|
||||
and also by reporting quiescent states explicitly when a CPU goes
|
||||
offline. This explicit reporting of quiescent states avoids any need
|
||||
for the force-quiescent-state loop (FQS) to report quiescent states for
|
||||
offline CPUs. However, as a debugging measure, the FQS loop does splat
|
||||
if offline CPUs block an RCU grace period for too long.
|
||||
|
||||
An offline CPU's quiescent state will be reported either:
|
||||
1. As the CPU goes offline using RCU's hotplug notifier (``rcu_report_dead()``).
|
||||
2. When grace period initialization (``rcu_gp_init()``) detects a
|
||||
race either with CPU offlining or with a task unblocking on a leaf
|
||||
``rcu_node`` structure whose CPUs are all offline.
|
||||
|
||||
The CPU-online path (``rcu_cpu_starting()``) should never need to report
|
||||
a quiescent state for an offline CPU. However, as a debugging measure,
|
||||
it does emit a warning if a quiescent state was not already reported
|
||||
for that CPU.
|
||||
|
||||
During the checking/modification of RCU's hotplug bookkeeping, the
|
||||
corresponding CPU's leaf node lock is held. This avoids race conditions
|
||||
between RCU's hotplug notifier hooks, the grace period initialization
|
||||
code, and the FQS loop, all of which refer to or modify this bookkeeping.
|
||||
|
||||
Scheduler and RCU
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
Loading…
Reference in New Issue
Block a user