IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Currently, an RCU-preempt CPU stall warning simply lists the PIDs of
those tasks holding up the current grace period. This can be helpful,
but more can be even more helpful.
To this end, this commit adds the nesting level, whether the task
thinks it was preempted in its current RCU read-side critical section,
whether RCU core has asked this task for a quiescent state, whether the
expedited-grace-period hint is set, and whether the task believes that
it is on the blocked-tasks list (it must be, or it would not be printed,
but if things are broken, best not to take too much for granted).
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Currently, the PREEMPT=y version of rcu_note_context_switch() does not
invoke rcu_tasks_qs(), and we need it to in order to keep RCU Tasks
Trace's IPIs down to a dull roar. This commit therefore enables this
hook.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
It is not as clear as it might be just where in RCU's idle entry/exit
code RCU stops and starts watching the current CPU. This commit therefore
adds comments calling out the transitions.
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Now that it should be safe to hold scheduler locks across
rcu_read_unlock(), even in cases where the corresponding RCU read-side
critical section might have been preempted and boosted, the commit adds
a test of this capability to rcutorture. This has been tested on current
mainline (which can deadlock in this situation), and lockdep duly reported
the expected deadlock. On -rcu, lockdep is silent, thus far, anyway.
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Now that RCU flavors have been consolidated, an RCU-preempt
rcu_read_unlock() in an interrupt or softirq handler cannot possibly
end the RCU read-side critical section. Consider the old vulnerability
involving rcu_read_unlock() being invoked within such a handler that
interrupted an __rcu_read_unlock_special(), in which a wakeup might be
invoked with a scheduler lock held. Because rcu_read_unlock_special()
no longer does wakeups in such situations, it is no longer necessary
for __rcu_read_unlock() to set the nesting level negative.
This commit therefore removes this recursion-protection code from
__rcu_read_unlock().
[ paulmck: Let rcu_exp_handler() continue to call rcu_report_exp_rdp(). ]
[ paulmck: Adjust other checks given no more negative nesting. ]
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The ->rcu_read_unlock_special.b.deferred_qs field is set to true in
rcu_read_unlock_special() but never set to false. This is not
particularly useful, so this commit removes this field.
The only possible justification for this field is to ease debugging
of RCU deferred quiscent states, but the combination of the other
->rcu_read_unlock_special fields plus ->rcu_blocked_node and of course
->rcu_read_lock_nesting should cover debugging needs. And if this last
proves incorrect, this patch can always be reverted, along with the
required setting of ->rcu_read_unlock_special.b.deferred_qs to false
in rcu_preempt_deferred_qs_irqrestore().
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Now that RCU flavors have been consolidated, an RCU-preempt
rcu_read_unlock() in an interrupt or softirq handler cannot possibly
end the RCU read-side critical section. Consider the old vulnerability
involving rcu_preempt_deferred_qs() being invoked within such a handler
that interrupted an extended RCU read-side critical section, in which
a wakeup might be invoked with a scheduler lock held. Because
rcu_read_unlock_special() no longer does wakeups in such situations,
it is no longer necessary for rcu_preempt_deferred_qs() to set the
nesting level negative.
This commit therefore removes this recursion-protection code from
rcu_preempt_deferred_qs().
[ paulmck: Fix typo in commit log per Steve Rostedt. ]
Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The scheduler is currently required to hold rq/pi locks across the entire
RCU read-side critical section or not at all. This is inconvenient and
leaves traps for the unwary, including the author of this commit.
But now that excessively long grace periods enable scheduling-clock
interrupts for holdout nohz_full CPUs, the nohz_full rescue logic in
rcu_read_unlock_special() can be dispensed with. In other words, the
rcu_read_unlock_special() function can refrain from doing wakeups unless
such wakeups are guaranteed safe.
This commit therefore avoids unsafe wakeups, freeing the scheduler to
hold rq/pi locks across rcu_read_unlock() even if the corresponding RCU
read-side critical section might have been preempted. This commit also
updates RCU's requirements documentation.
This commit is inspired by a patch from Lai Jiangshan:
https://lore.kernel.org/lkml/20191102124559.1135-2-laijs@linux.alibaba.com
This commit is further intended to be a step towards his goal of permitting
the inlining of RCU-preempt's rcu_read_lock() and rcu_read_unlock().
Cc: Lai Jiangshan <laijs@linux.alibaba.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit adds stubs for KCSAN's data_race(), ASSERT_EXCLUSIVE_WRITER(),
and ASSERT_EXCLUSIVE_ACCESS() macros to allow code using these macros
to move ahead.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit adds rcu_gp_might_be_stalled(), which returns true if there
is some reason to believe that the RCU grace period is stalled. The use
case is where an RCU free-memory path needs to allocate memory in order
to free it, a situation that should be avoided where possible.
But where it is necessary, there is always the alternative of using
synchronize_rcu() to wait for a grace period in order to avoid the
allocation. And if the grace period is stalled, allocating memory to
asynchronously wait for it is a bad idea of epic proportions: Far better
to let others use the memory, because these others might actually be
able to free that memory before the grace period ends.
Thus, rcu_gp_might_be_stalled() can be used to help decide whether
allocating memory on an RCU free path is a semi-reasonable course
of action.
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Uladzislau Rezki <urezki@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
We can relax the correctness of counting of number of queued objects in
favor of not hurting performance, by locklessly sampling per-cpu
counters. This should be Ok since under high memory pressure, it should not
matter if we are off by a few objects while counting. The shrinker will
still do the reclaim.
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
[ paulmck: Remove unused "flags" variable. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
To reduce grace periods and improve kfree() performance, we have done
batching recently dramatically bringing down the number of grace periods
while giving us the ability to use kfree_bulk() for efficient kfree'ing.
However, this has increased the likelihood of OOM condition under heavy
kfree_rcu() flood on small memory systems. This patch introduces a
shrinker which starts grace periods right away if the system is under
memory pressure due to existence of objects that have still not started
a grace period.
With this patch, I do not observe an OOM anymore on a system with 512MB
RAM and 8 CPUs, with the following rcuperf options:
rcuperf.kfree_loops=20000 rcuperf.kfree_alloc_num=8000
rcuperf.kfree_rcu_test=1 rcuperf.kfree_mult=2
Otherwise it easily OOMs with the above parameters.
NOTE:
1. On systems with no memory pressure, the patch has no effect as intended.
2. In the future, we can use this same mechanism to prevent grace periods
from happening even more, by relying on shrinkers carefully.
Cc: urezki@gmail.com
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This allows us to increase memory pressure dynamically using a new
rcuperf boot command line parameter called 'rcumult'.
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit converts the ULONG_CMP_LT() in rcu_nohz_full_cpu() to
time_before() to reflect the fact that it is comparing a timestamp to
the jiffies counter.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit converts the ULONG_CMP_GE() in rcu_initiate_boost() to
time_after() to reflect the fact that it is comparing a timestamp to
the jiffies counter.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit converts the ULONG_CMP_GE() in rcu_gp_fqs_loop() to
time_after() to reflect the fact that it is comparing a timestamp to
the jiffies counter.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Coccinelle reports a warning at use_softirq declaration
WARNING: Assignment of 0/1 to bool variable
The root cause is
use_softirq a variable of bool type is initialised with the integer 1
Replacing 1 with value true solve the issue.
Signed-off-by: Jules Irenge <jbi.octave@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Coccinelle reports warnings at rcu_read_lock_held_common()
WARNING: Assignment of 0/1 to bool variable
To fix this,
the assigned pointer ret values are replaced by corresponding boolean value.
Given that ret is a pointer of bool type
Signed-off-by: Jules Irenge <jbi.octave@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The rcu_state structure's gp_seq field is only to be modified by the RCU
grace-period kthread, which is single-threaded. This commit therefore
enlists KCSAN's help in enforcing this restriction. This commit applies
KCSAN-specific primitives, so cannot go upstream until KCSAN does.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit escapes *ret, because otherwise the documentation system
thinks that this is an incomplete emphasis block:
./kernel/rcu/update.c:65: WARNING: Inline emphasis start-string without end-string.
./kernel/rcu/update.c:65: WARNING: Inline emphasis start-string without end-string.
./kernel/rcu/update.c:70: WARNING: Inline emphasis start-string without end-string.
./kernel/rcu/update.c:82: WARNING: Inline emphasis start-string without end-string.
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
It is possible that an over-long grace period will end while the RCU
CPU stall warning message is printing. In this case, the estimate of
the offending grace period's duration can be erroneous due to refetching
of rcu_state.gp_start, which will now be the time of the newly started
grace period. Computation of this duration clearly needs to use the
start time for the old over-long grace period, not the fresh new one.
This commit avoids such errors by causing both print_other_cpu_stall() and
print_cpu_stall() to reuse the value previously fetched by their caller.
Signed-off-by: Zhaolong Zhang <zhangzl2013@126.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Even if some CPUs have excessive numbers of callbacks, RCU's grace-period
kthread will still wait normally between successive force-quiescent-state
scans. The first two are the most important, as they are the ones that
enlist aid from the scheduler when overloaded. This commit therefore
omits the wait before the first and the second force-quiescent-state
scan under callback-overload conditions.
This approach was inspired by a discussion with Jeff Roberson.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Although the accesses used to determine whether or not a stall should
be printed are an integral part of the concurrency algorithm governing
use of the corresponding variables, the values that are simply printed
are ancillary. As such, it is best to use data_race() for these accesses
in order to provide the greatest latitude in the use of KCSAN for the
other accesses that are an integral part of the algorithm. This commit
therefore changes the relevant uses of READ_ONCE() to data_race().
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The rcu_node structure's ->boost_tasks field is read locklessly, so this
commit adds the WRITE_ONCE() to an update in order to provide proper
documentation and READ_ONCE()/WRITE_ONCE() pairing.
This data race was reported by KCSAN. Not appropriate for backporting
due to failure being unlikely.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The srcu_data structure's ->srcu_lock_count and ->srcu_unlock_count arrays
are read and written locklessly, so this commit adds the data_race()
to the diagnostic-print loads from these arrays in order mark them as
known and approved data-racy accesses.
This data race was reported by KCSAN. Not appropriate for backporting due
to failure being unlikely and due to this being used only by rcutorture.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The rcu_node structure's ->boost_tasks field is read locklessly, so this
commit adds the READ_ONCE() to one load in order to avoid destructive
compiler optimizations. The other load is from a diagnostic print,
so data_race() suffices.
This data race was reported by KCSAN. Not appropriate for backporting
due to failure being unlikely.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
There are lockless loads from the rcu_node structure's ->exp_tasks field,
so this commit causes all stores to use WRITE_ONCE() and all lockless
loads to use READ_ONCE() or data_race(), with the latter for debug
prints. This code also did a unprotected traversal of the linked list
pointed into by ->exp_tasks, so this commit also acquires the rcu_node
structure's ->lock to properly protect this traversal. This list was
traversed unprotected only when printing an RCU CPU stall warning for
an expedited grace period, so the odds of seeing this in production are
not all that high.
This data race was reported by KCSAN.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The rcu_state structure's ncpus field is only to be modified by the
CPU-hotplug CPU-online code path, which is single-threaded. This
commit therefore enlists KCSAN's help in enforcing this restriction.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit adds stubs for KCSAN's data_race(), ASSERT_EXCLUSIVE_WRITER(),
and ASSERT_EXCLUSIVE_ACCESS() macros to allow code using these macros to
move ahead.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
This commit adds stubs for KCSAN's data_race(), ASSERT_EXCLUSIVE_WRITER(),
and ASSERT_EXCLUSIVE_ACCESS() macros to allow code using these macros to
move ahead.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The rcu_nmi_enter_common() function can be invoked both in interrupt
and NMI handlers. If it is invoked from process context (as opposed
to userspace or idle context) on a nohz_full CPU, it might acquire the
CPU's leaf rcu_node structure's ->lock. Because this lock is held only
with interrupts disabled, this is safe from an interrupt handler, but
doing so from an NMI handler can result in self-deadlock.
This commit therefore adds "irq" to the "if" condition so as to only
acquire the ->lock from irq handlers or process context, never from
an NMI handler.
Fixes: 5b14557b073c ("rcu: Avoid tick_dep_set_cpu() misordering")
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Cc: <stable@vger.kernel.org> # 5.5.x
Pull locking updates from Ingo Molnar:
"The main changes in this cycle were:
- Continued user-access cleanups in the futex code.
- percpu-rwsem rewrite that uses its own waitqueue and atomic_t
instead of an embedded rwsem. This addresses a couple of
weaknesses, but the primary motivation was complications on the -rt
kernel.
- Introduce raw lock nesting detection on lockdep
(CONFIG_PROVE_RAW_LOCK_NESTING=y), document the raw_lock vs. normal
lock differences. This too originates from -rt.
- Reuse lockdep zapped chain_hlocks entries, to conserve RAM
footprint on distro-ish kernels running into the "BUG:
MAX_LOCKDEP_CHAIN_HLOCKS too low!" depletion of the lockdep
chain-entries pool.
- Misc cleanups, smaller fixes and enhancements - see the changelog
for details"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (55 commits)
fs/buffer: Make BH_Uptodate_Lock bit_spin_lock a regular spinlock_t
thermal/x86_pkg_temp: Make pkg_temp_lock a raw_spinlock_t
Documentation/locking/locktypes: Minor copy editor fixes
Documentation/locking/locktypes: Further clarifications and wordsmithing
m68knommu: Remove mm.h include from uaccess_no.h
x86: get rid of user_atomic_cmpxchg_inatomic()
generic arch_futex_atomic_op_inuser() doesn't need access_ok()
x86: don't reload after cmpxchg in unsafe_atomic_op2() loop
x86: convert arch_futex_atomic_op_inuser() to user_access_begin/user_access_end()
objtool: whitelist __sanitizer_cov_trace_switch()
[parisc, s390, sparc64] no need for access_ok() in futex handling
sh: no need of access_ok() in arch_futex_atomic_op_inuser()
futex: arch_futex_atomic_op_inuser() calling conventions change
completion: Use lockdep_assert_RT_in_threaded_ctx() in complete_all()
lockdep: Add posixtimer context tracing bits
lockdep: Annotate irq_work
lockdep: Add hrtimer context tracing bits
lockdep: Introduce wait-type checks
completion: Use simple wait queues
sched/swait: Prepare usage in completions
...
Currently, rcu_barrier() ignores offline CPUs, However, it is possible
for an offline no-CBs CPU to have callbacks queued, and rcu_barrier()
must wait for those callbacks. This commit therefore makes rcu_barrier()
directly invoke the rcu_barrier_func() with interrupts disabled for such
CPUs. This requires passing the CPU number into this function so that
it can entrain the rcu_barrier() callback onto the correct CPU's callback
list, given that the code must instead execute on the current CPU.
While in the area, this commit fixes a bug where the first CPU's callback
might have been invoked before rcu_segcblist_entrain() returned, which
would also result in an early wakeup.
Fixes: 5d6742b37727 ("rcu/nocb: Use rcu_segcblist for no-CBs CPUs")
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
[ paulmck: Apply optimization feedback from Boqun Feng. ]
Cc: <stable@vger.kernel.org> # 5.5.x
The rcu_state structure's gp_seq field is only to be modified by the RCU
grace-period kthread, which is single-threaded. This commit therefore
enlists KCSAN's help in enforcing this restriction.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Mark irq_work items with IRQ_WORK_HARD_IRQ which should be invoked in
hardirq context even on PREEMPT_RT. IRQ_WORK without this flag will be
invoked in softirq context on PREEMPT_RT.
Set ->irq_config to 1 for the IRQ_WORK items which are invoked in softirq
context so lockdep knows that these can safely acquire a spinlock_t.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200321113242.643576700@linutronix.de
Extend lockdep to validate lock wait-type context.
The current wait-types are:
LD_WAIT_FREE, /* wait free, rcu etc.. */
LD_WAIT_SPIN, /* spin loops, raw_spinlock_t etc.. */
LD_WAIT_CONFIG, /* CONFIG_PREEMPT_LOCK, spinlock_t etc.. */
LD_WAIT_SLEEP, /* sleeping locks, mutex_t etc.. */
Where lockdep validates that the current lock (the one being acquired)
fits in the current wait-context (as generated by the held stack).
This ensures that there is no attempt to acquire mutexes while holding
spinlocks, to acquire spinlocks while holding raw_spinlocks and so on. In
other words, its a more fancy might_sleep().
Obviously RCU made the entire ordeal more complex than a simple single
value test because RCU can be acquired in (pretty much) any context and
while it presents a context to nested locks it is not the same as it
got acquired in.
Therefore its necessary to split the wait_type into two values, one
representing the acquire (outer) and one representing the nested context
(inner). For most 'normal' locks these two are the same.
[ To make static initialization easier we have the rule that:
.outer == INV means .outer == .inner; because INV == 0. ]
It further means that its required to find the minimal .inner of the held
stack to compare against the outer of the new lock; because while 'normal'
RCU presents a CONFIG type to nested locks, if it is taken while already
holding a SPIN type it obviously doesn't relax the rules.
Below is an example output generated by the trivial test code:
raw_spin_lock(&foo);
spin_lock(&bar);
spin_unlock(&bar);
raw_spin_unlock(&foo);
[ BUG: Invalid wait context ]
-----------------------------
swapper/0/1 is trying to lock:
ffffc90000013f20 (&bar){....}-{3:3}, at: kernel_init+0xdb/0x187
other info that might help us debug this:
1 lock held by swapper/0/1:
#0: ffffc90000013ee0 (&foo){+.+.}-{2:2}, at: kernel_init+0xd1/0x187
The way to read it is to look at the new -{n,m} part in the lock
description; -{3:3} for the attempted lock, and try and match that up to
the held locks, which in this case is the one: -{2,2}.
This tells that the acquiring lock requires a more relaxed environment than
presented by the lock stack.
Currently only the normal locks and RCU are converted, the rest of the
lockdep users defaults to .inner = INV which is ignored. More conversions
can be done when desired.
The check for spinlock_t nesting is not enabled by default. It's a separate
config option for now as there are known problems which are currently
addressed. The config option allows to identify these problems and to
verify that the solutions found are indeed solving them.
The config switch will be removed and the checks will permanently enabled
once the vast majority of issues has been addressed.
[ bigeasy: Move LD_WAIT_FREE,… out of CONFIG_LOCKDEP to avoid compile
failure with CONFIG_DEBUG_SPINLOCK + !CONFIG_LOCKDEP]
[ tglx: Add the config option ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200321113242.427089655@linutronix.de
Currently, if rcu_barrier() returns too soon, the test waits 100ms and
then does another instance of the test. However, if rcu_barrier() were
to have waited for more than 100ms too short a time, this could cause
the test's rcu_head structures to be reused while they were still on
RCU's callback lists. This can result in knock-on errors that obscure
the original rcu_barrier() test failure.
This commit therefore adds code that attempts to wait until all of
the test's callbacks have been invoked. Of course, if RCU completely
lost track of the corresponding rcu_head structures, this wait could be
forever. This commit therefore also complains if this attempted recovery
takes more than one second, and it also gives up when the test ends.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Currently, rcu_torture_barrier_cbs() posts callbacks from whatever CPU
it is running on, which means that all these kthreads might well be
posting from the same CPU, which would drastically reduce the effectiveness
of this test. This commit therefore uses IPIs to make the callbacks be
posted from the corresponding CPU (given by local variable myid).
If the IPI fails (which can happen if the target CPU is offline or does
not exist at all), the callback is posted on whatever CPU is currently
running.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
During changes to kfree_rcu() code, we often check the amount of free
memory. As an alternative to checking this manually, this commit adds a
measurement in the test itself. It measures four times during the test
for available memory, digitally filters these measurements to produce a
running average with a weight of 0.5, and compares this digitally filtered
value with the amount of available memory at the beginning of the test.
Something like the following is printed at the end of the run:
Total time taken by all kfree'ers: 6369738407 ns, loops: 10000, batches: 764, memory footprint: 216MB
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The rcutorture global variable rcu_torture_current is accessed locklessly,
so it must use the RCU pointer load/store primitives. This commit
therefore adds several that were missed.
This data race was reported by KCSAN. Not appropriate for backporting due
to failure being unlikely and due to this being used only by rcutorture.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The rcutorture rcu_torture_count and rcu_torture_batch per-CPU variables
are read locklessly, so this commit adds the READ_ONCE() to a load in
order to avoid various types of compiler vandalism^Woptimization.
This data race was reported by KCSAN. Not appropriate for backporting
due to failure being unlikely and due to this being rcutorture.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The rcu_fwd_cb_nodelay variable suppresses excessively long read-side
delays while carrying out an rcutorture forward-progress test. As such,
it is accessed both by readers and updaters, and most of the accesses
therefore use *_ONCE(). Except for one in rcu_read_delay(), which this
commit fixes.
This data race was reported by KCSAN. Not appropriate for backporting
due to this being rcutorture.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
The ->rtort_pipe_count field in the rcu_torture structure checks for
too-short grace periods, and is therefore read by rcutorture's readers
while being updated by rcutorture's writers. This commit therefore
adds the needed READ_ONCE() and WRITE_ONCE() invocations.
This data race was reported by KCSAN. Not appropriate for backporting
due to failure being unlikely and due to this being rcutorture.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
In normal production, an excessively long wait on a grace period
(synchronize_rcu(), for example) at boottime is often just as bad
as at any other time. In fact, given the desire for fast boot, any
sort of long wait at boot is a bad idea. However, heavy rcutorture
testing on large hyperthreaded systems can generate such long waits
during boot as a matter of course. This commit therefore causes
the rcupdate.rcu_cpu_stall_suppress_at_boot kernel boot parameter to
suppress reporting of bootime bad-sequence warning due to excessively
long grace-period waits.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
In normal production, an RCU CPU stall warning at boottime is often
just as bad as at any other time. In fact, given the desire for fast
boot, any sort of long-term stall at boot is a bad idea. However,
heavy rcutorture testing on large hyperthreaded systems can generate
boottime RCU CPU stalls as a matter of course. This commit therefore
provides a kernel boot parameter that suppresses reporting of boottime
RCU CPU stall warnings and similarly of rcutorture writer stalls.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Additional rcutorture aggression can result in, believe it or not,
boot times in excess of three minutes on large hyperthreaded systems.
This is long enough for rcutorture to decide to do some callback flooding,
which seems a bit excessive given that userspace cannot have started
until long after boot, and it is userspace that does the real-world
callback flooding. Worse yet, because Tiny RCU lacks forward-progress
functionality, the looping-in-the-kernel tests can also be problematic
during early boot.
This commit therefore causes rcutorture to hold off on callback
flooding until about the time that init is spawned, and the same
for looping-in-the-kernel tests for Tiny RCU.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Some larger systems can take in excess of 50 seconds to complete their
early boot initcalls prior to spawing init. This does not in any way
help the forward-progress judgments of built-in rcutorture (when
rcutorture is built as a module, the insmod or modprobe command normally
cannot happen until some time after boot completes). This commit
therefore suppresses such complaints until about the time that init
is spawned.
This also includes a fix to a stupid error located by kbuild test robot.
[ paulmck: Apply kbuild test robot feedback. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
[ paulmck: Fix to nohz_full slow-expediting recovery logic, per bpetkov. ]
[ paulmck: Restrict splat to CONFIG_PREEMPT_RT=y kernels and simplify. ]
Tested-by: Borislav Petkov <bp@alien8.de>
A read of the srcu_struct structure's ->srcu_gp_seq field should not
need READ_ONCE() when that structure's ->lock is held. Except that this
lock is not always held when updating this field. This commit therefore
acquires the lock around updates and removes a now-unneeded READ_ONCE().
This data race was reported by KCSAN.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
[ paulmck: Switch from READ_ONCE() to lock per Peter Zilstra question. ]
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>