linux

iv/linux

Author	SHA1	Message	Date
Paul E. McKenney	ab6ad3dbdd	Merge branches 'bitmaprange.2021.03.08a', 'fixes.2021.03.15a', 'kvfree_rcu.2021.03.08a', 'mmdumpobj.2021.03.08a', 'nocb.2021.03.15a', 'poll.2021.03.24a', 'rt.2021.03.08a', 'tasks.2021.03.08a', 'torture.2021.03.08a' and 'torturescript.2021.03.22a' into HEAD bitmaprange.2021.03.08a: Allow 3-N for bitmap ranges. fixes.2021.03.15a: Miscellaneous fixes. kvfree_rcu.2021.03.08a: kvfree_rcu() updates. mmdumpobj.2021.03.08a: mem_dump_obj() updates. nocb.2021.03.15a: RCU NOCB CPU updates, including limited deoffloading. poll.2021.03.24a: Polling grace-period interfaces for RCU. rt.2021.03.08a: Realtime-related RCU changes. tasks.2021.03.08a: Tasks-RCU updates. torture.2021.03.08a: Torture-test updates. torturescript.2021.03.22a: Torture-test scripting updates.	2021-03-24 17:20:18 -07:00
Paul E. McKenney	7ac3fdf099	rcutorture: Test start_poll_synchronize_rcu() and poll_state_synchronize_rcu() This commit causes rcutorture to test the new start_poll_synchronize_rcu() and poll_state_synchronize_rcu() functions. Because of the difficulty of determining the nature of a synchronous RCU grace (expedited or not), the test that insisted that poll_state_synchronize_rcu() detect an intervening synchronize_rcu() had to be dropped. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-24 17:17:38 -07:00
Paul E. McKenney	0909fc2b2c	rcu: Provide polling interfaces for Tiny RCU grace periods There is a need for a non-blocking polling interface for RCU grace periods, so this commit supplies start_poll_synchronize_rcu() and poll_state_synchronize_rcu() for this purpose. Note that the existing get_state_synchronize_rcu() may be used if future grace periods are inevitable (perhaps due to a later call_rcu() invocation). The new start_poll_synchronize_rcu() is to be used if future grace periods might not otherwise happen. Finally, poll_state_synchronize_rcu() provides a lockless check for a grace period having elapsed since the corresponding call to either of the get_state_synchronize_rcu() or start_poll_synchronize_rcu(). As with get_state_synchronize_rcu(), the return value from either get_state_synchronize_rcu() or start_poll_synchronize_rcu() is passed in to a later call to either poll_state_synchronize_rcu() or the existing (might_sleep) cond_synchronize_rcu(). [ paulmck: Revert cond_synchronize_rcu() to might_sleep() per Frederic Weisbecker feedback. ] Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-24 17:16:15 -07:00
Paul E. McKenney	114e4a4b48	torture: Fix kvm.sh --datestamp regex check Some versions of grep are happy to interpret a nonsensically placed "-" within a "[]" pattern as a dash, while others give an error message. This commit therefore places the "-" at the end of the expression where it was supposed to be in the first place. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-22 08:29:21 -07:00
Paul E. McKenney	a1ab2e89f3	torture: Consolidate qemu-cmd duration editing into kvm-transform.sh Currently, kvm-again.sh updates the duration in the "seconds=" comment in the qemu-cmd file, but kvm-transform.sh updates the duration in the actual qemu command arguments. This is an accident waiting to happen. This commit therefore consolidates these updates into kvm-transform.sh. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-22 08:29:21 -07:00
Paul E. McKenney	03edf700db	torture: Print proper vmlinux path for kvm-again.sh runs The kvm-again.sh script does not copy over the vmlinux files due to their large size. This means that a gdb run must use the vmlinux file from the original "res" directory. This commit therefore finds that directory and prints it out so that the user can copy and pasted the gdb command just as for the initial run. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-22 08:29:20 -07:00
Paul E. McKenney	a5dbe2524f	torture: Make TORTURE_TRUST_MAKE available in kvm-again.sh environment Because the TORTURE_TRUST_MAKE environment variable is not recorded, kvm-again.sh runs can result in the parse-build.sh script emitting false-positive "BUG: TREE03 no build" messages. These messages are intended to complain about any lack of compiler invocations when the --trust-make flag is not given to kvm.sh. However, when this flag is given to kvm.sh (and thus when TORTURE_TRUST_MAKE=y), lack of compiler invocations is expected behavior when rebuilding from identical source code. This commit therefore makes kvm-test-1-run.sh record the value of the TORTURE_TRUST_MAKE environment variable as an additional comment in the qemu-cmd file, and also makes kvm-again.sh reconstitute that variable from that comment. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-22 08:29:20 -07:00
Paul E. McKenney	018629e909	torture: Make kvm-transform.sh update jitter commands When rerunning an old run using kvm-again.sh, the jitter commands will re-use the original "res" directory. This works, but is clearly an accident waiting to happen. And this accident will happen with remote runs, where the original directory lives on some other system. This commit therefore updates the qemu-cmd commands to use the new res directory created for this specific run. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-22 08:29:20 -07:00
Paul E. McKenney	00505165cf	torture: Add --duration argument to kvm-again.sh This commit adds a --duration argument to kvm-again.sh to allow the user to override the --duration specified for the original kvm.sh run. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-22 08:29:19 -07:00
Paul E. McKenney	7cf86c0b62	torture: Add kvm-again.sh to rerun a previous torture-test This commit adds a kvm-again.sh script that, given the results directory of a torture-test run, re-runs that test. This means that the kernels need not be rebuilt, but it also is a step towards running torture tests on remote systems. This commit also adds a kvm-test-1-run-batch.sh script that runs one batch out of the torture test. The idea is to copy a results directory tree to remote systems, then use kvm-test-1-run-batch.sh to run batches on these systems. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-22 08:29:19 -07:00
Paul E. McKenney	d6100d764c	torture: Create a "batches" file for build reuse This commit creates a "batches" file in the res/$ds directory, where $ds is the datestamp. This file contains the batches and the number of CPUs, for example: 1 TREE03 16 1 SRCU-P 8 2 TREE07 16 2 TREE01 8 3 TREE02 8 3 TREE04 8 3 TREE05 8 4 SRCU-N 4 4 TRACE01 4 4 TRACE02 4 4 RUDE01 2 4 RUDE01.2 2 4 TASKS01 2 4 TASKS03 2 4 SRCU-t 1 4 SRCU-u 1 4 TASKS02 1 4 TINY01 1 5 TINY02 1 5 TREE09 1 The first column is the batch number, the second the scenario number (possibly suffixed by a repetition number, as in "RUDE01.2"), and the third is the number of CPUs required by that scenario. The last line shows the number of CPUs expected by this batch file, which allows the run to be re-batched if a different number of CPUs is available. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-22 08:29:18 -07:00
Paul E. McKenney	7ef0d5a33c	torture: De-capitalize TORTURE_SUITE Although it might be unlikely that someone would name a scenario "TORTURE_SUITE", they are within their rights to do so. This script therefore renames the "TORTURE_SUITE" file in the top-level date-stamped directory within "res" to "torture_suite" to avoid this name collision. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-22 08:29:18 -07:00
Paul E. McKenney	e633e63aa9	torture: Make upper-case-only no-dot no-slash scenario names official This commit enforces the defacto restriction on scenario names, which is that they contain neither "/", ".", nor lowercase alphabetic characters. This restriction avoids collisions between scenario names and the torture scripting's files and directories. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-22 08:29:18 -07:00
Paul E. McKenney	00a447fabb	torture: Rename SRCU-t and SRCU-u to avoid lowercase characters The convention that scenario names are all uppercase has two exceptions, SRCU-t and SRCU-u. This commit therefore renames them to SRCU-T and SRCU-U, respectively, to bring them in line with this convention. This in turn permits tighter argument checking in the torture-test scripting. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-22 08:29:17 -07:00
Paul E. McKenney	996a042e0a	torture: Remove no-mpstat error message The cpus2use.sh script complains if the mpstat command is not available, and instead uses all available CPUs. Unfortunately, this complaint goes to stdout, where it confuses invokers who expect a single number. This commit removes this error message in order to avoid this confusion. The tendency of late has been to give rcutorture a full system, so this should not cause issues. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-22 08:29:17 -07:00
Paul E. McKenney	cb1fa863a0	torture: Record kvm-test-1-run.sh and kvm-test-1-run-qemu.sh PIDs This commit records the process IDs of the kvm-test-1-run.sh and kvm-test-1-run-qemu.sh scripts to ease monitoring of remotely running instances of these scripts. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-22 08:29:17 -07:00
Paul E. McKenney	7831b391fb	torture: Record jitter start/stop commands Distributed runs of rcutorture will need to start and stop jittering on the remote hosts, which means that the commands must be communicated to those hosts. The commit therefore causes kvm.sh to place these commands in new TORTURE_JITTER_START and TORTURE_JITTER_STOP environment variables to communicate them to the scripts that will set this up. In addition, this commit causes kvm-test-1-run.sh to append these commands to each generated qemu-cmd file, which allows any remotely executing script to extract the needed commands from this file. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-22 08:29:16 -07:00
Paul E. McKenney	d53f52d6fc	torture: Extract kvm-test-1-run-qemu.sh from kvm-test-1-run.sh Currently, kvm-test-1-run.sh both builds and runs an rcutorture kernel, which is inconvenient when it is necessary to re-run an old run or to carry out a run on a remote system. This commit therefore extracts the portion of kvm-test-1-run.sh that invoke qemu to actually run rcutorture and places it in kvm-test-1-run-qemu.sh. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-22 08:29:16 -07:00
Paul E. McKenney	cc45716e07	torture: Record TORTURE_KCONFIG_GDB_ARG in qemu-cmd When re-running old rcutorture builds, if the original run involved gdb, the re-run also needs to do so. This commit therefore records the TORTURE_KCONFIG_GDB_ARG environment variable into the qemu-cmd file so that the re-run can access it. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-22 08:29:15 -07:00
Paul E. McKenney	040accb3cd	torture: Abstract jitter.sh start/stop into scripts This commit creates jitterstart.sh and jitterstop.sh scripts that handle the starting and stopping of the jitter.sh scripts. These must be sourced using the bash "." command to allow the generated script to wait on the backgrounded jitter.sh scripts. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-22 08:28:34 -07:00
Paul E. McKenney	7abb18bd75	rcu: Provide polling interfaces for Tree RCU grace periods There is a need for a non-blocking polling interface for RCU grace periods, so this commit supplies start_poll_synchronize_rcu() and poll_state_synchronize_rcu() for this purpose. Note that the existing get_state_synchronize_rcu() may be used if future grace periods are inevitable (perhaps due to a later call_rcu() invocation). The new start_poll_synchronize_rcu() is to be used if future grace periods might not otherwise happen. Finally, poll_state_synchronize_rcu() provides a lockless check for a grace period having elapsed since the corresponding call to either of the get_state_synchronize_rcu() or start_poll_synchronize_rcu(). As with get_state_synchronize_rcu(), the return value from either get_state_synchronize_rcu() or start_poll_synchronize_rcu() is passed in to a later call to either poll_state_synchronize_rcu() or the existing (might_sleep) cond_synchronize_rcu(). [ paulmck: Remove redundant smp_mb() per Frederic Weisbecker feedback. ] [ Update poll_state_synchronize_rcu() docbook per Frederic Weisbecker feedback. ] Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-22 08:23:48 -07:00
Frederic Weisbecker	e02691b7ef	rcu/nocb: Move trace_rcu_nocb_wake() calls outside nocb_lock when possible Those tracing calls don't need to be under ->nocb_lock. This commit therefore moves them outside of that lock. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Neeraj Upadhyay <neeraju@codeaurora.org> Cc: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-15 13:54:55 -07:00
Frederic Weisbecker	0efdf14a9f	rcu/nocb: Remove stale comment above rcu_segcblist_offload() This commit removes a stale comment claiming that the cblist must be empty before changing the offloading state. This claim was correct back when the offloaded state was defined exclusively at boot. Reported-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Neeraj Upadhyay <neeraju@codeaurora.org> Cc: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-15 13:54:54 -07:00
Frederic Weisbecker	76d00b494d	rcu/nocb: Disable bypass when CPU isn't completely offloaded Currently, the bypass is flushed at the very last moment in the deoffloading procedure. However, this approach leads to a larger state space than would be preferred. This commit therefore disables the bypass at soon as the deoffloading procedure begins, then flushes it. This guarantees that the bypass remains empty and thus out of the way of the deoffloading procedure. Symmetrically, this commit waits to enable the bypass until the offloading procedure has completed. Reported-by: Paul E. McKenney <paulmck@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Neeraj Upadhyay <neeraju@codeaurora.org> Cc: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-15 13:54:54 -07:00
Frederic Weisbecker	b2fcf21020	rcu/nocb: Fix missed nocb_timer requeue This sequence of events can lead to a failure to requeue a CPU's ->nocb_timer: 1. There are no callbacks queued for any CPU covered by CPU 0-2's ->nocb_gp_kthread. Note that ->nocb_gp_kthread is associated with CPU 0. 2. CPU 1 enqueues its first callback with interrupts disabled, and thus must defer awakening its ->nocb_gp_kthread. It therefore queues its rcu_data structure's ->nocb_timer. At this point, CPU 1's rdp->nocb_defer_wakeup is RCU_NOCB_WAKE. 3. CPU 2, which shares the same ->nocb_gp_kthread, also enqueues a callback, but with interrupts enabled, allowing it to directly awaken the ->nocb_gp_kthread. 4. The newly awakened ->nocb_gp_kthread associates both CPU 1's and CPU 2's callbacks with a future grace period and arranges for that grace period to be started. 5. This ->nocb_gp_kthread goes to sleep waiting for the end of this future grace period. 6. This grace period elapses before the CPU 1's timer fires. This is normally improbably given that the timer is set for only one jiffy, but timers can be delayed. Besides, it is possible that kernel was built with CONFIG_RCU_STRICT_GRACE_PERIOD=y. 7. The grace period ends, so rcu_gp_kthread awakens the ->nocb_gp_kthread, which in turn awakens both CPU 1's and CPU 2's ->nocb_cb_kthread. Then ->nocb_gb_kthread sleeps waiting for more newly queued callbacks. 8. CPU 1's ->nocb_cb_kthread invokes its callback, then sleeps waiting for more invocable callbacks. 9. Note that neither kthread updated any ->nocb_timer state, so CPU 1's ->nocb_defer_wakeup is still set to RCU_NOCB_WAKE. 10. CPU 1 enqueues its second callback, this time with interrupts enabled so it can wake directly ->nocb_gp_kthread. It does so with calling wake_nocb_gp() which also cancels the pending timer that got queued in step 2. But that doesn't reset CPU 1's ->nocb_defer_wakeup which is still set to RCU_NOCB_WAKE. So CPU 1's ->nocb_defer_wakeup and its ->nocb_timer are now desynchronized. 11. ->nocb_gp_kthread associates the callback queued in 10 with a new grace period, arranges for that grace period to start and sleeps waiting for it to complete. 12. The grace period ends, rcu_gp_kthread awakens ->nocb_gp_kthread, which in turn wakes up CPU 1's ->nocb_cb_kthread which then invokes the callback queued in 10. 13. CPU 1 enqueues its third callback, this time with interrupts disabled so it must queue a timer for a deferred wakeup. However the value of its ->nocb_defer_wakeup is RCU_NOCB_WAKE which incorrectly indicates that a timer is already queued. Instead, CPU 1's ->nocb_timer was cancelled in 10. CPU 1 therefore fails to queue the ->nocb_timer. 14. CPU 1 has its pending callback and it may go unnoticed until some other CPU ever wakes up ->nocb_gp_kthread or CPU 1 ever calls an explicit deferred wakeup, for example, during idle entry. This commit fixes this bug by resetting rdp->nocb_defer_wakeup everytime we delete the ->nocb_timer. It is quite possible that there is a similar scenario involving ->nocb_bypass_timer and ->nocb_defer_wakeup. However, despite some effort from several people, a failure scenario has not yet been located. However, that by no means guarantees that no such scenario exists. Finding a failure scenario is left as an exercise for the reader, and the "Fixes:" tag below relates to ->nocb_bypass_timer instead of ->nocb_timer. Fixes: d1b222c6be1f (rcu/nocb: Add bypass callback queueing) Cc: <stable@vger.kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Joel Fernandes <joel@joelfernandes.org> Cc: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-15 13:54:54 -07:00
Jiapeng Chong	9640dcab97	rcu: Make nocb_nobypass_lim_per_jiffy static RCU triggerse the following sparse warning: kernel/rcu/tree_plugin.h:1497:5: warning: symbol 'nocb_nobypass_lim_per_jiffy' was not declared. Should it be static? This commit therefore makes this variable static. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Reported-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-15 13:54:42 -07:00
Sangmoon Kim	565cfb9e64	rcu/tree: Add a trace event for RCU CPU stall warnings This commit adds a trace event which allows tracing the beginnings of RCU CPU stall warnings on systems where sysctl_panic_on_rcu_stall is disabled. The first parameter is the name of RCU flavor like other trace events. The second parameter indicates whether this is a stall of an expedited grace period, a self-detected stall of a normal grace period, or a stall of a normal grace period detected by some CPU other than the one that is stalled. RCU CPU stall warnings are often caused by external-to-RCU issues, for example, in interrupt handling or task scheduling. Therefore, this event uses TRACE_EVENT, not TRACE_EVENT_RCU, to avoid requiring those interested in tracing RCU CPU stalls to rebuild their kernels with CONFIG_RCU_TRACE=y. Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Sangmoon Kim <sangmoon.kim@samsung.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-15 13:53:24 -07:00
Paul E. McKenney	7e937220af	rcu: Add explicit barrier() to __rcu_read_unlock() Because preemptible RCU's __rcu_read_unlock() is an external function, the rough equivalent of an implicit barrier() is inserted by the compiler. Except that there is a direct call to __rcu_read_unlock() in that same file, and compilers are getting to the point where they might choose to inline the fastpath of the __rcu_read_unlock() function. This commit therefore adds an explicit barrier() to the very beginning of __rcu_read_unlock(). Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-15 13:53:24 -07:00
Paul E. McKenney	e589c7c723	docs: Correctly spell Stephen Hemminger's name This commit replaces "Steve" with the his real name, which is "Stephen". Reported-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-15 13:53:24 -07:00
Paul E. McKenney	1c0c4bc1ce	softirq: Don't try waking ksoftirqd before it has been spawned If there is heavy softirq activity, the softirq system will attempt to awaken ksoftirqd and will stop the traditional back-of-interrupt softirq processing. This is all well and good, but only if the ksoftirqd kthreads already exist, which is not the case during early boot, in which case the system hangs. One reproducer is as follows: tools/testing/selftests/rcutorture/bin/kvm.sh --allcpus --duration 2 --configs "TREE03" --kconfig "CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y CONFIG_NO_HZ_IDLE=y CONFIG_HZ_PERIODIC=n" --bootargs "threadirqs=1" --trust-make This commit therefore adds a couple of existence checks for ksoftirqd and forces back-of-interrupt softirq processing when ksoftirqd does not yet exist. With this change, the above test passes. Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reported-by: Uladzislau Rezki <urezki@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> [ paulmck: Remove unneeded check per Sebastian Siewior feedback. ] Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-15 13:51:48 -07:00
Paul E. McKenney	4cd54518c3	torture: Reverse jittering and duration parameters for jitter.sh Remote rcutorture testing requires that jitter.sh continue to be invoked from the generated script for local runs, but that it instead be invoked on the remote system for distributed runs. This argues for common jitterstart and jitterstop scripts. But it would be good for jitterstart and jitterstop to control the name and location of the "jittering" file, while continuing to have the duration controlled by the caller of these new scripts. This commit therefore reverses the order of the jittering and duration parameters for jitter.sh, so that the jittering parameter precedes the duration parameter. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:23:01 -08:00
Paul E. McKenney	1f922db8ee	torture: Eliminate jitter_pids file Now that there is a reliable way to convince the jitter.sh scripts to stop, the jitter_pids file is not needed, nor is the code that kills all the PIDs contained in this file. This commit therefore eliminates this file and the code using it. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:23:01 -08:00
Paul E. McKenney	37812c9429	torture: Use "jittering" file to control jitter.sh execution Currently, jitter.sh execution is controlled by a time limit and by the "kill" command. The former allowed jitter.sh to run uselessly past the end of a set of runs that panicked during boot, and the latter is vulnerable to PID reuse. This commit therefore introduces a "jittering" file in the date-stamp directory within "res" that must be present for the jitter.sh scripts to continue executing. The time limit is still in place in order to avoid disturbing runs featuring large trace dumps, but the removal of the "jittering" file handles the panic-during-boot scenario without relying on PIDs. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:23:01 -08:00
Paul E. McKenney	b674100e63	torture: Use file-based protocol to mark batch's runs complete Currently, the script generated by kvm.sh does a "wait" to wait on both the current batch's guest OSes and any jitter.sh scripts. This works, but makes it hard to abstract the jittering so that common code can be used for both local and distributed runs. This commit therefore uses "build.run" files in scenario directories, and these files are removed after the corresponding scenario's guest OS has completed. Note that --build-only runs do not create build.run files because they also do not create guest OSes and do not run any jitter.sh scripts. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:23:01 -08:00
Paul E. McKenney	3c43ce53fd	torture: Move build/run synchronization files into scenario directories Currently the bN.ready and bN.wait files are placed in the rcutorture directory, which really is not at all a good place for run-specific files. This commit therefore renames these files to build.ready and build.wait and then moves them into the scenario directories within the "res" directory, for example, into tools/testing/selftests/rcutorture/res/2021.02.10-15.08.23/TINY01. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:23:01 -08:00
Paul E. McKenney	aebf8c7bf6	refscale: Disable verbose torture-test output Given large numbers of threads, the quantity of torture-test output is sufficient to sometimes result in RCU CPU stall warnings. The probability of these stall warnings was greatly reduced by batching the output, but the warnings were not eliminated. However, the actual test only depends on console output that is printed even when refscale.verbose=0. This commit therefore causes this test to run with refscale.verbose=0. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:23:01 -08:00
Paul E. McKenney	0e7457b550	rcuscale: Disable verbose torture-test output Given large numbers of threads, the quantity of torture-test output is sufficient to sometimes result in RCU CPU stall warnings. The probability of these stall warnings was greatly reduced by batching the output, but the warnings were not eliminated. However, the actual test only depends on console output that is printed even when rcuscale.verbose=0. This commit therefore causes this test to run with rcuscale.verbose=0. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:23:01 -08:00
Paul E. McKenney	f9d2f1e2c4	torture: Improve readability of the testid.txt file The testid.txt file was intended for occasional in extremis use, but now that the new "bare-metal" file references it, it might see more use. This commit therefore labels sections of output and adds spacing to make it easier to see what needs to be done to make a bare-metal build tree match an rcutorture build tree. Of course, you can avoid this whole issue by building your bare-metal kernel in the same directory in which you ran rcutorture, but that might not always be an option. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:23:01 -08:00
Paul E. McKenney	a8dafbf3a5	torture: Provide bare-metal modprobe-based advice In some environments, the torture-testing use of virtualization is inconvenient. In such cases, the modprobe and rmmod commands may be used to do torture testing, but significant setup is required to build, boot, and modprobe a kernel so as to match a given torture-test scenario. This commit therefore creates a "bare-metal" file in each results directory containing steps to run the corresponding scenario using the modprobe command on bare metal. For example, the contents of this file after using kvm.sh to build an rcutorture TREE01 kernel, perhaps with the --buildonly argument, is as follows: To run this scenario on bare metal: 1. Set your bare-metal build tree to the state shown in this file: /home/git/linux-rcu/tools/testing/selftests/rcutorture/res/2021.02.04-17.10.19/testid.txt 2. Update your bare-metal build tree's .config based on this file: /home/git/linux-rcu/tools/testing/selftests/rcutorture/res/2021.02.04-17.10.19/TREE01/ConfigFragment 3. Make the bare-metal kernel's build system aware of your .config updates: $ yes "" \| make oldconfig 4. Build your bare-metal kernel. 5. Boot your bare-metal kernel with the following parameters: maxcpus=8 nr_cpus=43 rcutree.gp_preinit_delay=3 rcutree.gp_init_delay=3 rcutree.gp_cleanup_delay=3 rcu_nocbs=0-1,3-7 6. Start the test with the following command: $ modprobe rcutorture nocbs_nthreads=8 nocbs_toggle=1000 fwd_progress=0 onoff_interval=1000 onoff_holdoff=30 n_barrier_cbs=4 stat_interval=15 shutdown_secs=120 test_no_idle_hz=1 verbose=1 7. After some time, end the test with the following command: $ rmmod rcutorture 8. Copy your bare-metal kernel's .config file, overwriting this file: /home/git/linux-rcu/tools/testing/selftests/rcutorture/res/2021.02.04-17.10.19/TREE01/.config 9. Copy the console output from just before the modprobe to just after the rmmod into this file: /home/git/linux-rcu/tools/testing/selftests/rcutorture/res/2021.02.04-17.10.19/TREE01/console.log 10. Check for runtime errors using the following command: $ tools/testing/selftests/rcutorture/bin/kvm-recheck.sh /home/git/linux-rcu/tools/testing/selftests/rcutorture/res/2021.02.04-17.10.19 Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:23:01 -08:00
Paul E. McKenney	3d4977b681	torture: Allow 1G of memory for torture.sh kvfree testing Yes, I do recall a time when 512MB of memory was a lot of mass storage, much less main memory, but the rcuscale kvfree_rcu() testing invoked by torture.sh can sometimes exceed it on large systems, resulting in OOM. This commit therefore causes torture.sh to pase the "--memory 1G" argument to kvm.sh to reserve a full gigabyte for this purpose. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:23:01 -08:00
Paul E. McKenney	a519d21480	torturescript: Don't rerun failed rcutorture builds If the build fails when running multiple instances of a given rcutorture scenario, for example, using the kvm.sh --configs "8*RUDE01" argument, the build will be rerun an additional seven times. This is in some sense correct, but it can waste significant time. This commit therefore checks for a prior failed build and simply copies over that build's output. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:23:01 -08:00
Stephen Zhang	0a27fff30a	rcutorture: Replace rcu_torture_stall string with %s This commit replaces a hard-coded "rcu_torture_stall" string in a pr_alert() format with "%s" and __func__. Signed-off-by: Stephen Zhang <stephenzhangzsd@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:22:28 -08:00
Stephen Zhang	4ac9de07b2	torture: Replace torture_init_begin string with %s This commit replaces a hard-coded "torture_init_begin" string in a pr_alert() format with "%s" and __func__. Signed-off-by: Stephen Zhang <stephenzhangzsd@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:22:28 -08:00
Paul E. McKenney	a434dd10cd	rcu-tasks: Add block comment laying out RCU Tasks Trace design This commit adds a block comment that gives a high-level overview of how RCU tasks trace grace periods progress. It also adds a note about how exiting tasks are handled, plus it gives an overview of the memory ordering. Reported-by: Peter Zijlstra <peterz@infradead.org> Reported-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> [ paulmck: Fix commit log per Mathieu Desnoyers feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:22:02 -08:00
Lukas Bulwahn	85b8699428	rcu-tasks: Rectify kernel-doc for struct rcu_tasks The command 'find ./kernel/rcu/ \| xargs ./scripts/kernel-doc -none' reported an issue with the kernel-doc of struct rcu_tasks. This commit rectifies the kernel-doc, such that no issues remain for ./kernel/rcu/. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:22:02 -08:00
Paul E. McKenney	8126c57f00	torture: Make jitter.sh handle large systems The current jitter.sh script expects cpumask bits to fit into whatever the awk interpreter uses for an integer, which clearly does not hold for even medium-sized systems these days. This means that on a large system, only the first 32 or 64 CPUs (depending) are subjected to jitter.sh CPU-time perturbations. This commit therefore computes a given CPU's cpumask using text manipulation rather than arithmetic shifts. Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:21:41 -08:00
Paul E. McKenney	7308e02404	rcu: Make rcu_read_unlock_special() expedite strict grace periods In kernels built with CONFIG_RCU_STRICT_GRACE_PERIOD=y, every grace period is an expedited grace period. However, rcu_read_unlock_special() does not treat them that way, instead allowing the deferred quiescent state to be reported whenever. This commit therefore adds a check of this Kconfig option that causes rcu_read_unlock_special() to treat all grace periods as expedited for CONFIG_RCU_STRICT_GRACE_PERIOD=y kernels. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:21:41 -08:00
Paul E. McKenney	5e59fba573	rcutorture: Fix testing of RCU priority boosting Currently, rcutorture refuses to test RCU priority boosting in CONFIG_HOTPLUG_CPU=y kernels, which are the only kind normally built on x86 these days. This commit therefore updates rcutorture's tests of RCU priority boosting to make them safe for CPU hotplug. However, these tests will fail unless TIMER_SOFTIRQ runs at realtime priority, which does not happen in current mainline. This commit therefore also refuses to test RCU priority boosting except in kernels built with CONFIG_PREEMPT_RT=y. While in the area, this commt adds some debug output at boost-fail time that helps diagnose the cause of the failure, for example, failing to run TIMER_SOFTIRQ at realtime priority. Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Scott Wood <swood@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:21:41 -08:00
Paul E. McKenney	e2b949d543	rcutorture: Make TREE03 use real-time tree.use_softirq setting TREE03 tests RCU priority boosting, which is a real-time feature. It would also be good if it tested something closer to what is actually used by the real-time folks. This commit therefore adds tree.use_softirq=0 to the TREE03 kernel boot parameters in TREE03.boot. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:21:40 -08:00
Paul E. McKenney	39bbfc62cc	rcu: Expedite deboost in case of deferred quiescent state Historically, a task that has been subjected to RCU priority boosting is deboosted at rcu_read_unlock() time. However, with the advent of deferred quiescent states, if the outermost rcu_read_unlock() was invoked with either bottom halves, interrupts, or preemption disabled, the deboosting will be delayed for some time. During this time, a low-priority process might be incorrectly running at a high real-time priority level. Fortunately, rcu_read_unlock_special() already provides mechanisms for forcing a minimal deferral of quiescent states, at least for kernels built with CONFIG_IRQ_WORK=y. These mechanisms are currently used when expedited grace periods are pending that might be blocked by the current task. This commit therefore causes those mechanisms to also be used in cases where the current task has been or might soon be subjected to RCU priority boosting. Note that this applies to all kernels built with CONFIG_RCU_BOOST=y, regardless of whether or not they are also built with CONFIG_PREEMPT_RT=y. This approach assumes that kernels build for use with aggressive real-time applications are built with CONFIG_IRQ_WORK=y. It is likely to be far simpler to enable CONFIG_IRQ_WORK=y than to implement a fast-deboosting scheme that works correctly in its absence. While in the area, alphabetize the rcu_preempt_deferred_qs_handler() function's local variables. Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Scott Wood <swood@redhat.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2021-03-08 14:21:40 -08:00

1 2 3 4 5 ...

996236 Commits