linux

iv/linux

Author	SHA1	Message	Date
Wei Yang	7a0c2b0940	rcu: grpnum just records group number The ->grpnum field in the rcu_node structure contains the bit position in this structure's parent's bitmasks, which is not the CPU number. This commit therefore adjusts this field's comment accordingly. Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-06-29 11:58:51 -07:00
Wei Yang	a2dae43088	rcu: grplo/grphi just records CPU number The ->grplo and ->grphi fields store the lowest and highest CPU number covered by to a rcu_node structure, which is not the group number. This commit therefore adjusts these fields' comments to match reality. Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-06-29 11:58:51 -07:00
Wei Yang	00943a609d	rcu: gp_max is protected by root rcu_node's lock Because gp_max is protected by root rcu_node's lock, this commit moves the gp_max definition to the region of the rcu_node structure containing fields protected by this lock. Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-06-29 11:58:51 -07:00
Lihao Liang	360fbbb489	rcu: Update comment from rsp->rcu_gp_seq to rsp->gp_seq Signed-off-by: Lihao Liang <lihaoliang@google.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-06-29 11:58:50 -07:00
Paul E. McKenney	e816d56fad	rcu: Add callbacks-invoked counters This commit adds a count of the callbacks invoked to the per-CPU rcu_data structure. This count is printed by the show_rcu_gp_kthreads() that is invoked by rcutorture and the RCU CPU stall-warning code. It is also intended for use by drgn. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-06-29 11:58:49 -07:00
Paul E. McKenney	f736e0f1a5	Merge branches 'fixes.2020.04.27a', 'kfree_rcu.2020.04.27a', 'rcu-tasks.2020.04.27a', 'stall.2020.04.27a' and 'torture.2020.05.07a' into HEAD fixes.2020.04.27a: Miscellaneous fixes. kfree_rcu.2020.04.27a: Changes related to kfree_rcu(). rcu-tasks.2020.04.27a: Addition of new RCU-tasks flavors. stall.2020.04.27a: RCU CPU stall-warning updates. torture.2020.05.07a: Torture-test updates.	2020-05-07 10:18:32 -07:00
Paul E. McKenney	7d0c9c50c5	rcu-tasks: Avoid IPIing userspace/idle tasks if kernel is so built Systems running CPU-bound real-time task do not want IPIs sent to CPUs executing nohz_full userspace tasks. Battery-powered systems don't want IPIs sent to idle CPUs in low-power mode. Unfortunately, RCU tasks trace can and will send such IPIs in some cases. Both of these situations occur only when the target CPU is in RCU dyntick-idle mode, in other words, when RCU is not watching the target CPU. This suggests that CPUs in dyntick-idle mode should use memory barriers in outermost invocations of rcu_read_lock_trace() and rcu_read_unlock_trace(), which would allow the RCU tasks trace grace period to directly read out the target CPU's read-side state. One challenge is that RCU tasks trace is not targeting a specific CPU, but rather a task. And that task could switch from one CPU to another at any time. This commit therefore uses try_invoke_on_locked_down_task() and checks for task_curr() in trc_inspect_reader_notrunning(). When this condition holds, the target task is running and cannot move. If CONFIG_TASKS_TRACE_RCU_READ_MB=y, the new rcu_dynticks_zero_in_eqs() function can be used to check if the specified integer (in this case, t->trc_reader_nesting) is zero while the target CPU remains in that same dyntick-idle sojourn. If so, the target task is in a quiescent state. If not, trc_read_check_handler() must indicate failure so that the grace-period kthread can take appropriate action or retry after an appropriate delay, as the case may be. With this change, given CONFIG_TASKS_TRACE_RCU_READ_MB=y, if a given CPU remains idle or a given task continues executing in nohz_full mode, the RCU tasks trace grace-period kthread will detect this without the need to send an IPI. Suggested-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-04-27 11:03:52 -07:00
Paul E. McKenney	1fca4d12f4	rcu: Expedite first two FQS scans under callback-overload conditions Even if some CPUs have excessive numbers of callbacks, RCU's grace-period kthread will still wait normally between successive force-quiescent-state scans. The first two are the most important, as they are the ones that enlist aid from the scheduler when overloaded. This commit therefore omits the wait before the first and the second force-quiescent-state scan under callback-overload conditions. This approach was inspired by a discussion with Jeff Roberson. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-04-27 11:01:16 -07:00
Paul E. McKenney	b2b00ddf19	rcu: React to callback overload by aggressively seeking quiescent states In default configutions, RCU currently waits at least 100 milliseconds before asking cond_resched() and/or resched_rcu() for help seeking quiescent states to end a grace period. But 100 milliseconds can be one good long time during an RCU callback flood, for example, as can happen when user processes repeatedly open and close files in a tight loop. These 100-millisecond gaps in successive grace periods during a callback flood can result in excessive numbers of callbacks piling up, unnecessarily increasing memory footprint. This commit therefore asks cond_resched() and/or resched_rcu() for help as early as the first FQS scan when at least one of the CPUs has more than 20,000 callbacks queued, a number that can be changed using the new rcutree.qovld kernel boot parameter. An auxiliary qovld_calc variable is used to avoid acquisition of locks that have not yet been initialized. Early tests indicate that this reduces the RCU-callback memory footprint during rcutorture floods by from 50% to 4x, depending on configuration. Reported-by: Joel Fernandes (Google) <joel@joelfernandes.org> Reported-by: Tejun Heo <tj@kernel.org> [ paulmck: Fix bug located by Qian Cai. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Dexuan Cui <decui@microsoft.com> Tested-by: Qian Cai <cai@lca.pw>	2020-02-20 16:00:20 -08:00
Paul E. McKenney	0e247386d9	Merge branches 'doc.2019.12.10a', 'exp.2019.12.09a', 'fixes.2020.01.24a', 'kfree_rcu.2020.01.24a', 'list.2020.01.10a', 'preempt.2020.01.24a' and 'torture.2019.12.09a' into HEAD doc.2019.12.10a: Documentations updates exp.2019.12.09a: Expedited grace-period updates fixes.2020.01.24a: Miscellaneous fixes kfree_rcu.2020.01.24a: Batch kfree_rcu() work list.2020.01.10a: RCU-protected-list updates preempt.2020.01.24a: Preemptible RCU updates torture.2019.12.09a: Torture-test updates	2020-01-24 10:37:27 -08:00
Paul E. McKenney	f6105fc2a9	rcu: Remove unused stop-machine #include Long ago, RCU used the stop-machine mechanism to implement expedited grace periods, but no longer does so. This commit therefore removes the no-longer-needed #includes of linux/stop_machine.h. Link: https://lwn.net/Articles/805317/ Reported-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-01-24 10:33:52 -08:00
Lai Jiangshan	e2167b38c8	rcu: Move gp_state_names[] and gp_state_getname() to tree_stall.h Only tree_stall.h needs to get name from GP state, so this commit moves the gp_state_names[] array and the gp_state_getname() from kernel/rcu/tree.h and kernel/rcu/tree.c, respectively, to kernel/rcu/tree_stall.h. While moving gp_state_names[], this commit uses the GCC syntax to ensure that the right string is associated with the right CPP macro. Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-01-24 10:33:45 -08:00
Lai Jiangshan	4778339df0	rcu: Remove the declaration of call_rcu() in tree.h The call_rcu() function is an external RCU API that is declared in include/linux/rcupdate.h. There is thus no point in redeclaring it in kernel/rcu/tree.h, so this commit removes that redundant declaration. Signed-off-by: Lai Jiangshan <jiangshanlai@gmail.com> Signed-off-by: Lai Jiangshan <laijs@linux.alibaba.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-01-24 10:33:38 -08:00
Joel Fernandes (Google)	77a40f9703	rcu: Remove kfree_rcu() special casing and lazy-callback handling This commit removes kfree_rcu() special-casing and the lazy-callback handling from Tree RCU. It moves some of this special casing to Tiny RCU, the removal of which will be the subject of later commits. This results in a nice negative delta. Suggested-by: Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> [ paulmck: Add slab.h #include, thanks to kbuild test robot <lkp@intel.com>. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2020-01-24 10:24:31 -08:00
Paul E. McKenney	c30fe54189	rcu: Mark non-global functions and variables as static Each of rcu_state, rcu_rnp_online_cpus(), rcu_dynticks_curr_cpu_in_eqs(), and rcu_dynticks_snap() are used only in the kernel/rcu/tree.o translation unit, and may thus be marked static. This commit therefore makes this change. Reported-by: Ben Dooks <ben.dooks@codethink.co.uk> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>	2019-12-12 10:24:52 -08:00
Paul E. McKenney	df1e849ae4	rcu: Enable tick for nohz_full CPUs slow to provide expedited QS An expedited grace period can be stalled by a nohz_full CPU looping in kernel context. This possibility is currently handled by some carefully crafted checks in rcu_read_unlock_special() that enlist help from ksoftirqd when permitted by the scheduler. However, it is exactly these checks that require the scheduler avoid holding any of its rq or pi locks across rcu_read_unlock() without also having held them across the entire RCU read-side critical section. It would therefore be very nice if expedited grace periods could handle nohz_full CPUs looping in kernel context without such checks. This commit therefore adds code to the expedited grace period's wait and cleanup code that forces the scheduler-clock interrupt on for CPUs that fail to quickly supply a quiescent state. "Quickly" is currently a hard-coded single-jiffy delay. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2019-12-09 12:32:59 -08:00
Paul E. McKenney	66e4c33b51	rcu: Force tick on for nohz_full CPUs not reaching quiescent states CPUs running for long time periods in the kernel in nohz_full mode might leave the scheduling-clock interrupt disabled for then full duration of their in-kernel execution. This can (among other things) delay grace periods. This commit therefore forces the tick back on for any nohz_full CPU that is failing to pass through a quiescent state upon return from interrupt, which the resched_cpu() will induce. Reported-by: Joel Fernandes <joel@joelfernandes.org> [ paulmck: Clear ->rcu_forced_tick as reported by Joel Fernandes testing. ] [ paulmck: Apply Joel Fernandes TICK_DEP_MASK_RCU->TICK_DEP_BIT_RCU fix. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org>	2019-10-28 07:02:21 -07:00
Paul E. McKenney	f7a81b12d6	rcu/nocb: Print no-CBs diagnostics when rcutorture writer unduly delayed This commit causes locking, sleeping, and callback state to be printed for no-CBs CPUs when the rcutorture writer is delayed sufficiently for rcutorture to complain. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-08-13 14:38:24 -07:00
Paul E. McKenney	d1b222c6be	rcu/nocb: Add bypass callback queueing Use of the rcu_data structure's segmented ->cblist for no-CBs CPUs takes advantage of unrelated grace periods, thus reducing the memory footprint in the face of floods of call_rcu() invocations. However, the ->cblist field is a more-complex rcu_segcblist structure which must be protected via locking. Even though there are only three entities which can acquire this lock (the CPU invoking call_rcu(), the no-CBs grace-period kthread, and the no-CBs callbacks kthread), the contention on this lock is excessive under heavy stress. This commit therefore greatly reduces contention by provisioning an rcu_cblist structure field named ->nocb_bypass within the rcu_data structure. Each no-CBs CPU is permitted only a limited number of enqueues onto the ->cblist per jiffy, controlled by a new nocb_nobypass_lim_per_jiffy kernel boot parameter that defaults to about 16 enqueues per millisecond (16 * 1000 / HZ). When that limit is exceeded, the CPU instead enqueues onto the new ->nocb_bypass. The ->nocb_bypass is flushed into the ->cblist every jiffy or when the number of callbacks on ->nocb_bypass exceeds qhimark, whichever happens first. During call_rcu() floods, this flushing is carried out by the CPU during the course of its call_rcu() invocations. However, a CPU could simply stop invoking call_rcu() at any time. The no-CBs grace-period kthread therefore carries out less-aggressive flushing (every few jiffies or when the number of callbacks on ->nocb_bypass exceeds (2 * qhimark), whichever comes first). This means that the no-CBs grace-period kthread cannot be permitted to do unbounded waits while there are callbacks on ->nocb_bypass. A ->nocb_bypass_timer is used to provide the needed wakeups. [ paulmck: Apply Coverity feedback reported by Colin Ian King. ] Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-08-13 14:37:32 -07:00
Paul E. McKenney	4fd8c5f153	rcu/nocb: Reduce ->nocb_lock contention with separate ->nocb_gp_lock The sleep/wakeup of the no-CBs grace-period kthreads is synchronized using the ->nocb_lock of the first CPU corresponding to that kthread. This commit provides a separate ->nocb_gp_lock for this purpose, thus reducing contention on ->nocb_lock. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-08-13 14:35:49 -07:00
Paul E. McKenney	81c0b3d724	rcu/nocb: Avoid ->nocb_lock capture by corresponding CPU A given rcu_data structure's ->nocb_lock can be acquired very frequently by the corresponding CPU and occasionally by the corresponding no-CBs grace-period and callbacks kthreads. In particular, these two kthreads will have frequent gaps between ->nocb_lock acquisitions that are roughly a grace period in duration. This means that any excessive ->nocb_lock contention will be due to the CPU's acquisitions, and this in turn enables a very naive contention-avoidance strategy to be quite effective. This commit therefore modifies rcu_nocb_lock() to first attempt a raw_spin_trylock(), and to atomically increment a separate ->nocb_lock_contended across a raw_spin_lock(). This new ->nocb_lock_contended field is checked in __call_rcu_nocb_wake() when interrupts are enabled, with a spin-wait for contending acquisitions to complete, thus allowing the kthreads a chance to acquire the lock. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-08-13 14:35:49 -07:00
Paul E. McKenney	4f9c1bc727	rcu/nocb: Remove obsolete nocb_gp_head and nocb_gp_tail fields Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-08-13 14:35:49 -07:00
Paul E. McKenney	2a777de757	rcu/nocb: Remove obsolete nocb_cb_tail and nocb_cb_head fields Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-08-13 14:35:49 -07:00
Paul E. McKenney	c035280f17	rcu/nocb: Remove obsolete nocb_q_count and nocb_q_count_lazy fields This commit removes the obsolete nocb_q_count and nocb_q_count_lazy fields, also removing rcu_get_n_cbs_nocb_cpu(), adjusting rcu_get_n_cbs_cpu(), and making rcutree_migrate_callbacks() once again disable the ->cblist fields of offline CPUs. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-08-13 14:35:49 -07:00
Paul E. McKenney	e7f4c5b399	rcu/nocb: Remove obsolete nocb_head and nocb_tail fields Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-08-13 14:35:49 -07:00
Paul E. McKenney	5d6742b377	rcu/nocb: Use rcu_segcblist for no-CBs CPUs Currently the RCU callbacks for no-CBs CPUs are queued on a series of ad-hoc linked lists, which means that these callbacks cannot benefit from "drive-by" grace periods, thus suffering needless delays prior to invocation. In addition, the no-CBs grace-period kthreads first wait for callbacks to appear and later wait for a new grace period, which means that callbacks appearing during a grace-period wait can be delayed. These delays increase memory footprint, and could even result in an out-of-memory condition. This commit therefore enqueues RCU callbacks from no-CBs CPUs on the rcu_segcblist structure that is already used by non-no-CBs CPUs. It also restructures the no-CBs grace-period kthread to be checking for incoming callbacks while waiting for grace periods. Also, instead of waiting for a new grace period, it waits for the closest grace period that will cause some of the callbacks to be safe to invoke. All of these changes reduce callback latency and thus the number of outstanding callbacks, in turn reducing the probability of an out-of-memory condition. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-08-13 14:35:49 -07:00
Paul E. McKenney	e83e73f5b0	rcu/nocb: Leave ->cblist enabled for no-CBs CPUs As a first step towards making no-CBs CPUs use the ->cblist, this commit leaves the ->cblist enabled for these CPUs. The main reason to make no-CBs CPUs use ->cblist is to take advantage of callback numbering, which will reduce the effects of missed grace periods which in turn will reduce forward-progress problems for no-CBs CPUs. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-08-13 14:35:49 -07:00
Paul E. McKenney	12f54c3a84	rcu/nocb: Provide separate no-CBs grace-period kthreads Currently, there is one no-CBs rcuo kthread per CPU, and these kthreads are divided into groups. The first rcuo kthread to come online in a given group is that group's leader, and the leader both waits for grace periods and invokes its CPU's callbacks. The non-leader rcuo kthreads only invoke callbacks. This works well in the real-time/embedded environments for which it was intended because such environments tend not to generate all that many callbacks. However, given huge floods of callbacks, it is possible for the leader kthread to be stuck invoking callbacks while its followers wait helplessly while their callbacks pile up. This is a good recipe for an OOM, and rcutorture's new callback-flood capability does generate such OOMs. One strategy would be to wait until such OOMs start happening in production, but similar OOMs have in fact happened starting in 2018. It would therefore be wise to take a more proactive approach. This commit therefore features per-CPU rcuo kthreads that do nothing but invoke callbacks. Instead of having one of these kthreads act as leader, each group has a separate rcog kthread that handles grace periods for its group. Because these rcuog kthreads do not invoke callbacks, callback floods on one CPU no longer block callbacks from reaching the rcuc callback-invocation kthreads on other CPUs. This change does introduce additional kthreads, however: 1. The number of additional kthreads is about the square root of the number of CPUs, so that a 4096-CPU system would have only about 64 additional kthreads. Note that recent changes decreased the number of rcuo kthreads by a factor of two (CONFIG_PREEMPT=n) or even three (CONFIG_PREEMPT=y), so this still represents a significant improvement on most systems. 2. The leading "rcuo" of the rcuog kthreads should allow existing scripting to affinity these additional kthreads as needed, the same as for the rcuop and rcuos kthreads. (There are no longer any rcuob kthreads.) 3. A state-machine approach was considered and rejected. Although this would allow the rcuo kthreads to continue their dual leader/follower roles, it complicates callback invocation and makes it more difficult to consolidate rcuo callback invocation with existing softirq callback invocation. The introduction of rcuog kthreads should thus be acceptable. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-08-13 14:32:39 -07:00
Paul E. McKenney	6484fe54b5	rcu/nocb: Update comments to prepare for forward-progress work This commit simply rewords comments to prepare for leader nocb kthreads doing only grace-period work and callback shuffling. This will mean the addition of replacement kthreads to invoke callbacks. The "leader" and "follower" thus become less meaningful, so the commit changes no-CB comments with these strings to "GP" and "CB", respectively. (Give or take the usual grammatical transformations.) Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-08-13 14:32:39 -07:00
Paul E. McKenney	58bf6f77c6	rcu/nocb: Rename rcu_data fields to prepare for forward-progress work This commit simply renames rcu_data fields to prepare for leader nocb kthreads doing only grace-period work and callback shuffling. This will mean the addition of replacement kthreads to invoke callbacks. The "leader" and "follower" thus become less meaningful, so the commit changes no-CB fields with these strings to "gp" and "cb", respectively. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-08-13 14:32:39 -07:00
Paul E. McKenney	1bb336443c	rcu: Rename rcu_data's ->deferred_qs to ->exp_deferred_qs The rcu_data structure's ->deferred_qs field is used to indicate that the current CPU is blocking an expedited grace period (perhaps a future one). Given that it is used only for expedited grace periods, its current name is misleading, so this commit renames it to ->exp_deferred_qs. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-05-28 08:48:19 -07:00
Paul E. McKenney	0864f057b0	rcu: Use irq_work to get scheduler's attention in clean context When rcu_read_unlock_special() is invoked with interrupts disabled, is either not in an interrupt handler or is not using RCU_SOFTIRQ, is not the first RCU read-side critical section in the chain, and either there is an expedited grace period in flight or this is a NO_HZ_FULL kernel, the end of the grace period can be unduly delayed. The reason for this is that it is not safe to do wakeups in this situation. This commit fixes this problem by using the irq_work subsystem to force a later interrupt handler in a clean environment. Because set_tsk_need_resched(current) and set_preempt_need_resched() are invoked prior to this, the scheduler will force a context switch upon return from this interrupt (though perhaps at the end of any interrupted preempt-disable or BH-disable region of code), which will invoke rcu_note_context_switch() (again in a clean environment), which will in turn give RCU the chance to report the deferred quiescent state. Of course, by then this task might be within another RCU read-side critical section. But that will be detected at that time and reporting will be further deferred to the outermost rcu_read_unlock(). See rcu_preempt_need_deferred_qs() and rcu_preempt_deferred_qs() for more details on the checking. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-05-25 14:50:49 -07:00
Sebastian Andrzej Siewior	48d07c04b4	rcu: Enable elimination of Tree-RCU softirq processing Some workloads need to change kthread priority for RCU core processing without affecting other softirq work. This commit therefore introduces the rcutree.use_softirq kernel boot parameter, which moves the RCU core work from softirq to a per-CPU SCHED_OTHER kthread named rcuc. Use of SCHED_OTHER approach avoids the scalability problems that appeared with the earlier attempt to move RCU core processing to from softirq to kthreads. That said, kernels built with RCU_BOOST=y will run the rcuc kthreads at the RCU-boosting priority. Note that rcutree.use_softirq=0 must be specified to move RCU core processing to the rcuc kthreads: rcutree.use_softirq=1 is the default. Reported-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> [ paulmck: Adjust for invoke_rcu_callbacks() only ever being invoked from RCU core processing, in contrast to softirq->rcuc transition in old mainline RCU priority boosting. ] [ paulmck: Avoid wakeups when scheduler might have invoked rcu_read_unlock() while holding rq or pi locks, also possibly fixing a pre-existing latent bug involving raise_softirq()-induced wakeups. ] Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-05-25 14:50:46 -07:00
Paul E. McKenney	b51bcbbf16	rcu: Move forward-progress checkers into tree_stall.h This commit further consolidates stall-warning functionality by moving forward-progress checkers into kernel/rcu/tree_stall.h, updating a comment or two while in the area. More specifically, this commit moves show_rcu_gp_kthreads(), rcu_check_gp_start_stall(), rcu_fwd_progress_check(), sysrq_rcu, sysrq_show_rcu(), sysrq_rcudump_op, and rcu_sysrq_init() from kernel/rcu/tree.c to kernel/rcu/tree_stall.h. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-03-26 14:40:13 -07:00
Paul E. McKenney	7ac1907c9e	rcu: Move irq-disabled stall-warning checking to tree_stall.h The rcu_iw_handler() function's sole purpose in life is to indicate whether a stalled CPU had interrupts disabled, so it belongs in kernel/rcu/tree_stall.h. This commit therefore makes that move, clarifying its header comment while in the area. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-03-26 14:40:13 -07:00
Paul E. McKenney	e23344c2ca	rcu: Organize functions in tree_stall.h This commit does only code movement, removal of now-unneeded forward declarations, and addition of comments. It organizes the functions that implement RCU CPU stall warnings for normal grace periods into three categories: 1. Control of RCU CPU stall warnings, including computing timeouts. 2. Interaction of stall warnings with grace periods. 3. Actual printing of the RCU CPU stall-warning messages. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-03-26 14:40:13 -07:00
Paul E. McKenney	59b73a2768	rcu: Move FAST_NO_HZ stall-warning code to tree_stall.h This commit further consolidates the stall-warning code by moving print_cpu_stall_info() and its helper functions along with zero_cpu_stall_ticks() to kernel/rcu/tree_stall.h. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-03-26 14:40:13 -07:00
Paul E. McKenney	40e69ac7d0	rcu: Inline RCU stall-warning info helper functions The print_cpu_stall_info_begin() and print_cpu_stall_info_end() print a single character each onto the console, and are a holdover from a time when RCU CPU stall warning messages could be abbreviated using a long-gone Kconfig option. This commit therefore adds these single characters to already-printed strings in the calling functions, and then eliminates both print_cpu_stall_info_begin() and print_cpu_stall_info_end(). Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-03-26 14:40:13 -07:00
Paul E. McKenney	21d0d79ab0	rcu: Inline RCU task stall-warning helper functions The rcu_print_detail_task_stall(), rcu_print_task_stall_begin(), and rcu_print_task_stall_end() functions were defined to allow long-gone Kconfig options to provide an abbreviated RCU CPU stall warning printout. This commit saves a few lines of code by inlining them into their sole callers. While in the area, a useless call of rcu_print_detail_task_stall_rnp() on the root rcu_node structure was eliminated. If there is only one rcu_node structure, its tasks get printed twice, but if there are more, the root rcu_node structure is guaranteed to have an empty list of blocked tasks, hence the uselessness. (Long ago, root rcu_node structures with non-empty ->blkd_tasks lists could happen, but no longer.) Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-03-26 14:40:13 -07:00
Paul E. McKenney	32255d51b6	rcu: Move RCU CPU stall-warning code out of tree.c This commit completes the process of consolidating the code for RCU CPU stall warnings for normal grace periods by moving the remaining such code from kernel/rcu/tree.c to kernel/rcu/tree_stall.h. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-03-26 14:40:13 -07:00
Paul E. McKenney	e7ffb4eb9a	Merge branches 'doc.2019.01.26a', 'fixes.2019.01.26a', 'sil.2019.01.26a', 'spdx.2019.02.09a', 'srcu.2019.01.26a' and 'torture.2019.01.26a' into HEAD doc.2019.01.26a: Documentation updates. fixes.2019.01.26a: Miscellaneous fixes. sil.2019.01.26a: Removal of a few more spin_is_locked() instances. spdx.2019.02.09a: Add SPDX identifiers to RCU files srcu.2019.01.26a: SRCU updates. torture.2019.01.26a: Torture-test updates.	2019-02-09 08:47:52 -08:00
Paul E. McKenney	22e4092531	rcu/tree: Convert to SPDX license identifier Replace the license boiler plate with a SPDX license identifier. While in the area, update an email address. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> [ paulmck: Update .h file SPDX comment format per Joe Perches. ] Reviewed-by: Thomas Gleixner <tglx@linutronix.de>	2019-02-09 08:44:10 -08:00
Sebastian Andrzej Siewior	e81baf4cb1	srcu: Remove srcu_queue_delayed_work_on() srcu_queue_delayed_work_on() disables preemption (and therefore CPU hotplug in RCU's case) and then checks based on its own accounting if a CPU is online. If the CPU is online it uses queue_delayed_work_on() otherwise it fallbacks to queue_delayed_work(). The problem here is that queue_work() on -RT does not work with disabled preemption. queue_work_on() works also on an offlined CPU. queue_delayed_work_on() has the problem that it is possible to program a timer on an offlined CPU. This timer will fire once the CPU is online again. But until then, the timer remains programmed and nothing will happen. Add a local timer which will fire (as requested per delay) on the local CPU and then enqueue the work on the specific CPU. RCUtorture testing with SRCU-P for 24h showed no problems. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-01-25 15:36:42 -08:00
Paul E. McKenney	c98cac603f	rcu: Rename rcu_check_callbacks() to rcu_sched_clock_irq() The name rcu_check_callbacks() arguably made sense back in the early 2000s when RCU was quite a bit simpler than it is today, but it has become quite misleading, especially with the advent of dyntick-idle and NO_HZ_FULL. The rcu_check_callbacks() function is RCU's hook into the scheduling-clock interrupt, and is now but one of many ways that callbacks get promoted to invocable state. This commit therefore changes the name to rcu_sched_clock_irq(), which is the same number of characters and clearly indicates this function's relation to the rest of the Linux kernel. In addition, for the sake of consistency, rcu_flavor_check_callbacks() is also renamed to rcu_flavor_sched_clock_irq(). While in the area, the header comments for both functions are reworked. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-01-25 15:35:21 -08:00
Paul E. McKenney	7a968bb26a	Merge branches 'consolidate.2019.01.26a' and 'fwd.2019.01.26a' into HEAD consolidate.2019.01.26a: RCU flavor consolidation cleanups. fwd.2019.01.26a: RCU grace-period forward-progress fixes.	2019-01-25 15:32:01 -08:00
Paul E. McKenney	fd897573fa	rcu: Improve diagnostics for failed RCU grace-period start If a grace period fails to start (for example, because you commented out the last two lines of rcu_accelerate_cbs_unlocked()), rcu_core() will invoke rcu_check_gp_start_stall(), which will notice and complain. However, this complaint is lacking crucial debugging information such as when the last wakeup executed and what the value of ->gp_seq was at that time. This commit therefore removes the current pr_alert() from rcu_check_gp_start_stall(), instead invoking show_rcu_gp_kthreads(), which has been updated to print the needed information, which is collected by rcu_gp_kthread_wake(). Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-01-25 15:29:57 -08:00
Paul E. McKenney	b2c1955b88	rcu: Remove unused rcu_cpu_kthread_cpu per-CPU variable The rcu_cpu_kthread_cpu used to provide debugfs information, but is no longer used. This commit therefore removes it. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-01-25 15:29:56 -08:00
Paul E. McKenney	f7e972ee12	rcu: Move rcu_cpu_has_work to rcu_data structure Given that RCU has a perfectly good per-CPU rcu_data structure, most per-CPU quantities should be stored there. This commit therefore moves the rcu_cpu_has_work per-CPU variable to the rcu_data structure. This also makes this variable unconditionally present, which should be acceptable given the memory reduction due to the RCU flavor consolidation and also due to simplifications this will enable. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-01-25 15:29:56 -08:00
Paul E. McKenney	8b4d0f4858	rcu: Remove unused rcu_cpu_kthread_loops per-CPU variable The rcu_cpu_kthread_loops variable used to provide debugfs information, but is no longer used. This commit therefore removes it. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-01-25 15:29:55 -08:00
Paul E. McKenney	6ffdde28b7	rcu: Move rcu_cpu_kthread_status to rcu_data structure Given that RCU has a perfectly good per-CPU rcu_data structure, most per-CPU quantities should be stored there. This commit therefore moves the rcu_cpu_kthread_status per-CPU variable to the rcu_data structure. This also makes this variable unconditionally present, which should be acceptable given the memory reduction due to the RCU flavor consolidation and also due to simplifications this will enable. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-01-25 15:29:54 -08:00
Paul E. McKenney	37f62d7cf0	rcu: Move rcu_cpu_kthread_task to rcu_data structure Given that RCU has a perfectly good per-CPU rcu_data structure, most per-CPU quantities should be stored there. This commit therefore moves the rcu_cpu_kthread_task per-CPU variable to the rcu_data structure. This also makes this variable unconditionally present, which should be acceptable given the memory reduction due to the RCU flavor consolidation and also due to simplifications this will enable. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-01-25 15:29:53 -08:00
Paul E. McKenney	260e1e4fd8	rcu: Discard separate per-CPU callback counts Back when there were multiple flavors of RCU, it was necessary to separately count lazy and non-lazy callbacks for each CPU. These counts were used in CONFIG_RCU_FAST_NO_HZ kernels to determine how long a newly idle CPU should be allowed to sleep before handling its RCU callbacks. But now that there is only one flavor, the callback counts for a given CPU's sole rcu_data structure are the counts for that CPU. This commit therefore removes the rcu_data structure's ->nonlazy_posted and ->nonlazy_posted_snap fields, the rcu_idle_count_callbacks_posted() and rcu_cpu_has_callbacks() functions, repurposes the rcu_data structure's ->all_lazy field to record the laziness state at the beginning of the latest idle sojourn, and modifies CONFIG_RCU_FAST_NO_HZ RCU CPU stall warnings accordingly. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-01-25 15:28:30 -08:00
Paul E. McKenney	142d106d5e	rcu: Determine expedited-GP IPI handler at build time Back when there could be multiple RCU flavors running in the same kernel at the same time, it was necessary to specify the expedited grace-period IPI handler at runtime. Now that there is only one RCU flavor, the IPI handler can be determined at build time. There is therefore no longer any reason for the RCU-preempt and RCU-sched IPI handlers to have different names, nor is there any reason to pass these handlers in function arguments and in the data structures enclosing workqueues. This commit therefore makes all these changes, pushing the specification of the expedited grace-period IPI handler down to the point of use. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-01-25 15:28:27 -08:00
Paul E. McKenney	ad368d15b0	rcu: Rename and comment changes due to only one rcuo kthread per CPU Given RCU flavor consolidation, the name rcu_spawn_all_nocb_kthreads() is quite misleading. It no longer ever creates more than one kthread, and it does so only for the specified CPU. This commit therefore changes this name to the more descriptive rcu_spawn_cpu_nocb_kthread(), and also fixes up a similar issue in its header comment while in the area. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2019-01-25 15:28:23 -08:00
Paul E. McKenney	c51d7b5e6c	rcutorture: Print time since GP end upon forward-progress failure If rcutorture's forward-progress tests fail while a grace period is not in progress, it is useful to print the time since the last grace period ended as a way to detect failure to launch a new grace period. This commit therefore makes this change. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2018-12-01 12:45:40 -08:00
Paul E. McKenney	903ee83d91	rcu: Account for nocb-CPU callback counts in RCU CPU stall warnings The RCU CPU stall warnings print an estimate of the total number of RCU callbacks queued in the system, but this estimate leaves out the callbacks queued for nocbs CPUs. This commit therefore introduces rcu_get_n_cbs_cpu(), which gives an accurate callback estimate for both nocbs and normal CPUs, and uses this new function as needed. This commit also introduces a rcu_get_n_cbs_nocb_cpu() helper function that returns the number of callbacks for nocbs CPUs or zero otherwise, and also uses this function in place of direct access to ->nocb_q_count while in the area (fewer characters, you see). Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2018-12-01 12:45:37 -08:00
Joel Fernandes (Google)	adbccddb4a	rcu: Fix rcu_{node,data} comments about gp_seq_needed Recent changes have removed the old ->gp_seq_needed field from the rcu_state structure, which in turn obsoleted a couple of comments in the rcu_node and rcu_data structures. This commit therefore updates these comments accordingly. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: <kernel-team@android.com> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2018-11-08 21:43:20 -08:00
Joel Fernandes (Google)	75a8f72245	rcu: Remove unused rcu_state externs The rcu_bh_state and rcu_sched_state variables were removed during the RCU flavor consolidations, but external declarations remain in tree.h. This commit therefore removes these obsolete declarations. Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: <kernel-team@android.com> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>	2018-11-08 21:43:20 -08:00
Mike Galbraith	894d45bbf7	rcu: Convert rcu_state.ofl_lock to raw_spinlock_t `1e64b15a4b` ("rcu: Fix grace-period hangs due to race with CPU offline") added spinlock_t ofl_lock to the rcu_state structure, then takes it with preemption disabled during CPU offline, which gives the -rt patchset's sleeping spinlock heartburn. This commit therefore converts ->ofl_lock to raw_spinlock_t. Signed-off-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>	2018-08-30 16:03:54 -07:00
Paul E. McKenney	8d8a9d0e7e	rcu: Remove obsolete ->dynticks_fqs and ->cond_resched_completed The rcu_data structure's ->dynticks_fqs is incremented but never accesses. Its ->cond_resched_completed field isn't used at all. This commit therefore removes both fields. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:53 -07:00
Paul E. McKenney	dc5a4f2932	rcu: Switch ->dynticks to rcu_data structure, remove rcu_dynticks This commit move ->dynticks from the rcu_dynticks structure to the rcu_data structure, replacing the field of the same name. It also updates the code to access ->dynticks from the rcu_data structure and to use the rcu_data structure rather than following to now-gone ->dynticks field to the now-gone rcu_dynticks structure. While in the area, this commit also fixes up comments. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:52 -07:00
Paul E. McKenney	4c5273bf2b	rcu: Switch dyntick nesting counters to rcu_data structure This commit removes ->dynticks_nesting and ->dynticks_nmi_nesting from the rcu_dynticks structure and updates the code to access them from the rcu_data structure. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:51 -07:00
Paul E. McKenney	2dba13f0b6	rcu: Switch urgent quiescent-state requests to rcu_data structure This commit removes ->rcu_need_heavy_qs and ->rcu_urgent_qs from the rcu_dynticks structure and updates the code to access them from the rcu_data structure. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:50 -07:00
Paul E. McKenney	c458a89e96	rcu: Switch lazy counts to rcu_data structure This commit removes ->all_lazy, ->nonlazy_posted and ->nonlazy_posted_snap from the rcu_dynticks structure and updates the code to access them from the rcu_data structure. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:49 -07:00
Paul E. McKenney	5998a75adb	rcu: Switch last accelerate/advance to rcu_data structure This commit removes ->last_accelerate and ->last_advance_all from the rcu_dynticks structure and updates the code to access them from the rcu_data structure. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:48 -07:00
Paul E. McKenney	0fd79e7521	rcu: Switch ->tick_nohz_enabled_snap to rcu_data structure This commit removes ->tick_nohz_enabled_snap from the rcu_dynticks structure and updates the code to access it from the rcu_data structure. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:47 -07:00
Paul E. McKenney	cc72046cc3	rcu: Merge rcu_dynticks structure into rcu_data structure Now that there is only ever one rcu_data structure per CPU, there is no need for a separate rcu_dynticks structure. This commit therefore adds the rcu_dynticks fields into the rcu_data structure in preparation for removing the rcu_dynticks structure entirely. Note that the ->dynticks field will be handled specially because there is a field by that name in both structures. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:47 -07:00
Paul E. McKenney	df63fa5bc1	rcu: Convert "1UL << x" to "BIT(x)" This commit saves a few characters by converting "1UL << x" to "BIT(x)". Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:46 -07:00
Paul E. McKenney	d3052109c0	rcu: More aggressively enlist scheduler aid for nohz_full CPUs Because nohz_full CPUs can leave the scheduler-clock interrupt disabled even when in kernel mode, RCU cannot rely on rcu_check_callbacks() to enlist the scheduler's aid in extracting a quiescent state from such CPUs. This commit therefore more aggressively uses resched_cpu() on nohz_full CPUs that fail to pass through a quiescent state in a timely manner. By default, the resched_cpu() beating starts 300 milliseconds into the quiescent state. While in the neighborhood, add a ->last_fqs_resched field to the rcu_data structure in order to rate-limit resched_cpu() calls from the RCU grace-period kthread. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:44 -07:00
Paul E. McKenney	7e28c5af4e	rcu: Eliminate ->rcu_qs_ctr from the rcu_dynticks structure The ->rcu_qs_ctr counter was intended to allow providing a lightweight report of a quiescent state to all RCU flavors. But now that there is only one flavor of RCU in any one running kernel, there is no point in having this feature. This commit therefore removes the ->rcu_qs_ctr field from the rcu_dynticks structure and the ->rcu_qs_ctr_snap field from the rcu_data structure. This results in the "rqc" option to the rcu_fqs trace event no longer being used, so this commit also removes the "rqc" description from the header comment. While in the neighborhood, this commit also causes the forward-progress request .rcu_need_heavy_qs be set one jiffies_till_sched_qs interval later in the grace period than the first setting of .rcu_urgent_qs. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:42 -07:00
Paul E. McKenney	dd46a7882c	rcu: Inline _rcu_barrier() into its sole remaining caller Because rcu_barrier() is a one-line wrapper function for _rcu_barrier() and because nothing else calls _rcu_barrier(), this commit inlines _rcu_barrier() into rcu_barrier(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:39 -07:00
Paul E. McKenney	4e95020cdd	rcu: Inline increment_cpu_stall_ticks() into its sole caller Consolidation of the RCU flavors into one makes increment_cpu_stall_ticks() a trivial one-line function with only one caller. This commit therefore inlines it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:23 -07:00
Paul E. McKenney	b97d23c51c	rcu: Remove for_each_rcu_flavor() flavor-traversal macro Now that there is only ever a single flavor of RCU in a given kernel build, there isn't a whole lot of point in having a flavor-traversal macro. This commit therefore removes it and converts calls to it to straightline code, inlining trivial functions as appropriate. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:18 -07:00
Paul E. McKenney	88d1bead85	rcu: Remove rcu_data structure's ->rsp field Now that there is only one rcu_state structure, there is no need for the rcu_data structure to indicate which it corresponds to. This commit therefore removes the rcu_data structure's ->rsp field, replacing all remaining uses of it with &rcu_state. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:17 -07:00
Paul E. McKenney	63d4c8c979	rcu: Remove rsp parameter from expedited grace-period functions There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from the code in kernel/rcu/tree_exp.h, and removes all of the rsp local variables while in the area. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:14 -07:00
Paul E. McKenney	4580b0541b	rcu: Remove rsp parameter from no-CBs CPU functions There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_nocb_cpu_needs_barrier(), rcu_spawn_one_nocb_kthread(), rcu_organize_nocb_kthreads(), rcu_nocb_cpu_needs_barrier(), and rcu_nohz_full_cpu(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:13 -07:00
Paul E. McKenney	b21ebed951	rcu: Remove rsp parameter from print_cpu_stall_info() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from print_cpu_stall_info(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:12 -07:00
Paul E. McKenney	6dbfdc1409	rcu: Remove rsp parameter from rcu_spawn_one_boost_kthread() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_spawn_one_boost_kthread(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:11 -07:00
Paul E. McKenney	81ab59a3ad	rcu: Remove rsp parameter from dump_blkd_tasks() and friend There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from dump_blkd_tasks() and rcu_preempt_blocked_readers_cgp(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:10 -07:00
Paul E. McKenney	a2887cd85f	rcu: Remove rsp parameter from rcu_print_detail_task_stall() There now is only one rcu_state structure in a given build of the Linux kernel, so there is no need to pass it as a parameter to RCU's functions. This commit therefore removes the rsp parameter from rcu_print_detail_task_stall(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:03:09 -07:00
Paul E. McKenney	da1df50d16	rcu: Remove rcu_state structure's ->rda field The rcu_state structure's ->rda field was used to find the per-CPU rcu_data structures corresponding to that rcu_state structure. But now there is only one rcu_state structure (creatively named "rcu_state") and one set of per-CPU rcu_data structures (creatively named "rcu_data"). Therefore, uses of the ->rda field can always be replaced by "rcu_data, and this commit makes that change and removes the ->rda field. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:02:49 -07:00
Paul E. McKenney	ec5dd444b6	rcu: Eliminate rcu_state structure's ->call field The rcu_state structure's ->call field references the corresponding RCU flavor's call_rcu() function. However, now that there is only ever one rcu_state structure in a given build of the Linux kernel, and that flavor uses plain old call_rcu(), there is not a lot of point in continuing to have the ->call field. This commit therefore removes it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:02:48 -07:00
Paul E. McKenney	358be2d368	rcu: Remove RCU_STATE_INITIALIZER() Now that a given build of the Linux kernel has only one set of rcu_state, rcu_node, and rcu_data structures, there is no point in creating a macro to declare and compile-time initialize them. This commit therefore just does normal declaration and compile-time initialization of these structures. While in the area, this commit also removes #ifndefs of the no-longer-ever-defined preprocessor macro RCU_TREE_NONCORE. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-08-30 16:02:47 -07:00
Paul E. McKenney	45975c7d21	rcu: Define RCU-sched API in terms of RCU for Tree RCU PREEMPT builds Now that RCU-preempt knows about preemption disabling, its implementation of synchronize_rcu() works for synchronize_sched(), and likewise for the other RCU-sched update-side API members. This commit therefore confines the RCU-sched update-side code to CONFIG_PREEMPT=n builds, and defines RCU-sched's update-side API members in terms of those of RCU-preempt. This means that any given build of the Linux kernel has only one update-side flavor of RCU, namely RCU-preempt for CONFIG_PREEMPT=y builds and RCU-sched for CONFIG_PREEMPT=n builds. This in turn means that kernels built with CONFIG_RCU_NOCB_CPU=y have only one rcuo kthread per CPU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Andi Kleen <ak@linux.intel.com>	2018-08-30 16:02:45 -07:00
Paul E. McKenney	d28139c4e9	rcu: Apply RCU-bh QSes to RCU-sched and RCU-preempt when safe One necessary step towards consolidating the three flavors of RCU is to make sure that the resulting consolidated "one flavor to rule them all" correctly handles networking denial-of-service attacks. One thing that allows RCU-bh to do so is that __do_softirq() invokes rcu_bh_qs() every so often, and so something similar has to happen for consolidated RCU. This must be done carefully. For example, if a preemption-disabled region of code takes an interrupt which does softirq processing before returning, consolidated RCU must ignore the resulting rcu_bh_qs() invocations -- preemption is still disabled, and that means an RCU reader for the consolidated flavor. This commit therefore creates a new rcu_softirq_qs() that is called only from the ksoftirqd task, thus avoiding the interrupted-a-preempted-region problem. This new rcu_softirq_qs() function invokes rcu_sched_qs(), rcu_preempt_qs(), and rcu_preempt_deferred_qs(). The latter call handles any deferred quiescent states. Note that __do_softirq() still invokes rcu_bh_qs(). It will continue to do so until a later stage of cleanup when the RCU-bh flavor is removed. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Fix !SMP issue located by kbuild test robot. ]	2018-08-30 16:02:38 -07:00
Paul E. McKenney	3e31009898	rcu: Defer reporting RCU-preempt quiescent states when disabled This commit defers reporting of RCU-preempt quiescent states at rcu_read_unlock_special() time when any of interrupts, softirq, or preemption are disabled. These deferred quiescent states are reported at a later RCU_SOFTIRQ, context switch, idle entry, or CPU-hotplug offline operation. Of course, if another RCU read-side critical section has started in the meantime, the reporting of the quiescent state will be further deferred. This also means that disabling preemption, interrupts, and/or softirqs will act as an RCU-preempt read-side critical section. This is enforced by checking preempt_count() as needed. Some special cases must be handled on an ad-hoc basis, for example, context switch is a quiescent state even though both the scheduler and do_exit() disable preemption. In these cases, additional calls to rcu_preempt_deferred_qs() override the preemption disabling. Similar logic overrides disabled interrupts in rcu_preempt_check_callbacks() because in this case the quiescent state happened just before the corresponding scheduling-clock interrupt. In theory, this change lifts a long-standing restriction that required that if interrupts were disabled across a call to rcu_read_unlock() that the matching rcu_read_lock() also be contained within that interrupts-disabled region of code. Because the reporting of the corresponding RCU-preempt quiescent state is now deferred until after interrupts have been enabled, it is no longer possible for this situation to result in deadlocks involving the scheduler's runqueue and priority-inheritance locks. This may allow some code simplification that might reduce interrupt latency a bit. Unfortunately, in practice this would also defer deboosting a low-priority task that had been subjected to RCU priority boosting, so real-time-response considerations might well force this restriction to remain in place. Because RCU-preempt grace periods are now blocked not only by RCU read-side critical sections, but also by disabling of interrupts, preemption, and softirqs, it will be possible to eliminate RCU-bh and RCU-sched in favor of RCU-preempt in CONFIG_PREEMPT=y kernels. This may require some additional plumbing to provide the network denial-of-service guarantees that have been traditionally provided by RCU-bh. Once these are in place, CONFIG_PREEMPT=n kernels will be able to fold RCU-bh into RCU-sched. This would mean that all kernels would have but one flavor of RCU, which would open the door to significant code cleanup. Moving to a single flavor of RCU would also have the beneficial effect of reducing the NOCB kthreads by at least a factor of two. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Apply rcu_read_unlock_special() preempt_count() feedback from Joel Fernandes. ] [ paulmck: Adjust rcu_eqs_enter() call to rcu_preempt_deferred_qs() in response to bug reports from kbuild test robot. ] [ paulmck: Fix bug located by kbuild test robot involving recursion via rcu_preempt_deferred_qs(). ]	2018-08-30 16:02:34 -07:00
Paul E. McKenney	164ba3fc48	rcu: Remove unused rcu_kick_nohz_cpu() function The rcu_kick_nohz_cpu() function is no longer used, and the functionality it used to provide is now provided by a call to resched_cpu() in the force-quiescent-state function rcu_implicit_dynticks_qs(). This commit therefore removes rcu_kick_nohz_cpu(). Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:17 -07:00
Paul E. McKenney	f2e2df5978	rcu: Add diagnostics for offline CPUs failing to report QS CPUs are expected to report quiescent states when coming online and when going offline, and grace-period initialization is supposed to handle any race conditions where a CPU's ->qsmask bit is set just after it goes offline. This commit adds diagnostics for the case where an offline CPU nevertheless has a grace period waiting on it. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:10 -07:00
Paul E. McKenney	fea3f222d3	rcu: Record ->gp_state for both phases of grace-period initialization Grace-period initialization first processes any recent CPU-hotplug operations, and then initializes state for the new grace period. These two phases of initialization are currently not distinguished in debug prints, but the distinction is valuable in a number of debug situations. This commit therefore introduces two new values for ->gp_state, RCU_GP_ONOFF and RCU_GP_INIT, in order to make this distinction. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:09 -07:00
Paul E. McKenney	5773894231	rcu: Add CPU online/offline state to dump_blkd_tasks() Interactions between CPU-hotplug operations and grace-period initialization can result in dump_blkd_tasks(). One of the first debugging actions in this case is to search back in dmesg to work out which of the affected rcu_node structure's CPUs are online and to determine the last CPU-hotplug operation affecting any of those CPUs. This can be laborious and error-prone, especially when console output is lost. This commit therefore causes dump_blkd_tasks() to dump the state of the affected rcu_node structure's CPUs and the last grace period during which the last offline and online operation affected each of these CPUs. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:09 -07:00
Paul E. McKenney	e05121ba5b	rcu: Remove CPU-hotplug failsafe from force-quiescent-state code path Now that quiescent states for newly offlined CPUs are reported either when that CPU goes offline or at the end of grace-period initialization, the CPU-hotplug failsafe in the force-quiescent-state code path is no longer needed. This commit therefore removes this failsafe. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:07 -07:00
Paul E. McKenney	1e64b15a4b	rcu: Fix grace-period hangs due to race with CPU offline Without special fail-safe quiescent-state-propagation checks, grace-period hangs can result from the following scenario: 1. CPU 1 goes offline. 2. Because CPU 1 is the only CPU in the system blocking the current grace period, the grace period ends as soon as rcu_cleanup_dying_idle_cpu()'s call to rcu_report_qs_rnp() returns. 3. At this point, the leaf rcu_node structure's ->lock is no longer held: rcu_report_qs_rnp() has released it, as it must in order to awaken the RCU grace-period kthread. 4. At this point, that same leaf rcu_node structure's ->qsmaskinitnext field still records CPU 1 as being online. This is absolutely necessary because the scheduler uses RCU (in this case on the wake-up path while awakening RCU's grace-period kthread), and ->qsmaskinitnext contains RCU's idea as to which CPUs are online. Therefore, invoking rcu_report_qs_rnp() after clearing CPU 1's bit from ->qsmaskinitnext would result in a lockdep-RCU splat due to RCU being used from an offline CPU. 5. RCU's grace-period kthread awakens, sees that the old grace period has completed and that a new one is needed. It therefore starts a new grace period, but because CPU 1's leaf rcu_node structure's ->qsmaskinitnext field still shows CPU 1 as being online, this new grace period is initialized to wait for a quiescent state from the now-offline CPU 1. 6. Without the fail-safe force-quiescent-state checks, there would be no quiescent state from the now-offline CPU 1, which would eventually result in RCU CPU stall warnings and memory exhaustion. It would be good to get rid of the special fail-safe quiescent-state propagation checks, and thus it would be good to fix things so that the above scenario cannot happen. This commit therefore adds a new ->ofl_lock to the rcu_state structure. This lock is held by rcu_gp_init() across the applying of buffered online and offline operations to the rcu_node tree, and it is also held by rcu_cleanup_dying_idle_cpu() when buffering a new offline operation. This prevents rcu_gp_init() from acquiring the leaf rcu_node structure's lock during the interval between when rcu_cleanup_dying_idle_cpu() invokes rcu_report_qs_rnp(), which releases ->lock and the re-acquisition of that same lock. This in turn prevents the failure scenario outlined above, and will hopefully eventually allow removal of the offline-CPU checks from the force-quiescent-state code path. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:39:04 -07:00
Paul E. McKenney	ff3bb6f4d0	rcu: Remove ->gpnum and ->completed Now that everything has been converted to use ->gp_seq instead of ->gpnum and ->completed, this commit removes ->gpnum and ->completed. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:38:48 -07:00
Paul E. McKenney	7a1d0f23ad	rcu: Move from ->need_future_gp[] to ->gp_seq_needed One problem with the ->need_future_gp[] array is that the grace-period assignment of each element changes as the grace periods complete. This means that it is necessary to hold a lock when checking this array to learn if a given grace period has already been requested. This increase lock contention, which is the opposite of helpful. This commit therefore replaces the ->need_future_gp[] with a single ->gp_seq_needed value and keeps it updated in the rcu_data structure. This will enable reliable lockless checking of whether or not a given grace period has already been requested. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 15:37:48 -07:00
Paul E. McKenney	29365e563b	rcu: Convert grace-period requests to ->gp_seq This commit converts the grace-period request code paths from ->completed and ->gpnum to ->gp_seq. The need_future_gp_element() macro encapsulates the shift operation required to use ->gp_seq as an index to the ->need_future_gp[] array. The rcu_cbs_completed() function is removed in favor of the rcu_seq_snap() function. The rcu_start_this_gp() gets some temporary consistency checks and uses rcu_seq_done(), rcu_seq_current(), rcu_seq_state(), and rcu_gp_in_progress() in place of the earlier open-coded comparisons of ->gpnum and ->completed. The rcu_future_gp_cleanup() function replaces use of ->completed with ->gp_seq. The rcu_accelerate_cbs() function replaces a call to rcu_cbs_completed() with one to rcu_seq_snap(). The rcu_advance_cbs() function replaces an access to >completed with one to ->gp_seq and adds some temporary warnings. The rcu_nocb_wait_gp() function replaces a call to rcu_cbs_completed() with one to rcu_seq_snap() and an open-coded comparison with rcu_seq_done(). The temporary warnings will be removed when the various ->gpnum and ->completed fields are removed. Their purpose is to locate code who might still be using ->gpnum and ->completed. (Much easier that way than trying to trace down the causes of too-short grace periods and grace-period hangs!) Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 14:27:55 -07:00
Paul E. McKenney	8aa670cdac	rcu: Convert ->rcu_iw_gpnum to ->gp_seq This commit switches the interrupt-disabled detection mechanism to ->gp_seq. This mechanism is used as part of RCU CPU stall warnings, and detects cases where the stall is due to a CPU having interrupts disabled. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 14:27:53 -07:00
Paul E. McKenney	de30ad512a	rcu: Introduce grace-period sequence numbers This commit adds grace-period sequence numbers (->gp_seq) to the rcu_state, rcu_node, and rcu_data structures, and updates them. It also checks for consistency between rsp->gpnum and rsp->gp_seq. These ->gp_seq counters will eventually replace the existing ->gpnum and ->completed counters, allowing a single memory access to determine whether or not a grace period is in progress and if so, which one. This in turn will enable changes that will reduce ->lock contention on the leaf rcu_node structures. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-07-12 14:27:46 -07:00
Paul E. McKenney	26d950a945	rcu: Diagnostics for grace-period startup hangs This commit causes a splat if RCU is idle and a request for a new grace period is ignored for more than one second. This splat normally indicates that some code path asked for a new grace period, but failed to wake up the RCU grace-period kthread. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> [ paulmck: Fix bug located by Dan Carpenter and his static checker. ] [ paulmck: Fix self-deadlock bug located 0day test robot. ] [ paulmck: Disable unless CONFIG_PROVE_RCU=y. ]	2018-07-12 14:24:42 -07:00
Paul E. McKenney	4bc8d55574	rcu: Add debugging info to assertion The WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp()) in rcu_gp_cleanup() triggers (inexplicably, of course) every so often. This commit therefore extracts more information. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2018-06-26 12:25:55 -07:00
Paul E. McKenney	22df7316ac	Merge branches 'exp.2018.05.15a', 'fixes.2018.05.15a', 'lock.2018.05.15a' and 'torture.2018.05.15a' into HEAD exp.2018.05.15a: Parallelize expedited grace-period initialization. fixes.2018.05.15a: Miscellaneous fixes. lock.2018.05.15a: Decrease lock contention on root rcu_node structure, which is a step towards merging RCU flavors. torture.2018.05.15a: Torture-test updates.	2018-05-15 10:33:05 -07:00

1 2 3 4 5 ...

295 Commits