[ Upstream commit 55d4669ef1b76823083caecfab12a8bd2ccdcf64 ] When rcu_barrier() calls rcu_rdp_cpu_online() and observes a CPU off rnp->qsmaskinitnext, it means that all accesses from the offline CPU preceding the CPUHP_TEARDOWN_CPU are visible to RCU barrier, including callbacks expiration and counter updates. However interrupts can still fire after stop_machine() re-enables interrupts and before rcutree_report_cpu_dead(). The related accesses happening between CPUHP_TEARDOWN_CPU and rnp->qsmaskinitnext clearing are _NOT_ guaranteed to be seen by rcu_barrier() without proper ordering, especially when callbacks are invoked there to the end, making rcutree_migrate_callback() bypass barrier_lock. The following theoretical race example can make rcu_barrier() hang: CPU 0 CPU 1 ----- ----- //cpu_down() smpboot_park_threads() //ksoftirqd is parked now <IRQ> rcu_sched_clock_irq() invoke_rcu_core() do_softirq() rcu_core() rcu_do_batch() // callback storm // rcu_do_batch() returns // before completing all // of them // do_softirq also returns early because of // timeout. It defers to ksoftirqd but // it's parked </IRQ> stop_machine() take_cpu_down() rcu_barrier() spin_lock(barrier_lock) // observes rcu_segcblist_n_cbs(&rdp->cblist) != 0 <IRQ> do_softirq() rcu_core() rcu_do_batch() //completes all pending callbacks //smp_mb() implied _after_ callback number dec </IRQ> rcutree_report_cpu_dead() rnp->qsmaskinitnext &= ~rdp->grpmask; rcutree_migrate_callback() // no callback, early return without locking // barrier_lock //observes !rcu_rdp_cpu_online(rdp) rcu_barrier_entrain() rcu_segcblist_entrain() // Observe rcu_segcblist_n_cbs(rsclp) == 0 // because no barrier between reading // rnp->qsmaskinitnext and rsclp->len rcu_segcblist_add_len() smp_mb__before_atomic() // will now observe the 0 count and empty // list, but too late, we enqueue regardless WRITE_ONCE(rsclp->len, rsclp->len + v); // ignored barrier callback // rcu barrier stall... This could be solved with a read memory barrier, enforcing the message passing between rnp->qsmaskinitnext and rsclp->len, matching the full memory barrier after rsclp->len addition in rcu_segcblist_add_len() performed at the end of rcu_do_batch(). However the rcu_barrier() is complicated enough and probably doesn't need too many more subtleties. CPU down is a slowpath and the barrier_lock seldom contended. Solve the issue with unconditionally locking the barrier_lock on rcutree_migrate_callbacks(). This makes sure that either rcu_barrier() sees the empty queue or its entrained callback will be migrated. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
Linux kernel ============ There are several guides for kernel developers and users. These guides can be rendered in a number of formats, like HTML and PDF. Please read Documentation/admin-guide/README.rst first. In order to build the documentation, use ``make htmldocs`` or ``make pdfdocs``. The formatted documentation can also be read online at: https://www.kernel.org/doc/html/latest/ There are various text files in the Documentation/ subdirectory, several of them using the reStructuredText markup notation. Please read the Documentation/process/changes.rst file, as it contains the requirements for building and running the kernel, and information about the problems which may result by upgrading your kernel.
Description
Languages
C
97.6%
Assembly
1%
Shell
0.5%
Python
0.3%
Makefile
0.3%