mm/page_alloc: use write_seqlock_irqsave() instead write_seqlock() + local_irq_save().
__build_all_zonelists() acquires zonelist_update_seq by first disabling interrupts via local_irq_save() and then acquiring the seqlock with write_seqlock(). This is troublesome and leads to problems on PREEMPT_RT. The problem is that the inner spinlock_t becomes a sleeping lock on PREEMPT_RT and must not be acquired with disabled interrupts. The API provides write_seqlock_irqsave() which does the right thing in one step. printk_deferred_enter() has to be invoked in non-migrate-able context to ensure that deferred printing is enabled and disabled on the same CPU. This is the case after zonelist_update_seq has been acquired. There was discussion on the first submission that the order should be: local_irq_disable(); printk_deferred_enter(); write_seqlock(); to avoid pitfalls like having an unaccounted printk() coming from write_seqlock_irqsave() before printk_deferred_enter() is invoked. The only origin of such a printk() can be a lockdep splat because the lockdep annotation happens after the sequence count is incremented. This is exceptional and subject to change. It was also pointed that PREEMPT_RT can be affected by the printk problem since its write_seqlock_irqsave() does not really disable interrupts. This isn't the case because PREEMPT_RT's printk implementation differs from the mainline implementation in two important aspects: - Printing happens in a dedicated threads and not at during the invocation of printk(). - In emergency cases where synchronous printing is used, a different driver is used which does not use tty_port::lock. Acquire zonelist_update_seq with write_seqlock_irqsave() and then defer printk output. Link: https://lkml.kernel.org/r/20230623201517.yw286Knb@linutronix.de Fixes: 1007843a91909 ("mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlock") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Michal Hocko <mhocko@suse.com> Reviewed-by: David Hildenbrand <david@redhat.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Petr Mladek <pmladek@suse.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This commit is contained in:
parent
ada5caed79
commit
a2ebb51575
@ -5139,19 +5139,17 @@ static void __build_all_zonelists(void *data)
|
|||||||
unsigned long flags;
|
unsigned long flags;
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Explicitly disable this CPU's interrupts before taking seqlock
|
* The zonelist_update_seq must be acquired with irqsave because the
|
||||||
* to prevent any IRQ handler from calling into the page allocator
|
* reader can be invoked from IRQ with GFP_ATOMIC.
|
||||||
* (e.g. GFP_ATOMIC) that could hit zonelist_iter_begin and livelock.
|
|
||||||
*/
|
*/
|
||||||
local_irq_save(flags);
|
write_seqlock_irqsave(&zonelist_update_seq, flags);
|
||||||
/*
|
/*
|
||||||
* Explicitly disable this CPU's synchronous printk() before taking
|
* Also disable synchronous printk() to prevent any printk() from
|
||||||
* seqlock to prevent any printk() from trying to hold port->lock, for
|
* trying to hold port->lock, for
|
||||||
* tty_insert_flip_string_and_push_buffer() on other CPU might be
|
* tty_insert_flip_string_and_push_buffer() on other CPU might be
|
||||||
* calling kmalloc(GFP_ATOMIC | __GFP_NOWARN) with port->lock held.
|
* calling kmalloc(GFP_ATOMIC | __GFP_NOWARN) with port->lock held.
|
||||||
*/
|
*/
|
||||||
printk_deferred_enter();
|
printk_deferred_enter();
|
||||||
write_seqlock(&zonelist_update_seq);
|
|
||||||
|
|
||||||
#ifdef CONFIG_NUMA
|
#ifdef CONFIG_NUMA
|
||||||
memset(node_load, 0, sizeof(node_load));
|
memset(node_load, 0, sizeof(node_load));
|
||||||
@ -5188,9 +5186,8 @@ static void __build_all_zonelists(void *data)
|
|||||||
#endif
|
#endif
|
||||||
}
|
}
|
||||||
|
|
||||||
write_sequnlock(&zonelist_update_seq);
|
|
||||||
printk_deferred_exit();
|
printk_deferred_exit();
|
||||||
local_irq_restore(flags);
|
write_sequnlock_irqrestore(&zonelist_update_seq, flags);
|
||||||
}
|
}
|
||||||
|
|
||||||
static noinline void __init
|
static noinline void __init
|
||||||
|
Loading…
x
Reference in New Issue
Block a user