linux/kernel/locking
Waiman Long a9e9bcb45b locking/rwsem: Prevent decrement of reader count before increment
During my rwsem testing, it was found that after a down_read(), the
reader count may occasionally become 0 or even negative. Consequently,
a writer may steal the lock at that time and execute with the reader
in parallel thus breaking the mutual exclusion guarantee of the write
lock. In other words, both readers and writer can become rwsem owners
simultaneously.

The current reader wakeup code does it in one pass to clear waiter->task
and put them into wake_q before fully incrementing the reader count.
Once waiter->task is cleared, the corresponding reader may see it,
finish the critical section and do unlock to decrement the count before
the count is incremented. This is not a problem if there is only one
reader to wake up as the count has been pre-incremented by 1.  It is
a problem if there are more than one readers to be woken up and writer
can steal the lock.

The wakeup was actually done in 2 passes before the following v4.9 commit:

  70800c3c0c ("locking/rwsem: Scan the wait_list for readers only once")

To fix this problem, the wakeup is now done in two passes
again. In the first pass, we collect the readers and count them.
The reader count is then fully incremented. In the second pass, the
waiter->task is then cleared and they are put into wake_q to be woken
up later.

Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: huang ying <huang.ying.caritas@gmail.com>
Fixes: 70800c3c0c ("locking/rwsem: Scan the wait_list for readers only once")
Link: http://lkml.kernel.org/r/20190428212557.13482-2-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-05-07 08:46:46 +02:00
..
lock_events_list.h locking/rwsem: Enable lock event counting 2019-04-10 10:56:06 +02:00
lock_events.c locking/lock_events: Don't show pvqspinlock events on bare metal 2019-04-10 10:56:05 +02:00
lock_events.h locking/lock_events: Make lock_events available for all archs & other locks 2019-04-10 10:56:04 +02:00
lockdep_internals.h locking/lockdep: Test all incompatible scenarios at once in check_irq_usage() 2019-04-29 08:29:20 +02:00
lockdep_proc.c locking/lockdep: Introduce lockdep_next_lockchain() and lock_chain_count() 2019-02-28 07:55:44 +01:00
lockdep_states.h
lockdep.c s390 updates for the 5.2 merge window 2019-05-06 16:42:54 -07:00
locktorture.c locktorture: NULL cxt.lwsa and cxt.lrsa to allow bad-arg detection 2019-03-26 14:42:53 -07:00
Makefile locking/lock_events: Make lock_events available for all archs & other locks 2019-04-10 10:56:04 +02:00
mcs_spinlock.h locking/mcs: Use smp_cond_load_acquire() in MCS spin loop 2018-04-27 09:48:49 +02:00
mutex-debug.c locking/mutex: Replace spin_is_locked() with lockdep 2018-11-12 09:06:22 -08:00
mutex-debug.h
mutex.c kernel/locking/mutex.c: remove caller signal_pending branch predictions 2019-01-04 13:13:48 -08:00
mutex.h
osq_lock.c
percpu-rwsem.c locking/rwsem: Remove arch specific rwsem files 2019-04-03 14:50:50 +02:00
qrwlock.c
qspinlock_paravirt.h locking/qspinlock_stat: Introduce generic lockevent_*() counting APIs 2019-04-10 10:56:03 +02:00
qspinlock_stat.h locking/lock_events: Make lock_events available for all archs & other locks 2019-04-10 10:56:04 +02:00
qspinlock.c locking/qspinlock_stat: Introduce generic lockevent_*() counting APIs 2019-04-10 10:56:03 +02:00
rtmutex_common.h locking/rtmutex: Handle non enqueued waiters gracefully in remove_waiter() 2018-03-28 23:01:30 +02:00
rtmutex-debug.c
rtmutex-debug.h
rtmutex.c futex: Handle early deadlock return correctly 2019-02-08 13:00:36 +01:00
rtmutex.h
rwsem-xadd.c locking/rwsem: Prevent decrement of reader count before increment 2019-05-07 08:46:46 +02:00
rwsem.c locking/rwsem: Enhance DEBUG_RWSEMS_WARN_ON() macro 2019-04-10 10:56:03 +02:00
rwsem.h locking/rwsem: Prevent unneeded warning during locking selftest 2019-04-14 11:09:35 +02:00
semaphore.c
spinlock_debug.c mmiowb: Hook up mmiowb helpers to spinlocks and generic I/O accessors 2019-04-08 11:59:47 +01:00
spinlock.c asm-generic/mmiowb: Add generic implementation of mmiowb() tracking 2019-04-08 11:59:39 +01:00
test-ww_mutex.c locking/ww_mutex: Fix runtime warning in the WW mutex selftest 2018-10-03 08:56:31 +02:00