linux

iv/linux

History

Waiman Long 64489e7800 locking/rwsem: Implement a new locking scheme The current way of using various reader, writer and waiting biases in the rwsem code are confusing and hard to understand. I have to reread the rwsem count guide in the rwsem-xadd.c file from time to time to remind myself how this whole thing works. It also makes the rwsem code harder to be optimized. To make rwsem more sane, a new locking scheme similar to the one in qrwlock is now being used. The atomic long count has the following bit definitions: Bit 0 - writer locked bit Bit 1 - waiters present bit Bits 2-7 - reserved for future extension Bits 8-X - reader count (24/56 bits) The cmpxchg instruction is now used to acquire the write lock. The read lock is still acquired with xadd instruction, so there is no change here. This scheme will allow up to 16M/64P active readers which should be more than enough. We can always use some more reserved bits if necessary. With that change, we can deterministically know if a rwsem has been write-locked. Looking at the count alone, however, one cannot determine for certain if a rwsem is owned by readers or not as the readers that set the reader count bits may be in the process of backing out. So we still need the reader-owned bit in the owner field to be sure. With a locking microbenchmark running on 5.1 based kernel, the total locking rates (in kops/s) of the benchmark on a 8-socket 120-core IvyBridge-EX system before and after the patch were as follows: Before Patch After Patch # of Threads wlock rlock wlock rlock ------------ ----- ----- ----- ----- 1 30,659 31,341 31,055 31,283 2 8,909 16,457 9,884 17,659 4 9,028 15,823 8,933 20,233 8 8,410 14,212 7,230 17,140 16 8,217 25,240 7,479 24,607 The locking rates of the benchmark on a Power8 system were as follows: Before Patch After Patch # of Threads wlock rlock wlock rlock ------------ ----- ----- ----- ----- 1 12,963 13,647 13,275 13,601 2 7,570 11,569 7,902 10,829 4 5,232 5,516 5,466 5,435 8 5,233 3,386 5,467 3,168 The locking rates of the benchmark on a 2-socket ARM64 system were as follows: Before Patch After Patch # of Threads wlock rlock wlock rlock ------------ ----- ----- ----- ----- 1 21,495 21,046 21,524 21,074 2 5,293 10,502 5,333 10,504 4 5,325 11,463 5,358 11,631 8 5,391 11,712 5,470 11,680 The performance are roughly the same before and after the patch. There are run-to-run variations in performance. Runs with higher variances usually have higher throughput. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Will Deacon <will.deacon@arm.com> Cc: huang ying <huang.ying.caritas@gmail.com> Link: https://lkml.kernel.org/r/20190520205918.22251-4-longman@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>		2019-06-17 12:27:56 +02:00
..
lock_events_list.h	locking/rwsem: Enable lock event counting	2019-04-10 10:56:06 +02:00
lock_events.c	locking/lock_events: Don't show pvqspinlock events on bare metal	2019-04-10 10:56:05 +02:00
lock_events.h	locking/lock_events: Use raw_cpu_{add,inc}() for stats	2019-06-03 12:32:56 +02:00
lockdep_internals.h	locking/lockdep: Test all incompatible scenarios at once in check_irq_usage()	2019-04-29 08:29:20 +02:00
lockdep_proc.c	locking/lockdep: Introduce lockdep_next_lockchain() and lock_chain_count()	2019-02-28 07:55:44 +01:00
lockdep_states.h
lockdep.c	locking/lockdep: Remove unnecessary DEBUG_LOCKS_WARN_ON()	2019-06-17 12:09:37 +02:00
locktorture.c	locktorture: NULL cxt.lwsa and cxt.lrsa to allow bad-arg detection	2019-03-26 14:42:53 -07:00
Makefile	locking/lock_events: Make lock_events available for all archs & other locks	2019-04-10 10:56:04 +02:00
mcs_spinlock.h	locking/mcs: Use smp_cond_load_acquire() in MCS spin loop	2018-04-27 09:48:49 +02:00
mutex-debug.c	locking/mutex: Replace spin_is_locked() with lockdep	2018-11-12 09:06:22 -08:00
mutex-debug.h	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
mutex.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
mutex.h	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
osq_lock.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
percpu-rwsem.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
qrwlock.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157	2019-05-30 11:26:37 -07:00
qspinlock_paravirt.h	locking/qspinlock_stat: Introduce generic lockevent_*() counting APIs	2019-04-10 10:56:03 +02:00
qspinlock_stat.h	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157	2019-05-30 11:26:37 -07:00
qspinlock.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157	2019-05-30 11:26:37 -07:00
rtmutex_common.h	locking/rtmutex: Handle non enqueued waiters gracefully in remove_waiter()	2018-03-28 23:01:30 +02:00
rtmutex-debug.c	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
rtmutex-debug.h	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
rtmutex.c	treewide: Add SPDX license identifier for missed files	2019-05-21 10:50:45 +02:00
rtmutex.h	License cleanup: add SPDX GPL-2.0 license identifier to files with no license	2017-11-02 11:10:55 +01:00
rwsem-xadd.c	locking/rwsem: Implement a new locking scheme	2019-06-17 12:27:56 +02:00
rwsem.c	locking/rwsem: Enhance DEBUG_RWSEMS_WARN_ON() macro	2019-04-10 10:56:03 +02:00
rwsem.h	locking/rwsem: Implement a new locking scheme	2019-06-17 12:27:56 +02:00
semaphore.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 436	2019-06-05 17:37:17 +02:00
spinlock_debug.c	mmiowb: Hook up mmiowb helpers to spinlocks and generic I/O accessors	2019-04-08 11:59:47 +01:00
spinlock.c	asm-generic/mmiowb: Add generic implementation of mmiowb() tracking	2019-04-08 11:59:39 +01:00
test-ww_mutex.c	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 9	2019-05-21 11:28:40 +02:00