IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Add lock event counting calls so that we can track the number of lock
events happening in the rwsem code.
With CONFIG_LOCK_EVENT_COUNTS on and booting a 4-socket 112-thread x86-64
system, the rwsem counts after system bootup were as follows:
rwsem_opt_fail=261
rwsem_opt_wlock=50636
rwsem_rlock=445
rwsem_rlock_fail=0
rwsem_rlock_fast=22
rwsem_rtrylock=810144
rwsem_sleep_reader=441
rwsem_sleep_writer=310
rwsem_wake_reader=355
rwsem_wake_writer=2335
rwsem_wlock=261
rwsem_wlock_fail=0
rwsem_wtrylock=20583
It can be seen that most of the lock acquisitions in the slowpath were
write-locks in the optimistic spinning code path with no sleeping at
all. For this system, over 97% of the locks are acquired via optimistic
spinning. It illustrates the importance of optimistic spinning in
improving the performance of rwsem.
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20190404174320.22416-11-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The QUEUED_LOCK_STAT option to report queued spinlocks event counts
was previously allowed only on x86 architecture. To make the locking
event counting code more useful, it is now renamed to a more generic
LOCK_EVENT_COUNTS config option. This new option will be available to
all the architectures that use qspinlock at the moment.
Other locking code can now start to use the generic locking event
counting code by including lock_events.h and put the new locking event
names into the lock_events_list.h header file.
My experience with lock event counting is that it gives valuable insight
on how the locking code works and what can be done to make it better. I
would like to extend this benefit to other locking code like mutex and
rwsem in the near future.
The PV qspinlock specific code will stay in qspinlock_stat.h. The
locking event counters will now reside in the <debugfs>/lock_event_counts
directory.
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/20190404174320.22416-9-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Removing explicit calls to mmiowb() from driver code means that we must
now call into the generic mmiowb_spin_{lock,unlock}() functions from the
core spinlock code. In order to elide barriers following critical
sections without any I/O writes, we also hook into the asm-generic I/O
routines.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
In preparation for removing all explicit mmiowb() calls from driver
code, implement a tracking system in asm-generic based loosely on the
PowerPC implementation. This allows architectures with a non-empty
mmiowb() definition to have the barrier automatically inserted in
spin_unlock() following a critical section containing an I/O write.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
As the generic rwsem-xadd code is using the appropriate acquire and
release versions of the atomic operations, the arch specific rwsem.h
files will not be that much faster than the generic code as long as the
atomic functions are properly implemented. So we can remove those arch
specific rwsem.h and stop building asm/rwsem.h to reduce maintenance
effort.
Currently, only x86, alpha and ia64 have implemented architecture
specific fast paths. I don't have access to alpha and ia64 systems for
testing, but they are legacy systems that are not likely to be updated
to the latest kernel anyway.
By using a rwsem microbenchmark, the total locking rates on a 4-socket
56-core 112-thread x86-64 system before and after the patch were as
follows (mixed means equal # of read and write locks):
Before Patch After Patch
# of Threads wlock rlock mixed wlock rlock mixed
------------ ----- ----- ----- ----- ----- -----
1 29,201 30,143 29,458 28,615 30,172 29,201
2 6,807 13,299 1,171 7,725 15,025 1,804
4 6,504 12,755 1,520 7,127 14,286 1,345
8 6,762 13,412 764 6,826 13,652 726
16 6,693 15,408 662 6,599 15,938 626
32 6,145 15,286 496 5,549 15,487 511
64 5,812 15,495 60 5,858 15,572 60
There were some run-to-run variations for the multi-thread tests. For
x86-64, using the generic C code fast path seems to be a little bit
faster than the assembly version with low lock contention. Looking at
the assembly version of the fast paths, there are assembly to/from C
code wrappers that save and restore all the callee-clobbered registers
(7 registers on x86-64). The assembly generated from the generic C
code doesn't need to do that. That may explain the slight performance
gain here.
The generic asm rwsem.h can also be merged into kernel/locking/rwsem.h
with no code change as no other code other than those under
kernel/locking needs to access the internal rwsem macros and functions.
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-riscv@lists.infradead.org
Cc: linux-um@lists.infradead.org
Cc: linux-xtensa@linux-xtensa.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: nios2-dev@lists.rocketboards.org
Cc: openrisc@lists.librecores.org
Cc: uclinux-h8-devel@lists.sourceforge.jp
Link: https://lkml.kernel.org/r/20190322143008.21313-2-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently, lock_torture_cleanup() uses the values of cxt.lwsa and cxt.lrsa
to detect bad parameters that prevented locktorture from initializing,
let alone running. In this case, lock_torture_cleanup() does no cleanup
aside from invoking torture_cleanup_begin() and torture_cleanup_end(),
as required to permit future torture tests to run. However, this
heuristic fails if the run with bad parameters was preceded by a previous
run that actually ran: In this case, both cxt.lwsa and cxt.lrsa will
remain non-zero, which means that the current lock_torture_cleanup()
invocation will be unable to detect the fact that it should skip cleanup,
which can result in charming outcomes such as double frees.
This commit therefore NULLs out both cxt.lwsa and cxt.lrsa at the end
of any run that actually ran.
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Josh Triplett <josh@joshtriplett.org>
Pull perf updates from Ingo Molnar:
"Lots of tooling updates - too many to list, here's a few highlights:
- Various subcommand updates to 'perf trace', 'perf report', 'perf
record', 'perf annotate', 'perf script', 'perf test', etc.
- CPU and NUMA topology and affinity handling improvements,
- HW tracing and HW support updates:
- Intel PT updates
- ARM CoreSight updates
- vendor HW event updates
- BPF updates
- Tons of infrastructure updates, both on the build system and the
library support side
- Documentation updates.
- ... and lots of other changes, see the changelog for details.
Kernel side updates:
- Tighten up kprobes blacklist handling, reduce the number of places
where developers can install a kprobe and hang/crash the system.
- Fix/enhance vma address filter handling.
- Various PMU driver updates, small fixes and additions.
- refcount_t conversions
- BPF updates
- error code propagation enhancements
- misc other changes"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (238 commits)
perf script python: Add Python3 support to syscall-counts-by-pid.py
perf script python: Add Python3 support to syscall-counts.py
perf script python: Add Python3 support to stat-cpi.py
perf script python: Add Python3 support to stackcollapse.py
perf script python: Add Python3 support to sctop.py
perf script python: Add Python3 support to powerpc-hcalls.py
perf script python: Add Python3 support to net_dropmonitor.py
perf script python: Add Python3 support to mem-phys-addr.py
perf script python: Add Python3 support to failed-syscalls-by-pid.py
perf script python: Add Python3 support to netdev-times.py
perf tools: Add perf_exe() helper to find perf binary
perf script: Handle missing fields with -F +..
perf data: Add perf_data__open_dir_data function
perf data: Add perf_data__(create_dir|close_dir) functions
perf data: Fail check_backup in case of error
perf data: Make check_backup work over directories
perf tools: Add rm_rf_perf_data function
perf tools: Add pattern name checking to rm_rf
perf tools: Add depth checking to rm_rf
perf data: Add global path holder
...
Pull locking updates from Ingo Molnar:
"The biggest part of this tree is the new auto-generated atomics API
wrappers by Mark Rutland.
The primary motivation was to allow instrumentation without uglifying
the primary source code.
The linecount increase comes from adding the auto-generated files to
the Git space as well:
include/asm-generic/atomic-instrumented.h | 1689 ++++++++++++++++--
include/asm-generic/atomic-long.h | 1174 ++++++++++---
include/linux/atomic-fallback.h | 2295 +++++++++++++++++++++++++
include/linux/atomic.h | 1241 +------------
I preferred this approach, so that the full call stack of the (already
complex) locking APIs is still fully visible in 'git grep'.
But if this is excessive we could certainly hide them.
There's a separate build-time mechanism to determine whether the
headers are out of date (they should never be stale if we do our job
right).
Anyway, nothing from this should be visible to regular kernel
developers.
Other changes:
- Add support for dynamic keys, which removes a source of false
positives in the workqueue code, among other things (Bart Van
Assche)
- Updates to tools/memory-model (Andrea Parri, Paul E. McKenney)
- qspinlock, wake_q and lockdep micro-optimizations (Waiman Long)
- misc other updates and enhancements"
* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits)
locking/lockdep: Shrink struct lock_class_key
locking/lockdep: Add module_param to enable consistency checks
lockdep/lib/tests: Test dynamic key registration
lockdep/lib/tests: Fix run_tests.sh
kernel/workqueue: Use dynamic lockdep keys for workqueues
locking/lockdep: Add support for dynamic keys
locking/lockdep: Verify whether lock objects are small enough to be used as class keys
locking/lockdep: Check data structure consistency
locking/lockdep: Reuse lock chains that have been freed
locking/lockdep: Fix a comment in add_chain_cache()
locking/lockdep: Introduce lockdep_next_lockchain() and lock_chain_count()
locking/lockdep: Reuse list entries that are no longer in use
locking/lockdep: Free lock classes that are no longer in use
locking/lockdep: Update two outdated comments
locking/lockdep: Make it easy to detect whether or not inside a selftest
locking/lockdep: Split lockdep_free_key_range() and lockdep_reset_lock()
locking/lockdep: Initialize the locks_before and locks_after lists earlier
locking/lockdep: Make zap_class() remove all matching lock order entries
locking/lockdep: Reorder struct lock_class members
locking/lockdep: Avoid that add_chain_cache() adds an invalid chain to the cache
...
A shortcoming of the current lockdep implementation is that it requires
lock keys to be allocated statically. That forces all instances of lock
objects that occur in a given data structure to share a lock key. Since
lock dependency analysis groups lock objects per key sharing lock keys
can cause false positive lockdep reports. Make it possible to avoid
such false positive reports by allowing lock keys to be allocated
dynamically. Require that dynamically allocated lock keys are
registered before use by calling lockdep_register_key(). Complain about
attempts to register the same lock key pointer twice without calling
lockdep_unregister_key() between successive registration calls.
The purpose of the new lock_keys_hash[] data structure that keeps
track of all dynamic keys is twofold:
- Verify whether the lockdep_register_key() and lockdep_unregister_key()
functions are used correctly.
- Avoid that lockdep_init_map() complains when encountering a dynamically
allocated key.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: johannes.berg@intel.com
Cc: tj@kernel.org
Link: https://lkml.kernel.org/r/20190214230058.196511-19-bvanassche@acm.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Instead of leaving lock classes that are no longer in use in the
lock_classes array, reuse entries from that array that are no longer in
use. Maintain a linked list of free lock classes with list head
'free_lock_class'. Only add freed lock classes to the free_lock_classes
list after a grace period to avoid that a lock_classes[] element would
be reused while an RCU reader is accessing it. Since the lockdep
selftests run in a context where sleeping is not allowed and since the
selftests require that lock resetting/zapping works with debug_locks
off, make the behavior of lockdep_free_key_range() and
lockdep_reset_lock() depend on whether or not these are called from
the context of the lockdep selftests.
Thanks to Peter for having shown how to modify get_pending_free()
such that that function does not have to sleep.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: johannes.berg@intel.com
Cc: tj@kernel.org
Link: https://lkml.kernel.org/r/20190214230058.196511-12-bvanassche@acm.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull the latest RCU tree from Paul E. McKenney:
- Additional cleanups after RCU flavor consolidation
- Grace-period forward-progress cleanups and improvements
- Documentation updates
- Miscellaneous fixes
- spin_is_locked() conversions to lockdep
- SPDX changes to RCU source and header files
- SRCU updates
- Torture-test updates, including nolibc updates and moving
nolibc to tools/include
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Replace the license boiler plate with a SPDX license identifier.
While in the area, update an email address.
Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
commit 56222b212e ("futex: Drop hb->lock before enqueueing on the
rtmutex") changed the locking rules in the futex code so that the hash
bucket lock is not longer held while the waiter is enqueued into the
rtmutex wait list. This made the lock and the unlock path symmetric, but
unfortunately the possible early exit from __rt_mutex_proxy_start() due to
a detected deadlock was not updated accordingly. That allows a concurrent
unlocker to observe inconsitent state which triggers the warning in the
unlock path.
futex_lock_pi() futex_unlock_pi()
lock(hb->lock)
queue(hb_waiter) lock(hb->lock)
lock(rtmutex->wait_lock)
unlock(hb->lock)
// acquired hb->lock
hb_waiter = futex_top_waiter()
lock(rtmutex->wait_lock)
__rt_mutex_proxy_start()
---> fail
remove(rtmutex_waiter);
---> returns -EDEADLOCK
unlock(rtmutex->wait_lock)
// acquired wait_lock
wake_futex_pi()
rt_mutex_next_owner()
--> returns NULL
--> WARN
lock(hb->lock)
unqueue(hb_waiter)
The problem is caused by the remove(rtmutex_waiter) in the failure case of
__rt_mutex_proxy_start() as this lets the unlocker observe a waiter in the
hash bucket but no waiter on the rtmutex, i.e. inconsistent state.
The original commit handles this correctly for the other early return cases
(timeout, signal) by delaying the removal of the rtmutex waiter until the
returning task reacquired the hash bucket lock.
Treat the failure case of __rt_mutex_proxy_start() in the same way and let
the existing cleanup code handle the eventual handover of the rtmutex
gracefully. The regular rt_mutex_proxy_start() gains the rtmutex waiter
removal for the failure case, so that the other callsites are still
operating correctly.
Add proper comments to the code so all these details are fully documented.
Thanks to Peter for helping with the analysis and writing the really
valuable code comments.
Fixes: 56222b212e ("futex: Drop hb->lock before enqueueing on the rtmutex")
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Co-developed-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: Stefan Liebler <stli@linux.ibm.com>
Cc: Sebastian Sewior <bigeasy@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1901292311410.1950@nanos.tec.linutronix.de