linux

iv/linux

Author	SHA1	Message	Date
Thomas Gleixner	07d91ef510	futex: Prevent requeue_pi() lock nesting issue on RT The requeue_pi() operation on RT kernels creates a problem versus the task::pi_blocked_on state when a waiter is woken early (signal, timeout) and that early wake up interleaves with the requeue_pi() operation. When the requeue manages to block the waiter on the rtmutex which is associated to the second futex, then a concurrent early wakeup of that waiter faces the problem that it has to acquire the hash bucket spinlock, which is not an issue on non-RT kernels, but on RT kernels spinlocks are substituted by 'sleeping' spinlocks based on rtmutex. If the hash bucket lock is contended then blocking on that spinlock would result in a impossible situation: blocking on two locks at the same time (the hash bucket lock and the rtmutex representing the PI futex). It was considered to make the hash bucket locks raw_spinlocks, but especially requeue operations with a large amount of waiters can introduce significant latencies, so that's not an option for RT. The RT tree carried a solution which (ab)used task::pi_blocked_on to store the information about an ongoing requeue and an early wakeup which worked, but required to add checks for these special states all over the place. The distangling of an early wakeup of a waiter for a requeue_pi() operation is already looking at quite some different states and the task::pi_blocked_on magic just expanded that to a hard to understand 'state machine'. This can be avoided by keeping track of the waiter/requeue state in the futex_q object itself. Add a requeue_state field to struct futex_q with the following possible states: Q_REQUEUE_PI_NONE Q_REQUEUE_PI_IGNORE Q_REQUEUE_PI_IN_PROGRESS Q_REQUEUE_PI_WAIT Q_REQUEUE_PI_DONE Q_REQUEUE_PI_LOCKED The waiter starts with state = NONE and the following state transitions are valid: On the waiter side: Q_REQUEUE_PI_NONE -> Q_REQUEUE_PI_IGNORE Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_WAIT On the requeue side: Q_REQUEUE_PI_NONE -> Q_REQUEUE_PI_INPROGRESS Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_DONE/LOCKED Q_REQUEUE_PI_IN_PROGRESS -> Q_REQUEUE_PI_NONE (requeue failed) Q_REQUEUE_PI_WAIT -> Q_REQUEUE_PI_DONE/LOCKED Q_REQUEUE_PI_WAIT -> Q_REQUEUE_PI_IGNORE (requeue failed) The requeue side ignores a waiter with state Q_REQUEUE_PI_IGNORE as this signals that the waiter is already on the way out. It also means that the waiter is still on the 'wait' futex, i.e. uaddr1. The waiter side signals early wakeup to the requeue side either through setting state to Q_REQUEUE_PI_IGNORE or to Q_REQUEUE_PI_WAIT depending on the current state. In case of Q_REQUEUE_PI_IGNORE it can immediately proceed to take the hash bucket lock of uaddr1. If it set state to WAIT, which means the wakeup is interleaving with a requeue in progress it has to wait for the requeue side to change the state. Either to DONE/LOCKED or to IGNORE. DONE/LOCKED means the waiter q is now on the uaddr2 futex and either blocked (DONE) or has acquired it (LOCKED). IGNORE is set by the requeue side when the requeue attempt failed via deadlock detection and therefore the waiter's futex_q is still on the uaddr1 futex. While this is not strictly required on !RT making this unconditional has the benefit of common code and it also allows the waiter to avoid taking the hash bucket lock on the way out in certain cases, which reduces contention. Add the required helpers required for the state transitions, invoke them at the right places and restructure the futex_wait_requeue_pi() code to handle the return from wait (early or not) based on the state machine values. On !RT enabled kernels the waiter spin waits for the state going from Q_REQUEUE_PI_WAIT to some other state, on RT enabled kernels this is handled by rcuwait_wait_event() and the corresponding wake up on the requeue side. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211305.693317658@linutronix.de	2021-08-17 19:05:59 +02:00
Thomas Gleixner	6231acbd08	futex: Simplify handle_early_requeue_pi_wakeup() Move the futex key match out of handle_early_requeue_pi_wakeup() which allows to simplify that function. The upcoming state machine for requeue_pi() will make that go away. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211305.638938670@linutronix.de	2021-08-17 19:05:57 +02:00
Thomas Gleixner	d69cba5c71	futex: Reorder sanity checks in futex_requeue() No point in allocating memory when the input parameters are bogus. Validate all parameters before proceeding. Suggested-by: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211305.581789253@linutronix.de	2021-08-17 19:05:54 +02:00
Thomas Gleixner	c18eaa3aca	futex: Clarify comment in futex_requeue() The comment about the restriction of the number of waiters to wake for the REQUEUE_PI case is confusing at best. Rewrite it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211305.524990421@linutronix.de	2021-08-17 19:05:51 +02:00
Thomas Gleixner	64b7b715f7	futex: Restructure futex_requeue() No point in taking two more 'requeue_pi' conditionals just to get to the requeue. Same for the requeue_pi case just the other way round. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211305.468835790@linutronix.de	2021-08-17 19:05:49 +02:00
Thomas Gleixner	59c7ecf154	futex: Correct the number of requeued waiters for PI The accounting is wrong when either the PI sanity check or the requeue PI operation fails. Adjust it in the failure path. Will be simplified in the next step. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211305.416427548@linutronix.de	2021-08-17 19:05:46 +02:00
Thomas Gleixner	8e74633dce	futex: Remove bogus condition for requeue PI For requeue PI it's required to establish PI state for the PI futex to which waiters are requeued. This either acquires the user space futex on behalf of the top most waiter on the inner 'waitqueue' futex, or attaches to the PI state of an existing waiter, or creates on attached to the owner of the futex. This code can retry in case of failure, but retry can never happen when the pi state was successfully created. The condition to run this code is: (task_count - nr_wake) < nr_requeue which is always true because: task_count = 0 nr_wake = 1 nr_requeue >= 0 Remove it completely. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211305.362730187@linutronix.de	2021-08-17 19:05:44 +02:00
Thomas Gleixner	f6f4ec00b5	futex: Clarify futex_requeue() PI handling When requeuing to a PI futex, then the requeue code tries to trylock the PI futex on behalf of the topmost waiter on the inner 'waitqueue' futex. If that succeeds, then PI state has to be allocated in order to requeue further waiters to the PI futex. The comment and the code are confusing, as the PI state allocation uses lookup_pi_state(), which either attaches to an existing waiter or to the owner. As the PI futex was just acquired, there cannot be a waiter on the PI futex because the hash bucket lock is held. Clarify the comment and use attach_to_pi_owner() directly. As the task on which behalf the PI futex has been acquired is guaranteed to be alive and not exiting, this call must succeed. Add a WARN_ON() in case that fails. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211305.305142462@linutronix.de	2021-08-17 19:05:41 +02:00
Thomas Gleixner	c363b7ed79	futex: Clean up stale comments The futex key reference mechanism is long gone. Clean up the stale comments which still mention it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211305.249178312@linutronix.de	2021-08-17 19:05:39 +02:00
Thomas Gleixner	dc7109aaa2	futex: Validate waiter correctly in futex_proxy_trylock_atomic() The loop in futex_requeue() has a sanity check for the waiter, which is missing in futex_proxy_trylock_atomic(). In theory the key2 check is sufficient, but futexes are cursed so add it for completeness and paranoia sake. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211305.193767519@linutronix.de	2021-08-17 19:05:36 +02:00
Sebastian Andrzej Siewior	c49f7ece46	lib/test_lockup: Adapt to changed variables The inner parts of certain locks (mutex, rwlocks) changed due to a rework for RT and non RT code. Most users remain unaffected, but those who fiddle around in the inner parts need to be updated. Match the struct names to the new layout. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211305.137982730@linutronix.de	2021-08-17 19:05:33 +02:00
Thomas Gleixner	bb630f9f7a	locking/rtmutex: Add mutex variant for RT Add the necessary defines, helpers and API functions for replacing struct mutex on a PREEMPT_RT enabled kernel with an rtmutex based variant. No functional change when CONFIG_PREEMPT_RT=n Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211305.081517417@linutronix.de	2021-08-17 19:05:29 +02:00
Peter Zijlstra	f8635d509d	locking/ww_mutex: Implement rtmutex based ww_mutex API functions Add the actual ww_mutex API functions which replace the mutex based variant on RT enabled kernels. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211305.024057938@linutronix.de	2021-08-17 19:05:26 +02:00
Peter Zijlstra	add461325e	locking/rtmutex: Extend the rtmutex core to support ww_mutex Add a ww acquire context pointer to the waiter and various functions and add the ww_mutex related invocations to the proper spots in the locking code, similar to the mutex based variant. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211304.966139174@linutronix.de	2021-08-17 19:05:23 +02:00
Peter Zijlstra	2408f7a378	locking/ww_mutex: Add rt_mutex based lock type and accessors Provide the defines for RT mutex based ww_mutexes and fix up the debug logic so it's either enabled by DEBUG_MUTEXES or DEBUG_RT_MUTEXES on RT kernels. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211304.908012566@linutronix.de	2021-08-17 19:05:11 +02:00
Peter Zijlstra	8850d77370	locking/ww_mutex: Add RT priority to W/W order RT mutex based ww_mutexes cannot order based on timestamps. They have to order based on priority. Add the necessary decision logic. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211304.847536630@linutronix.de	2021-08-17 19:05:08 +02:00
Peter Zijlstra	dc4564f5dc	locking/ww_mutex: Implement rt_mutex accessors Provide the type defines and the helper inlines for rtmutex based ww_mutexes. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211304.790760545@linutronix.de	2021-08-17 19:05:06 +02:00
Thomas Gleixner	653a5b0bd9	locking/ww_mutex: Abstract out internal lock accesses Accessing the internal wait_lock of mutex and rtmutex is slightly different. Provide helper functions for that. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211304.734635961@linutronix.de	2021-08-17 19:05:03 +02:00
Peter Zijlstra	bdb189148d	locking/ww_mutex: Abstract out mutex types Some ww_mutex helper functions use pointers for the underlying mutex and mutex_waiter. The upcoming rtmutex based implementation needs to share these functions. Add and use defines for the types and replace the direct types in the affected functions. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211304.678720245@linutronix.de	2021-08-17 19:05:00 +02:00
Peter Zijlstra	9934ccc75c	locking/ww_mutex: Abstract out mutex accessors Move the mutex related access from various ww_mutex functions into helper functions so they can be substituted for rtmutex based ww_mutex later. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211304.622477030@linutronix.de	2021-08-17 19:04:57 +02:00
Peter Zijlstra	843dac28f9	locking/ww_mutex: Abstract out waiter enqueueing The upcoming rtmutex based ww_mutex needs a different handling for enqueueing a waiter. Split it out into a helper function. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211304.566318143@linutronix.de	2021-08-17 19:04:54 +02:00
Peter Zijlstra	23d599eb23	locking/ww_mutex: Abstract out the waiter iteration Split out the waiter iteration functions so they can be substituted for a rtmutex based ww_mutex later. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211304.509186185@linutronix.de	2021-08-17 19:04:52 +02:00
Peter Zijlstra	5297ccb2c5	locking/ww_mutex: Remove the __sched annotation from ww_mutex APIs None of these functions will be on the stack when blocking in schedule(), hence __sched is not needed. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211304.453235952@linutronix.de	2021-08-17 19:04:49 +02:00
Peter Zijlstra (Intel)	2674bd181f	locking/ww_mutex: Split out the W/W implementation logic into kernel/locking/ww_mutex.h Split the W/W mutex helper functions out into a separate header file, so they can be shared with a rtmutex based variant later. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211304.396893399@linutronix.de	2021-08-17 19:04:46 +02:00
Peter Zijlstra (Intel)	aaa77de10b	locking/ww_mutex: Split up ww_mutex_unlock() Split the ww related part out into a helper function so it can be reused for a rtmutex based ww_mutex implementation. [ mingo: Fixed bisection failure. ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211304.340166556@linutronix.de	2021-08-17 19:04:44 +02:00
Peter Zijlstra	c0afb0ffc0	locking/ww_mutex: Gather mutex_waiter initialization Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211304.281927514@linutronix.de	2021-08-17 19:04:41 +02:00
Peter Zijlstra	cf702eddcd	locking/ww_mutex: Simplify lockdep annotations No functional change. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211304.222921634@linutronix.de	2021-08-17 19:04:28 +02:00
Thomas Gleixner	ebf4c55c1d	locking/mutex: Make mutex::wait_lock raw The wait_lock of mutex is really a low level lock. Convert it to a raw_spinlock like the wait_lock of rtmutex. [ mingo: backmerged the test_lockup.c build fix by bigeasy. ] Co-developed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211304.166863404@linutronix.de	2021-08-17 19:03:33 +02:00
Thomas Gleixner	4f1893ec8c	locking/ww_mutex: Move the ww_mutex definitions from <linux/mutex.h> into <linux/ww_mutex.h> Move the ww_mutex definitions into the ww_mutex specific header where they belong. Preparatory change to allow compiling ww_mutexes standalone. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211304.110216293@linutronix.de	2021-08-17 18:24:31 +02:00
Thomas Gleixner	43d2d52d70	locking/mutex: Move the 'struct mutex_waiter' definition from <linux/mutex.h> to the internal header Move the mutex waiter declaration from the public <linux/mutex.h> header to the internal kernel/locking/mutex.h header. There is no reason to expose it outside of the core code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211304.054325923@linutronix.de	2021-08-17 18:24:31 +02:00
Thomas Gleixner	a321fb9038	locking/mutex: Consolidate core headers, remove kernel/locking/mutex-debug.h Having two header files which contain just the non-debug and debug variants is mostly waste of disc space and has no real value. Stick the debug variants into the common mutex.h file as counterpart to the stubs for the non-debug case. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.995350521@linutronix.de	2021-08-17 18:24:22 +02:00
Peter Zijlstra	715f7f9ece	locking/rtmutex: Squash !RT tasks to DEFAULT_PRIO Ensure all !RT tasks have the same prio such that they end up in FIFO order and aren't split up according to nice level. The reason why nice levels were taken into account so far is historical. In the early days of the rtmutex code it was done to give the PI boosting and deboosting a larger coverage. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.938676930@linutronix.de	2021-08-17 17:51:02 +02:00
Thomas Gleixner	8282947f67	locking/rwlock: Provide RT variant Similar to rw_semaphores, on RT the rwlock substitution is not writer fair, because it's not feasible to have a writer inherit its priority to multiple readers. Readers blocked on a writer follow the normal rules of priority inheritance. Like RT spinlocks, RT rwlocks are state preserving across the slow lock operations (contended case). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.882793524@linutronix.de	2021-08-17 17:50:51 +02:00
Thomas Gleixner	0f383b6dc9	locking/spinlock: Provide RT variant Provide the actual locking functions which make use of the general and spinlock specific rtmutex code. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.826621464@linutronix.de	2021-08-17 17:48:13 +02:00
Thomas Gleixner	1c143c4b65	locking/rtmutex: Provide the spin/rwlock core lock function A simplified version of the rtmutex slowlock function, which neither handles signals nor timeouts, and is careful about preserving the state of the blocked task across the lock operation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.770228446@linutronix.de	2021-08-17 17:45:37 +02:00
Thomas Gleixner	342a93247e	locking/spinlock: Provide RT variant header: <linux/spinlock_rt.h> Provide the necessary wrappers around the actual rtmutex based spinlock implementation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.712897671@linutronix.de	2021-08-17 17:43:24 +02:00
Thomas Gleixner	051790eecc	locking/spinlock: Provide RT specific spinlock_t RT replaces spinlocks with a simple wrapper around an rtmutex, which turns spinlocks on RT into 'sleeping' spinlocks. The actual implementation of the spinlock API differs from a regular rtmutex, as it does neither handle timeouts nor signals and it is state preserving across the lock operation. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.654230709@linutronix.de	2021-08-17 17:41:24 +02:00
Sebastian Andrzej Siewior	e4e17af3b7	locking/rtmutex: Reduce <linux/rtmutex.h> header dependencies, only include <linux/rbtree_types.h> We have the following header dependency problem on RT: - <linux/rtmutex.h> needs the definition of 'struct rb_root_cached'. - <linux/rbtree.h> includes <linux/kernel.h>, which includes <linux/spinlock.h> That works nicely for non-RT enabled kernels, but on RT enabled kernels spinlocks are based on rtmutexes, which creates another circular header dependency as <linux/spinlocks.h> will require <linux/rtmutex.h>. Include <linux/rbtree_types.h> instead. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.598003167@linutronix.de	2021-08-17 17:37:26 +02:00
Sebastian Andrzej Siewior	089050cafa	rbtree: Split out the rbtree type definitions into <linux/rbtree_types.h> So we have this header dependency problem on RT: - <linux/rtmutex.h> needs the definition of 'struct rb_root_cached'. - <linux/rbtree.h> includes <linux/kernel.h>, which includes <linux/spinlock.h>. That works nicely for non-RT enabled kernels, but on RT enabled kernels spinlocks are based on rtmutexes, which creates another circular header dependency, as <linux/spinlocks.h> will require <linux/rtmutex.h>. Split out the type definitions and move them into their own header file so the rtmutex header can include just those. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.542123501@linutronix.de	2021-08-17 17:36:48 +02:00
Sebastian Andrzej Siewior	cbcebf5bd3	locking/lockdep: Reduce header dependencies in <linux/debug_locks.h> The inclusion of printk.h leads to a circular dependency if spinlock_t is based on rtmutexes on RT enabled kernels. Include only atomic.h (xchg()) and cache.h (__read_mostly) which is all what debug_locks.h requires. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.484161136@linutronix.de	2021-08-17 17:29:10 +02:00
Sebastian Andrzej Siewior	a403abbdc7	locking/rtmutex: Prevent future include recursion hell rtmutex only needs raw_spinlock_t, but it includes spinlock_types.h, which is not a problem on an non RT enabled kernel. RT kernels substitute regular spinlocks with 'sleeping' spinlocks, which are based on rtmutexes, and therefore must be able to include rtmutex.h. Include <linux/spinlock_types_raw.h> instead. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.428224188@linutronix.de	2021-08-17 17:27:28 +02:00
Thomas Gleixner	4f084ca74c	locking/spinlock: Split the lock types header, and move the raw types into <linux/spinlock_types_raw.h> Move raw_spinlock into its own file. Prepare for RT 'sleeping spinlocks', to avoid header recursion, as RT locks require rtmutex.h, which in turn requires the raw spinlock types. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.371269088@linutronix.de	2021-08-17 17:26:27 +02:00
Thomas Gleixner	e17ba59b7e	locking/rtmutex: Guard regular sleeping locks specific functions Guard the regular sleeping lock specific functionality, which is used for rtmutex on non-RT enabled kernels and for mutex, rtmutex and semaphores on RT enabled kernels so the code can be reused for the RT specific implementation of spinlocks and rwlocks in a different compilation unit. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.311535693@linutronix.de	2021-08-17 17:23:27 +02:00
Thomas Gleixner	456cfbc65c	locking/rtmutex: Prepare RT rt_mutex_wake_q for RT locks Add an rtlock_task pointer to rt_mutex_wake_q, which allows to handle the RT specific wakeup for spin/rwlock waiters. The pointer is just consuming 4/8 bytes on the stack so it is provided unconditionaly to avoid #ifdeffery all over the place. This cannot use a regular wake_q, because a task can have concurrent wakeups which would make it miss either lock or the regular wakeups, depending on what gets queued first, unless task struct gains a separate wake_q_node for this, which would be overkill, because there can only be a single task which gets woken up in the spin/rw_lock unlock path. No functional change for non-RT enabled kernels. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.253614678@linutronix.de	2021-08-17 17:21:09 +02:00
Thomas Gleixner	7980aa397c	locking/rtmutex: Use rt_mutex_wake_q_head Prepare for the required state aware handling of waiter wakeups via wake_q and switch the rtmutex code over to the rtmutex specific wrapper. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.197113263@linutronix.de	2021-08-17 17:20:14 +02:00
Thomas Gleixner	b576e640ce	locking/rtmutex: Provide rt_wake_q_head and helpers To handle the difference between wakeups for regular sleeping locks (mutex, rtmutex, rw_semaphore) and the wakeups for 'sleeping' spin/rwlocks on PREEMPT_RT enabled kernels correctly, it is required to provide a wake_q_head construct which allows to keep them separate. Provide a wrapper around wake_q_head and the required helpers, which will be extended with the state handling later. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.139337655@linutronix.de	2021-08-17 17:18:15 +02:00
Thomas Gleixner	c014ef69b3	locking/rtmutex: Add wake_state to rt_mutex_waiter Regular sleeping locks like mutexes, rtmutexes and rw_semaphores are always entering and leaving a blocking section with task state == TASK_RUNNING. On a non-RT kernel spinlocks and rwlocks never affect the task state, but on RT kernels these locks are converted to rtmutex based 'sleeping' locks. So in case of contention the task goes to block, which requires to carefully preserve the task state, and restore it after acquiring the lock taking regular wakeups for the task into account, which happened while the task was blocked. This state preserving is achieved by having a separate task state for blocking on a RT spin/rwlock and a saved_state field in task_struct along with careful handling of these wakeup scenarios in try_to_wake_up(). To avoid conditionals in the rtmutex code, store the wake state which has to be used for waking a lock waiter in rt_mutex_waiter which allows to handle the regular and RT spin/rwlocks by handing it to wake_up_state(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.079800739@linutronix.de	2021-08-17 17:15:36 +02:00
Thomas Gleixner	42254105df	locking/rwsem: Add rtmutex based R/W semaphore implementation The RT specific R/W semaphore implementation used to restrict the number of readers to one, because a writer cannot block on multiple readers and inherit its priority or budget. The single reader restricting was painful in various ways: - Performance bottleneck for multi-threaded applications in the page fault path (mmap sem) - Progress blocker for drivers which are carefully crafted to avoid the potential reader/writer deadlock in mainline. The analysis of the writer code paths shows that properly written RT tasks should not take them. Syscalls like mmap(), file access which take mmap sem write locked have unbound latencies, which are completely unrelated to mmap sem. Other R/W sem users like graphics drivers are not suitable for RT tasks either. So there is little risk to hurt RT tasks when the RT rwsem implementation is done in the following way: - Allow concurrent readers - Make writers block until the last reader left the critical section. This blocking is not subject to priority/budget inheritance. - Readers blocked on a writer inherit their priority/budget in the normal way. There is a drawback with this scheme: R/W semaphores become writer unfair though the applications which have triggered writer starvation (mostly on mmap_sem) in the past are not really the typical workloads running on a RT system. So while it's unlikely to hit writer starvation, it's possible. If there are unexpected workloads on RT systems triggering it, the problem has to be revisited. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211303.016885947@linutronix.de	2021-08-17 17:12:47 +02:00
Thomas Gleixner	943f0edb75	locking/rt: Add base code for RT rw_semaphore and rwlock On PREEMPT_RT, rw_semaphores and rwlocks are substituted with an rtmutex and a reader count. The implementation is writer unfair, as it is not feasible to do priority inheritance on multiple readers, but experience has shown that real-time workloads are not the typical workloads which are sensitive to writer starvation. The inner workings of rw_semaphores and rwlocks on RT are almost identical except for the task state and signal handling. rw_semaphores are not state preserving over a contention, they are expected to enter and leave with state == TASK_RUNNING. rwlocks have a mechanism to preserve the state of the task at entry and restore it after unblocking taking potential non-lock related wakeups into account. rw_semaphores can also be subject to signal handling interrupting a blocked state, while rwlocks ignore signals. To avoid code duplication, provide a shared implementation which takes the small difference vs. state and signals into account. The code is included into the relevant rw_semaphore/rwlock base code and compiled for each use case separately. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211302.957920571@linutronix.de	2021-08-17 17:12:22 +02:00
Thomas Gleixner	6bc8996add	locking/rtmutex: Provide rt_mutex_base_is_locked() Provide rt_mutex_base_is_locked(), which will be used for various wrapped locking primitives for RT. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20210815211302.899572818@linutronix.de	2021-08-17 17:04:35 +02:00

1 2 3 4 5 ...

1031144 Commits