linux

iv/linux

History

Peter Zijlstra 04dc1b2fff futex,rt_mutex: Fix rt_mutex_cleanup_proxy_lock() Markus reported that the glibc/nptl/tst-robustpi8 test was failing after commit: `cfafcd117d` ("futex: Rework futex_lock_pi() to use rt_mutex__proxy_lock()") The following trace shows the problem: ld-linux-x86-64-2161 [019] .... 410.760971: SyS_futex: 00007ffbeb76b028: 80000875 op=FUTEX_LOCK_PI ld-linux-x86-64-2161 [019] ...1 410.760972: lock_pi_update_atomic: 00007ffbeb76b028: curval=80000875 uval=80000875 newval=80000875 ret=0 ld-linux-x86-64-2165 [011] .... 410.760978: SyS_futex: 00007ffbeb76b028: 80000875 op=FUTEX_UNLOCK_PI ld-linux-x86-64-2165 [011] d..1 410.760979: do_futex: 00007ffbeb76b028: curval=80000875 uval=80000875 newval=80000871 ret=0 ld-linux-x86-64-2165 [011] .... 410.760980: SyS_futex: 00007ffbeb76b028: 80000871 ret=0000 ld-linux-x86-64-2161 [019] .... 410.760980: SyS_futex: 00007ffbeb76b028: 80000871 ret=ETIMEDOUT Task 2165 does an UNLOCK_PI, assigning the lock to the waiter task 2161 which then returns with -ETIMEDOUT. That wrecks the lock state, because now the owner isn't aware it acquired the lock and removes the pending robust list entry. If 2161 is killed, the robust list will not clear out this futex and the subsequent acquire on this futex will then (correctly) result in -ESRCH which is unexpected by glibc, triggers an internal assertion and dies. Task 2161 Task 2165 rt_mutex_wait_proxy_lock() timeout(); / T2161 is still queued in the waiter list / return -ETIMEDOUT; futex_unlock_pi() spin_lock(hb->lock); rtmutex_unlock() remove_rtmutex_waiter(T2161); mark_lock_available(); / Make the next waiter owner of the user space side */ futex_uval = 2161; spin_unlock(hb->lock); spin_lock(hb->lock); rt_mutex_cleanup_proxy_lock() if (rtmutex_owner() !== current) ... return FAIL; .... return -ETIMEOUT; This means that rt_mutex_cleanup_proxy_lock() needs to call try_to_take_rt_mutex() so it can take over the rtmutex correctly which was assigned by the waker. If the rtmutex is owned by some other task then this call is harmless and just confirmes that the waiter is not able to acquire it. While there, fix what looks like a merge error which resulted in rt_mutex_cleanup_proxy_lock() having two calls to fixup_rt_mutex_waiters() and rt_mutex_wait_proxy_lock() not having any. Both should have one, since both potentially touch the waiter list. Fixes: `38d589f2fd` ("futex,rt_mutex: Restructure rt_mutex_finish_proxy_lock()") Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Bug-Spotted-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Florian Weimer <fweimer@redhat.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Link: http://lkml.kernel.org/r/20170519154850.mlomgdsd26drq5j6@hirez.programming.kicks-ass.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>		2017-05-22 21:57:18 +02:00
..
lockdep_internals.h	sparc64: Use LOCKDEP_SMALL, not PROVE_LOCKING_SMALL	2017-04-18 13:11:07 -07:00
lockdep_proc.c	Replace <asm/uaccess.h> with <linux/uaccess.h> globally	2016-12-24 11:46:01 -08:00
lockdep_states.h
lockdep.c	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2017-05-10 10:30:46 -07:00
locktorture.c	sched/headers: Prepare for the removal of <linux/rtmutex.h> from <linux/sched.h>	2017-03-02 08:42:32 +01:00
Makefile	locking/ww_mutex: Begin kselftests for ww_mutex	2017-01-14 11:37:14 +01:00
mcs_spinlock.h	locking/core: Remove cpu_relax_lowlatency() users	2016-11-16 10:15:10 +01:00
mutex-debug.c	locking/mutex: Rework mutex::owner	2016-10-25 11:31:50 +02:00
mutex-debug.h	locking/mutex: Fix lockdep_assert_held() fail	2017-01-30 11:42:59 +01:00
mutex.c	sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h>	2017-03-02 08:42:34 +01:00
mutex.h	locking/mutex: Fix lockdep_assert_held() fail	2017-01-30 11:42:59 +01:00
osq_lock.c	locking/osq: Break out of spin-wait busy waiting loop for a preempted vCPU in osq_lock()	2016-11-22 12:48:10 +01:00
percpu-rwsem.c	locking/percpu-rwsem: Replace waitqueue with rcuwait	2017-01-14 11:14:35 +01:00
qrwlock.c	locking/core: Remove cpu_relax_lowlatency() users	2016-11-16 10:15:10 +01:00
qspinlock_paravirt.h	locking/pvqspinlock: Don't wait if vCPU is preempted	2017-01-12 09:35:57 +01:00
qspinlock_stat.h	sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h>	2017-03-02 08:42:27 +01:00
qspinlock.c	locking/qspinlock: Use __this_cpu_dec() instead of full-blown this_cpu_dec()	2016-06-27 11:37:41 +02:00
rtmutex_common.h	rtmutex: Fix PI chain order integrity	2017-04-04 11:44:06 +02:00
rtmutex-debug.c	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2017-05-10 10:30:46 -07:00
rtmutex-debug.h	futex: Remove rt_mutex_deadlock_account_*()	2017-03-23 19:10:07 +01:00
rtmutex.c	futex,rt_mutex: Fix rt_mutex_cleanup_proxy_lock()	2017-05-22 21:57:18 +02:00
rtmutex.h	futex: Remove rt_mutex_deadlock_account_*()	2017-03-23 19:10:07 +01:00
rwsem-spinlock.c	locking/rwsem: Fix down_write_killable() for CONFIG_RWSEM_GENERIC_SPINLOCK=y	2017-03-16 09:28:30 +01:00
rwsem-xadd.c	sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h>	2017-03-02 08:42:34 +01:00
rwsem.c	locking/lockdep: Add new check to lock_downgrade()	2017-03-16 09:57:07 +01:00
rwsem.h	locking/rwsem: Protect all writes to owner by WRITE_ONCE()	2016-06-08 15:16:59 +02:00
semaphore.c	sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h>	2017-03-02 08:42:34 +01:00
spinlock_debug.c	locking/spinlock/debug: Remove spinlock lockup detection code	2017-02-10 09:09:49 +01:00
spinlock.c	locking/spinlocks: Remove the unused spin_lock_bh_nested() API	2017-01-12 09:33:39 +01:00
test-ww_mutex.c	locking/ww-mutex: Limit stress test to 2 seconds	2017-03-30 09:49:47 +02:00