linux/drivers/infiniband/core
Hefty, Sean 186834b5de RDMA/ucma: Fix AB-BA deadlock
When we destroy a cm_id, we must purge associated events from the
event queue.  If the cm_id is for a listen request, we also purge
corresponding pending connect requests.  This requires destroying
the cm_id's associated with the connect requests by calling
rdma_destroy_id().  rdma_destroy_id() blocks until all outstanding
callbacks have completed.

The issue is that we hold file->mut while purging events from the
event queue.  We also acquire file->mut in our event handler.  Calling
rdma_destroy_id() while holding file->mut can lead to a deadlock,
since the event handler callback cannot acquire file->mut, which
prevents rdma_destroy_id() from completing.

Fix this by moving events to purge from the event queue to a temporary
list.  We can then release file->mut and call rdma_destroy_id()
outside of holding any locks.

Bug report by Or Gerlitz <ogerlitz@mellanox.com>:

    [ INFO: possible circular locking dependency detected ]
    3.3.0-rc5-00008-g79f1e43-dirty #34 Tainted: G          I

    tgtd/9018 is trying to acquire lock:
     (&id_priv->handler_mutex){+.+.+.}, at: [<ffffffffa0359a41>] rdma_destroy_id+0x33/0x1f0 [rdma_cm]

    but task is already holding lock:
     (&file->mut){+.+.+.}, at: [<ffffffffa02470fe>] ucma_free_ctx+0xb6/0x196 [rdma_ucm]

    which lock already depends on the new lock.


    the existing dependency chain (in reverse order) is:

    -> #1 (&file->mut){+.+.+.}:
           [<ffffffff810682f3>] lock_acquire+0xf0/0x116
           [<ffffffff8135f179>] mutex_lock_nested+0x64/0x2e6
           [<ffffffffa0247636>] ucma_event_handler+0x148/0x1dc [rdma_ucm]
           [<ffffffffa035a79a>] cma_ib_handler+0x1a7/0x1f7 [rdma_cm]
           [<ffffffffa0333e88>] cm_process_work+0x32/0x119 [ib_cm]
           [<ffffffffa03362ab>] cm_work_handler+0xfb8/0xfe5 [ib_cm]
           [<ffffffff810423e2>] process_one_work+0x2bd/0x4a6
           [<ffffffff810429e2>] worker_thread+0x1d6/0x350
           [<ffffffff810462a6>] kthread+0x84/0x8c
           [<ffffffff81369624>] kernel_thread_helper+0x4/0x10

    -> #0 (&id_priv->handler_mutex){+.+.+.}:
           [<ffffffff81067b86>] __lock_acquire+0x10d5/0x1752
           [<ffffffff810682f3>] lock_acquire+0xf0/0x116
           [<ffffffff8135f179>] mutex_lock_nested+0x64/0x2e6
           [<ffffffffa0359a41>] rdma_destroy_id+0x33/0x1f0 [rdma_cm]
           [<ffffffffa024715f>] ucma_free_ctx+0x117/0x196 [rdma_ucm]
           [<ffffffffa0247255>] ucma_close+0x77/0xb4 [rdma_ucm]
           [<ffffffff810df6ef>] fput+0x117/0x1cf
           [<ffffffff810dc76e>] filp_close+0x6d/0x78
           [<ffffffff8102b667>] put_files_struct+0xbd/0x17d
           [<ffffffff8102b76d>] exit_files+0x46/0x4e
           [<ffffffff8102d057>] do_exit+0x299/0x75d
           [<ffffffff8102d599>] do_group_exit+0x7e/0xa9
           [<ffffffff8103ae4b>] get_signal_to_deliver+0x536/0x555
           [<ffffffff81001717>] do_signal+0x39/0x634
           [<ffffffff81001d39>] do_notify_resume+0x27/0x69
           [<ffffffff81361c03>] retint_signal+0x46/0x83

    other info that might help us debug this:

     Possible unsafe locking scenario:

           CPU0                    CPU1
           ----                    ----
      lock(&file->mut);
                                   lock(&id_priv->handler_mutex);
                                   lock(&file->mut);
      lock(&id_priv->handler_mutex);

     *** DEADLOCK ***

    1 lock held by tgtd/9018:
     #0:  (&file->mut){+.+.+.}, at: [<ffffffffa02470fe>] ucma_free_ctx+0xb6/0x196 [rdma_ucm]

    stack backtrace:
    Pid: 9018, comm: tgtd Tainted: G          I  3.3.0-rc5-00008-g79f1e43-dirty #34
    Call Trace:
     [<ffffffff81029e9c>] ? console_unlock+0x18e/0x207
     [<ffffffff81066433>] print_circular_bug+0x28e/0x29f
     [<ffffffff81067b86>] __lock_acquire+0x10d5/0x1752
     [<ffffffff810682f3>] lock_acquire+0xf0/0x116
     [<ffffffffa0359a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm]
     [<ffffffff8135f179>] mutex_lock_nested+0x64/0x2e6
     [<ffffffffa0359a41>] ? rdma_destroy_id+0x33/0x1f0 [rdma_cm]
     [<ffffffff8106546d>] ? trace_hardirqs_on_caller+0x11e/0x155
     [<ffffffff810654b1>] ? trace_hardirqs_on+0xd/0xf
     [<ffffffffa0359a41>] rdma_destroy_id+0x33/0x1f0 [rdma_cm]
     [<ffffffffa024715f>] ucma_free_ctx+0x117/0x196 [rdma_ucm]
     [<ffffffffa0247255>] ucma_close+0x77/0xb4 [rdma_ucm]
     [<ffffffff810df6ef>] fput+0x117/0x1cf
     [<ffffffff810dc76e>] filp_close+0x6d/0x78
     [<ffffffff8102b667>] put_files_struct+0xbd/0x17d
     [<ffffffff8102b5cc>] ? put_files_struct+0x22/0x17d
     [<ffffffff8102b76d>] exit_files+0x46/0x4e
     [<ffffffff8102d057>] do_exit+0x299/0x75d
     [<ffffffff8102d599>] do_group_exit+0x7e/0xa9
     [<ffffffff8103ae4b>] get_signal_to_deliver+0x536/0x555
     [<ffffffff810654b1>] ? trace_hardirqs_on+0xd/0xf
     [<ffffffff81001717>] do_signal+0x39/0x634
     [<ffffffff8135e037>] ? printk+0x3c/0x45
     [<ffffffff8106546d>] ? trace_hardirqs_on_caller+0x11e/0x155
     [<ffffffff810654b1>] ? trace_hardirqs_on+0xd/0xf
     [<ffffffff81361803>] ? _raw_spin_unlock_irq+0x2b/0x40
     [<ffffffff81039011>] ? set_current_blocked+0x44/0x49
     [<ffffffff81361bce>] ? retint_signal+0x11/0x83
     [<ffffffff81001d39>] do_notify_resume+0x27/0x69
     [<ffffffff8118a1fe>] ? trace_hardirqs_on_thunk+0x3a/0x3f
     [<ffffffff81361c03>] retint_signal+0x46/0x83

Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2012-03-05 12:27:57 -08:00
..
addr.c infiniband: addr: Consolidate code to fetch neighbour hardware address from dst. 2011-12-05 15:20:19 -05:00
agent.c IB/mad: Improve an error message so error code is included 2011-03-18 09:42:20 -07:00
agent.h RDMA: Remove subversion $Id tags 2008-07-14 23:48:44 -07:00
cache.c IB/core: Add GID change event 2011-07-18 21:04:30 -07:00
cm_msgs.h IB/cm: Fix layout of APR message 2012-01-03 21:04:18 -08:00
cm.c switch device_get_devnode() and ->devnode() to umode_t * 2012-01-03 22:54:55 -05:00
cma.c infiniband changes for 3.3 merge window 2012-01-08 14:05:48 -08:00
core_priv.h IB/core: Allow device-specific per-port sysfs files 2010-05-21 10:34:44 -07:00
device.c RDMA: Allow for NULL .modify_device() and .modify_port() methods 2011-07-18 16:44:30 -07:00
fmr_pool.c infiniband: add in export.h for files using EXPORT_SYMBOL/THIS_MODULE 2011-10-31 19:31:35 -04:00
iwcm.c infiniband: Fix up module files that need to include module.h 2011-10-31 19:31:35 -04:00
iwcm.h
mad_priv.h IB/mad: Allow tuning of QP0 and QP1 sizes 2009-09-07 08:28:48 -07:00
mad_rmpp.c include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h 2010-03-30 22:02:32 +09:00
mad_rmpp.h RDMA: Remove subversion $Id tags 2008-07-14 23:48:44 -07:00
mad.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
Makefile RDMA: Add netlink infrastructure 2011-05-20 11:46:11 -07:00
multicast.c infiniband: add in export.h for files using EXPORT_SYMBOL/THIS_MODULE 2011-10-31 19:31:35 -04:00
netlink.c infiniband: add in export.h for files using EXPORT_SYMBOL/THIS_MODULE 2011-10-31 19:31:35 -04:00
packer.c infiniband: add in export.h for files using EXPORT_SYMBOL/THIS_MODULE 2011-10-31 19:31:35 -04:00
sa_query.c RDMA: Update missed conversion of flush_scheduled_work() 2011-01-28 16:39:08 -08:00
sa.h
smi.c IB/mad: Check hop count field in directed route MAD to avoid array overflow 2009-09-05 20:24:10 -07:00
smi.h
sysfs.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
ucm.c rdma/core: Fix sparse warnings 2012-01-04 09:17:45 -08:00
ucma.c RDMA/ucma: Fix AB-BA deadlock 2012-03-05 12:27:57 -08:00
ud_header.c infiniband: add in export.h for files using EXPORT_SYMBOL/THIS_MODULE 2011-10-31 19:31:35 -04:00
umem.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
user_mad.c switch device_get_devnode() and ->devnode() to umode_t * 2012-01-03 22:54:55 -05:00
uverbs_cmd.c RDMA/core: Fix kernel panic by always initializing qp->usecnt 2012-01-27 09:20:10 -08:00
uverbs_main.c switch device_get_devnode() and ->devnode() to umode_t * 2012-01-03 22:54:55 -05:00
uverbs_marshall.c infiniband: add in export.h for files using EXPORT_SYMBOL/THIS_MODULE 2011-10-31 19:31:35 -04:00
uverbs.h RDMA/uverbs: Export ib_open_qp() capability to user space 2011-10-13 09:50:56 -07:00
verbs.c RDMA/core: Fix kernel panic by always initializing qp->usecnt 2012-01-27 09:20:10 -08:00