linux/drivers
Shay Drory 374012b004 RDMA/mlx5: Fix mkey cache possible deadlock on cleanup
Fix the deadlock by refactoring the MR cache cleanup flow to flush the
workqueue without holding the rb_lock.
This adds a race between cache cleanup and creation of new entries which
we solve by denied creation of new entries after cache cleanup started.

Lockdep:
WARNING: possible circular locking dependency detected
 [ 2785.326074 ] 6.2.0-rc6_for_upstream_debug_2023_01_31_14_02 #1 Not tainted
 [ 2785.339778 ] ------------------------------------------------------
 [ 2785.340848 ] devlink/53872 is trying to acquire lock:
 [ 2785.341701 ] ffff888124f8c0c8 ((work_completion)(&(&ent->dwork)->work)){+.+.}-{0:0}, at: __flush_work+0xc8/0x900
 [ 2785.343403 ]
 [ 2785.343403 ] but task is already holding lock:
 [ 2785.344464 ] ffff88817e8f1260 (&dev->cache.rb_lock){+.+.}-{3:3}, at: mlx5_mkey_cache_cleanup+0x77/0x250 [mlx5_ib]
 [ 2785.346273 ]
 [ 2785.346273 ] which lock already depends on the new lock.
 [ 2785.346273 ]
 [ 2785.347720 ]
 [ 2785.347720 ] the existing dependency chain (in reverse order) is:
 [ 2785.349003 ]
 [ 2785.349003 ] -> #1 (&dev->cache.rb_lock){+.+.}-{3:3}:
 [ 2785.350160 ]        __mutex_lock+0x14c/0x15c0
 [ 2785.350962 ]        delayed_cache_work_func+0x2d1/0x610 [mlx5_ib]
 [ 2785.352044 ]        process_one_work+0x7c2/0x1310
 [ 2785.352879 ]        worker_thread+0x59d/0xec0
 [ 2785.353636 ]        kthread+0x28f/0x330
 [ 2785.354370 ]        ret_from_fork+0x1f/0x30
 [ 2785.355135 ]
 [ 2785.355135 ] -> #0 ((work_completion)(&(&ent->dwork)->work)){+.+.}-{0:0}:
 [ 2785.356515 ]        __lock_acquire+0x2d8a/0x5fe0
 [ 2785.357349 ]        lock_acquire+0x1c1/0x540
 [ 2785.358121 ]        __flush_work+0xe8/0x900
 [ 2785.358852 ]        __cancel_work_timer+0x2c7/0x3f0
 [ 2785.359711 ]        mlx5_mkey_cache_cleanup+0xfb/0x250 [mlx5_ib]
 [ 2785.360781 ]        mlx5_ib_stage_pre_ib_reg_umr_cleanup+0x16/0x30 [mlx5_ib]
 [ 2785.361969 ]        __mlx5_ib_remove+0x68/0x120 [mlx5_ib]
 [ 2785.362960 ]        mlx5r_remove+0x63/0x80 [mlx5_ib]
 [ 2785.363870 ]        auxiliary_bus_remove+0x52/0x70
 [ 2785.364715 ]        device_release_driver_internal+0x3c1/0x600
 [ 2785.365695 ]        bus_remove_device+0x2a5/0x560
 [ 2785.366525 ]        device_del+0x492/0xb80
 [ 2785.367276 ]        mlx5_detach_device+0x1a9/0x360 [mlx5_core]
 [ 2785.368615 ]        mlx5_unload_one_devl_locked+0x5a/0x110 [mlx5_core]
 [ 2785.369934 ]        mlx5_devlink_reload_down+0x292/0x580 [mlx5_core]
 [ 2785.371292 ]        devlink_reload+0x439/0x590
 [ 2785.372075 ]        devlink_nl_cmd_reload+0xaef/0xff0
 [ 2785.372973 ]        genl_family_rcv_msg_doit.isra.0+0x1bd/0x290
 [ 2785.374011 ]        genl_rcv_msg+0x3ca/0x6c0
 [ 2785.374798 ]        netlink_rcv_skb+0x12c/0x360
 [ 2785.375612 ]        genl_rcv+0x24/0x40
 [ 2785.376295 ]        netlink_unicast+0x438/0x710
 [ 2785.377121 ]        netlink_sendmsg+0x7a1/0xca0
 [ 2785.377926 ]        sock_sendmsg+0xc5/0x190
 [ 2785.378668 ]        __sys_sendto+0x1bc/0x290
 [ 2785.379440 ]        __x64_sys_sendto+0xdc/0x1b0
 [ 2785.380255 ]        do_syscall_64+0x3d/0x90
 [ 2785.381031 ]        entry_SYSCALL_64_after_hwframe+0x46/0xb0
 [ 2785.381967 ]
 [ 2785.381967 ] other info that might help us debug this:
 [ 2785.381967 ]
 [ 2785.383448 ]  Possible unsafe locking scenario:
 [ 2785.383448 ]
 [ 2785.384544 ]        CPU0                    CPU1
 [ 2785.385383 ]        ----                    ----
 [ 2785.386193 ]   lock(&dev->cache.rb_lock);
 [ 2785.386940 ]				lock((work_completion)(&(&ent->dwork)->work));
 [ 2785.388327 ]				lock(&dev->cache.rb_lock);
 [ 2785.389425 ]   lock((work_completion)(&(&ent->dwork)->work));
 [ 2785.390414 ]
 [ 2785.390414 ]  *** DEADLOCK ***
 [ 2785.390414 ]
 [ 2785.391579 ] 6 locks held by devlink/53872:
 [ 2785.392341 ]  #0: ffffffff84c17a50 (cb_lock){++++}-{3:3}, at: genl_rcv+0x15/0x40
 [ 2785.393630 ]  #1: ffff888142280218 (&devlink->lock_key){+.+.}-{3:3}, at: devlink_get_from_attrs_lock+0x12d/0x2d0
 [ 2785.395324 ]  #2: ffff8881422d3c38 (&dev->lock_key){+.+.}-{3:3}, at: mlx5_unload_one_devl_locked+0x4a/0x110 [mlx5_core]
 [ 2785.397322 ]  #3: ffffffffa0e59068 (mlx5_intf_mutex){+.+.}-{3:3}, at: mlx5_detach_device+0x60/0x360 [mlx5_core]
 [ 2785.399231 ]  #4: ffff88810e3cb0e8 (&dev->mutex){....}-{3:3}, at: device_release_driver_internal+0x8d/0x600
 [ 2785.400864 ]  #5: ffff88817e8f1260 (&dev->cache.rb_lock){+.+.}-{3:3}, at: mlx5_mkey_cache_cleanup+0x77/0x250 [mlx5_ib]

Fixes: b958451783 ("RDMA/mlx5: Change the cache structure to an RB-tree")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2023-09-26 12:33:53 +03:00
..
accel Short summary of fixes pull: 2023-09-08 06:36:36 +10:00
accessibility
acpi More thermal control updates for 6.6-rc1 2023-09-04 15:17:28 -07:00
amba amba: bus: fix refcount leak 2023-08-22 15:50:57 +02:00
android Char/Misc driver changes for 6.6-rc1 2023-09-01 09:53:54 -07:00
ata SCSI misc on 20230909 2023-09-09 12:01:33 -07:00
atm
auxdisplay drm for 6.6-rc1 2023-08-30 13:34:34 -07:00
base Driver core changes for 6.6-rc1 2023-09-01 09:43:18 -07:00
bcma
block block-6.6-2023-09-08 2023-09-08 21:39:54 -07:00
bluetooth TTY/Serial driver changes for 6.6-rc1 2023-09-01 09:38:00 -07:00
bus Char/Misc driver changes for 6.6-rc1 2023-09-01 09:53:54 -07:00
cache cache: Add L2 cache management for Andes AX45MP RISC-V core 2023-09-01 09:08:59 -07:00
cdrom
cdx
char tpm: Enable hwrng only for Pluton on AMD CPUs 2023-09-04 21:57:59 +03:00
clk This pull request is full of clk driver changes. In fact, there aren't any 2023-08-30 19:53:39 -07:00
clocksource Updates for clocksource/clockevent drivers: 2023-09-04 13:15:57 -07:00
comedi
connector
counter - New Drivers 2023-09-04 13:47:59 -07:00
cpufreq cpufreq: Support per-policy performance boost 2023-08-29 20:51:40 +02:00
cpuidle powerpc updates for 6.6 2023-08-31 12:43:10 -07:00
crypto This update includes the following changes: 2023-08-29 11:23:29 -07:00
cxl
dax mm: remove enum page_entry_size 2023-08-24 16:20:30 -07:00
dca
devfreq
dio
dma dmaengine updates for v6.6 2023-09-03 10:49:42 -07:00
dma-buf drm for 6.6-rc1 2023-08-30 13:34:34 -07:00
edac Intel EDAC fixes: 2023-08-30 19:23:00 -07:00
eisa
extcon
firewire
firmware RISC-V Patches for the 6.6 Merge Window, Part 2 (try 2) 2023-09-09 14:25:11 -07:00
fpga
fsi fsi: i2cr: Switch to use struct i2c_driver's .probe() 2023-08-22 15:51:33 +02:00
genpd ARM: SoC drivers for 6.6 2023-08-30 16:42:21 -07:00
gnss
gpio gpio: zynq: restore zynq_gpio_irq_reqres/zynq_gpio_irq_relres callbacks 2023-09-06 17:08:51 +02:00
gpu drm ci for 6.6-rc1 2023-09-10 11:55:26 -07:00
greybus
hid for-linus-2023083101 2023-09-01 12:31:44 -07:00
hsi
hte hte: Explicitly include correct DT includes 2023-08-28 13:31:06 -05:00
hv hyperv-next for v6.6 2023-09-04 11:26:29 -07:00
hwmon Char/Misc driver changes for 6.6-rc1 2023-09-01 09:53:54 -07:00
hwspinlock
hwtracing
i2c I2C has mainly cleanups this time and a few driver improvements. Because 2023-09-04 13:44:11 -07:00
i3c i3c: master: svc: fix probe failure when no i3c device exist 2023-09-06 01:21:47 +02:00
idle Perf events changes for v6.6: 2023-08-28 16:35:01 -07:00
iio Char/Misc driver changes for 6.6-rc1 2023-09-01 09:53:54 -07:00
infiniband RDMA/mlx5: Fix mkey cache possible deadlock on cleanup 2023-09-26 12:33:53 +03:00
input Input updates for 6.6 merge window: 2023-09-06 09:24:25 -07:00
interconnect This pull request is full of clk driver changes. In fact, there aren't any 2023-08-30 19:53:39 -07:00
iommu IOMMU Updates for Linux v6.6 2023-09-01 16:54:25 -07:00
ipack
irqchip Documentation work keeps chugging along; stuff for 6.6 includes: 2023-08-30 20:05:42 -07:00
isdn
leds - Core Frameworks 2023-09-04 13:52:58 -07:00
macintosh powerpc updates for 6.6 2023-08-31 12:43:10 -07:00
mailbox mailbox: qcom-ipcc: fix incorrect num_chans counting 2023-09-05 10:11:01 -05:00
mcb
md for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
media media: dvb: symbol fixup for dvb_attach() 2023-09-09 08:15:11 +01:00
memory
memstick
message
mfd spi: Updates for v6.6 2023-08-29 09:47:33 -07:00
misc Char/Misc driver changes for 6.6-rc1 2023-09-01 09:53:54 -07:00
mmc TTY/Serial driver changes for 6.6-rc1 2023-09-01 09:38:00 -07:00
most
mtd - New Drivers 2023-09-04 13:47:59 -07:00
mux mux: Explicitly include correct DT includes 2023-08-28 13:36:24 -05:00
net Including fixes from netfilter and bpf. 2023-09-07 18:33:07 -07:00
nfc NFC: nxp: add NXP1002 2023-08-30 18:32:24 -07:00
ntb ntb: Check tx descriptors outstanding instead of head/tail for tx queue 2023-08-22 12:38:19 -04:00
nubus
nvdimm nvdimm changes for v6.6 merge window 2023-08-30 20:52:08 -07:00
nvme for-6.6/block-2023-08-28 2023-08-29 20:21:42 -07:00
nvmem nvmem: core: Notify when a new layout is registered 2023-08-23 16:34:02 +02:00
of Devicetree updates for v6.6: 2023-08-30 16:59:03 -07:00
opp
parisc parisc: ccio-dma: Create private runway procfs root entry 2023-08-28 18:00:27 +02:00
parport TTY/Serial driver changes for 6.6-rc1 2023-09-01 09:38:00 -07:00
pci pci-v6.6-fixes-1 2023-09-09 11:35:28 -07:00
pcmcia
peci
perf arm64 fixes for -rc1 2023-09-08 12:48:37 -07:00
phy phy-for-6.6 2023-09-03 10:38:02 -07:00
pinctrl Pin control bulk changes for the v6.6 kernel cycle: 2023-08-30 19:36:19 -07:00
platform USB / Thunderbolt / PHY driver update for 6.6-rc1 2023-09-01 09:23:34 -07:00
pnp
power thermal: Use thermal_tripless_zone_device_register() 2023-09-05 21:42:18 +02:00
powercap powercap: intel_rapl: Fix invalid setting of Power Limit 4 2023-09-06 22:21:22 +02:00
pps
ps3
ptp
pwm pwm: Changes for v6.6-rc1 2023-09-07 18:05:58 -07:00
rapidio
ras
regulator regulator: Fixes for v6.6 2023-09-07 15:51:07 -07:00
remoteproc remoteproc updates for v6.6 2023-09-04 15:12:26 -07:00
reset This pull request is full of clk driver changes. In fact, there aren't any 2023-08-30 19:53:39 -07:00
rpmsg rpmsg updates for v6.6 2023-09-04 15:08:52 -07:00
rtc RTC for 6.6 2023-09-07 16:07:35 -07:00
s390 block-6.6-2023-09-08 2023-09-08 21:39:54 -07:00
sbus sbus: Explicitly include correct DT includes 2023-08-28 13:36:24 -05:00
scsi SCSI misc on 20230909 2023-09-09 12:01:33 -07:00
sh
siox
slimbus
soc soc: renesas: Kconfig: For ARCH_R9A07G043 select the required configs if dependencies are met 2023-09-08 11:25:29 -07:00
soundwire soundwire updates for 6.6 2023-09-03 10:20:57 -07:00
spi spi: Fixes for v6.6 2023-09-07 15:49:20 -07:00
spmi
ssb
staging media: dvb: symbol fixup for dvb_attach() 2023-09-09 08:15:11 +01:00
target SCSI misc on 20230902 2023-09-02 12:02:41 -07:00
tc
tee
thermal thermal: core: Drop thermal_zone_device_register() 2023-09-05 21:42:18 +02:00
thunderbolt
tty TTY/Serial driver changes for 6.6-rc1 2023-09-01 09:38:00 -07:00
ufs Merge branch 'fixes' into misc 2023-09-02 08:25:19 +01:00
uio
usb just cleanups and fixes 2023-09-07 10:35:14 -07:00
vdpa virtio: features 2023-09-04 10:43:44 -07:00
vfio iommufd for 6.6 2023-08-30 20:41:37 -07:00
vhost vdpa: add get_backend_features vdpa operation 2023-09-03 18:10:22 -04:00
video - New Functionality 2023-09-06 09:00:37 -07:00
virt minmax: add in_range() macro 2023-08-24 16:20:18 -07:00
virtio virtio_ring: fix avail_wrap_counter in virtqueue_add_packed 2023-09-03 18:10:24 -04:00
vlynq
w1
watchdog linux-watchdog 6.6-rc1 tag 2023-09-06 09:19:12 -07:00
xen dma-maping updates for Linux 6.6 2023-08-29 20:32:10 -07:00
zorro
Kconfig Merge patch series "Add non-coherent DMA support for AX45MP" 2023-09-08 11:24:34 -07:00
Makefile Merge patch series "Add non-coherent DMA support for AX45MP" 2023-09-08 11:24:34 -07:00