Shay Drory
c8b3f38d2d
net/mlx5: Always stop health timer during driver removal
...
Currently, if teardown_hca fails to execute during driver removal, mlx5
does not stop the health timer. Afterwards, mlx5 continue with driver
teardown. This may lead to a UAF bug, which results in page fault
Oops[1], since the health timer invokes after resources were freed.
Hence, stop the health monitor even if teardown_hca fails.
[1]
mlx5_core 0000:18:00.0: E-Switch: Unload vfs: mode(LEGACY), nvfs(0), necvfs(0), active vports(0)
mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0)
mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0)
mlx5_core 0000:18:00.0: E-Switch: cleanup
mlx5_core 0000:18:00.0: wait_func:1155:(pid 1967079): TEARDOWN_HCA(0x103) timeout. Will cause a leak of a command resource
mlx5_core 0000:18:00.0: mlx5_function_close:1288:(pid 1967079): tear_down_hca failed, skip cleanup
BUG: unable to handle page fault for address: ffffa26487064230
PGD 100c00067 P4D 100c00067 PUD 100e5a067 PMD 105ed7067 PTE 0
Oops: 0000 [#1 ] PREEMPT SMP PTI
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE ------- --- 6.7.0-68.fc38.x86_64 #1
Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0013.121520200651 12/15/2020
RIP: 0010:ioread32be+0x34/0x60
RSP: 0018:ffffa26480003e58 EFLAGS: 00010292
RAX: ffffa26487064200 RBX: ffff9042d08161a0 RCX: ffff904c108222c0
RDX: 000000010bbf1b80 RSI: ffffffffc055ddb0 RDI: ffffa26487064230
RBP: ffff9042d08161a0 R08: 0000000000000022 R09: ffff904c108222e8
R10: 0000000000000004 R11: 0000000000000441 R12: ffffffffc055ddb0
R13: ffffa26487064200 R14: ffffa26480003f00 R15: ffff904c108222c0
FS: 0000000000000000(0000) GS:ffff904c10800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffa26487064230 CR3: 00000002c4420006 CR4: 00000000007706f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<IRQ>
? __die+0x23/0x70
? page_fault_oops+0x171/0x4e0
? exc_page_fault+0x175/0x180
? asm_exc_page_fault+0x26/0x30
? __pfx_poll_health+0x10/0x10 [mlx5_core]
? __pfx_poll_health+0x10/0x10 [mlx5_core]
? ioread32be+0x34/0x60
mlx5_health_check_fatal_sensors+0x20/0x100 [mlx5_core]
? __pfx_poll_health+0x10/0x10 [mlx5_core]
poll_health+0x42/0x230 [mlx5_core]
? __next_timer_interrupt+0xbc/0x110
? __pfx_poll_health+0x10/0x10 [mlx5_core]
call_timer_fn+0x21/0x130
? __pfx_poll_health+0x10/0x10 [mlx5_core]
__run_timers+0x222/0x2c0
run_timer_softirq+0x1d/0x40
__do_softirq+0xc9/0x2c8
__irq_exit_rcu+0xa6/0xc0
sysvec_apic_timer_interrupt+0x72/0x90
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x1a/0x20
RIP: 0010:cpuidle_enter_state+0xcc/0x440
? cpuidle_enter_state+0xbd/0x440
cpuidle_enter+0x2d/0x40
do_idle+0x20d/0x270
cpu_startup_entry+0x2a/0x30
rest_init+0xd0/0xd0
arch_call_rest_init+0xe/0x30
start_kernel+0x709/0xa90
x86_64_start_reservations+0x18/0x30
x86_64_start_kernel+0x96/0xa0
secondary_startup_64_no_verify+0x18f/0x19b
---[ end trace 0000000000000000 ]---
Fixes: 9b98d395b85d ("net/mlx5: Start health poll at earlier stage of driver load")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-06-05 14:07:16 +01:00
..
2024-05-19 09:21:03 -07:00
2024-05-24 08:43:25 -07:00
2024-05-22 12:26:46 -07:00
2024-05-21 10:09:28 -07:00
2024-05-06 18:26:47 -07:00
2024-05-22 10:45:12 -07:00
2024-05-23 13:38:31 -07:00
2024-05-23 13:44:47 -07:00
2024-05-23 12:04:36 -07:00
2024-05-22 20:14:47 -04:00
2024-05-15 12:59:55 -06:00
2024-05-29 09:12:58 -07:00
2024-05-18 12:48:37 -07:00
2024-05-22 09:56:00 -07:00
2024-05-21 11:40:49 -07:00
2024-05-16 08:50:32 -07:00
2024-05-23 12:04:36 -07:00
2024-05-22 20:14:47 -04:00
2024-05-19 09:21:03 -07:00
2024-05-09 00:30:37 +09:00
2024-05-21 11:15:56 -07:00
2024-05-22 20:14:47 -04:00
2024-05-13 16:53:53 -07:00
2024-05-14 08:31:10 -07:00
2024-05-09 01:03:39 +09:00
2024-05-14 18:57:22 -07:00
2024-05-23 12:04:36 -07:00
2024-05-22 12:26:46 -07:00
2024-05-23 12:04:36 -07:00
2024-05-24 17:28:02 -07:00
2024-05-22 10:45:12 -07:00
2024-05-22 12:26:46 -07:00
2024-05-22 12:13:40 -07:00
2024-05-22 12:26:46 -07:00
2024-05-20 08:55:18 -07:00
2024-05-23 00:29:19 +02:00
2024-05-22 12:26:46 -07:00
2024-05-22 20:14:47 -04:00
2024-05-24 09:01:21 -07:00
2024-05-22 20:14:47 -04:00
2024-05-23 12:28:01 -07:00
2024-05-23 04:48:40 -07:00
2024-05-17 09:05:46 -07:00
2024-05-19 22:33:28 -05:00
2024-05-21 11:43:11 -07:00
2024-05-22 20:14:47 -04:00
2024-05-17 08:53:47 -07:00
2024-05-14 18:25:53 -07:00
2024-05-21 10:09:28 -07:00
2024-05-23 12:28:01 -07:00
2024-05-16 08:56:49 -07:00
2024-05-21 09:51:42 -07:00
2024-06-05 14:07:16 +01:00
2024-05-29 13:08:31 +01:00
2024-05-23 12:04:36 -07:00
2024-05-14 09:14:49 -06:00
2024-05-20 08:55:18 -07:00
2024-05-17 13:01:24 +02:00
2024-05-21 10:09:28 -07:00
2024-05-23 12:09:22 -07:00
2024-05-21 11:19:18 -07:00
2024-05-22 10:41:14 -07:00
2024-05-22 12:13:40 -07:00
2024-05-27 08:18:31 -07:00
2024-05-20 08:55:18 -07:00
2024-05-14 19:42:24 -07:00
2024-05-10 07:30:27 +02:00
2024-05-14 23:36:19 +09:00
2024-05-23 13:39:42 -07:00
2024-05-06 13:34:12 -06:00
2024-05-18 12:48:37 -07:00
2024-05-23 12:04:36 -07:00
2024-05-07 23:40:46 +02:00
2024-05-21 12:09:36 -07:00
2024-05-23 12:04:36 -07:00
2024-05-08 19:21:51 +01:00
2024-05-22 20:14:47 -04:00
2024-05-21 11:23:36 -07:00
2024-05-14 14:41:01 -07:00
2024-05-08 19:46:11 +01:00
2024-05-22 12:11:48 -07:00
2024-05-21 13:11:44 -07:00
2024-05-22 20:14:47 -04:00
2024-05-10 10:25:22 +01:00
2024-05-24 08:38:28 -07:00
2024-05-14 18:25:53 -07:00
2024-05-22 20:14:47 -04:00
2024-05-22 08:32:48 -04:00
2024-05-20 14:56:50 -07:00
2024-05-23 12:04:36 -07:00
2024-05-22 10:45:12 -07:00
2024-05-19 09:21:03 -07:00
2024-05-23 12:04:36 -07:00
2024-05-11 11:32:06 +02:00
2024-05-24 10:24:49 -07:00
2024-05-10 04:34:52 +09:00