13333 Commits

Author SHA1 Message Date
Sindhu-Devale
7f51a961f8 RDMA/irdma: Align AE id codes to correct flush code and event
A number of asynchronous event (AE) ids were not aligned to the
correct flush_code and event_type. Fix these up so that the
correct IBV error and event codes are returned to application.

Also, add handling for new AE ids like IRDMA_AE_INVALID_REQUEST to
return the correct WC error code.

Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions")
Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20220907191324.1173-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-20 13:19:52 +03:00
Daisuke Matsuda
e866025b3b RDMA/mlx5: Remove duplicate assignment in umr_rereg_pas()
The same value is assigned to 'mr->ibmr.length'. Remove redundant one.

Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Link: https://lore.kernel.org/r/20220908083058.3993700-1-matsuda-daisuke@fujitsu.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-08 11:53:41 +03:00
Li Zhijian
415a04844a RDMA/rxe: convert pr_warn to pr_debug
They could be triggered by user APIs with invalid parameters.

Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Link: https://lore.kernel.org/r/1662518901-2-2-git-send-email-lizhijian@fujitsu.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-08 11:03:15 +03:00
Li Zhijian
e2edba67fc RDMA/rxe: use %u to print u32 variables
struct ib_qp_cap {
        u32     max_send_wr;
        u32     max_recv_wr;
        u32     max_send_sge;
        u32     max_recv_sge;
        u32     max_inline_data;
...

To avoid getting a negative value from dmesg:
[410580.579965] rdma_rxe: invalid send sge = 65535 > 32
[410580.583818] rdma_rxe: invalid send wr = -1 > 1048576
[410582.771323] rdma_rxe: invalid recv sge = 65535 > 32
[410582.775310] rdma_rxe: invalid recv wr = -1 > 1048576

Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Link: https://lore.kernel.org/r/1662518901-2-1-git-send-email-lizhijian@fujitsu.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-08 11:03:15 +03:00
Sindhu-Devale
a261786fdc RDMA/irdma: Report RNR NAK generation in device caps
Report RNR NAK generation when device capabilities are queried

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20220906223244.1119-6-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07 11:22:18 +03:00
Sindhu-Devale
2c8844431d RDMA/irdma: Use s/g array in post send only when its valid
Send with invalidate verb call can pass in an
uninitialized s/g array with 0 sge's which is
filled into irdma WQE and causes a HW asynchronous
event.

Fix this by using the s/g array in irdma post send
only when its valid.

Fixes: 551c46e ("RDMA/irdma: Add user/kernel shared libraries")
Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20220906223244.1119-5-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07 11:22:18 +03:00
Sindhu-Devale
dcb23bbb1d RDMA/irdma: Return correct WC error for bind operation failure
When a QP and a MR on a local host are in different PDs, the HW generates
an asynchronous event (AE). The same AE is generated when a QP and a MW
are in different PDs during a bind operation. Return the more appropriate
IBV_WC_MW_BIND_ERR for the latter case by checking the OP type from the
CQE in error.

Fixes: 551c46edc769 ("RDMA/irdma: Add user/kernel shared libraries")
Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20220906223244.1119-4-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07 11:22:18 +03:00
Sindhu-Devale
6b227bd32d RDMA/irdma: Return error on MR deregister CQP failure
The MR deregister CQP can fail if an MW is bound to it.
Return an appropriate error for this case.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20220906223244.1119-3-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07 11:22:17 +03:00
Sindhu-Devale
12faad5e5c RDMA/irdma: Report the correct max cqes from query device
Report the correct max cqes available to an application taking
into account a reserved entry to detect overflow.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20220906223244.1119-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07 11:22:17 +03:00
Guoqing Jiang
db77d84cfe RDMA/rtrs-clt: Kill xchg_paths
Let's call try_cmpxchg directly for the same purpose.

Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Link: https://lore.kernel.org/r/20220903040252.29397-1-guoqing.jiang@linux.dev
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-06 14:12:03 +03:00
Guoqing Jiang
57eb938237 RDMA/rtrs-clt: Break the loop once one path is connected
No need to iterate all paths after find one connected path.

Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Link: https://lore.kernel.org/r/20220902101922.26273-3-guoqing.jiang@linux.dev
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-06 14:05:22 +03:00
Guoqing Jiang
2aa9e4a2c3 RDMA/rtrs: Update comments for MAX_SESS_QUEUE_DEPTH
The maximum queue_depth should be 65535 per check_module_params,
also update other relevant comments.

Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Link: https://lore.kernel.org/r/20220902101922.26273-2-guoqing.jiang@linux.dev
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-06 14:05:22 +03:00
ye xingchen
e58f889e29 RDMA/hfi1: Remove the unneeded result variable
Return the value set_link_state() directly instead of storing it in
another redundant variable.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Link: https://lore.kernel.org/r/20220901074209.313004-1-ye.xingchen@zte.com.cn
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-05 15:14:31 +03:00
Leon Romanovsky
d324a46be3 Merge branch 'mlx5-next' into wip/leon-for-next
Perform merge of Mellanox shared branch.

* mlx5-next:
  RDMA/mlx5: Move function mlx5_core_query_ib_ppcnt() to mlx5_ib
2022-09-05 15:09:55 +03:00
Chris Mi
8a2dd123f1 RDMA/mlx5: Move function mlx5_core_query_ib_ppcnt() to mlx5_ib
This patch doesn't change any functionality, but move one function
to mlx5_ib because it is not used by mlx5_core.

The actual fix is in the next patch.

Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Chris Mi <cmi@nvidia.com>
Link: https://lore.kernel.org/r/fd47b9138412bd94ed30f838026cbb4cf3878150.1661763871.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-05 15:09:03 +03:00
Bodong Wang
b021d82e25 IB/mlx5: Support querying eswitch functions from DEVX
Query eswitch functions returns information of the external host
PF(if it exists). It can be used to check if DEVX is running on ECPF.

Reviewed-by: Erez Shitrit <erezsh@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Bodong Wang <bodong@mellanox.com>
Link: https://lore.kernel.org/r/4265925178ab3224dc1d3e3784bb312d808edca5.1661763785.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-05 14:54:39 +03:00
Maor Gottlieb
9b7d4be967 RDMA/mlx5: Fix UMR cleanup on error flow of driver init
The cited commit removed from the cleanup flow of umr the checks
if the resources were created. This could lead to null-ptr-deref
in case that we had failure in mlx5_ib_stage_ib_reg_init stage.

Fix it by adding new state to the umr that can say if the resources
were created or not and check it in the umr cleanup flow before
destroying the resources.

Fixes: 04876c12c19e ("RDMA/mlx5: Move init and cleanup of UMR to umr.c")
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Link: https://lore.kernel.org/r/4cfa61386cf202e9ce330e8d228ce3b25a36326e.1661763459.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-05 14:49:57 +03:00
Chris Mi
74b30b3ad5 RDMA/mlx5: Set local port to one when accessing counters
When accessing Ports Performance Counters Register (PPCNT),
local port must be one if it is Function-Per-Port HCA that
HCA_CAP.num_ports is 1.

The offending patch can change the local port to other values
when accessing PPCNT after enabling switchdev mode. The following
syndrome will be printed:

 # cat /sys/class/infiniband/rdmap4s0f0/ports/2/counters/*
 # dmesg
 mlx5_core 0000:04:00.0: mlx5_cmd_check:756:(pid 12450): ACCESS_REG(0x805) op_mod(0x1) failed, status bad parameter(0x3), syndrome (0x1e5585)

Fix it by setting local port to one for Function-Per-Port HCA.

Fixes: 210b1f78076f ("IB/mlx5: When not in dual port RoCE mode, use provided port as native")
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Chris Mi <cmi@nvidia.com>
Link: https://lore.kernel.org/r/6c5086c295c76211169e58dbd610fb0402360bab.1661763459.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-05 14:49:53 +03:00
Maher Sanalla
9ca05b0f27 RDMA/mlx5: Rely on RoCE fw cap instead of devlink when setting profile
When the RDMA auxiliary driver probes, it sets its profile based on
devlink driverinit value. The latter might not be in sync with FW yet
(In case devlink reload is not performed), thus causing a mismatch
between RDMA driver and FW. This results in the following FW syndrome
when the RDMA driver tries to adjust RoCE state, which fails the probe:

"0xC1F678 | modify_nic_vport_context: roce_en set on a vport that
doesn't support roce"

To prevent this, select the PF profile based on FW RoCE capability
instead of relying on devlink driverinit value.
To provide backward compatibility of the RoCE disable feature, on older
FW's where roce_rw is not set (FW RoCE capability is read-only), keep
the current behavior e.g., rely on devlink driverinit value.

Fixes: fbfa97b4d79f ("net/mlx5: Disable roce at HCA level")
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Link: https://lore.kernel.org/r/cb34ce9a1df4a24c135cb804db87f7d2418bd6cc.1661763459.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-05 14:49:46 +03:00
Yishai Hadas
85eaeb5058 IB/core: Fix a nested dead lock as part of ODP flow
Fix a nested dead lock as part of ODP flow by using mmput_async().

From the below call trace [1] can see that calling mmput() once we have
the umem_odp->umem_mutex locked as required by
ib_umem_odp_map_dma_and_lock() might trigger in the same task the
exit_mmap()->__mmu_notifier_release()->mlx5_ib_invalidate_range() which
may dead lock when trying to lock the same mutex.

Moving to use mmput_async() will solve the problem as the above
exit_mmap() flow will be called in other task and will be executed once
the lock will be available.

[1]
[64843.077665] task:kworker/u133:2  state:D stack:    0 pid:80906 ppid:
2 flags:0x00004000
[64843.077672] Workqueue: mlx5_ib_page_fault mlx5_ib_eqe_pf_action [mlx5_ib]
[64843.077719] Call Trace:
[64843.077722]  <TASK>
[64843.077724]  __schedule+0x23d/0x590
[64843.077729]  schedule+0x4e/0xb0
[64843.077735]  schedule_preempt_disabled+0xe/0x10
[64843.077740]  __mutex_lock.constprop.0+0x263/0x490
[64843.077747]  __mutex_lock_slowpath+0x13/0x20
[64843.077752]  mutex_lock+0x34/0x40
[64843.077758]  mlx5_ib_invalidate_range+0x48/0x270 [mlx5_ib]
[64843.077808]  __mmu_notifier_release+0x1a4/0x200
[64843.077816]  exit_mmap+0x1bc/0x200
[64843.077822]  ? walk_page_range+0x9c/0x120
[64843.077828]  ? __cond_resched+0x1a/0x50
[64843.077833]  ? mutex_lock+0x13/0x40
[64843.077839]  ? uprobe_clear_state+0xac/0x120
[64843.077860]  mmput+0x5f/0x140
[64843.077867]  ib_umem_odp_map_dma_and_lock+0x21b/0x580 [ib_core]
[64843.077931]  pagefault_real_mr+0x9a/0x140 [mlx5_ib]
[64843.077962]  pagefault_mr+0xb4/0x550 [mlx5_ib]
[64843.077992]  pagefault_single_data_segment.constprop.0+0x2ac/0x560
[mlx5_ib]
[64843.078022]  mlx5_ib_eqe_pf_action+0x528/0x780 [mlx5_ib]
[64843.078051]  process_one_work+0x22b/0x3d0
[64843.078059]  worker_thread+0x53/0x410
[64843.078065]  ? process_one_work+0x3d0/0x3d0
[64843.078073]  kthread+0x12a/0x150
[64843.078079]  ? set_kthread_struct+0x50/0x50
[64843.078085]  ret_from_fork+0x22/0x30
[64843.078093]  </TASK>

Fixes: 36f30e486dce ("IB/core: Improve ODP to use hmm_range_fault()")
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Link: https://lore.kernel.org/r/74d93541ea533ef7daec6f126deb1072500aeb16.1661251841.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-05 14:47:40 +03:00
Linus Walleij
0d1b756acf RDMA/siw: Pass a pointer to virt_to_page()
Functions that work on a pointer to virtual memory such as
virt_to_pfn() and users of that function such as
virt_to_page() are supposed to pass a pointer to virtual
memory, ideally a (void *) or other pointer. However since
many architectures implement virt_to_pfn() as a macro,
this function becomes polymorphic and accepts both a
(unsigned long) and a (void *).

If we instead implement a proper virt_to_pfn(void *addr)
function the following happens (occurred on arch/arm):

drivers/infiniband/sw/siw/siw_qp_tx.c:32:23: warning: incompatible
  integer to pointer conversion passing 'dma_addr_t' (aka 'unsigned int')
  to parameter of type 'const void *' [-Wint-conversion]
drivers/infiniband/sw/siw/siw_qp_tx.c:32:37: warning: passing argument
  1 of 'virt_to_pfn' makes pointer from integer without a cast
  [-Wint-conversion]
drivers/infiniband/sw/siw/siw_qp_tx.c:538:36: warning: incompatible
  integer to pointer conversion passing 'unsigned long long'
  to parameter of type 'const void *' [-Wint-conversion]

Fix this with an explicit cast. In one case where the SIW
SGE uses an unaligned u64 we need a double cast modifying the
virtual address (va) to a platform-specific uintptr_t before
casting to a (void *).

Fixes: b9be6f18cf9e ("rdma/siw: transmit path")
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://lore.kernel.org/r/20220902215918.603761-1-linus.walleij@linaro.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-04 10:21:59 +03:00
Tom Talpey
fc5e1acf6a RDMA/siw: Add missing Kconfig selections
The SoftiWARP Kconfig is missing "select" for CRYPTO and CRYPTO_CRC32C.

In addition, it improperly "depends on" LIBCRC32C, this should be a
"select", similar to net/sctp and others. As a dependency, SIW fails
to appear in generic configurations.

Link: https://lore.kernel.org/r/d366bf02-3271-754f-fc68-1a84016d0e19@talpey.com
Signed-off-by: Tom Talpey <tom@talpey.com>
Acked-by: Bernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-01 10:12:01 +03:00
yangx.jy@fujitsu.com
12f35199a2 RDMA/srp: Set scmnd->result only when scmnd is not NULL
This change fixes the following kernel NULL pointer dereference
which is reproduced by blktests srp/007 occasionally.

BUG: kernel NULL pointer dereference, address: 0000000000000170
PGD 0 P4D 0
Oops: 0002 [#1] PREEMPT SMP NOPTI
CPU: 0 PID: 9 Comm: kworker/0:1H Kdump: loaded Not tainted 6.0.0-rc1+ #37
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.15.0-29-g6a62e0cb0dfe-prebuilt.qemu.org 04/01/2014
Workqueue:  0x0 (kblockd)
RIP: 0010:srp_recv_done+0x176/0x500 [ib_srp]
Code: 00 4d 85 ff 0f 84 52 02 00 00 48 c7 82 80 02 00 00 00 00 00 00 4c 89 df 4c 89 14 24 e8 53 d3 4a f6 4c 8b 14 24 41 0f b6 42 13 <41> 89 87 70 01 00 00 41 0f b6 52 12 f6 c2 02 74 44 41 8b 42 1c b9
RSP: 0018:ffffaef7c0003e28 EFLAGS: 00000282
RAX: 0000000000000000 RBX: ffff9bc9486dea60 RCX: 0000000000000000
RDX: 0000000000000102 RSI: ffffffffb76bbd0e RDI: 00000000ffffffff
RBP: ffff9bc980099a00 R08: 0000000000000001 R09: 0000000000000001
R10: ffff9bca53ef0000 R11: ffff9bc980099a10 R12: ffff9bc956e14000
R13: ffff9bc9836b9cb0 R14: ffff9bc9557b4480 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff9bc97ec00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000170 CR3: 0000000007e04000 CR4: 00000000000006f0
Call Trace:
 <IRQ>
 __ib_process_cq+0xb7/0x280 [ib_core]
 ib_poll_handler+0x2b/0x130 [ib_core]
 irq_poll_softirq+0x93/0x150
 __do_softirq+0xee/0x4b8
 irq_exit_rcu+0xf7/0x130
 sysvec_apic_timer_interrupt+0x8e/0xc0
 </IRQ>

Fixes: ad215aaea4f9 ("RDMA/srp: Make struct scsi_cmnd and struct srp_request adjacent")
Link: https://lore.kernel.org/r/20220831081626.18712-1-yangx.jy@fujitsu.com
Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
Acked-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-01 09:51:18 +03:00
Daisuke Matsuda
2c02249fcb RDMA/rxe: Delete error messages triggered by incoming Read requests
An incoming Read request causes multiple Read responses. If a user MR to
copy data from is unavailable or responder cannot send a reply, then the
error messages can be printed for each response attempt, resulting in
message overflow.

Link: https://lore.kernel.org/r/20220829071218.1639065-1-matsuda-daisuke@fujitsu.com
Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-31 09:57:09 +03:00
Zhu Yanjun
f07853582d RDMA/rxe: Remove the unused variable obj
The member variable obj in struct rxe_task is not needed.
So remove it to save memory.

Link: https://lore.kernel.org/r/20220822011615.805603-4-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-31 09:53:13 +03:00
Zhu Yanjun
548ce2e667 RDMA/rxe: Fix the error caused by qp->sk
When sock_create_kern in the function rxe_qp_init_req fails,
qp->sk is set to NULL.

Then the function rxe_create_qp will call rxe_qp_do_cleanup
to handle allocated resource.

Before handling qp->sk, this variable should be checked.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Link: https://lore.kernel.org/r/20220822011615.805603-3-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-31 09:53:13 +03:00
Zhu Yanjun
a625ca30ef RDMA/rxe: Fix "kernel NULL pointer dereference" error
When rxe_queue_init in the function rxe_qp_init_req fails,
both qp->req.task.func and qp->req.task.arg are not initialized.

Because of creation of qp fails, the function rxe_create_qp will
call rxe_qp_do_cleanup to handle allocated resource.

Before calling __rxe_do_task, both qp->req.task.func and
qp->req.task.arg should be checked.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Link: https://lore.kernel.org/r/20220822011615.805603-2-yanjun.zhu@linux.dev
Reported-by: syzbot+ab99dc4c6e961eed8b8e@syzkaller.appspotmail.com
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-31 09:53:12 +03:00
Wenpeng Liang
bfb3bde954 RDMA/hns: Remove redundant member doorbell_qpn of struct hns_roce_qp
The value of doorbell_qpn is always equal to qpn on current hardware
versions. So remove it.

Link: https://lore.kernel.org/r/20220829105021.1427804-5-liangwenpeng@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-30 12:14:23 +03:00
Mark Zhang
637ff8ea00 IB/cm: Refactor cm_insert_listen() and cm_find_listen()
Move the device and service_id match code at the top of
cm_insert_listen() and cm_find_listen() into the final else branch.

Link: https://lore.kernel.org/r/20220819090859.957943-4-markzhang@nvidia.com
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-30 12:14:23 +03:00
Mark Zhang
a461b746c5 IB/cm: remove cm_id_priv->id.service_mask and service_mask parameter of cm_init_listen()
The service_mask is always ~cpu_to_be64(0), so the result is always
a NOP when it is &'d with a service_id. Remove it for simplicity.

Link: https://lore.kernel.org/r/20220819090859.957943-3-markzhang@nvidia.com
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-30 12:14:23 +03:00
Mark Zhang
91a3f14ec9 IB/cm: Remove the service_mask parameter from ib_cm_listen()
Remove the service_mask parameter of ib_cm_listen(), as all callers
use 0.

Link: https://lore.kernel.org/r/20220819090859.957943-2-markzhang@nvidia.com
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-30 12:14:23 +03:00
Guoqing Jiang
6edd86a2d2 RDMA/rtrs: Remove 'dir' argument from rnbd_srv_rdma_ev
Since process_{read,write} already prints direction info if ctx->ops.rdma_ev
fails, no need to pass 'dir'.

Link: https://lore.kernel.org/r/20220826081117.21687-1-guoqing.jiang@linux.dev
Signed-off-by: Guoqing Jiang <guoqing.jiang@linux.dev>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-30 12:13:57 +03:00
Yixing Liu
45baad7dd9 RDMA/hns: Remove the num_qpc_timer variable
The bt number of qpc_timer of HIP09 increases compared with that of HIP08.
Therefore, qpc_timer_bt_num and num_qpc_timer do not match. As a result,
the driver may fail to allocate qpc_timer. So the driver needs to uniquely
uses qpc_timer_bt_num to represent the bt number of qpc_timer.

Fixes: 0e40dc2f70cd ("RDMA/hns: Add timer allocation support for hip08")
Link: https://lore.kernel.org/r/20220829105021.1427804-4-liangwenpeng@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-30 10:22:43 +03:00
Wenpeng Liang
0c8b5d6268 RDMA/hns: Fix wrong fixed value of qp->rq.wqe_shift
The value of qp->rq.wqe_shift of HIP08 is always determined by the number
of sge. So delete the wrong branch.

Fixes: cfc85f3e4b7f ("RDMA/hns: Add profile support for hip08 driver")
Fixes: 926a01dc000d ("RDMA/hns: Add QP operations support for hip08 SoC")
Link: https://lore.kernel.org/r/20220829105021.1427804-3-liangwenpeng@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-30 10:22:35 +03:00
Chengchang Tang
55af9d4985 RDMA/hns: Fix supported page size
The supported page size for hns is (4K, 128M), not (4K, 2G).

Fixes: cfc85f3e4b7f ("RDMA/hns: Add profile support for hip08 driver")
Link: https://lore.kernel.org/r/20220829105021.1427804-2-liangwenpeng@huawei.com
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-30 10:22:26 +03:00
Michael Guralnik
27cfde795a RDMA/cma: Fix arguments order in net device validation
Fix the order of source and destination addresses when resolving the
route between server and client to validate use of correct net device.

The reverse order we had so far didn't actually validate the net device
as the server would try to resolve the route to itself, thus always
getting the server's net device.

The issue was discovered when running cm applications on a single host
between 2 interfaces with same subnet and source based routing rules.
When resolving the reverse route the source based route rules were
ignored.

Fixes: f887f2ac87c2 ("IB/cma: Validate routing of incoming requests")
Link: https://lore.kernel.org/r/1c1ec2277a131d277ebcceec987fd338d35b775f.1661251872.git.leonro@nvidia.com
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-29 11:16:35 +03:00
Daisuke Matsuda
d4ecb56e86 RDMA/rxe: Remove an unused member from struct rxe_mr
Commit 1e75550648da ("Revert "RDMA/rxe: Create duplicate mapping tables for
FMRs"") brought back the member 'va' to struct rxe_mr. However, it is
actually used by nobody and thus can be removed.

Fixes: 1e75550648da ("Revert "RDMA/rxe: Create duplicate mapping tables for FMRs"")
Link: https://lore.kernel.org/r/20220829012335.1212697-1-matsuda-daisuke@fujitsu.com
Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-29 09:44:07 +03:00
Yunsheng Lin
05195dcb43 RDMA/core: Remove 'device' argument from rdma_build_skb()
'device' argument is never used since rdma_build_skb()
is introduced, so remove it.

Link: https://lore.kernel.org/r/20220826143215.18111-1-linyunsheng@huawei.com
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-28 14:29:07 +03:00
Bart Van Assche
b8a9c18c2f RDMA/srp: Use the attribute group mechanism for sysfs attributes
Simplify the SRP driver by using the attribute group mechanism instead
of calling device_create_file() explicitly.

Link: https://lore.kernel.org/r/20220825213900.864587-5-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-28 13:02:28 +03:00
Bart Van Assche
351e458f72 RDMA/srp: Handle dev_set_name() failure
Instead of ignoring dev_set_name() failure, handle dev_set_name()
failure. Convert a device_register() call into device_initialize() and
device_add() calls.

Link: https://lore.kernel.org/r/20220825213900.864587-4-bvanassche@acm.org
Reported-by: Bo Liu <liubo03@inspur.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-28 13:02:20 +03:00
Bart Van Assche
0766fcaa1e RDMA/srp: Remove the srp_host.released completion
Move the kfree(host) calls into srp_release_dev(). Convert a
device_unregister() call into a device_del() and a device_put() call.
Remove the host->released completion object. This patch prepares for
handling dev_set_name() failure in srp_add_port().

Link: https://lore.kernel.org/r/20220825213900.864587-3-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-28 13:02:12 +03:00
Bart Van Assche
c8e4c23976 RDMA/srp: Rework the srp_add_port() error path
device_register() always calls device_initialize() so calling device_del()
is safe even if device_register() fails. Implement the following advice
from the comment block above device_register(): "NOTE: _Never_ directly free
@dev after calling this function, even if it returned an error! Always use
put_device() to give up the reference initialized in this function instead."
Keep the kfree() call in the error path since srp_release_dev() does not
free the host.

Link: https://lore.kernel.org/r/20220825213900.864587-2-bvanassche@acm.org
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-28 13:02:04 +03:00
Shiraz Saleem
ead54ced63 RDMA/irdma: Fix drain SQ hang with no completion
SW generated completions for outstanding WRs posted on SQ
after QP is in error target the wrong CQ. This causes the
ib_drain_sq to hang with no completion.

Fix this to generate completions on the right CQ.

[  863.969340] INFO: task kworker/u52:2:671 blocked for more than 122 seconds.
[  863.979224]       Not tainted 5.14.0-130.el9.x86_64 #1
[  863.986588] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  863.996997] task:kworker/u52:2   state:D stack:    0 pid:  671 ppid:     2 flags:0x00004000
[  864.007272] Workqueue: xprtiod xprt_autoclose [sunrpc]
[  864.014056] Call Trace:
[  864.017575]  __schedule+0x206/0x580
[  864.022296]  schedule+0x43/0xa0
[  864.026736]  schedule_timeout+0x115/0x150
[  864.032185]  __wait_for_common+0x93/0x1d0
[  864.037717]  ? usleep_range_state+0x90/0x90
[  864.043368]  __ib_drain_sq+0xf6/0x170 [ib_core]
[  864.049371]  ? __rdma_block_iter_next+0x80/0x80 [ib_core]
[  864.056240]  ib_drain_sq+0x66/0x70 [ib_core]
[  864.062003]  rpcrdma_xprt_disconnect+0x82/0x3b0 [rpcrdma]
[  864.069365]  ? xprt_prepare_transmit+0x5d/0xc0 [sunrpc]
[  864.076386]  xprt_rdma_close+0xe/0x30 [rpcrdma]
[  864.082593]  xprt_autoclose+0x52/0x100 [sunrpc]
[  864.088718]  process_one_work+0x1e8/0x3c0
[  864.094170]  worker_thread+0x50/0x3b0
[  864.099109]  ? rescuer_thread+0x370/0x370
[  864.104473]  kthread+0x149/0x170
[  864.109022]  ? set_kthread_struct+0x40/0x40
[  864.114713]  ret_from_fork+0x22/0x30

Fixes: 81091d7696ae ("RDMA/irdma: Add SW mechanism to generate completions on error")
Link: https://lore.kernel.org/r/20220824154358.117-1-shiraz.saleem@intel.com
Reported-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-28 12:43:37 +03:00
Wenpeng Liang
3d67e7e236 RDMA/hns: Support MR's restrack raw ops for hns driver
The MR raw restrack attributes come from the queue context maintained by
the ROCEE.

For example:

$ rdma res show mr dev hns_0 mrn 6 -dd -jp -r
[ {
        "ifindex": 4,
        "ifname": "hns_0",
        "data": [ 1,0,0,0,2,0,0,0,0,3,0,0,0,0,2,0,0,0,0,0,32,0,0,0,2,0,0,0,
		  2,0,0,0,0,0,0,0 ]
    } ]

Link: https://lore.kernel.org/r/20220822104455.2311053-8-liangwenpeng@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-24 08:47:18 +03:00
Wenpeng Liang
dc9981ef17 RDMA/hns: Support MR's restrack ops for hns driver
The MR restrack attributes come from the queue information maintained by
the driver.

For example:

$ rdma res show mr dev hns_0 mrn 6 -dd -jp
[ {
        "ifindex": 4,
        "ifname": "hns_0",
        "mrn": 6,
        "rkey": "300",
        "lkey": "300",
        "mrlen": 131072,
        "pdn": 8,
        "pid": 1524,
        "comm": "ib_send_bw"
    },
    "drv_pbl_hop_num": 2,
    "drv_ba_pg_shift": 14,
    "drv_buf_pg_shift": 12
}

Link: https://lore.kernel.org/r/20220822104455.2311053-7-liangwenpeng@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-23 11:35:22 +03:00
Wenpeng Liang
3e89d78b21 RDMA/hns: Support QP's restrack raw ops for hns driver
The QP raw restrack attributes come from the queue context maintained by
the ROCEE.

For example:

$ rdma res show qp link hns_0 -jp -dd -r
[ {
        "ifindex": 4,
        "ifname": "hns_0",
        "data": [ 2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,0,0,0,
		  5,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,255,156,0,0,63,156,0,0,
		  7,0,0,0,1,0,0,0,9,0,0,0,0,0,0,0,2,0,0,0,2,0,0,0,0,0,0,
		  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
		  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,63,156,0,
		  0,0,0,0,0 ]
    } ]

Link: https://lore.kernel.org/r/20220822104455.2311053-6-liangwenpeng@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-23 11:35:13 +03:00
Wenpeng Liang
e198d65d76 RDMA/hns: Support QP's restrack ops for hns driver
The QP restrack attributes come from the queue information maintained by
the driver.

For example:

$ rdma res show qp link hns_0 lqpn 41 -jp -dd
[ {
        "ifindex": 4,
        "ifname": "hns_0",
        "port": 1,
        "lqpn": 41,
        "rqpn": 40,
        "type": "RC",
        "state": "RTR",
        "rq-psn": 12474738,
        "sq-psn": 0,
        "path-mig-state": "ARMED",
        "pdn": 9,
        "pid": 1523,
        "comm": "ib_send_bw"
    },
    "drv_sq_wqe_cnt": 128,
    "drv_sq_max_gs": 1,
    "drv_rq_wqe_cnt": 512,
    "drv_rq_max_gs": 2,
    "drv_ext_sge_sge_cnt": 0
}

Link: https://lore.kernel.org/r/20220822104455.2311053-5-liangwenpeng@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-23 11:35:05 +03:00
Wenpeng Liang
f2b070f36d RDMA/hns: Support CQ's restrack raw ops for hns driver
The CQ raw restrack attributes come from the queue context maintained by
the ROCEE.

For example:

$ rdma res show cq dev hns_0 cqn 14 -dd -jp -r
[ {
        "ifindex": 4,
        "ifname": "hns_0",
        "data": [ 1,0,0,0,7,0,0,0,0,0,0,0,0,82,6,0,0,82,6,0,0,82,6,0,
		  1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,
		  6,0,0,0,0,0,0,0 ]
    } ]

Link: https://lore.kernel.org/r/20220822104455.2311053-4-liangwenpeng@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-23 11:34:58 +03:00
Wenpeng Liang
eb00b9a08b RDMA/hns: Add or remove CQ's restrack attributes
Remove the resttrack attributes from the queue context held by ROCEE, and
add the resttrack attributes from the queue information maintained by the
driver.

For example:

$ rdma res show cq dev hns_0 cqn 14 -dd -jp
[ {
        "ifindex": 4,
        "ifname": "hns_0",
        "cqn": 14,
        "cqe": 127,
        "users": 1,
        "adaptive-moderation": false,
        "ctxn": 8,
        "pid": 1524,
        "comm": "ib_send_bw"
    },
    "drv_cq_depth": 128,
    "drv_cons_index": 0,
    "drv_cqe_size": 32,
    "drv_arm_sn": 1
}

Link: https://lore.kernel.org/r/20220822104455.2311053-3-liangwenpeng@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-23 11:34:51 +03:00
Wenpeng Liang
40b4b79c86 RDMA/hns: Remove redundant DFX file and DFX ops structure
There is no need to use a dedicated DXF file and DFX structure to manage
the interface of the query queue context.

Link: https://lore.kernel.org/r/20220822104455.2311053-2-liangwenpeng@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-23 11:34:42 +03:00