1040 Commits

Author SHA1 Message Date
Li Zhijian
70aad902ce RDMA/rxe: Implement flush completion
Per IBA SPEC, FLUSH will ack in rdma read response with 0 length.

Use IB_WC_FLUSH (aka IB_UVERBS_WC_FLUSH) code to tell userspace a FLUSH
completion.

Link: https://lore.kernel.org/r/20221206130201.30986-9-lizhijian@fujitsu.com
Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09 19:36:02 -04:00
Li Zhijian
ea1bb00ee9 RDMA/rxe: Implement flush execution in responder side
Only the requested placement types that also registered in the destination
memory region are acceptable.
Otherwise, responder will also reply NAK "Remote Access Error" if it
found a placement type violation.

We will persist data via arch_wb_cache_pmem(), which could be
architecture specific.

This commit also adds 2 helpers to update qp.resp from the incoming packet.

Link: https://lore.kernel.org/r/20221206130201.30986-8-lizhijian@fujitsu.com
Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09 19:36:02 -04:00
Li Zhijian
fa1fd682ad RDMA/rxe: Implement RC RDMA FLUSH service in requester side
Implement FLUSH request operation in the requester.

Link: https://lore.kernel.org/r/20221206130201.30986-7-lizhijian@fujitsu.com
Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09 19:36:02 -04:00
Li Zhijian
02e9a31c89 RDMA/rxe: Extend rxe packet format to support flush
Extend rxe opcode tables, headers, helper and constants to support
flush operations.

Refer to the IBA A19.4.1 for more FETH definition details

Link: https://lore.kernel.org/r/20221206130201.30986-6-lizhijian@fujitsu.com
Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09 19:36:02 -04:00
Li Zhijian
02ea0a5115 RDMA/rxe: Allow registering persistent flag for pmem MR only
Memory region could  support at most 2 flush access flags:
IB_ACCESS_FLUSH_PERSISTENT and IB_ACCESS_FLUSH_GLOBAL

But we only allow user to register persistent flush flags to the pmem MR
where it has the ability of persisting data across power cycles.

So registering a persistent access flag to a non-pmem MR will be rejected.

Link: https://lore.kernel.org/r/20221206130201.30986-5-lizhijian@fujitsu.com
CC: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09 19:36:02 -04:00
Bob Pearson
689c5421bf RDMA/rxe: Fix incorrect responder length checking
The code in rxe_resp.c at check_length() is incorrect as it compares
pkt->opcode an 8 bit value against various mask bits which are all higher
than 256 so nothing is ever reported.

This patch rewrites this to compare against pkt->mask which is
correct. However this now exposes another error. For UD send packets the
value of the pmtu cannot be determined from qp->mtu.  All that is required
here is to later check if the payload fits into the posted receive buffer
in that case.

Fixes: 837a55847ead ("RDMA/rxe: Implement packet length validation on responder")
Link: https://lore.kernel.org/r/20221208210945.28607-1-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Reviewed-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09 19:26:27 -04:00
Daisuke Matsuda
3282a549cf RDMA/rxe: Fix oops with zero length reads
The commit 686d348476ee ("RDMA/rxe: Remove unnecessary mr testing") causes
a kernel crash. If responder get a zero-byte RDMA Read request,
qp->resp.mr is not set in check_rkey() (see IBA C9-88). The mr is NULL in
this case, and a NULL pointer dereference occurs as shown below.

 BUG: kernel NULL pointer dereference, address: 0000000000000010
 #PF: supervisor write access in kernel mode
 #PF: error_code(0x0002) - not-present page
 PGD 0 P4D 0
 Oops: 0002 [#1] PREEMPT SMP PTI
 CPU: 2 PID: 3622 Comm: python3 Kdump: loaded Not tainted 6.1.0-rc3+ #34
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
 RIP: 0010:__rxe_put+0xc/0x60 [rdma_rxe]
 Code: cc cc cc 31 f6 e8 64 36 1b d3 41 b8 01 00 00 00 44 89 c0 c3 cc cc cc cc 41 89 c0 eb c1 90 0f 1f 44 00 00 41 54 b8 ff ff ff ff <f0> 0f c1 47 10 83 f8 01 74 11 45 31 e4 85 c0 7e 20 44 89 e0 41 5c
 RSP: 0018:ffffb27bc012ce78 EFLAGS: 00010246
 RAX: 00000000ffffffff RBX: ffff9790857b0580 RCX: 0000000000000000
 RDX: ffff979080fe145a RSI: 000055560e3e0000 RDI: 0000000000000000
 RBP: ffff97909c7dd800 R08: 0000000000000001 R09: e7ce43d97f7bed0f
 R10: ffff97908b29c300 R11: 0000000000000000 R12: 0000000000000000
 R13: 0000000000000000 R14: ffff97908b29c300 R15: 0000000000000000
 FS:  00007f276f7bd740(0000) GS:ffff9792b5c80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000010 CR3: 0000000114230002 CR4: 0000000000060ee0
 Call Trace:
  <IRQ>
  read_reply+0xda/0x310 [rdma_rxe]
  rxe_responder+0x82d/0xe50 [rdma_rxe]
  do_task+0x84/0x170 [rdma_rxe]
  tasklet_action_common.constprop.0+0xa7/0x120
  __do_softirq+0xcb/0x2ac
  do_softirq+0x63/0x90
  </IRQ>

Support a NULL mr during read_reply()

Fixes: 686d348476ee ("RDMA/rxe: Remove unnecessary mr testing")
Fixes: b5f9a01fae42 ("RDMA/rxe: Fix mr leak in RESPST_ERR_RNR")
Link: https://lore.kernel.org/r/20221209045926.531689-1-matsuda-daisuke@fujitsu.com
Link: https://lore.kernel.org/r/20221202145713.13152-1-lizhijian@fujitsu.com
Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09 15:57:51 -04:00
Jason Gunthorpe
d69e8c63fc Linux 6.1-rc8
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmONI6weHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG9xgH/jqXGuMoO1ikfmGb
 7oY0W/f69G9V/e0DxFLvnIjhFgCUzdnNsmD4jQJA4x6QsxwLWuvpI282Ez+bHV5T
 U4RPsxJZIIMsXE2lKM9BRgeLzDdCt0aK4Pj+3x2x7NZC5cWFSQ8PyQJkCwg+0PQo
 u8Ly+GO8c4RUMf4/rrAZQq16qZUqGDaGm1EJhtSoa+KiR81LmUUmbDIK9Mr53rmQ
 wou+95XhibwMWr17WgXA28bTgYqn9UGr67V3qvTH2LC7GW8BCoKvn+3wh6TVhlWj
 dsWplXgcOP0/OHvSC5Sb1Uibk5Gx3DlIzYa6OfNZQuZ5xmQqm9kXjW8lmYpWFHy/
 38/5HWc=
 =EuoA
 -----END PGP SIGNATURE-----

Merge tag 'v6.1-rc8' into rdma.git for-next

For dependencies in following patches

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-09 15:52:17 -04:00
Xiao Yang
4cd9f1d320 RDMA/rxe: Enable atomic write capability for rxe device
The capability shows that rxe device supports atomic write operation.

Link: https://lore.kernel.org/r/1669905568-62-4-git-send-email-yangx.jy@fujitsu.com
Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-01 19:51:10 -04:00
Xiao Yang
3aec427bb1 RDMA/rxe: Implement atomic write completion
Generate an atomic write completion when the atomic write request
has been finished.

Link: https://lore.kernel.org/r/1669905568-62-3-git-send-email-yangx.jy@fujitsu.com
Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-01 19:51:10 -04:00
Xiao Yang
034e285f8b RDMA/rxe: Make responder support atomic write on RC service
Make responder process an atomic write request and send a read response
on RC service.

Link: https://lore.kernel.org/r/1669905568-62-2-git-send-email-yangx.jy@fujitsu.com
Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-01 19:51:09 -04:00
Xiao Yang
abb633cf28 RDMA/rxe: Make requester support atomic write on RC service
Make requester process and send an atomic write request on RC service.

Link: https://lore.kernel.org/r/1669905568-62-1-git-send-email-yangx.jy@fujitsu.com
Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-01 19:51:09 -04:00
Xiao Yang
5c7af6c793 RDMA/rxe: Extend rxe packet format to support atomic write
Extend rxe_wr_opcode_info[] and rxe_opcode[] for new atomic write opcode.

Link: https://lore.kernel.org/r/1669905432-14-5-git-send-email-yangx.jy@fujitsu.com
Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-12-01 19:51:09 -04:00
David Hildenbrand
129e636fe9 RDMA/siw: remove FOLL_FORCE usage
GUP now supports reliable R/O long-term pinning in COW mappings, such
that we break COW early. MAP_SHARED VMAs only use the shared zeropage so
far in one corner case (DAXFS file with holes), which can be ignored
because GUP does not support long-term pinning in fsdax (see
check_vma_flags()).

Consequently, FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM is no longer required
for reliable R/O long-term pinning: FOLL_LONGTERM is sufficient. So stop
using FOLL_FORCE, which is really only for ptrace access.

Link: https://lkml.kernel.org/r/20221116102659.70287-13-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Cc: Bernard Metzler <bmt@zurich.ibm.com>
Cc: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-11-30 15:58:59 -08:00
Zhang Xiaoxu
f67376d801 RDMA/rxe: Fix NULL-ptr-deref in rxe_qp_do_cleanup() when socket create failed
There is a null-ptr-deref when mount.cifs over rdma:

  BUG: KASAN: null-ptr-deref in rxe_qp_do_cleanup+0x2f3/0x360 [rdma_rxe]
  Read of size 8 at addr 0000000000000018 by task mount.cifs/3046

  CPU: 2 PID: 3046 Comm: mount.cifs Not tainted 6.1.0-rc5+ #62
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc3
  Call Trace:
   <TASK>
   dump_stack_lvl+0x34/0x44
   kasan_report+0xad/0x130
   rxe_qp_do_cleanup+0x2f3/0x360 [rdma_rxe]
   execute_in_process_context+0x25/0x90
   __rxe_cleanup+0x101/0x1d0 [rdma_rxe]
   rxe_create_qp+0x16a/0x180 [rdma_rxe]
   create_qp.part.0+0x27d/0x340
   ib_create_qp_kernel+0x73/0x160
   rdma_create_qp+0x100/0x230
   _smbd_get_connection+0x752/0x20f0
   smbd_get_connection+0x21/0x40
   cifs_get_tcp_session+0x8ef/0xda0
   mount_get_conns+0x60/0x750
   cifs_mount+0x103/0xd00
   cifs_smb3_do_mount+0x1dd/0xcb0
   smb3_get_tree+0x1d5/0x300
   vfs_get_tree+0x41/0xf0
   path_mount+0x9b3/0xdd0
   __x64_sys_mount+0x190/0x1d0
   do_syscall_64+0x35/0x80
   entry_SYSCALL_64_after_hwframe+0x46/0xb0

The root cause of the issue is the socket create failed in
rxe_qp_init_req().

So move the reset rxe_qp_do_cleanup() after the NULL ptr check.

Fixes: 8700e3e7c485 ("Soft RoCE driver")
Link: https://lore.kernel.org/r/20221122151437.1057671-1-zhangxiaoxu5@huawei.com
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-22 15:55:54 -04:00
Jason Gunthorpe
cb6562c380 RDMA/rxe: Do not NULL deref on debugging failure path
Correct the mistake, mr is obviously NULL in this code path.

Fixes: 2778b72b1df0 ("RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe_mr.c")
Link: https://lore.kernel.org/r/Y3eeJW0AdyJYhYyQ@kili
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-22 15:53:02 -04:00
Li Zhijian
7d984dac8f RDMA/rxe: Fix mr->map double free
rxe_mr_cleanup() which tries to free mr->map again will be called when
rxe_mr_init_user() fails:

   CPU: 0 PID: 4917 Comm: rdma_flush_serv Kdump: loaded Not tainted 6.1.0-rc1-roce-flush+ #25
   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
   Call Trace:
    <TASK>
    dump_stack_lvl+0x45/0x5d
    panic+0x19e/0x349
    end_report.part.0+0x54/0x7c
    kasan_report.cold+0xa/0xf
    rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
    __rxe_cleanup+0x10a/0x1e0 [rdma_rxe]
    rxe_reg_user_mr+0xb7/0xd0 [rdma_rxe]
    ib_uverbs_reg_mr+0x26a/0x480 [ib_uverbs]
    ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x1a2/0x250 [ib_uverbs]
    ib_uverbs_cmd_verbs+0x1397/0x15a0 [ib_uverbs]

This issue was firstly exposed since commit b18c7da63fcb ("RDMA/rxe: Fix
memory leak in error path code") and then we fixed it in commit
8ff5f5d9d8cf ("RDMA/rxe: Prevent double freeing rxe_map_set()") but this
fix was reverted together at last by commit 1e75550648da (Revert
"RDMA/rxe: Create duplicate mapping tables for FMRs")

Simply let rxe_mr_cleanup() always handle freeing the mr->map once it is
successfully allocated.

Fixes: 1e75550648da ("Revert "RDMA/rxe: Create duplicate mapping tables for FMRs"")
Link: https://lore.kernel.org/r/1667099073-2-1-git-send-email-lizhijian@fujitsu.com
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-18 20:15:51 -04:00
Zhu Yanjun
8e1a76493b RDMA/rxe: Remove reliable datagram support
The rdma_rxe driver does not actually support the reliable datagram
transport but contains a variable with RD opcodes in driver code.  And
this variable is never used. So remove it.

Link: https://lore.kernel.org/r/20221112023537.432912-1-yanjun.zhu@intel.com
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-18 19:57:46 -04:00
Bernard Metzler
60da2d11fc RDMA/siw: Set defined status for work completion with undefined status
A malicious user may write undefined values into memory mapped completion
queue elements status or opcode. Undefined status or opcode values will
result in out-of-bounds access to an array mapping siw internal
representation of opcode and status to RDMA core representation when
reaping CQ elements. While siw detects those undefined values, it did not
correctly set completion status to a defined value, thus defeating the
whole purpose of the check.

This bug leads to the following Smatch static checker warning:

	drivers/infiniband/sw/siw/siw_cq.c:96 siw_reap_cqe()
	error: buffer overflow 'map_cqe_status' 10 <= 21

Fixes: bdf1da5df9da ("RDMA/siw: Fix immediate work request flush to completion queue")
Link: https://lore.kernel.org/r/20221115170747.1263298-1-bmt@zurich.ibm.com
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-15 16:47:00 -04:00
Bob Pearson
5de087250f RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe_mmap.c
Replace calls to pr_xxx() in rxe_mmap.c with rxe_dbg_xxx().

Link: https://lore.kernel.org/r/20221103171013.20659-17-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-10 15:33:07 -04:00
Bob Pearson
813728043b RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe_icrc.c
Replace calls to pr_xxx() in rxe_icrc.c with rxe_dbg_xxx().

Link: https://lore.kernel.org/r/20221103171013.20659-16-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-10 15:33:06 -04:00
Bob Pearson
c6aba5ea00 RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe.c
Replace calls to pr_xxx() in rxe.c with rxe_dbg_xxx().
Calls with a rxe device not yet in scope are left as is.

Link: https://lore.kernel.org/r/20221103171013.20659-15-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-10 15:33:06 -04:00
Bob Pearson
fc50597934 RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe_task.c
Replace calls to pr_xxx() in rxe_task.c with rxe_dbg_xxx().

Link: https://lore.kernel.org/r/20221103171013.20659-14-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-10 15:33:06 -04:00
Bob Pearson
25fd735a4c RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe_av.c
Replace calls to pr_xxx() in rxe_av.c with rxe_dbg_xxx().

Link: https://lore.kernel.org/r/20221103171013.20659-13-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-10 15:33:05 -04:00
Bob Pearson
14e501fdb0 RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe_verbs.c
Replace calls to pr_xxx() in rxe_verbs.c with rxe_dbg_xxx().

Link: https://lore.kernel.org/r/20221103171013.20659-12-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-10 15:33:05 -04:00
Bob Pearson
0e6090024b RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe_srq.c
Replace calls to pr_xxx() in rxe_srq.c with rxe_dbg_xxx().

Link: https://lore.kernel.org/r/20221103171013.20659-11-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-10 15:33:05 -04:00
Bob Pearson
74ddf7233c RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe_resp.c
Replace calls to pr_xxx() in rxe_resp.c with rxe_dbg_xxx().

Link: https://lore.kernel.org/r/20221103171013.20659-10-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-10 15:33:05 -04:00
Bob Pearson
0edfb15e30 RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe_req.c
Replace calls to pr_xxx() in rxe_req.c with rxe_dbg_xxx().

Link: https://lore.kernel.org/r/20221103171013.20659-9-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-10 15:33:04 -04:00
Bob Pearson
6af70060d2 RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe_qp.c
Replace calls to pr_xxx() in rxe_qp.c with rxe_dbg_xxx().

Link: https://lore.kernel.org/r/20221103171013.20659-8-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-10 15:33:04 -04:00
Bob Pearson
34549e88e0 RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe_net.c
Replace (some) calls to pr_xxx() in rxe_net.c with rxe_dbg_xxx().
Calls with a rxe device not yet in scope are left as is.

Link: https://lore.kernel.org/r/20221103171013.20659-7-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-10 15:33:04 -04:00
Bob Pearson
e8a87efdf8 RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe_mw.c
Replace calls to pr_xxx() int rxe_mw.c with rxe_dbg_xxx().

Link: https://lore.kernel.org/r/20221103171013.20659-6-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-10 15:33:03 -04:00
Bob Pearson
2778b72b1d RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe_mr.c
Replace calls to pr_xxx() in rxe_mr.c by rxe_dbg_mr().

Link: https://lore.kernel.org/r/20221103171013.20659-5-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-10 15:33:03 -04:00
Bob Pearson
52920f537a RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe_cq.c
Replace calls to pr_xxx() in rxe_cq.c with rxe_dbg_xxx().

Link: https://lore.kernel.org/r/20221103171013.20659-4-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-10 15:33:03 -04:00
Bob Pearson
27c4c520bd RDMA/rxe: Replace pr_xxx by rxe_dbg_xxx in rxe_comp.c
Replace calls to pr_xxx() in rxe_comp.c with rxe_dbg_xxx().

Link: https://lore.kernel.org/r/20221103171013.20659-3-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-10 15:33:02 -04:00
Bob Pearson
4554bac48a RDMA/rxe: Add ibdev_dbg macros for rxe
Add macros borrowed from siw to call dynamic debug macro ibdev_dbg.

Link: https://lore.kernel.org/r/20221103171013.20659-2-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-10 15:33:02 -04:00
Daisuke Matsuda
837a55847e RDMA/rxe: Implement packet length validation on responder
The function check_length() is supposed to check the length of inbound
packets on responder, but it actually has been a stub since the driver was
born. Let it check the payload length and the DMA length.

Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Link: https://lore.kernel.org/r/20221107055338.357184-1-matsuda-daisuke@fujitsu.com
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-09 19:54:57 +02:00
Bernard Metzler
bdf1da5df9 RDMA/siw: Fix immediate work request flush to completion queue
Correctly set send queue element opcode during immediate work request
flushing in post sendqueue operation, if the QP is in ERROR state.
An undefined ocode value results in out-of-bounds access to an array
for mapping the opcode between siw internal and RDMA core representation
in work completion generation. It resulted in a KASAN BUG report
of type 'global-out-of-bounds' during NFSoRDMA testing.

This patch further fixes a potential case of a malicious user which may
write undefined values for completion queue elements status or opcode,
if the CQ is memory mapped to user land. It avoids the same out-of-bounds
access to arrays for status and opcode mapping as described above.

Fixes: 303ae1cdfdf7 ("rdma/siw: application interface")
Fixes: b0fff7317bb4 ("rdma/siw: completion queue methods")
Reported-by: Olga Kornievskaia <kolga@netapp.com>
Reviewed-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com>
Link: https://lore.kernel.org/r/20221107145057.895747-1-bmt@zurich.ibm.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-09 15:26:49 +02:00
Yunsheng Lin
692373d186 RDMA/rxe: cleanup some error handling in rxe_verbs.c
Instead of 'goto and return', just return directly to
simplify the error handling, and avoid some unnecessary
return value check.

Link: https://lore.kernel.org/r/20221028075053.3990467-1-xuhaoyue1@hisilicon.com
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-10-28 15:11:44 -03:00
Xiao Yang
b071850ef6 RDMA/rxe: Remove the duplicate assignment of mr->map_shift
mr->map_shift is set to ilog2(RXE_BUF_PER_MAP) in both rxe_mr_init() and
rxe_mr_alloc() so remove the duplicate one in rxe_mr_init().

Link: https://lore.kernel.org/r/1666855893-145-1-git-send-email-yangx.jy@fujitsu.com
Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-10-28 15:08:41 -03:00
Li Zhijian
875ab4a8d9 RDMA/rxe: Make sure requested access is a subset of {mr,mw}->access
We should reject the requests with access flags that is not registered by
MR/MW. For example, lookup_mr() should return NULL when requested access
is 0x03 and mr->access is 0x01.

Link: https://lore.kernel.org/r/20220927055337.22630-2-lizhijian@fujitsu.com
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-10-28 14:39:47 -03:00
Bob Pearson
63a18baef2 RDMA/rxe: Rename task->state_lock to task->lock
Rename task-state_lock to task->lock

Link: https://lore.kernel.org/r/20221021200118.2163-7-rpearsonhpe@gmail.com
Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-10-28 13:47:16 -03:00
Bob Pearson
dcef28528c RDMA/rxe: Make rxe_do_task static
The subroutine rxe_do_task() is only called in rxe_task.c. This patch
makes it static and renames it do_task().

Link: https://lore.kernel.org/r/20221021200118.2163-6-rpearsonhpe@gmail.com
Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-10-28 13:47:15 -03:00
Bob Pearson
dccb23f6c3 RDMA/rxe: Split rxe_run_task() into two subroutines
Split rxe_run_task(task, sched) into rxe_run_task(task) and
rxe_sched_task(task).

Link: https://lore.kernel.org/r/20221021200118.2163-5-rpearsonhpe@gmail.com
Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-10-28 13:47:15 -03:00
Bob Pearson
de669ae8af RDMA/rxe: Removed unused name from rxe_task struct
The name field in struct rxe_task is never used. This patch removes it.

Link: https://lore.kernel.org/r/20221021200118.2163-4-rpearsonhpe@gmail.com
Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-10-28 13:47:15 -03:00
Bob Pearson
98a54f1706 RDMA/rxe: Remove init of task locks from rxe_qp.c
The calls to spin_lock_init() for the tasklet spinlocks in
rxe_qp_init_misc() are redundant since they are intiialized in
rxe_init_task().  This patch removes them.

Link: https://lore.kernel.org/r/20221021200118.2163-3-rpearsonhpe@gmail.com
Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-10-28 13:47:15 -03:00
Bob Pearson
05e88ebb9e RDMA/rxe: Remove redundant header files
Remove unneeded include files.

Link: https://lore.kernel.org/r/20221021200118.2163-2-rpearsonhpe@gmail.com
Signed-off-by: Ian Ziemba <ian.ziemba@hpe.com>
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-10-28 13:47:15 -03:00
Li Zhijian
b5f9a01fae RDMA/rxe: Fix mr leak in RESPST_ERR_RNR
rxe_recheck_mr() will increase mr's ref_cnt, so we should call rxe_put(mr)
to drop mr's ref_cnt in RESPST_ERR_RNR to avoid below warning:

  WARNING: CPU: 0 PID: 4156 at drivers/infiniband/sw/rxe/rxe_pool.c:259 __rxe_cleanup+0x1df/0x240 [rdma_rxe]
...
  Call Trace:
   rxe_dereg_mr+0x4c/0x60 [rdma_rxe]
   ib_dereg_mr_user+0xa8/0x200 [ib_core]
   ib_mr_pool_destroy+0x77/0xb0 [ib_core]
   nvme_rdma_destroy_queue_ib+0x89/0x240 [nvme_rdma]
   nvme_rdma_free_queue+0x40/0x50 [nvme_rdma]
   nvme_rdma_teardown_io_queues.part.0+0xc3/0x120 [nvme_rdma]
   nvme_rdma_error_recovery_work+0x4d/0xf0 [nvme_rdma]
   process_one_work+0x582/0xa40
   ? pwq_dec_nr_in_flight+0x100/0x100
   ? rwlock_bug.part.0+0x60/0x60
   worker_thread+0x2a9/0x700
   ? process_one_work+0xa40/0xa40
   kthread+0x168/0x1a0
   ? kthread_complete_and_exit+0x20/0x20
   ret_from_fork+0x22/0x30

Link: https://lore.kernel.org/r/20221024052049.20577-1-lizhijian@fujitsu.com
Fixes: 8a1a0be894da ("RDMA/rxe: Replace mr by rkey in responder resources")
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-10-25 08:59:29 +03:00
Li Zhijian
686d348476 RDMA/rxe: Remove unnecessary mr testing
Before the testing, we already passed it to rxe_mr_copy() where mr could
be dereferenced. so this checking is not needed.

The only way that mr is NULL is when it reaches below line 780 with
 'qp->resp.mr = NULL', which is not possible in Bob's explanation[1].

 778         if (res->state == rdatm_res_state_new) {
 779                 if (!res->replay) {
 780                         mr = qp->resp.mr;
 781                         qp->resp.mr = NULL;
 782                 } else {

[1] https://lore.kernel.org/lkml/30ff25c4-ce66-eac4-eaa2-64c0db203a19@gmail.com/

Link: https://lore.kernel.org/r/1666582315-2-1-git-send-email-lizhijian@fujitsu.com
CC: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-10-25 08:56:32 +03:00
Daisuke Matsuda
5ac814e02e RDMA/rxe: Handle remote errors in the midst of a Read reply sequence
Requesting nodes do not handle a reported error correctly if it is
generated in the middle of multi-packet Read responses, and the node tries
to resend the request endlessly. Let completer terminate the connection in
that case.

Link: https://lore.kernel.org/r/20221013014724.3786212-2-matsuda-daisuke@fujitsu.com
Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-10-25 08:56:32 +03:00
Daisuke Matsuda
5ebc548f4f RDMA/rxe: Make responder handle RDMA Read failures
Currently, responder can reply packets with invalid payloads if it fails
to copy messages to the packets. Add an error handling in read_reply() to
inform a requesting node of the failure.

Link: https://lore.kernel.org/r/20221013014724.3786212-1-matsuda-daisuke@fujitsu.com
Suggested-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-10-25 08:56:32 +03:00