158 Commits

Author SHA1 Message Date
Sindhu Devale
3bfb25fa2b RDMA/irdma: Fix op_type reporting in CQEs
The op_type field CQ poll info structure is incorrectly
filled in with the queue type as opposed to the op_type
received in the CQEs. The wrong opcode could be decoded
and returned to the ULP.

Copy the op_type field received in the CQE in the CQ poll
info structure.

Fixes: 24419777e943 ("RDMA/irdma: Fix RQ completion opcode")
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230725155439.1057-1-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-26 14:58:42 +03:00
Shiraz Saleem
f0842bb3d3 RDMA/irdma: Fix data race on CQP request done
KCSAN detects a data race on cqp_request->request_done memory location
which is accessed locklessly in irdma_handle_cqp_op while being
updated in irdma_cqp_ce_handler.

Annotate lockless intent with READ_ONCE/WRITE_ONCE to avoid any
compiler optimizations like load fusing and/or KCSAN warning.

[222808.417128] BUG: KCSAN: data-race in irdma_cqp_ce_handler [irdma] / irdma_wait_event [irdma]

[222808.417532] write to 0xffff8e44107019dc of 1 bytes by task 29658 on cpu 5:
[222808.417610]  irdma_cqp_ce_handler+0x21e/0x270 [irdma]
[222808.417725]  cqp_compl_worker+0x1b/0x20 [irdma]
[222808.417827]  process_one_work+0x4d1/0xa40
[222808.417835]  worker_thread+0x319/0x700
[222808.417842]  kthread+0x180/0x1b0
[222808.417852]  ret_from_fork+0x22/0x30

[222808.417918] read to 0xffff8e44107019dc of 1 bytes by task 29688 on cpu 1:
[222808.417995]  irdma_wait_event+0x1e2/0x2c0 [irdma]
[222808.418099]  irdma_handle_cqp_op+0xae/0x170 [irdma]
[222808.418202]  irdma_cqp_cq_destroy_cmd+0x70/0x90 [irdma]
[222808.418308]  irdma_puda_dele_rsrc+0x46d/0x4d0 [irdma]
[222808.418411]  irdma_rt_deinit_hw+0x179/0x1d0 [irdma]
[222808.418514]  irdma_ib_dealloc_device+0x11/0x40 [irdma]
[222808.418618]  ib_dealloc_device+0x2a/0x120 [ib_core]
[222808.418823]  __ib_unregister_device+0xde/0x100 [ib_core]
[222808.418981]  ib_unregister_device+0x22/0x40 [ib_core]
[222808.419142]  irdma_ib_unregister_device+0x70/0x90 [irdma]
[222808.419248]  i40iw_close+0x6f/0xc0 [irdma]
[222808.419352]  i40e_client_device_unregister+0x14a/0x180 [i40e]
[222808.419450]  i40iw_remove+0x21/0x30 [irdma]
[222808.419554]  auxiliary_bus_remove+0x31/0x50
[222808.419563]  device_remove+0x69/0xb0
[222808.419572]  device_release_driver_internal+0x293/0x360
[222808.419582]  driver_detach+0x7c/0xf0
[222808.419592]  bus_remove_driver+0x8c/0x150
[222808.419600]  driver_unregister+0x45/0x70
[222808.419610]  auxiliary_driver_unregister+0x16/0x30
[222808.419618]  irdma_exit_module+0x18/0x1e [irdma]
[222808.419733]  __do_sys_delete_module.constprop.0+0x1e2/0x310
[222808.419745]  __x64_sys_delete_module+0x1b/0x30
[222808.419755]  do_syscall_64+0x39/0x90
[222808.419763]  entry_SYSCALL_64_after_hwframe+0x63/0xcd

[222808.419829] value changed: 0x01 -> 0x03

Fixes: 915cc7ac0f8e ("RDMA/irdma: Add miscellaneous utility definitions")
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230711175253.1289-4-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17 08:02:00 +03:00
Shiraz Saleem
f2c3037811 RDMA/irdma: Fix data race on CQP completion stats
CQP completion statistics is read lockesly in irdma_wait_event and
irdma_check_cqp_progress while it can be updated in the completion
thread irdma_sc_ccq_get_cqe_info on another CPU as KCSAN reports.

Make completion statistics an atomic variable to reflect coherent updates
to it. This will also avoid load/store tearing logic bug potentially
possible by compiler optimizations.

[77346.170861] BUG: KCSAN: data-race in irdma_handle_cqp_op [irdma] / irdma_sc_ccq_get_cqe_info [irdma]

[77346.171383] write to 0xffff8a3250b108e0 of 8 bytes by task 9544 on cpu 4:
[77346.171483]  irdma_sc_ccq_get_cqe_info+0x27a/0x370 [irdma]
[77346.171658]  irdma_cqp_ce_handler+0x164/0x270 [irdma]
[77346.171835]  cqp_compl_worker+0x1b/0x20 [irdma]
[77346.172009]  process_one_work+0x4d1/0xa40
[77346.172024]  worker_thread+0x319/0x700
[77346.172037]  kthread+0x180/0x1b0
[77346.172054]  ret_from_fork+0x22/0x30

[77346.172136] read to 0xffff8a3250b108e0 of 8 bytes by task 9838 on cpu 2:
[77346.172234]  irdma_handle_cqp_op+0xf4/0x4b0 [irdma]
[77346.172413]  irdma_cqp_aeq_cmd+0x75/0xa0 [irdma]
[77346.172592]  irdma_create_aeq+0x390/0x45a [irdma]
[77346.172769]  irdma_rt_init_hw.cold+0x212/0x85d [irdma]
[77346.172944]  irdma_probe+0x54f/0x620 [irdma]
[77346.173122]  auxiliary_bus_probe+0x66/0xa0
[77346.173137]  really_probe+0x140/0x540
[77346.173154]  __driver_probe_device+0xc7/0x220
[77346.173173]  driver_probe_device+0x5f/0x140
[77346.173190]  __driver_attach+0xf0/0x2c0
[77346.173208]  bus_for_each_dev+0xa8/0xf0
[77346.173225]  driver_attach+0x29/0x30
[77346.173240]  bus_add_driver+0x29c/0x2f0
[77346.173255]  driver_register+0x10f/0x1a0
[77346.173272]  __auxiliary_driver_register+0xbc/0x140
[77346.173287]  irdma_init_module+0x55/0x1000 [irdma]
[77346.173460]  do_one_initcall+0x7d/0x410
[77346.173475]  do_init_module+0x81/0x2c0
[77346.173491]  load_module+0x1232/0x12c0
[77346.173506]  __do_sys_finit_module+0x101/0x180
[77346.173522]  __x64_sys_finit_module+0x3c/0x50
[77346.173538]  do_syscall_64+0x39/0x90
[77346.173553]  entry_SYSCALL_64_after_hwframe+0x63/0xcd

[77346.173634] value changed: 0x0000000000000094 -> 0x0000000000000095

Fixes: 915cc7ac0f8e ("RDMA/irdma: Add miscellaneous utility definitions")
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230711175253.1289-3-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17 08:01:46 +03:00
Shiraz Saleem
4984eb5145 RDMA/irdma: Add missing read barriers
On code inspection, there are many instances in the driver where
CEQE and AEQE fields written to by HW are read without guaranteeing
that the polarity bit has been read and checked first.

Add a read barrier to avoid reordering of loads on the CEQE/AEQE fields
prior to checking the polarity bit.

Fixes: 3f49d6842569 ("RDMA/irdma: Implement HW Admin Queue OPs")
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230711175253.1289-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-07-17 08:01:22 +03:00
Jason Gunthorpe
5f004bcaee Linux 6.4
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmSYzfYeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG/ucH/iOM/1Py/fSg0qSs
 7NJ4XXlourT5zrnRMom3cm3d9gYqgTzgvKFL3kjMEexTRVYbhlcO4ZPRsiry8zxF
 ToGX+V8tDMqb8WSdFHzkljRY+zDRyfEUDMlTzROAD9DunLmQtkJKyrggkeGdjkpP
 OyfGqKpwlLXZRAXBil/U8Mx9MHdjJubloZwghLZr33VdUZa68+JJ9l6w163Oe/ET
 K264NM0wxN/kvN57JvePgqMccQwpINylg8IhRI+XelgczjUXeJBsOA8TDv4bDN4Q
 bjCLhkWbIaZtTYqvOXa/kD0T8wd7KETsMBQN8YzyDh6W0GmAlJjTawyAhA6jA5in
 x3uz2W8=
 =L3zp
 -----END PGP SIGNATURE-----

Merge tag 'v6.4' into rdma.git for-next

Linux 6.4

Resolve conflicts between rdma rc and next in rxe_cq matching linux-next:

drivers/infiniband/sw/rxe/rxe_cq.c:
  https://lore.kernel.org/r/20230622115246.365d30ad@canb.auug.org.au

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-06-27 14:06:29 -03:00
Arnd Bergmann
b002760f87 RDMA/irdma: avoid fortify-string warning in irdma_clr_wqes
Commit df8fc4e934c1 ("kbuild: Enable -fstrict-flex-arrays=3") triggers a
warning for fortified memset():

In function 'fortify_memset_chk',
    inlined from 'irdma_clr_wqes' at drivers/infiniband/hw/irdma/uk.c:103:4:
include/linux/fortify-string.h:493:25: error: call to '__write_overflow_field' declared with attribute warning: detected write beyond size of field (1st parameter); maybe use struct_group()? [-Werror=attribute-warning]
  493 |                         __write_overflow_field(p_size_field, size);
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The problem here isthat the inner array only has four 8-byte elements, so
clearing 4096 bytes overflows that. As this structure is part of an outer
array, change the code to pass a pointer to the irdma_qp_quanta instead,
and change the size argument for readability, matching the comment above
it.

Fixes: 551c46edc769 ("RDMA/irdma: Add user/kernel shared libraries")
Link: https://lore.kernel.org/r/20230523111859.2197825-1-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-06-01 12:59:12 -03:00
Mustafa Ismail
5842d1d9c1 RDMA/irdma: Fix Local Invalidate fencing
If the local invalidate fence is indicated in the WR, only the read fence
is currently being set in WQE. Fix this to set both the read and local
fence in the WQE.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20230522155654.1309-4-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-29 14:06:29 -03:00
Mustafa Ismail
c8f304d75f RDMA/irdma: Prevent QP use after free
There is a window where the poll cq may use a QP that has been freed.
This can happen if a CQE is polled before irdma_clean_cqes() can clear the
CQE's related to the QP and the destroy QP races to free the QP memory.
then the QP structures are used in irdma_poll_cq.  Fix this by moving the
clearing of CQE's before the reference is removed and the QP is destroyed.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20230522155654.1309-3-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-29 14:06:29 -03:00
Kamal Heib
a7dae5daf4 RDMA/irdma: Move iw device ops initialization
Move the initialization of the iw device ops to be under the declaration
of the irdma_iw_dev_ops.

Link: https://lore.kernel.org/r/20230515191142.413633-4-kheib@redhat.com
Signed-off-by: Kamal Heib <kheib@redhat.com>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-17 16:31:37 -03:00
Kamal Heib
bc89be9443 RDMA/irdma: Return void from irdma_init_rdma_device()
The return value from irdma_init_rdma_device() is always 0 - change it to
be void.

Link: https://lore.kernel.org/r/20230515191142.413633-3-kheib@redhat.com
Signed-off-by: Kamal Heib <kheib@redhat.com>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-17 16:31:37 -03:00
Kamal Heib
ab4e8fc174 RDMA/irdma: Return void from irdma_init_iw_device()
The return value from irdma_init_iw_device() is always 0 - change it to be
void.

Link: https://lore.kernel.org/r/20230515191142.413633-2-kheib@redhat.com
Signed-off-by: Kamal Heib <kheib@redhat.com>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-17 16:31:37 -03:00
Linus Torvalds
af3877265d v6.4 merge window RDMA pull request
Usual wide collection of unrelated items in drivers:
 
 - Driver bug fixes and treewide cleanups in hfi1, siw, qib, mlx5, rxe,
   usnic, usnic, bnxt_re, ocrdma, iser
    * Unnecessary NULL checks
    * kmap obsolescence
    * pci_enable_pcie_error_reporting() obsolescence
    * Unused variables and macros
    * trace event related warnings
    * casting warnings
 
 - Code cleanups for irdm and erdma
 
 - EFA reporting of 128 byte PCIe TLP support
 
 - mlx5 more agressively uses the out of order HW feature
 
 - Big rework of how state machines and tasks work in rxe
 
 - Fix a syzkaller found crash netdev refcount leak in siw
 
 - bnxt_re revises their HW description header
 
 - Congestion control for bnxt_re
 
 - Use mmu_notifiers more safely in hfi1
 
 - mlx5 gets better support for PCIe relaxed ordering inside VMs
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZEva5wAKCRCFwuHvBreF
 YZFmAQC9T3b/XQ3bRknYciuzbatC98o9xB0FTqmEFYGj+Y2lVAD9EEVe3HKfHfi3
 t/GxXYB5r22oxg5bgsblZfEdEdTVCg8=
 =akMm
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "Usual wide collection of unrelated items in drivers:

   - Driver bug fixes and treewide cleanups in hfi1, siw, qib, mlx5,
     rxe, usnic, usnic, bnxt_re, ocrdma, iser:
       - remove unnecessary NULL checks
       - kmap obsolescence
       - pci_enable_pcie_error_reporting() obsolescence
       - unused variables and macros
       - trace event related warnings
       - casting warnings

   - Code cleanups for irdm and erdma

   - EFA reporting of 128 byte PCIe TLP support

   - mlx5 more agressively uses the out of order HW feature

   - Big rework of how state machines and tasks work in rxe

   - Fix a syzkaller found crash netdev refcount leak in siw

   - bnxt_re revises their HW description header

   - Congestion control for bnxt_re

   - Use mmu_notifiers more safely in hfi1

   - mlx5 gets better support for PCIe relaxed ordering inside VMs"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (81 commits)
  RDMA/efa: Add rdma write capability to device caps
  RDMA/mlx5: Use correct device num_ports when modify DC
  RDMA/irdma: Drop spurious WQ_UNBOUND from alloc_ordered_workqueue() call
  RDMA/rxe: Fix spinlock recursion deadlock on requester
  RDMA/mlx5: Fix flow counter query via DEVX
  RDMA/rxe: Protect QP state with qp->state_lock
  RDMA/rxe: Move code to check if drained to subroutine
  RDMA/rxe: Remove qp->req.state
  RDMA/rxe: Remove qp->comp.state
  RDMA/rxe: Remove qp->resp.state
  RDMA/mlx5: Allow relaxed ordering read in VFs and VMs
  net/mlx5: Update relaxed ordering read HCA capabilities
  RDMA/mlx5: Check pcie_relaxed_ordering_enabled() in UMR
  RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write
  RDMA: Add ib_virt_dma_to_page()
  RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"
  RDMA/irdma: Slightly optimize irdma_form_ah_cm_frame()
  RDMA/rxe: Fix incorrect TASKLET_STATE_SCHED check in rxe_task.c
  IB/hfi1: Place struct mmu_rb_handler on cache line start
  IB/hfi1: Fix bugs with non-PAGE_SIZE-end multi-iovec user SDMA requests
  ...
2023-04-29 17:21:24 -07:00
Tejun Heo
109205b40a RDMA/irdma: Drop spurious WQ_UNBOUND from alloc_ordered_workqueue() call
Workqueue is in the process of cleaning up the distinction between unbound
workqueues w/ @nr_active==1 and ordered workqueues. Explicit WQ_UNBOUND
isn't needed for alloc_ordered_workqueue() and will trigger a warning in
the future. Let's remove it. This doesn't cause any functional changes.

Link: https://lore.kernel.org/r/ZEGW-IcFReR1juVM@slm.duckdns.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-04-21 12:35:31 -03:00
Christophe JAILLET
a2e20b29cf RDMA/irdma: Slightly optimize irdma_form_ah_cm_frame()
There is no need to zero 'pktsize' bytes of 'buf', only the header needs
to be cleared, to be safe.
All the other bytes are already written with some memcpy() at the end of
the function.

Doing so also gives the opportunity to the compiler to avoid the memset()
call. It can be inlined now that the length is known as compile time.

Link: https://lore.kernel.org/r/098e3c397be0436f1867899245ecfe656c472110.1675369386.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-04-13 12:17:45 -03:00
Tatyana Nikolova
e4522c097e RDMA/irdma: Add ipv4 check to irdma_find_listener()
Add ipv4 check to irdma_find_listener(). Otherwise the function
incorrectly finds and returns a listener with a different addr family for
the zero IP addr, if a listener with a zero IP addr and the same port as
the one searched for has already been created.

Fixes: 146b9756f14c ("RDMA/irdma: Add connection manager")
Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230315145231.931-5-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19 11:37:56 +02:00
Mustafa Ismail
8385a875c9 RDMA/irdma: Increase iWARP CM default rexmit count
When running perftest with large number of connections in iWARP mode, the
passive side could be slow to respond. Increase the rexmit counter default
to allow scaling connections.

Fixes: 146b9756f14c ("RDMA/irdma: Add connection manager")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230315145231.931-4-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19 11:37:56 +02:00
Mustafa Ismail
b69a6979db RDMA/irdma: Fix memory leak of PBLE objects
On rmmod of irdma, the PBLE object memory is not being freed. PBLE object
memory are not statically pre-allocated at function initialization time
unlike other HMC objects. PBLEs objects and the Segment Descriptors (SD)
for it can be dynamically allocated during scale up and SD's remain
allocated till function deinitialization.

Fix this leak by adding IRDMA_HMC_IW_PBLE to the iw_hmc_obj_types[] table
and skip pbles in irdma_create_hmc_obj but not in irdma_del_hmc_objects().

Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230315145231.931-3-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19 11:37:55 +02:00
Mustafa Ismail
30ed9ee9a1 RDMA/irdma: Do not generate SW completions for NOPs
Currently, artificial SW completions are generated for NOP wqes which can
generate unexpected completions with wr_id = 0. Skip the generation of
artificial completions for NOPs.

Fixes: 81091d7696ae ("RDMA/irdma: Add SW mechanism to generate completions on error")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230315145231.931-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19 11:37:55 +02:00
Sindhu Devale
cc8997c94b RDMA/irdma: Refactor PBLE functions
Refactor PBLE functions using a bit mask to represent the PBLE level
desired versus 2 parameters use_pble and lvl_one_only which makes the
code confusing.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Link: https://lore.kernel.org/r/20230315145305.955-5-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19 11:36:33 +02:00
Michal Swiatkowski
99f96b4552 RDMA/irdma: Change name of interrupts
Add more information in interrupt names.

Before this patch it was:
irdma
CEQ
CEQ
...

Now:
irdma-0000:18:00.0-AEQ
irdma-0000:18:00.0-CEQ-0
irdma-0000:18:00.0-CEQ-1
...

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Suggested-by: Piotr Raczynski <piotr.raczynski@intel.com>
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230315145305.955-4-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19 11:36:26 +02:00
Tatyana Nikolova
0219ad5d3a RDMA/irdma: Remove a redundant irdma_arp_table() call
Remove a redundant function call in irdma_modify_qp_roce, since
irdma_arp_table() with IRDMA_ARP_RESOLVE action is called after the if/else
ipv check as part of irdma_add_arp().

Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230315145305.955-3-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19 11:36:03 +02:00
Krzysztof Czurylo
5a711e5807 RDMA/irdma: Refactor HW statistics
Refactor HW statistics which,

- Unifies HW statistics support for all HW generations.

- Unifies support of 32- and 64-bit counters.

- Removes duplicated code and simplifies implementation.

- Fixes roll-over handling.

- Removes unneeded last_hw_stats.

With new implementation, there is no separate handling and no separate
arrays for 32- and 64-bit counters (offsets, regs, values). Instead,
there is a HW stats map array for each HW revision, which defines
HW-specific width and location of each counter in the statistics buffer.

Once the statistics are gathered (either via CQP op, or by reading HW
registers), counter values are extracted from the statistics buffer using
the stats map and the delta between the last and new values is computed.
Finally, the counter values in rdma_hw_stats are incremented by those
deltas.

From the OS perspective, all the counters are 64-bit and their order in
rdma_hw_stats->value[] array, as well as in irdma_hw_stat_names[], is the
same for all HW gens.  New statistics should always be added at the end.

Signed-off-by: Krzysztof Czurylo <krzysztof.czurylo@intel.com>
Signed-off-by: Youvaraj Sagar <youvaraj.sagar@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20230315145305.955-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19 11:36:03 +02:00
Linus Torvalds
8cbd92339d v6.3 RDMA pull request
Small cycle this time:
 
 - Minor driver updates for hfi1, cxgb4, erdma, hns, irdma, mlx5, siw, mana
 
 - inline CQE support for hns
 
 - Have mlx5 display device error codes
 
 - Pinned DMABUF support for irdma
 
 - Continued rxe cleanups, particularly converting the MRs to use xarray
 
 - Improvements to what can be cached in the mlx5 mkey cache
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCY/gPmgAKCRCFwuHvBreF
 YW5IAP4xOAiTif4f87vD1twRU/ebq4VEX0r+C2NX5x5fwlCJrAEA7RLV8uG9Uii2
 ez0BuWNxfajuvFHntnZ1E+7UDP0S8gk=
 =CgUH
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "Quite a small cycle this time, even with the rc8. I suppose everyone
  went to sleep over xmas.

   - Minor driver updates for hfi1, cxgb4, erdma, hns, irdma, mlx5, siw,
     mana

   - inline CQE support for hns

   - Have mlx5 display device error codes

   - Pinned DMABUF support for irdma

   - Continued rxe cleanups, particularly converting the MRs to use
     xarray

   - Improvements to what can be cached in the mlx5 mkey cache"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (61 commits)
  IB/mlx5: Extend debug control for CC parameters
  IB/hfi1: Fix sdma.h tx->num_descs off-by-one errors
  IB/hfi1: Fix math bugs in hfi1_can_pin_pages()
  RDMA/irdma: Add support for dmabuf pin memory regions
  RDMA/mlx5: Use query_special_contexts for mkeys
  net/mlx5e: Use query_special_contexts for mkeys
  net/mlx5: Change define name for 0x100 lkey value
  net/mlx5: Expose bits for querying special mkeys
  RDMA/rxe: Fix missing memory barriers in rxe_queue.h
  RDMA/mana_ib: Fix a bug when the PF indicates more entries for registering memory on first packet
  RDMA/rxe: Remove rxe_alloc()
  RDMA/cma: Distinguish between sockaddr_in and sockaddr_in6 by size
  Subject: RDMA/rxe: Handle zero length rdma
  iw_cxgb4: Fix potential NULL dereference in c4iw_fill_res_cm_id_entry()
  RDMA/mlx5: Use rdma_umem_for_each_dma_block()
  RDMA/umem: Remove unused 'work' member from struct ib_umem
  RDMA/irdma: Cap MSIX used to online CPUs + 1
  RDMA/mlx5: Check reg_create() create for errors
  RDMA/restrack: Correct spelling
  RDMA/cxgb4: Fix potential null-ptr-deref in pass_establish()
  ...
2023-02-24 15:11:03 -08:00
Zhu Yanjun
d2225b838c RDMA/irdma: Add support for dmabuf pin memory regions
This is a followup to the EFA dmabuf[1]. Irdma driver currently does
not support on-demand-paging(ODP). So it uses habanalabs as the
dmabuf exporter, and irdma as the importer to allow for peer2peer
access through libibverbs.

In this commit, the function ib_umem_dmabuf_get_pinned() is used.
This function is introduced in EFA dmabuf[1] which allows the driver
to get a dmabuf umem which is pinned and does not require move_notify
callback implementation. The returned umem is pinned and DMA mapped
like standard cpu umems, and is released through ib_umem_release().

[1]https://lore.kernel.org/lkml/20211007114018.GD2688930@ziepe.ca/t/

Link: https://lore.kernel.org/r/20230217011425.498847-1-yanjun.zhu@intel.com
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-02-17 16:36:14 -04:00
Mustafa Ismail
9cd9842c46 RDMA/irdma: Cap MSIX used to online CPUs + 1
The irdma driver can use a maximum number of msix vectors equal
to num_online_cpus() + 1 and the kernel warning stack below is shown
if that number is exceeded.

The kernel throws a warning as the driver tries to update the affinity
hint with a CPU mask greater than the max CPU IDs. Fix this by capping
the MSIX vectors to num_online_cpus() + 1.

 WARNING: CPU: 7 PID: 23655 at include/linux/cpumask.h:106 irdma_cfg_ceq_vector+0x34c/0x3f0 [irdma]
 RIP: 0010:irdma_cfg_ceq_vector+0x34c/0x3f0 [irdma]
 Call Trace:
 irdma_rt_init_hw+0xa62/0x1290 [irdma]
 ? irdma_alloc_local_mac_entry+0x1a0/0x1a0 [irdma]
 ? __is_kernel_percpu_address+0x63/0x310
 ? rcu_read_lock_held_common+0xe/0xb0
 ? irdma_lan_unregister_qset+0x280/0x280 [irdma]
 ? irdma_request_reset+0x80/0x80 [irdma]
 ? ice_get_qos_params+0x84/0x390 [ice]
 irdma_probe+0xa40/0xfc0 [irdma]
 ? rcu_read_lock_bh_held+0xd0/0xd0
 ? irdma_remove+0x140/0x140 [irdma]
 ? rcu_read_lock_sched_held+0x62/0xe0
 ? down_write+0x187/0x3d0
 ? auxiliary_match_id+0xf0/0x1a0
 ? irdma_remove+0x140/0x140 [irdma]
 auxiliary_bus_probe+0xa6/0x100
 __driver_probe_device+0x4a4/0xd50
 ? __device_attach_driver+0x2c0/0x2c0
 driver_probe_device+0x4a/0x110
 __driver_attach+0x1aa/0x350
 bus_for_each_dev+0x11d/0x1b0
 ? subsys_dev_iter_init+0xe0/0xe0
 bus_add_driver+0x3b1/0x610
 driver_register+0x18e/0x410
 ? 0xffffffffc0b88000
 irdma_init_module+0x50/0xaa [irdma]
 do_one_initcall+0x103/0x5f0
 ? perf_trace_initcall_level+0x420/0x420
 ? do_init_module+0x4e/0x700
 ? __kasan_kmalloc+0x7d/0xa0
 ? kmem_cache_alloc_trace+0x188/0x2b0
 ? kasan_unpoison+0x21/0x50
 do_init_module+0x1d1/0x700
 load_module+0x3867/0x5260
 ? layout_and_allocate+0x3990/0x3990
 ? rcu_read_lock_held_common+0xe/0xb0
 ? rcu_read_lock_sched_held+0x62/0xe0
 ? rcu_read_lock_bh_held+0xd0/0xd0
 ? __vmalloc_node_range+0x46b/0x890
 ? lock_release+0x5c8/0xba0
 ? alloc_vm_area+0x120/0x120
 ? selinux_kernel_module_from_file+0x2a5/0x300
 ? __inode_security_revalidate+0xf0/0xf0
 ? __do_sys_init_module+0x1db/0x260
 __do_sys_init_module+0x1db/0x260
 ? load_module+0x5260/0x5260
 ? do_syscall_64+0x22/0x450
 do_syscall_64+0xa5/0x450
 entry_SYSCALL_64_after_hwframe+0x66/0xdb

Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Link: https://lore.kernel.org/r/20230207201938.1329-1-sindhu.devale@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-02-08 10:28:13 +02:00
Nikita Zhandarovich
5d9745cead RDMA/irdma: Fix potential NULL-ptr-dereference
in_dev_get() can return NULL which will cause a failure once idev is
dereferenced in in_dev_for_each_ifa_rtnl(). This patch adds a
check for NULL value in idev beforehand.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Fixes: 146b9756f14c ("RDMA/irdma: Add connection manager")
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Link: https://lore.kernel.org/r/20230126185230.62464-1-n.zhandarovich@fintech.ru
Reviewed-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-29 14:55:54 +02:00
Zhu Yanjun
2f25e3bab0 RDMA/irdma: Split CQ handler into irdma_reg_user_mr_type_cq
Split the source codes related with CQ handling into a new function.

Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://lore.kernel.org/r/20230116193502.66540-5-yanjun.zhu@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-26 12:58:46 +02:00
Zhu Yanjun
e965ef0e7b RDMA/irdma: Split QP handler into irdma_reg_user_mr_type_qp
Split the source codes related with QP handling into a new function.

Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://lore.kernel.org/r/20230116193502.66540-4-yanjun.zhu@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-26 12:58:46 +02:00
Zhu Yanjun
693a5386ef RDMA/irdma: Split mr alloc and free into new functions
In the function irdma_reg_user_mr, the mr allocation and free
will be used by other functions. As such, the source codes related
with mr allocation and free are split into the new functions.

Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://lore.kernel.org/r/20230116193502.66540-3-yanjun.zhu@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-26 12:58:46 +02:00
Zhu Yanjun
01798df198 RDMA/irdma: Split MEM handler into irdma_reg_user_mr_type_mem
The source codes related with IRDMA_MEMREG_TYPE_MEM are split
into a new function irdma_reg_user_mr_type_mem.

Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://lore.kernel.org/r/20230116193502.66540-2-yanjun.zhu@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-26 12:58:46 +02:00
Zhu Yanjun
bd99ede8ef RDMA/irdma: Remove extra ret variable in favor of existing err
In the function irdma_reg_user_mr, err and ret exist. Actually,
one variable err is enough.

Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://lore.kernel.org/r/20230104064333.660344-1-yanjun.zhu@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-04 08:59:02 +02:00
Mustafa Ismail
9907526d25 RDMA/irdma: Initialize net_type before checking it
The av->net_type is not initialized before it is checked in
irdma_modify_qp_roce. This leads to an incorrect update to the ARP cache
and QP context. RoCEv2 connections might fail as result.

Set the net_type using rdma_gid_attr_network_type.

Fixes: 80005c43d4c8 ("RDMA/irdma: Use net_type to check network type")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20221122004410.1471-1-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-22 16:13:44 +02:00
Mustafa Ismail
8f7e2daa63 RDMA/irdma: Do not request 2-level PBLEs for CQ alloc
When allocating PBLE's for a large CQ, it is possible
that a 2-level PBLE is returned which would cause the
CQ allocation to fail since 1-level is assumed and checked for.
Fix this by requesting a level one PBLE only.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20221115011701.1379-4-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-17 10:41:28 +02:00
Mustafa Ismail
24419777e9 RDMA/irdma: Fix RQ completion opcode
The opcode written by HW, in the RQ CQE, is the
RoCEv2/iWARP protocol opcode from the received
packet and not the SW opcode as currently assumed.
Fix this by returning the raw operation type and
queue type in the CQE to irdma_process_cqe and add
2 helpers set_ib_wc_op_sq set_ib_wc_op_rq to map
IRDMA HW op types to IB op types.

Note that for iWARP, only Write with Immediate is
supported so the opcode can only be IB_WC_RECV_RDMA_WITH_IMM
when there is immediate data present.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20221115011701.1379-3-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-17 10:41:28 +02:00
Mustafa Ismail
4f44e519b6 RDMA/irdma: Fix inline for multiple SGE's
Currently, inline send and inline write assume a single
SGE and only copy data from the first one. Add support
for multiple SGE's.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20221115011701.1379-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-17 10:41:28 +02:00
Shiraz Saleem
4eace75e08 RDMA/irdma: Report the correct link speed
The active link speed is currently hard-coded in irdma_query_port due
to which the port rate in ibstatus does reflect the active link speed.

Call ib_get_eth_speed in irdma_query_port to get the active link speed.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Reported-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20221104234957.1135-1-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-07 09:12:42 +02:00
Jason Gunthorpe
33331a728c Linux 6.0
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmM5/fMeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG+Y0H/jG0HQjUHfIOGjPQ
 T33eaF5POoZKzZmsTNITbtya7B3/OTZZRGdSq9B8t5tElyPnZrdIfaXds17mCI8y
 5DJUEQGdv6fqRttYLGKf6rk1kzABhhaTS8n9BgDFEsmvdqwG4AV6dLQr3HL09gTV
 Lu+Is86RPwmpgH0Z9u7DyFCF8yLjefyu0vl6eFm/SXjCE8gQM/LZQSi9mv5/YDxa
 MVKlsZKIkQm6P8a6r8wbGedKYBped4+4gYedsf/IcS0lvKdLIs/P7YgR5wKklSbM
 tqECvBOmKq1Fwj/oxH+fx0KLX/ZD1xxQwoQd+a9DOSo+BuPBt6KGojYT9gQRyFJp
 R7gyUCo=
 =2UOD
 -----END PGP SIGNATURE-----

Merge tag 'v6.0' into rdma.git for-next

Trvial merge conflicts against rdma.git for-rc resolved matching
linux-next:
            drivers/infiniband/hw/hns/hns_roce_hw_v2.c
            drivers/infiniband/hw/hns/hns_roce_main.c

https://lore.kernel.org/r/20220929124005.105149-1-broonie@kernel.org

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-10-06 19:48:45 -03:00
Shiraz Saleem
34acb833cc RDMA/irdma: Validate udata inlen and outlen
Currently ib_copy_from_udata and ib_copy_to_udata could underfill
the request and response buffer if the user-space passes an undersized
value for udata->inlen or udata->outlen respectively [1]
This could lead to undesirable behavior.

Zero initing the buffer only goes as far as preventing using the buffer
uninitialized.

Validate udata->inlen and udata->outlen passed from user-space to ensure
they are at least the required minimum size.

[1] https://lore.kernel.org/linux-rdma/MWHPR11MB0029F37D40D9D4A993F8F549E9D79@MWHPR11MB0029.namprd11.prod.outlook.com/

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20220907191324.1173-3-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-20 13:19:53 +03:00
Sindhu-Devale
7f51a961f8 RDMA/irdma: Align AE id codes to correct flush code and event
A number of asynchronous event (AE) ids were not aligned to the
correct flush_code and event_type. Fix these up so that the
correct IBV error and event codes are returned to application.

Also, add handling for new AE ids like IRDMA_AE_INVALID_REQUEST to
return the correct WC error code.

Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions")
Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20220907191324.1173-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-20 13:19:52 +03:00
Sindhu-Devale
a261786fdc RDMA/irdma: Report RNR NAK generation in device caps
Report RNR NAK generation when device capabilities are queried

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20220906223244.1119-6-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07 11:22:18 +03:00
Sindhu-Devale
2c8844431d RDMA/irdma: Use s/g array in post send only when its valid
Send with invalidate verb call can pass in an
uninitialized s/g array with 0 sge's which is
filled into irdma WQE and causes a HW asynchronous
event.

Fix this by using the s/g array in irdma post send
only when its valid.

Fixes: 551c46e ("RDMA/irdma: Add user/kernel shared libraries")
Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20220906223244.1119-5-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07 11:22:18 +03:00
Sindhu-Devale
dcb23bbb1d RDMA/irdma: Return correct WC error for bind operation failure
When a QP and a MR on a local host are in different PDs, the HW generates
an asynchronous event (AE). The same AE is generated when a QP and a MW
are in different PDs during a bind operation. Return the more appropriate
IBV_WC_MW_BIND_ERR for the latter case by checking the OP type from the
CQE in error.

Fixes: 551c46edc769 ("RDMA/irdma: Add user/kernel shared libraries")
Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20220906223244.1119-4-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07 11:22:18 +03:00
Sindhu-Devale
6b227bd32d RDMA/irdma: Return error on MR deregister CQP failure
The MR deregister CQP can fail if an MW is bound to it.
Return an appropriate error for this case.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20220906223244.1119-3-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07 11:22:17 +03:00
Sindhu-Devale
12faad5e5c RDMA/irdma: Report the correct max cqes from query device
Report the correct max cqes available to an application taking
into account a reserved entry to detect overflow.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Signed-off-by: Sindhu-Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Link: https://lore.kernel.org/r/20220906223244.1119-2-shiraz.saleem@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-07 11:22:17 +03:00
Shiraz Saleem
ead54ced63 RDMA/irdma: Fix drain SQ hang with no completion
SW generated completions for outstanding WRs posted on SQ
after QP is in error target the wrong CQ. This causes the
ib_drain_sq to hang with no completion.

Fix this to generate completions on the right CQ.

[  863.969340] INFO: task kworker/u52:2:671 blocked for more than 122 seconds.
[  863.979224]       Not tainted 5.14.0-130.el9.x86_64 #1
[  863.986588] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  863.996997] task:kworker/u52:2   state:D stack:    0 pid:  671 ppid:     2 flags:0x00004000
[  864.007272] Workqueue: xprtiod xprt_autoclose [sunrpc]
[  864.014056] Call Trace:
[  864.017575]  __schedule+0x206/0x580
[  864.022296]  schedule+0x43/0xa0
[  864.026736]  schedule_timeout+0x115/0x150
[  864.032185]  __wait_for_common+0x93/0x1d0
[  864.037717]  ? usleep_range_state+0x90/0x90
[  864.043368]  __ib_drain_sq+0xf6/0x170 [ib_core]
[  864.049371]  ? __rdma_block_iter_next+0x80/0x80 [ib_core]
[  864.056240]  ib_drain_sq+0x66/0x70 [ib_core]
[  864.062003]  rpcrdma_xprt_disconnect+0x82/0x3b0 [rpcrdma]
[  864.069365]  ? xprt_prepare_transmit+0x5d/0xc0 [sunrpc]
[  864.076386]  xprt_rdma_close+0xe/0x30 [rpcrdma]
[  864.082593]  xprt_autoclose+0x52/0x100 [sunrpc]
[  864.088718]  process_one_work+0x1e8/0x3c0
[  864.094170]  worker_thread+0x50/0x3b0
[  864.099109]  ? rescuer_thread+0x370/0x370
[  864.104473]  kthread+0x149/0x170
[  864.109022]  ? set_kthread_struct+0x40/0x40
[  864.114713]  ret_from_fork+0x22/0x30

Fixes: 81091d7696ae ("RDMA/irdma: Add SW mechanism to generate completions on error")
Link: https://lore.kernel.org/r/20220824154358.117-1-shiraz.saleem@intel.com
Reported-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-28 12:43:37 +03:00
Linus Torvalds
e495274793 v5.20 pull request
This PR includes a new RDMA driver for Alibaba Cloud hardware
 
 - Bug fixes and small features for irdma, hns, siw, qedr, hfi1, mlx5
 
 - General spelling/grammer fixes
 
 - rdma cm can follow changes in neighbours for control packets
 
 - Significant amounts of rxe fixes and spec compliance changes
 
 - Use the modern NAPI API
 
 - Use the bitmap API instead of open coding
 
 - Performance improvements for rtrs
 
 - Add the ERDMA driver for Alibaba cloud
 
 - Fix a use after free bug in SRP
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCYuwAuAAKCRCFwuHvBreF
 YcRDAQC41YJNs7xve7r62/E6M+o/AXiwXa+m8rGRvcP3mdilNAEAhdom6HskenMZ
 /sopeBWF78M9plLvNzWkwukaqIwrXgM=
 =abuq
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "This cycle we got a new RDMA driver "ERDMA" for the Alibaba cloud
  environment. Otherwise the changes are dominated by rxe fixes.

  There is another RDMA driver on the list that might get merged next
  cycle, 'MANA' for the Azure cloud environment.

  Summary:

   - Bug fixes and small features for irdma, hns, siw, qedr, hfi1, mlx5

   - General spelling/grammer fixes

   - rdma cm can follow changes in neighbours for control packets

   - Significant amounts of rxe fixes and spec compliance changes

   - Use the modern NAPI API

   - Use the bitmap API instead of open coding

   - Performance improvements for rtrs

   - Add the ERDMA driver for Alibaba cloud

   - Fix a use after free bug in SRP"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (99 commits)
  RDMA/ib_srpt: Unify checking rdma_cm_id condition in srpt_cm_req_recv()
  RDMA/rxe: Fix error unwind in rxe_create_qp()
  RDMA/mlx5: Add missing check for return value in get namespace flow
  RDMA/rxe: Split qp state for requester and completer
  RDMA/rxe: Generate error completion for error requester QP state
  RDMA/rxe: Update wqe_index for each wqe error completion
  RDMA/srpt: Fix a use-after-free
  RDMA/srpt: Introduce a reference count in struct srpt_device
  RDMA/srpt: Duplicate port name members
  IB/qib: Fix repeated "in" within comments
  RDMA/erdma: Add driver to kernel build environment
  RDMA/erdma: Add the ABI definitions
  RDMA/erdma: Add the erdma module
  RDMA/erdma: Add connection management (CM) support
  RDMA/erdma: Add verbs implementation
  RDMA/erdma: Add verbs header file
  RDMA/erdma: Add event queue implementation
  RDMA/erdma: Add cmdq implementation
  RDMA/erdma: Add main include file
  RDMA/erdma: Add the hardware related definitions
  ...
2022-08-04 19:54:32 -07:00
Christophe JAILLET
82319639cd RDMA/irdma: Use the bitmap API to allocate bitmaps
Use bitmap_zalloc()/bitmap_free() instead of hand-writing them.

It is less verbose and it improves the semantic.

Link: https://lore.kernel.org/r/1f671b1af5881723ee265a0a12809c92950e58aa.1657567269.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-07-18 13:23:58 +03:00
Mustafa Ismail
3a844596ed RDMA/irdma: Fix setting of QP context err_rq_idx_valid field
Setting err_rq_idx_valid field in QP context when the AE source of the
AEQE is not associated with an RQ causes the firmware flush to fail.

Set err_rq_idx_valid field in QP context only if it is associated with an
RQ. Additionally, cleanup the redundant setting of this field in
irdma_process_aeq.

Fixes: 44d9e52977a1 ("RDMA/irdma: Implement device initialization definitions")
Link: https://lore.kernel.org/r/20220705230815.265-8-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-07-18 10:40:12 +03:00
Mustafa Ismail
82ab2b5265 RDMA/irdma: Fix VLAN connection with wildcard address
When an application listens on a wildcard address, and there are VLAN and
non-VLAN IP addresses, iWARP connection establishemnt can fail if the listen
node VLAN ID does not match.

Fix this by checking the vlan_id only if not a wildcard listen node.

Fixes: 146b9756f14c ("RDMA/irdma: Add connection manager")
Link: https://lore.kernel.org/r/20220705230815.265-7-shiraz.saleem@intel.com
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-07-18 10:40:05 +03:00
Mustafa Ismail
8ecef7890b RDMA/irdma: Fix a window for use-after-free
During a destroy CQ an interrupt may cause processing of a CQE after CQ
resources are freed by irdma_cq_free_rsrc(). Fix this by moving the call
to irdma_cq_free_rsrc() after the irdma_sc_cleanup_ceqes(), which is
called under the cq_lock.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20220705230815.265-6-shiraz.saleem@intel.com
Signed-off-by: Bartosz Sobczak <bartosz.sobczak@intel.com>
Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-07-18 10:39:57 +03:00