IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Synchronize the shared mlx5 branch with net:
- From Jiri: fixe a deadlock in mlx5_ib's netdev notifier unregister.
- From Mark and Patrisious: add IPsec RoCEv2 support.
- From Or: Rely on firmware to get special mkeys
* branch mlx5-next:
RDMA/mlx5: Use query_special_contexts for mkeys
net/mlx5e: Use query_special_contexts for mkeys
net/mlx5: Change define name for 0x100 lkey value
net/mlx5: Expose bits for querying special mkeys
net/mlx5: Configure IPsec steering for egress RoCEv2 traffic
net/mlx5: Configure IPsec steering for ingress RoCEv2 traffic
net/mlx5: Add IPSec priorities in RDMA namespaces
net/mlx5: Implement new destination type TABLE_TYPE
net/mlx5: Introduce new destination type TABLE_TYPE
RDMA/mlx5: Track netdev to avoid deadlock during netdev notifier unregister
net/mlx5e: Propagate an internal event in case uplink netdev changes
net/mlx5e: Fix trap event handling
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Use query_sepcial_contexts to get the correct value of mkeys such as
null_mkey, terminate_scatter_list_mkey and dump_fill_mkey, as FW will
change them in certain configurations.
Link: https://lore.kernel.org/r/000236f0a9487d48809f87bcc3620a3964b2d3d3.1673960981.git.leon@kernel.org
Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Change define of 0x100 lkey value from MLX5_INVALID_LKEY to be
MLX5_TERMINATE_SCATTER_LIST_LKEY as 0x100 is the value of
terminate_scatter_list_mkey.
Link: https://lore.kernel.org/r/3a116dc3fbae4cb6b76a63d27d418830b06ade0c.1673960981.git.leon@kernel.org
Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
An earlier patch which introduced smp_load_acquire/smp_store_release
into rxe_queue.h incorrectly assumed that surrounding spin-locks in
rxe_verbs.c around queue updates for kernel ulps was sufficient to
protect the passing of data through the queues between the ulp and
the rxe tasklets. But this was incorrect. The typical sequence was
ulp rxe requester tasklet
------------------------ ---------------------
spin_lock_irqsave() wqe = queue_head(queue)
if (!queue_full(q)) { if (!wqe)
spin_unlock_irqrestore return;
return -ENOMEM
} <process wqe>
wqe = queue_producer_addr(q)
<fill in wqe> queue_advance_consumer(queue)
queue_advance_producer(q)
spin_unlock_irqrestore()
queue_head() calls queue_empty() which calls smp_load_acquire()
For user space apps queue_advance_producer() calls smp_store_release()
so that there is a memory barrier between the producer and the
consumer but for kernel ulps queue_advance_produce() just incremented
the producer index because the lock function is a release function.
But to work the barrier has to come between filling in the wqe and
updating the producer index. This patch adds the missing barriers.
It also changes the enum names for the ulp queue types to
QUEUE_TYPE_FROM/TO_ULP instead of QUEUE_TYPE_TO/FROM_DRIVER
which is very ambiguous. This bug is suspected as the cause of very
rare lockups in a very high scale storage application. It is a bug
in any case and should be corrected.
Fixes: 0a67c46d2e99 ("RDMA/rxe: Protect user space index loads/stores")
Link: https://lore.kernel.org/r/20230214071053.5395-1-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
When registering memory in a large chunk that doesn't fit into a single PF
message, the PF may return GDMA_STATUS_MORE_ENTRIES on the first message if
there are more messages needed for registering more chunks.
Fix the VF to make it process the correct return code.
Fixes: 0266a177631d ("RDMA/mana_ib: Add a driver for Microsoft Azure Network Adapter")
Link: https://lore.kernel.org/r/1676507522-21018-1-git-send-email-longli@linuxonhyperv.com
Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Currently all the object types in the rxe driver are allocated in
rdma-core except for MRs. By moving tha kzalloc() call outside of
the pool code the rxe_alloc() subroutine can be eliminated and code
checking for MR as a special case can be removed.
This patch moves the kzalloc() and kfree_rcu() calls into the mr
registration and destruction verbs. It removes that code from
rxe_pool.c including the rxe_alloc() subroutine which is no longer
used.
Link: https://lore.kernel.org/r/20230213225551.12437-1-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Reviewed-by: Devesh Sharma <devesh.s.sharma@oracle.com>
Reviewed-by: Devesh Sharma <devesh.s.sharma@oracle.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Clang can do some aggressive inlining, which provides it with greater
visibility into the sizes of various objects that are passed into
helpers. Specifically, compare_netdev_and_ip() can see through the type
given to the "sa" argument, which means it can generate code for "struct
sockaddr_in" that would have been passed to ipv6_addr_cmp() (that expects
to operate on the larger "struct sockaddr_in6"), which would result in a
compile-time buffer overflow condition detected by memcmp(). Logically,
this state isn't reachable due to the sa_family assignment two callers
above and the check in compare_netdev_and_ip(). Instead, provide a
compile-time check on sizes so the size-mismatched code will be elided
when inlining. Avoids the following warning from Clang:
../include/linux/fortify-string.h:652:4: error: call to '__read_overflow' declared with 'error' attribute: detected read beyond size of object (1st parameter)
__read_overflow();
^
note: In function 'cma_netevent_callback'
note: which inlined function 'node_from_ndev_ip'
1 error generated.
When the underlying object size is not known (e.g. with GCC and older
Clang), the result of __builtin_object_size() is SIZE_MAX, which will also
compile away, leaving the code as it was originally.
Link: https://lore.kernel.org/r/20230208232549.never.139-kees@kernel.org
Link: https://github.com/ClangBuiltLinux/linux/issues/1687
Signed-off-by: Kees Cook <keescook@chromium.org>
Tested-by: Nathan Chancellor <nathan@kernel.org> # build
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Currently the rxe driver does not handle all cases of zero length rdma
operations correctly. The client does not have to provide an rkey for zero
length RDMA read or write operations so the rkey provided may be invalid
and should not be used to lookup an mr.
This patch corrects the driver to ignore the provided rkey if the reth
length is zero for read or write operations and make sure to set the mr to
NULL. In read_reply() if length is zero rxe_recheck_mr() is not
called. Warnings are added in the routines in rxe_mr.c to catch NULL MRs
when the length is non-zero.
Fixes: 8700e3e7c485 ("Soft RoCE driver")
Link: https://lore.kernel.org/r/20230202044240.6304-1-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Reviewed-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
This condition needs to match the previous "if (epcp->state == LISTEN) {"
exactly to avoid a NULL dereference of either "listen_ep" or "ep". The
problem is that "epcp" has been re-assigned so just testing
"if (epcp->state == LISTEN) {" a second time is not sufficient.
Fixes: 116aeb887371 ("iw_cxgb4: provide detailed provider-specific CM_ID information")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Link: https://lore.kernel.org/r/Y+usKuWIKr4dimZh@kili
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Replace an open coding of rdma_umem_for_each_dma_block() with the proper
function.
Fixes: b3d47ebd4908 ("RDMA/mlx5: Use mlx5_umr_post_send_wait() to update MR pas")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/0-v1-c13a5b88359b+556d0-mlx5_umem_block_jgg@nvidia.com
Reviewed-by: Devesh Sharma <devesh.s.sharma@oracle.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
When removing a network namespace with mlx5 devlink instance being in
it, following callchain is performed:
cleanup_net (takes down_read(&pernet_ops_rwsem)
devlink_pernet_pre_exit()
devlink_reload()
mlx5_devlink_reload_down()
mlx5_unload_one_devl_locked()
mlx5_detach_device()
del_adev()
mlx5r_remove()
__mlx5_ib_remove()
mlx5_ib_roce_cleanup()
mlx5_remove_netdev_notifier()
unregister_netdevice_notifier (takes down_write(&pernet_ops_rwsem)
This deadlocks.
Resolve this by converting to register_netdevice_notifier_dev_net()
which does not take pernet_ops_rwsem and moves the notifier block around
according to netdev it takes as arg.
Use previously introduced netdev added/removed events to track uplink
netdev to be used for register_netdevice_notifier_dev_net() purposes.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
The reg_create() can fail. Check for errors before dereferencing it.
Fixes: dd1b913fb0d0 ("RDMA/mlx5: Cache all user cacheable mkeys on dereg MR flow")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Link: https://lore.kernel.org/r/Y+ERYy4wN0LsKsm+@kili
Reviewed-by: Devesh Sharma <devesh.s.sharma@oracle.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
If get_ep_from_tid() fails to lookup non-NULL value for ep, ep is
dereferenced later regardless of whether it is empty.
This patch adds a simple sanity check to fix the issue.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 944661dd97f4 ("RDMA/iw_cxgb4: atomically lookup ep and get a reference")
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Link: https://lore.kernel.org/r/20230202184850.29882-1-n.zhandarovich@fintech.ru
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Block MR cache debugfs creation for IB representor flow as MR cache shouldn't be used
at all in that mode. As part of this change, add missing debugfs cleanup in error path
too.
This change fixes the following debugfs errors:
bond0: (slave enp8s0f1): Enslaving as a backup interface with an up link
mlx5_core 0000:08:00.0: lag map: port 1:1 port 2:1
mlx5_core 0000:08:00.0: shared_fdb:1 mode:queue_affinity
mlx5_core 0000:08:00.0: Operation mode is single FDB
debugfs: Directory '2' with parent '/' already present!
...
debugfs: Directory '22' with parent '/' already present!
Fixes: 73d09b2fe833 ("RDMA/mlx5: Introduce mlx5r_cache_rb_key")
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Link: https://lore.kernel.org/r/482a78c54acbcfa1742a0e06a452546428900ffa.1675328463.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
To avoid racing with other user memory reservations, immediately
account full amount of pages to be pinned.
Fixes: 2251334dcac9 ("rdma/siw: application buffer management")
Reported-by: Jason Gunthorpe <jgg@nvidia.com>
Suggested-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com>
Link: https://lore.kernel.org/r/20230202101000.402990-1-bmt@zurich.ibm.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
ip_dev_find() may return NULL and assign it to pdev which is
dereferenced later.
Fix this by checking the return value of ip_dev_find() for NULL
similar to the way it is done with other instances of said function.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 1cab775c3e75 ("RDMA/cxgb4: Fix LE hash collision bug for passive open connection")
Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru>
Link: https://lore.kernel.org/r/20230201172103.17261-1-n.zhandarovich@fintech.ru
Signed-off-by: Leon Romanovsky <leon@kernel.org>
The non-cache mkeys are stored in the cache only to shorten restarting
application time. Don't store them longer than needed.
Configure cache entries that store non-cache MRs as temporary entries. If
30 seconds have passed and no user reclaimed the temporarily cached mkeys,
an asynchronous work will destroy the mkeys entries.
Link: https://lore.kernel.org/r/20230125222807.6921-7-michaelgur@nvidia.com
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Currently, when dereging an MR, if the mkey doesn't belong to a cache
entry, it will be destroyed. As a result, the restart of applications
with many non-cached mkeys is not efficient since all the mkeys are
destroyed and then recreated. This process takes a long time (for 100,000
MRs, it is ~20 seconds for dereg and ~28 seconds for re-reg).
To shorten the restart runtime, insert all cacheable mkeys to the cache.
If there is no fitting entry to the mkey properties, create a temporary
entry that fits it.
After a predetermined timeout, the cache entries will shrink to the
initial high limit.
The mkeys will still be in the cache when consuming them again after an
application restart. Therefore, the registration will be much faster
(for 100,000 MRs, it is ~4 seconds for dereg and ~5 seconds for re-reg).
The temporary cache entries created to store the non-cache mkeys are not
exposed through sysfs like the default cache entries.
Link: https://lore.kernel.org/r/20230125222807.6921-6-michaelgur@nvidia.com
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Switch from using the mkey order to using the new struct as the key to the
RB tree of cache entries.
The key is all the mkey properties that UMR operations can't modify.
Using this key to define the cache entries and to search and create cache
mkeys.
Link: https://lore.kernel.org/r/20230125222807.6921-5-michaelgur@nvidia.com
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Currently, the cache structure is a static linear array. Therefore, his
size is limited to the number of entries in it and is not expandable. The
entries are dedicated to mkeys of size 2^x and no access_flags. Mkeys with
different properties are not cacheable.
In this patch, we change the cache structure to an RB-tree. This will
allow to extend the cache to support more entries with different mkey
properties.
Link: https://lore.kernel.org/r/20230125222807.6921-4-michaelgur@nvidia.com
Signed-off-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Implicit ODP mkey doesn't have unique properties. It shares the same
properties as the order 18 cache entry. There is no need to devote a
special entry for that.
Link: https://lore.kernel.org/r/20230125222807.6921-3-michaelgur@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
mkc.log_page_size can be changed using UMR. Therefore, don't treat it as a
cache entry property.
Removing it from struct mlx5_cache_ent.
All cache mkeys will be created with default PAGE_SHIFT, and updated with
the needed page_shift using UMR when passing them to a user.
Link: https://lore.kernel.org/r/20230125222807.6921-2-michaelgur@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Replace struct rxe-phys_buf and struct rxe_map by struct xarray
in rxe_verbs.h. This allows using rcu locking on reads for
the memory maps stored in each mr.
This is based off of a sketch of a patch from Jason Gunthorpe in the
link below. Some changes were needed to make this work. It applies
cleanly to the current for-next and passes the pyverbs, perftest
and the same blktests test cases which run today.
Link: https://lore.kernel.org/r/20230119235936.19728-7-rpearsonhpe@gmail.com
Link: https://lore.kernel.org/linux-rdma/Y3gvZr6%2FNCii9Avy@nvidia.com/
Co-developed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Cleanup usage of mr->page_shift and mr->page_mask and introduce
an extractor for mr->ibmr.page_size. Normal usage in the kernel
has page_mask masking out offset in page rather than masking out
the page number. The rxe driver had reversed that which was confusing.
Implicitly there can be a per mr page_size which was not uniformly
supported.
Link: https://lore.kernel.org/r/20230119235936.19728-6-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Isolate mr specific code from atomic_write_reply() in rxe_resp.c into
a subroutine rxe_mr_do_atomic_write() in rxe_mr.c.
Check length for atomic write operation.
Make iova_to_vaddr() static.
Link: https://lore.kernel.org/r/20230119235936.19728-5-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Isolate mr specific code from atomic_reply() in rxe_resp.c into
a subroutine rxe_mr_do_atomic_op() in rxe_mr.c.
Minor cleanups to rxe_check_range() and iova_to_vaddr().
Move enum resp_state to rxe.h
Link: https://lore.kernel.org/r/20230119235936.19728-4-rpearsonhpe@gmail.com
Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Split the source codes related with CQ handling into a new function.
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://lore.kernel.org/r/20230116193502.66540-5-yanjun.zhu@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Split the source codes related with QP handling into a new function.
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://lore.kernel.org/r/20230116193502.66540-4-yanjun.zhu@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
In the function irdma_reg_user_mr, the mr allocation and free
will be used by other functions. As such, the source codes related
with mr allocation and free are split into the new functions.
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://lore.kernel.org/r/20230116193502.66540-3-yanjun.zhu@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
The source codes related with IRDMA_MEMREG_TYPE_MEM are split
into a new function irdma_reg_user_mr_type_mem.
Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://lore.kernel.org/r/20230116193502.66540-2-yanjun.zhu@intel.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Print syndromes in case of fatal QP events. This is helpful for upper
level debugging, as there maybe no CQEs.
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Link: https://lore.kernel.org/r/edc794f622a33e4ee12d7f5d218d1a59aa7c6af5.1672821186.git.leonro@nvidia.com
Reviewed-by: Saeed Mahameed <saeed@kernel.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Move the call of qp event handler from atomic to workqueue context,
so that the handler is able to block. This is needed by following
patches.
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Patrisious Haddad <phaddad@nvidia.com>
Link: https://lore.kernel.org/r/0cd17b8331e445f03942f4bb28d447f24ac5669d.1672821186.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Zero-length arrays are deprecated[1]. Replace all remaining
0-length arrays with flexible arrays. Detected with GCC 13, using
-fstrict-flex-arrays=3:
In function 'build_rdma_write',
inlined from 'c4iw_post_send' at ../drivers/infiniband/hw/cxgb4/qp.c:1173:10:
../drivers/infiniband/hw/cxgb4/qp.c:597:38: warning: array subscript 0 is outside array bounds of 'struct fw_ri_immd[0]' [-Warray-bounds=]
597 | wqe->write.u.immd_src[0].r2 = 0;
| ~~~~~~~~~~~~~~~~~~~~~^~~
../drivers/infiniband/hw/cxgb4/t4fw_ri_api.h: In function 'c4iw_post_send':
../drivers/infiniband/hw/cxgb4/t4fw_ri_api.h:567:35: note: while referencing 'immd_src'
567 | struct fw_ri_immd immd_src[0];
| ^~~~~~~~
Additionally drop the unused C99_NOT_SUPPORTED ifndef lines.
[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays
Cc: Potnuri Bharat Teja <bharat@chelsio.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Leon Romanovsky <leon@kernel.org>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20230105223225.never.252-kees@kernel.org
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Fix possible RMT overflow: Use the correct netdev size.
Don't allow adjusted user contexts to go negative.
Fix QOS calculation: Send kernel context count as an argument since
dd->n_krcv_queues is not yet set up in earliest call. Do not include
the control context in the QOS calculation. Use the same sized
variable to find the max of krcvq[] entries.
Update the RMT count explanation to make more sense.
Signed-off-by: Dean Luick <dean.luick@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Link: https://lore.kernel.org/r/167329106946.1472990.18385495251650939054.stgit@awfm-02.cornelisnetworks.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Correct and improve validity checking of user supplied TIDs.
A tidctrl value of 0 is invalid. Verify that the final
index is in range, not an intermediate value.
Signed-off-by: Dean Luick <dean.luick@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Link: https://lore.kernel.org/r/167329105916.1472990.9915542468337924727.stgit@awfm-02.cornelisnetworks.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
The function rcventry2tidinfo() only creates part of
a TID and all calls to it are only used to make a user
TID. Consolidate all usage into a single routine.
Signed-off-by: Dean Luick <dean.luick@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Link: https://lore.kernel.org/r/167329105402.1472990.9685946655723333660.stgit@awfm-02.cornelisnetworks.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Improve code clarity and enable earlier use of
tidbuf->npages by moving its assignment to
structure creation time.
Signed-off-by: Dean Luick <dean.luick@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Link: https://lore.kernel.org/r/167329104884.1472990.4639750192433251493.stgit@awfm-02.cornelisnetworks.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
In hfi1_user_exp_rcv_setup(), variable pageidx mirrors
variable tididx. Remove pageidx and its use as an argument
to program_rcvarray().
Signed-off-by: Dean Luick <dean.luick@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Link: https://lore.kernel.org/r/167329104365.1472990.14264918308557487946.stgit@awfm-02.cornelisnetworks.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Refactors based on comments [1] of the multiple path records support
patchset:
- Return failure if not able to set inbound/outbound PRs;
- Simplify the flow when receiving the PRs from netlink channel: When
a good PR response is received, unpack it and call the path_query
callback directly. This saves two memory allocations;
- Define RDMA_PRIMARY_PATH_MAX_REC_NUM in a proper place.
[1] https://lore.kernel.org/linux-rdma/Yyxp9E9pJtUids2o@nvidia.com/
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org> #srp
Link: https://lore.kernel.org/r/7610025d57342b8b6da0f19516c9612f9c3fdc37.1672819376.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Refactor rdma_bind_addr function so that it doesn't require that the
cma destination address be changed before calling it.
So now it will update the destination address internally only when it is
really needed and after passing all the required checks.
Which in turn results in a cleaner and more sensible call and error
handling flows for the functions that call it directly or indirectly.
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reported-by: Wei Chen <harperchen1110@gmail.com>
Reviewed-by: Mark Zhang <markzhang@nvidia.com>
Link: https://lore.kernel.org/r/3d0e9a2fd62bc10ba02fed1c7c48a48638952320.1672819273.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Enable the CQEIE field and configure the CQEIS field of QPC. And add
compatibility handling.
Link: https://lore.kernel.org/r/20221224102201.3114536-4-xuhaoyue1@hisilicon.com
Signed-off-by: Luoyouming <luoyouming@huawei.com>
Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The rq inline makes some changes as follows, Firstly, it is only used in
user space. Secondly, it should notify hardware in QP RTR status. Thirdly,
Add compatibility processing between different user space and kernel
space.
Link: https://lore.kernel.org/r/20221224102201.3114536-3-xuhaoyue1@hisilicon.com
Signed-off-by: Luoyouming <luoyouming@huawei.com>
Signed-off-by: Haoyue Xu <xuhaoyue1@hisilicon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>