451 Commits

Author SHA1 Message Date
Mark Zhang
746aa3c8cb RDMA/mlx5: Use correct device num_ports when modify DC
Just like other QP types, when modify DC, the port_num should be compared
with dev->num_ports, instead of HCA_CAP.num_ports.  Otherwise Multi-port
vHCA on DC may not work.

Fixes: 776a3906b692 ("IB/mlx5: Add support for DC target QP")
Link: https://lore.kernel.org/r/20230420013906.1244185-1-markzhang@nvidia.com
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-04-21 12:36:47 -03:00
Leon Romanovsky
602fb42057 Enable IB out-of-order by default in mlx5
This series from Or changes default of IB out-of-order feature and
allows to the RDMA users to decide if they need to wait for completion
for all segments or it is enough to wait for last segment completion only.

Thanks

Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-23 10:22:32 +02:00
Or Har-Toov
742948cc02 RDMA/mlx5: Disable out-of-order in integrity enabled QPs
Set retry_mode to GO_BACK_N when qp is created with INTEGRITY_EN flag
because out-of-order is not supported when doing HW offload of signature
operations.

Signed-off-by: Or Har-Toov <ohartoov@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/362de42cdc7a541afa5b1fd0ec6ae706061764a2.1679230449.git.leon@kernel.org
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-23 10:21:13 +02:00
Rohit Chavan
c4526fe2e4 RDMA/mlx5: Coding style fix reported by checkpatch
Block comments should align the * on each line on line 2849
Avoid line continuations in quoted strings on line 3848

Signed-off-by: Rohit Chavan <roheetchavan@gmail.com>
Link: https://lore.kernel.org/r/20230319100847.5566-1-roheetchavan@gmail.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-19 14:03:34 +02:00
Patrisious Haddad
8067fd8b26 RDMA/mlx5: Print error syndrome in case of fatal QP errors
Print syndromes in case of fatal QP events. This is helpful for upper
level debugging, as there maybe no CQEs.

Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Link: https://lore.kernel.org/r/edc794f622a33e4ee12d7f5d218d1a59aa7c6af5.1672821186.git.leonro@nvidia.com
Reviewed-by: Saeed Mahameed <saeed@kernel.org>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-15 12:23:15 +02:00
Mark Zhang
312b8f79eb RDMA/mlx: Calling qp event handler in workqueue context
Move the call of qp event handler from atomic to workqueue context,
so that the handler is able to block. This is needed by following
patches.

Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Patrisious Haddad <phaddad@nvidia.com>
Link: https://lore.kernel.org/r/0cd17b8331e445f03942f4bb28d447f24ac5669d.1672821186.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-15 12:23:10 +02:00
Maor Gottlieb
8de8482fe5 RDMA/mlx5: Fix validation of max_rd_atomic caps for DC
Currently, when modifying DC, we validate max_rd_atomic user attribute
against the RC cap, validate against DC. RC and DC QP types have different
device limitations.

This can cause userspace created DC QPs to malfunction.

Fixes: c32a4f296e1d ("IB/mlx5: Add support for DC Initiator QP")
Link: https://lore.kernel.org/r/0c5aee72cea188c3bb770f4207cce7abc9b6fc74.1672231736.git.leonro@nvidia.com
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-01 09:40:35 +02:00
Linus Torvalds
780d8ce716 v5.19 pull request
Small collection of incremental improvement patches:
 
 - Minor code cleanup patches, comment improvements, etc from static tools
 
 - Clean the some of the kernel caps, reducing the historical stealth uAPI
   leftovers
 
 - Bug fixes and minor changes for rdmavt, hns, rxe, irdma
 
 - Remove unimplemented cruft from rxe
 
 - Reorganize UMR QP code in mlx5 to avoid going through the IB verbs layer
 
 - flush_workqueue(system_unbound_wq) removal
 
 - Ensure rxe waits for objects to be unused before allowing the core to
   free them
 
 - Several rc quality bug fixes for hfi1
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCYo+NxgAKCRCFwuHvBreF
 YbqSAQDJ+QolaATUvOQUPLbuLopUCJLe95VS15Kl3SNXiVUUFAEA8DLL1s6+WShd
 AgypUxGHipx5BAytrn45/WiwuDeEbQ8=
 =jgTl
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "Small collection of incremental improvement patches:

   - Minor code cleanup patches, comment improvements, etc from static
     tools

   - Clean the some of the kernel caps, reducing the historical stealth
     uAPI leftovers

   - Bug fixes and minor changes for rdmavt, hns, rxe, irdma

   - Remove unimplemented cruft from rxe

   - Reorganize UMR QP code in mlx5 to avoid going through the IB verbs
     layer

   - flush_workqueue(system_unbound_wq) removal

   - Ensure rxe waits for objects to be unused before allowing the core
     to free them

   - Several rc quality bug fixes for hfi1"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (67 commits)
  RDMA/rtrs-clt: Fix one kernel-doc comment
  RDMA/hfi1: Remove all traces of diagpkt support
  RDMA/hfi1: Consolidate software versions
  RDMA/hfi1: Remove pointless driver version
  RDMA/hfi1: Fix potential integer multiplication overflow errors
  RDMA/hfi1: Prevent panic when SDMA is disabled
  RDMA/hfi1: Prevent use of lock before it is initialized
  RDMA/rxe: Fix an error handling path in rxe_get_mcg()
  IB/core: Fix typo in comment
  RDMA/core: Fix typo in comment
  IB/hf1: Fix typo in comment
  IB/qib: Fix typo in comment
  IB/iser: Fix typo in comment
  RDMA/mlx4: Avoid flush_scheduled_work() usage
  IB/isert: Avoid flush_scheduled_work() usage
  RDMA/mlx5: Remove duplicate pointer assignment in mlx5_ib_alloc_implicit_mr()
  RDMA/qedr: Remove unnecessary synchronize_irq() before free_irq()
  RDMA/hns: Use hr_reg_read() instead of remaining roce_get_xxx()
  RDMA/hns: Use hr_reg_xxx() instead of remaining roce_set_xxx()
  RDMA/irdma: Add SW mechanism to generate completions on error
  ...
2022-05-26 21:08:40 -07:00
Mark Bloch
34a30d7635 net/mlx5: Lag, expose number of lag ports
Downstream patches will add support for hardware lag with
more than 2 ports. Add a way for users to query the number of lag ports.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-09 22:54:00 -07:00
Aharon Landau
8a8a5d37c7 RDMA/mlx5: Move mkey ctrl segment logic to umr.c
Move set_reg_umr_segment() and its helpers to umr.c.

Link: https://lore.kernel.org/r/5a7fac8ae8543521d19d174663245ae84b910310.1649747695.git.leonro@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-04-25 11:52:59 -03:00
Linus Torvalds
2dacc1e57b v5.18 merge window pull request
Patchces for the merge window:
 
 - Minor bug fixes in mlx5, mthca, pvrdma, rtrs, mlx4, hfi1, hns
 
 - Minor cleanups: coding style, useless includes and documentation
 
 - Reorganize how multicast processing works in rxe
 
 - Replace a red/black tree with xarray in rxe which improves performance
 
 - DSCP support and HW address handle re-use in irdma
 
 - Simplify the mailbox command handling in hns
 
 - Simplify iser now that FMR is eliminated
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAmI7d8gACgkQOG33FX4g
 mxq9TRAAm/3rEBtDyVan4Dm3mzq8xeJOTdYtQ29QNWZcwvhStNTo4c29jQJITZ9X
 Xd+m8VEPWzjkkZ4+hDFM9HhWjOzqDcCCFcbF88WDO0+P1M3HYYUE6AbIcSilAaT2
 GIby3+kAnsTOAryrSzHLehYJKUYJuw5NRGsVAPY3r6hz3UECjNb/X1KTWhXaIfNy
 2OTqFx5wMQ6hZO8e4a2Wz4bBYAYm95UfK+TgfRqUwhDLCDnkDELvHMPh9pXxJecG
 xzVg7W7Nv+6GWpUrqtM6W6YpriyjUOfbrtCJmBV3T/EdpiD6lZhKpkX23aLgu2Bi
 XoMM67aZ0i1Ft2trCIF8GbLaejG6epBkeKkUMMSozKqFP5DjhNbD2f3WAlMI15iW
 CcdyPS5re1ymPp66UlycTDSkN19wD8LfgQPLJCNM3EV8g8qs9LaKzEPWunBoZmw1
 a+QX2zK8J07S21F8iq2sf/Qe1EDdmgOEOf7pmB/A5/PKqgXZPG7lYNYuhIKyFsdL
 mO+BqWWp0a953JJjsiutxyUCyUd8jtwwxRa0B66oqSQ1jO+xj6XDxjEL8ACuT8Iq
 9wFJluTCN6lfxyV8mrSfweT0fQh/W/zVF7EJZHWdKXEzuADxMp9X06wtMQ82ea+k
 /wgN7DUpBZczRtlqaxt1VSkER75QwAMy/oSpA974+O120QCwLMQ=
 =aux1
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:

 - Minor bug fixes in mlx5, mthca, pvrdma, rtrs, mlx4, hfi1, hns

 - Minor cleanups: coding style, useless includes and documentation

 - Reorganize how multicast processing works in rxe

 - Replace a red/black tree with xarray in rxe which improves performance

 - DSCP support and HW address handle re-use in irdma

 - Simplify the mailbox command handling in hns

 - Simplify iser now that FMR is eliminated

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (93 commits)
  RDMA/nldev: Prevent underflow in nldev_stat_set_counter_dynamic_doit()
  IB/iser: Fix error flow in case of registration failure
  IB/iser: Generalize map/unmap dma tasks
  IB/iser: Use iser_fr_desc as registration context
  IB/iser: Remove iser_reg_data_sg helper function
  RDMA/rxe: Use standard names for ref counting
  RDMA/rxe: Replace red-black trees by xarrays
  RDMA/rxe: Shorten pool names in rxe_pool.c
  RDMA/rxe: Move max_elem into rxe_type_info
  RDMA/rxe: Replace obj by elem in declaration
  RDMA/rxe: Delete _locked() APIs for pool objects
  RDMA/rxe: Reverse the sense of RXE_POOL_NO_ALLOC
  RDMA/rxe: Replace mr by rkey in responder resources
  RDMA/rxe: Fix ref error in rxe_av.c
  RDMA/hns: Use the reserved loopback QPs to free MR before destroying MPT
  RDMA/irdma: Add support for address handle re-use
  RDMA/qib: Fix typos in comments
  RDMA/mlx5: Fix memory leak in error flow for subscribe event routine
  Revert "RDMA/core: Fix ib_qp_usecnt_dec() called when error"
  RDMA/rxe: Remove useless argument for update_state()
  ...
2022-03-24 19:17:39 -07:00
Saeed Mahameed
31803e5923 net/mlx5: Use mlx5_cmd_do() in core create_{cq,dct}
mlx5_core_create_{cq/dct} functions are non-trivial mlx5 commands
functions. They check command execution status themselves and hide
valuable FW failure information.

For mlx5_core/eth kernel user this is what we actually want, but for a
devx/rdma user the hidden information is essential and should be propagated
up to the caller, thus we convert these commands to use mlx5_cmd_do
to return the FW/driver and command outbox status as is, and let the caller
decide what to do with it.

For kernel callers of mlx5_core_create_{cq/dct} or those who only care about
the binary status (FAIL/SUCCESS) they must check status themselves via
mlx5_cmd_check() to restore the current behavior.

err = mlx5_create_cq(in, out)
err = mlx5_cmd_check(err, in, out)
if (err)
    // handle err

For DEVX users and those who care about full visibility, They will just
propagate the error to user space, and app can check if err == -EREMOTEIO,
then outbox.{status,syndrome} are valid.

API Note:
mlx5_cmd_check() must be used by kernel users since it allows the driver
to intercept the command execution status and return a driver simulated
status in case of driver induced error handling or reset/recovery flows.

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-23 15:21:59 -08:00
Leon Romanovsky
bd660922ab RDMA/mlx5: Delete useless module.h include
There is no need in include of module.h in the following files.

Link: https://lore.kernel.org/r/3ab153e25c7ea59599022dc7fe3c409fcfe1aac1.1642960861.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-01-28 13:03:12 -04:00
Leon Romanovsky
84aa6c3963 RDMA/mlx5: Delete get_num_static_uars function
There is no need to keep get_num_static_uars in the headers file
as it is not shared and used only once.

Link: https://lore.kernel.org/r/11d78568c3c6ba588ee8465e0d10d96145fc825c.1642960830.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-01-28 13:01:54 -04:00
Jakub Kicinski
b6459415b3 net: Don't include filter.h from net/sock.h
sock.h is pretty heavily used (5k objects rebuilt on x86 after
it's touched). We can drop the include of filter.h from it and
add a forward declaration of struct sk_filter instead.
This decreases the number of rebuilt objects when bpf.h
is touched from ~5k to ~1k.

There's a lot of missing includes this was masking. Primarily
in networking tho, this time.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/bpf/20211229004913.513372-1-kuba@kernel.org
2021-12-29 08:48:14 -08:00
Patrisious Haddad
1ab52ac1e9 RDMA/mlx5: Set user priority for DCT
Currently, the driver doesn't set the PCP-based priority for DCT, hence
DCT response packets are transmitted without user priority.

Fix it by setting user provided priority in the eth_prio field in the DCT
context, which in turn sets the value in the transmitted packet.

Fixes: 776a3906b692 ("IB/mlx5: Add support for DC target QP")
Link: https://lore.kernel.org/r/5fd2d94a13f5742d8803c218927322257d53205c.1633512672.git.leonro@nvidia.com
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-06 16:39:52 -03:00
Lior Nahmanson
65f90c8e38 RDMA/mlx5: Relax DCS QP creation checks
In order to create DCS QPs, we don't need to rely on both
log_max_dci_stream_channels and log_max_dci_errored_streams capabilities.

Fixes: 11656f593a86 ("RDMA/mlx5: Add DCS offload support")
Link: https://lore.kernel.org/r/3e7b3363fd73686176cc584295e86832a7cf99b2.1630320354.git.leonro@nvidia.com
Signed-off-by: Lior Nahmanson <liorna@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-30 09:47:40 -03:00
Leon Romanovsky
5f6bb7e322 RDMA/mlx5: Delete not-available udata check
XRC_TGT QPs are created through kernel verbs and don't have udata at all.

Fixes: 6eefa839c4dd ("RDMA/mlx5: Protect from kernel crash if XRC_TGT doesn't have udata")
Fixes: e383085c2425 ("RDMA/mlx5: Set ECE options during QP create")
Link: https://lore.kernel.org/r/b68228597e730675020aa5162745390a2d39d3a2.1628014762.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-03 15:26:17 -03:00
Leon Romanovsky
514aee660d RDMA: Globally allocate and release QP memory
Convert QP object to follow IB/core general allocation scheme.  That
change allows us to make sure that restrack properly kref the memory.

Link: https://lore.kernel.org/r/48e767124758aeecc433360ddd85eaa6325b34d9.1627040189.git.leonro@nvidia.com
Reviewed-by: Gal Pressman <galpress@amazon.com> #efa
Tested-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> #rdma and core
Tested-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Tested-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-03 13:44:27 -03:00
Leon Romanovsky
0dc0da15ed RDMA/mlx5: Rework custom driver QP type creation
Starting from commit 2b1f747071c5 ("RDMA/core: Allow drivers to disable
restrack DB") the restrack is able to handle non-standard QP types either.

That change allows us to rewrite custom QP calls to their IB/core
counterparts, so we will use general QP creation flow even for the driver
QP types.

Link: https://lore.kernel.org/r/51682ab82298748941f38bd23ee3bf77ef1cab7b.1627040189.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-03 13:44:27 -03:00
Lior Nahmanson
11656f593a RDMA/mlx5: Add DCS offload support
DCS is an offload to SW load balancing of DC initiator work requests.

A single DCI can be connected to only one target at the time and can't
start new connection until the previous work request is completed.  This
limitation will cause to delay when the initiator process needs to
transfer data to multiple targets at the same time.  The SW solution is to
use a process that handling and spreading the work request on many DCIs
according to destinations.

This feature is an offload to this process and coming to reduce the load
from the CPU and improve the performance.

Link: https://lore.kernel.org/r/491c2c2afdb5b07de7f03eab3f93cf0704549dbc.1624258894.git.leonro@nvidia.com
Reviewed-by: Meir Lichtinger <meirl@nvidia.com>
Signed-off-by: Lior Nahmanson <liorna@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-07-20 15:04:14 -03:00
Lior Nahmanson
2013b4d525 RDMA/mlx5: Separate DCI QP creation logic
This patch isolates DCI QP creation logic to separate function, so this
change will reduce complexity when adding new features to DCI QP without
interfering with other QP types.

The code was copied from create_user_qp() while taking only DCI relevant bits.

Link: https://lore.kernel.org/r/b4530bdd999349c59691224f016ff1efb5dc3b92.1624258894.git.leonro@nvidia.com
Reviewed-by: Meir Lichtinger <meirl@nvidia.com>
Signed-off-by: Lior Nahmanson <liorna@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-07-20 15:04:14 -03:00
Jason Gunthorpe
2833c977c3 Merge branch 'mlx5_realtime_ts' into rdma.git for-next
Aharon Landau says:

====================
In case device supports only real-time timestamp, the kernel will fail to
create QP despite rdma-core requested such timestamp type.

It is because device returns free-running timestamp, and the conversion
from free-running to real-time is performed in the user space.

This series fixes it, by returning real-time timestamp.
====================

* mlx5_realtime_ts:
  RDMA/mlx5: Support real-time timestamp directly from the device
  RDMA/mlx5: Refactor get_ts_format functions to simplify code

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-22 15:08:39 -03:00
Aharon Landau
336529518e RDMA/mlx5: Support real-time timestamp directly from the device
Currently, if the user asks for a real-time timestamp, the device will
return a free-running one, and the timestamp will be translated to
real-time in the user-space.

When the device supports only real-time timestamp and not free-running,
the creation of the QP will fail even though the user needs supported the
real-time one. To prevent this, we will return the real-time timestamp
directly from the device.

Link: https://lore.kernel.org/r/c6cfc8e6f038575c5c2de6505830f7e74e4de80d.1623829775.git.leonro@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-22 14:23:50 -03:00
Aharon Landau
9a1ac95a59 RDMA/mlx5: Refactor get_ts_format functions to simplify code
QPC, SQC and RQC timestamp formats and capabilities are always equal
because they represent general hardware support. So instead of code
duplication, let's merge them into general enum and logic.

Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2021-06-22 09:35:16 +03:00
Leon Romanovsky
f974428872 RDMA/core: Sanitize WQ state received from the userspace
The mlx4 and mlx5 implemented differently the WQ input checks.  Instead of
duplicating mlx4 logic in the mlx5, let's prepare the input in the central
place.

The mlx5 implementation didn't check for validity of state input.  It is
not real bug because our FW checked that, but still worth to fix.

Fixes: f213c0527210 ("IB/uverbs: Add WQ support")
Link: https://lore.kernel.org/r/ac41ad6a81b095b1a8ad453dcf62cf8d3c5da779.1621413310.git.leonro@nvidia.com
Reported-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-02 15:20:11 -03:00
Maor Gottlieb
9ecf6ac17c RDMA/mlx5: Take qp type from mlx5_ib_qp
Change all the places in the mlx5_ib driver to take the qp type from the
mlx5_ib_qp struct, except the QP initialization flow. It will ensure that
we check the right QP type also for vendor specific QPs.

Link: https://lore.kernel.org/r/b2e16cd65b59cd24fa81c01c7989248da44e58ea.1621413899.git.leonro@nvidia.com
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-26 16:49:42 -03:00
Lang Cheng
0bedd3d005 RDMA/mlx5: Remove unused parameter udata
The old version of ib_umem_get() need these udata as a parameter but now
they are unnecessary.

Fixes: c320e527e154 ("IB: Allow calls to ib_umem_get from kernel ULPs")
Link: https://lore.kernel.org/r/1620807142-39157-4-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Acked-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-20 11:52:17 -03:00
Sergey Gorenko
021c1f24f0 RDMA/mlx5: Support SQD2RTS for modify QP
The transition of the QP state from SQD to RTS is allowed by the IB
specification. The hardware also supports that, but it is not
implemented in mlx5_ib.

This commit adds SQD2RTS command to the modify QP in mlx5_ib to support
the missing feature. The feature is required by the signature pipelining
API that will be added to rdma-core.

Link: https://lore.kernel.org/r/ab4876360bfba0e9d64a5e8599438e32e0cb351e.1620641808.git.leonro@nvidia.com
Reviewed-by: Evgenii Kochetov <evgeniik@nvidia.com>
Signed-off-by: Sergey Gorenko <sergeygo@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-20 11:41:07 -03:00
Linus Torvalds
f34b2cf178 RDMA merge window pull request
This is significantly bug fixes and general cleanups. The noteworthy new
 features are fairly small:
 
 - XRC support for HNS and improves RQ operations
 
 - Bug fixes and updates for hns, mlx5, bnxt_re, hfi1, i40iw, rxe, siw and
   qib
 
 - Quite a few general cleanups on spelling, error handling, static checker
   detections, etc
 
 - Increase the number of device ports supported beyond 255. High port
   count software switches now exist
 
 - Several bug fixes for rtrs
 
 - mlx5 Device Memory support for host controlled atomics
 
 - Report SRQ tables through to rdma-tool
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAmCMMHEACgkQOG33FX4g
 mxri3Q//RAgIExCGHebQ9xkptZHVyTLLJMpiMl2cqk3ZVRdDZ7QdiQjIqY2KqlUK
 nxBj7EXJeX6rV5a1xqCcOO1gBetB28TSwnCNE2ZqrXP5B59ISW8D052IWza3UkUz
 WmHLARxHQlyKBWA4+ZAgfoUGL0NmWA8QPf56t/RK/3/OsuYnGzcnWmmFbt8XKFcH
 NtO3KC45mKWDqqG0A0XRrLbEQz/ElO3OuPBqlBKgB3ZgGPzgsOUTOGkm1tCcZ89L
 /pvZGB7SklKZdCX8TxdpVGd9h0zHl8pqh1yEzvTA1ypNAYSUId2mvZXluU8J5yJl
 FLk7E1IxE5050FNEc7T5uZdUVntulYiqL2558coRI34l5w26pKGjIMxw/nTB8hg8
 4ZfBtKVemIG6yzW5Up6iBpK7qWYpvLWVShwYAWhbNsjN7JGzJuh1gJnjbmYgyz2P
 RTMU9wjFPLL2wZxg4LDHACVJNBb82j6KKuE+kZWpk11ro7INw9+7YwRuTo7/ezxC
 BwXKu8wF4igwSigV55jM+WnGXLhxdC3qmx/2cbtWyLM/PzdRL96tM0RWW5v8/Nv7
 teFhkt+f3RVqcfYH5K1qCXy3UFrxG6bxFSvcHHSBx2bdIrqhuTY5FqszAYImeW2j
 iHoyIsuSuGu79HQgOzAQZsEyksWi6OYDvA9Q9VBoPP4bJ3DOAa4=
 =vsXA
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "This is significantly bug fixes and general cleanups. The noteworthy
  new features are fairly small:

   - XRC support for HNS and improves RQ operations

   - Bug fixes and updates for hns, mlx5, bnxt_re, hfi1, i40iw, rxe, siw
     and qib

   - Quite a few general cleanups on spelling, error handling, static
     checker detections, etc

   - Increase the number of device ports supported beyond 255. High port
     count software switches now exist

   - Several bug fixes for rtrs

   - mlx5 Device Memory support for host controlled atomics

   - Report SRQ tables through to rdma-tool"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (145 commits)
  IB/qib: Remove redundant assignment to ret
  RDMA/nldev: Add copy-on-fork attribute to get sys command
  RDMA/bnxt_re: Fix a double free in bnxt_qplib_alloc_res
  RDMA/siw: Fix a use after free in siw_alloc_mr
  IB/hfi1: Remove redundant variable rcd
  RDMA/nldev: Add QP numbers to SRQ information
  RDMA/nldev: Return SRQ information
  RDMA/restrack: Add support to get resource tracking for SRQ
  RDMA/nldev: Return context information
  RDMA/core: Add CM to restrack after successful attachment to a device
  RDMA/cma: Skip device which doesn't support CM
  RDMA/rxe: Fix a bug in rxe_fill_ip_info()
  RDMA/mlx5: Expose private query port
  RDMA/mlx4: Remove an unused variable
  RDMA/mlx5: Fix type assignment for ICM DM
  IB/mlx5: Set right RoCE l3 type and roce version while deleting GID
  RDMA/i40iw: Fix error unwinding when i40iw_hmc_sd_one fails
  RDMA/cxgb4: add missing qpid increment
  IB/ipoib: Remove unnecessary struct declaration
  RDMA/bnxt_re: Get rid of custom module reference counting
  ...
2021-05-01 09:15:05 -07:00
Mark Bloch
1fb7f8973f RDMA: Support more than 255 rdma ports
Current code uses many different types when dealing with a port of a RDMA
device: u8, unsigned int and u32. Switch to u32 to clean up the logic.

This allows us to make (at least) the core view consistent and use the
same type. Unfortunately not all places can be converted. Many uverbs
functions expect port to be u8 so keep those places in order not to break
UAPIs.  HW/Spec defined values must also not be changed.

With the switch to u32 we now can support devices with more than 255
ports. U32_MAX is reserved to make control logic a bit easier to deal
with. As a device with U32_MAX ports probably isn't going to happen any
time soon this seems like a non issue.

When a device with more than 255 ports is created uverbs will report the
RDMA device as having 255 ports as this is the max currently supported.

The verbs interface is not changed yet because the IBTA spec limits the
port size in too many places to be u8 and all applications that relies in
verbs won't be able to cope with this change. At this stage, we are
extending the interfaces that are using vendor channel solely

Once the limitation is lifted mlx5 in switchdev mode will be able to have
thousands of SFs created by the device. As the only instance of an RDMA
device that reports more than 255 ports will be a representor device and
it exposes itself as a RAW Ethernet only device CM/MAD/IPoIB and other
ULPs aren't effected by this change and their sysfs/interfaces that are
exposes to userspace can remain unchanged.

While here cleanup some alignment issues and remove unneeded sanity
checks (mainly in rdmavt),

Link: https://lore.kernel.org/r/20210301070420.439400-1-leon@kernel.org
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-26 09:31:21 -03:00
Mark Zhang
6fe6e56863 RDMA/mlx5: Fix mlx5 rates to IB rates map
Correct the map between mlx5 rates and corresponding ib rates, as they
don't always have a fixed offset between them.

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Link: https://lore.kernel.org/r/20210304124517.1100608-4-leon@kernel.org
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-11 20:12:39 -04:00
Maor Gottlieb
8256c69b2d RDMA/mlx5: Fix timestamp default mode
1. Don't set the ts_format bit to default when it reserved - device is
   running in the old mode (free running).
2. XRC doesn't have a CQ therefore the ts format in the QP
   context should be default / free running.
3. Set ts_format to WQ.

Fixes: 2fe8d4b87802 ("RDMA/mlx5: Fail QP creation if the device can not support the CQE TS")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-10 11:01:57 -08:00
Jason Gunthorpe
68ad4d1cc6 Merge branch 'mlx5_timestamp' into rdma.git for-next
Leon Romanovsky says:

====================
Add an extra timestamp format for mlx5_ib device.
====================

Based on the mlx5-next branch at
     git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
due to dependencies.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

* branch 'mlx5_timestamp':
  RDMA/mlx5: Fail QP creation if the device can not support the CQE TS
  net/mlx5: Add new timestamp mode bits
2021-02-16 14:49:36 -04:00
Aharon Landau
2fe8d4b878 RDMA/mlx5: Fail QP creation if the device can not support the CQE TS
In ConnectX6Dx device, HW can work in real time timestamp mode according
to the device capabilities per RQ/SQ/QP.

When the flag IB_UVERBS_CQ_FLAGS_TIMESTAMP_COMPLETION is set, the user
expect to get TS on the CQEs in free running format, so we need to fail
the QP creation if the current mode of the device doesn't support it.

Link: https://lore.kernel.org/r/20210209131107.698833-3-leon@kernel.org
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16 14:48:47 -04:00
Patrisious Haddad
c70f51de85 RDMA/mlx5: Support 400Gbps IB rate in mlx5 driver
Support 400Gbps IB rate in mlx5 driver.

Link: https://lore.kernel.org/r/20210209130429.698237-1-leon@kernel.org
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-09 14:00:35 -04:00
Parav Pandit
7416790e22 RDMA/core: Introduce and use API to read port immutable data
Currently mlx5 driver caches port GID table length for 2 ports.  It is
also cached by IB core as port immutable data.

When mlx5 representor ports are present, which are usually more than 2,
invalid access to port_caps array can happen while validating the GID
table length which is only for 2 ports.

To avoid this, take help of the IB cores port immutable data by exposing
an API to read the port immutable fields.

Remove mlx5 driver's internal cache, thereby reduce code and data.

Link: https://lore.kernel.org/r/20210203130133.4057329-5-leon@kernel.org
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-05 12:06:01 -04:00
Parav Pandit
2019d70e91 IB/mlx5: Avoid calling query device for reading pkey table length
Pkey table length for all the ports of the device is the same.  Currently
get_ports_cap() reads and stores it for each port by querying the device
which reads more than just pkey table length.

For representor device ports which can be in range of hundreds, it queries
is for each such port and end up returning same value for all the ports.

When in representor mode, modify QP accesses pkey port caps for a port
index that can be outside of the port_caps table.

Hence, simplify the logic to query the max pkey table length only once
during device initialization sequence.

Link: https://lore.kernel.org/r/20210203130133.4057329-3-leon@kernel.org
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-05 12:06:00 -04:00
Parav Pandit
3ce60f443b IB/mlx5: Move mlx5_port_caps from mlx5_core_dev to mlx5_ib_dev
mlx5_port_caps are RDMA specific capabilities. These are not used by the
mlx5_core_device at all. Move them to mlx5_ib_dev where it is used and
reduce the scope of it to multiple drivers.

Link: https://lore.kernel.org/r/20210203130133.4057329-2-leon@kernel.org
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-05 12:06:00 -04:00
Mark Bloch
2614488d1f RDMA/mlx5: Allow creating all QPs even when non RDMA profile is used
The cited commit disallowed creating any QP which isn't raw ethernet, reg
umr or the special UD qp for testing WC, this proved too strict.

While modify can't be done (no GIDS/GID table for example) just creating a
QP is okay.

This patch partially reverts the bellow mentioned commit and places the
restriction at the modify QP stage and not at the creation.  DEVX commands
should be used to manipulate such QPs.

Fixes: 42caf9cb5937 ("RDMA/mlx5: Allow only raw Ethernet QPs when RoCE isn't enabled")
Link: https://lore.kernel.org/r/20210125120709.836718-1-leon@kernel.org
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-28 15:19:43 -04:00
Lee Jones
30cd9fc5e7 RDMA/hw/mlx5/qp: Demote non-conformant kernel-doc header
Fixes the following W=1 kernel build warning(s):

 drivers/infiniband/hw/mlx5/qp.c:5384: warning: Function parameter or member 'qp' not described in 'mlx5_ib_qp_set_counter'
 drivers/infiniband/hw/mlx5/qp.c:5384: warning: Function parameter or member 'counter' not described in 'mlx5_ib_qp_set_counter'

Link: https://lore.kernel.org/r/20210121094519.2044049-3-lee.jones@linaro.org
Cc: Leon Romanovsky <leon@kernel.org>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Acked-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-22 14:37:29 -04:00
Tom Rix
7f1d2dfa30 RDMA/mlx5: Remove unneeded semicolon
A semicolon is not needed after a switch statement.

Link: https://lore.kernel.org/r/20201031134638.2135060-1-trix@redhat.com
Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-12-10 12:00:27 -04:00
Leon Romanovsky
2b1f747071 RDMA/core: Allow drivers to disable restrack DB
Driver QP types are special case with no IBTA restrictions. For example,
EFA implemented creation of this QP type as regular one, while mlx5
separated create to two step: create and modify. That separation causes to
the situation where DC QP (mlx5) is always added to the same xarray index
zero.

This change allows to drivers like mlx5 simply disable restrack DB
tracking, but it doesn't disable kref on the memory.

Fixes: 52e0a118a203 ("RDMA/restrack: Track driver QP types in resource tracker")
Link: https://lore.kernel.org/r/20201117070148.1974114-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-27 11:38:46 -04:00
Avihai Horon
f957d4d09a RDMA/mlx5: Enable querying AH for XRC QP types
Address handle is set for connected QP types such as RC and UC, and thus
can also be queried.

Since XRC QP types INI and TGT are connected, it should be possible to
query their address handle as well.

Until now it was not the case, and although the firmware supported it, the
driver allowed querying the address handle only for RC and UC.

Hence, we enable it now for INI and TGT QPs as well.

Link: https://lore.kernel.org/r/20201115121425.139833-2-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-26 11:58:19 -04:00
Gustavo A. R. Silva
808b2c925d IB/mlx5: Fix fall-through warnings for Clang
In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning
by explicitly adding the new pseudo-keyword fallthrough; instead of
letting the code fall through to the next case.

Link: https://lore.kernel.org/r/2b0c87362bc86f6adfe56a5a6685837b71022bbf.1605896059.git.gustavoars@kernel.org
Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Acked-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-23 15:54:11 -04:00
Jason Gunthorpe
a59b7b05ef RDMA/mlx5: Use mlx5_umem_find_best_quantized_pgoff() for QP
Delete custom logic in the QP in favor of more general variant.

Link: https://lore.kernel.org/r/20201115114311.136250-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-16 16:53:30 -04:00
Jason Gunthorpe
7579dcdf73 RDMA/mlx5: Directly compute the PAS list for raw QP RQ's
The RQ WQ created when making a raw ethernet QP copies the PAS list from
a dummy QPC command created earlier in the flow. The WQC and QPC PAS lists
are not fully compatible as the page_offset is a different size.

Create the RQ WQ's PAS list directly and do not try to copy it from
another command structure.

Like the prior patch, this also means that badly aligned buffers were not
correctly rejected.

Link: https://lore.kernel.org/r/20201115114311.136250-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-16 16:53:29 -04:00
Jason Gunthorpe
ad480ea5d6 RDMA/mlx5: Use mlx5_umem_find_best_quantized_pgoff() for WQ
This fixes a subtle bug, the WQ mailbox has only 5 bits to describe the
page_offset, while mlx5_ib_get_buf_offset() is hard wired to only work
with 6 bit page_offsets.

Thus it did not properly reject badly aligned buffers.

Fixes: 79b20a6c3014 ("IB/mlx5: Add receive Work Queue verbs")
Fixes: 0fb2ed66a14c ("IB/mlx5: Add create and destroy functionality for Raw Packet QP")
Link: https://lore.kernel.org/r/20201115114311.136250-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-16 16:53:29 -04:00
Jason Gunthorpe
aab8d3966d RDMA/mlx5: Change mlx5_ib_populate_pas() to use rdma_for_each_block()
This routine converts the umem SGL into a list of fixed pages for DMA,
which is exactly what rdma_umem_for_each_dma_block() is for, use the
common code directly.

Link: https://lore.kernel.org/r/20201026132314.1336717-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02 14:53:54 -04:00
Jason Gunthorpe
f8fb311063 RDMA/mlx5: Remove npages from mlx5_ib_cont_pages()
Most callers don't need this, and the few that do can get it as
ib_umem_num_pages(umem).

Link: https://lore.kernel.org/r/20201026131936.1335664-8-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-02 14:52:26 -04:00