7857 Commits

Author SHA1 Message Date
Mike Marciniszyn
d39bf40e55 IB/qib: Protect from buffer overflow in struct qib_user_sdma_pkt fields
Overflowing either addrlimit or bytes_togo can allow userspace to trigger
a buffer overflow of kernel memory. Check for overflows in all the places
doing math on user controlled buffers.

Fixes: f931551bafe1 ("IB/qib: Add new qib driver for QLogic PCIe InfiniBand adapters")
Link: https://lore.kernel.org/r/20211012175519.7298.77738.stgit@awfm-01.cornelisnetworks.com
Reported-by: Ilja Van Sprundel <ivansprundel@ioactive.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-13 13:26:04 -03:00
Leon Romanovsky
f4e56ec445 RDMA/mlx4: Return missed an error if device doesn't support steering
The error flow fixed in this patch is not possible because all kernel
users of create QP interface check that device supports steering before
set IB_QP_CREATE_NETIF_QP flag.

Fixes: c1c98501121e ("IB/mlx4: Add support for steerable IB UD QPs")
Link: https://lore.kernel.org/r/91c61f6e60eb0240f8bbc321fda7a1d2986dd03c.1634023677.git.leonro@nvidia.com
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-12 13:23:23 -03:00
Zhu Yanjun
9d8f247cc3 RDMA/irdma: Remove irdma_cqp_up_map_cmd()
The function irdma_cqp_up_map_cmd() is not used. So remove it.

Link: https://lore.kernel.org/r/20211011110128.4057-5-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-12 13:21:40 -03:00
Zhu Yanjun
16ddcfca56 RDMA/irdma: Remove irdma_get_hw_addr()
The function irdma_get_hw_addr() is not used. So remove it.

Link: https://lore.kernel.org/r/20211011110128.4057-4-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-12 13:21:39 -03:00
Zhu Yanjun
6d2682216d RDMA/irdma: Remove irdma_sc_send_lsmm_nostag()
The function irdma_sc_send_lsmm_nostag is not used. So remove it.

Link: https://lore.kernel.org/r/20211011110128.4057-3-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-12 13:21:39 -03:00
Zhu Yanjun
0bed5dfa5a RDMA/irdma: Remove irdma_uk_mw_bind()
The function irdma_uk_mw_bind is not used. So remove it.

Link: https://lore.kernel.org/r/20211011110128.4057-2-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-12 13:21:39 -03:00
Christophe JAILLET
8869574a6c RDMA: Remove redundant 'flush_workqueue()' calls
'destroy_workqueue()' already drains the queue before destroying it, so
there is no need to flush it explicitly.

Remove the redundant 'flush_workqueue()' calls.

This was generated with coccinelle:

@@
expression E;
@@
- 	flush_workqueue(E);
	destroy_workqueue(E);

Link: https://lore.kernel.org/r/ca7bac6e6c9c5cc8d04eec3944edb13de0e381a3.1633874776.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-12 13:21:23 -03:00
Cai Huoqing
9a33f39809 RDMA/hns: Use dma_alloc_coherent() instead of kmalloc/dma_map_single()
Replacing kmalloc/kfree/dma_map_single/dma_unmap_single() with
dma_alloc_coherent/dma_free_coherent() helps to reduce code size, and
simplify the code, and coherent DMA will not clear the cache every time.

The SOC that this driver supports does not have incoherent DMA, so this
makes the code follow the DMA API properly with no performance
impact. Currently there are missing dma sync calls around the DMA
transfers.

Link: https://lore.kernel.org/r/20210926061116.282-1-caihuoqing@baidu.com
Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Reviewed-by: Wenpeng Liang <liangwenpeng@huawei.com>
Tested-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-12 12:54:51 -03:00
Aharon Landau
a020094090 RDMA/mlx5: Add optional counter support in get_hw_stats callback
When get_hw_stats is called, query and return the optional counter
statistic as well.

Link: https://lore.kernel.org/r/20211008122439.166063-14-markzhang@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-12 12:48:07 -03:00
Aharon Landau
a29b934ceb RDMA/mlx5: Add modify_op_stat() support
Add support for ib callback modify_op_stat() to add or remove an optional
counter. When adding, a steering flow table is created with a rule that
catches and counts all the matching packets. When removing, the table and
flow counter are destroyed.

Link: https://lore.kernel.org/r/20211008122439.166063-13-markzhang@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-12 12:48:06 -03:00
Aharon Landau
ffa501ef19 RDMA/mlx5: Add steering support in optional flow counters
Adding steering infrastructure for adding and removing optional counter.
This allows to add and remove the counters dynamically in order not to
hurt performance.

Link: https://lore.kernel.org/r/20211008122439.166063-12-markzhang@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-12 12:48:06 -03:00
Aharon Landau
886773d249 RDMA/mlx5: Support optional counters in hw_stats initialization
Add optional counter support when allocate and initialize hw_stats
structure. Optional counters have IB_STAT_FLAG_OPTIONAL flag set and are
disabled by default.

Link: https://lore.kernel.org/r/20211008122439.166063-11-markzhang@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-12 12:48:06 -03:00
Aharon Landau
13f30b0fa0 RDMA/counter: Add a descriptor in struct rdma_hw_stats
Add a counter statistic descriptor structure in rdma_hw_stats. In addition
to the counter name, more meta-information will be added.  This code
extension is needed for optional-counter support in the following patches.

Link: https://lore.kernel.org/r/20211008122439.166063-4-markzhang@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-12 12:48:04 -03:00
Gal Pressman
2a152512a1 RDMA/efa: CQ notifications
This patch adds support for CQ notifications through the standard verbs
api.

In order to achieve that, a new event queue (EQ) object is introduced,
which is in charge of reporting completion events to the driver.  On
driver load, EQs are allocated and their affinity is set to a single
cpu. When a user app creates a CQ with a completion channel, the
completion vector number is converted to a EQ number, which is in charge
of reporting the CQ events.

In addition, the CQ creation admin command now returns an offset for the
CQ doorbell, which is mapped to the userspace provider and is used to arm
the CQ when requested by the user.

The EQs use a single doorbell (located on the registers BAR), which
encodes the EQ number and arm as part of the doorbell value.  The EQs are
polled by the driver on each new EQE, and arm it when the poll is
completed.

Link: https://lore.kernel.org/r/20211003105605.29222-1-galpress@amazon.com
Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-06 19:47:18 -03:00
Patrisious Haddad
1ab52ac1e9 RDMA/mlx5: Set user priority for DCT
Currently, the driver doesn't set the PCP-based priority for DCT, hence
DCT response packets are transmitted without user priority.

Fix it by setting user provided priority in the eth_prio field in the DCT
context, which in turn sets the value in the transmitted packet.

Fixes: 776a3906b692 ("IB/mlx5: Add support for DC target QP")
Link: https://lore.kernel.org/r/5fd2d94a13f5742d8803c218927322257d53205c.1633512672.git.leonro@nvidia.com
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-06 16:39:52 -03:00
Shiraz Saleem
e93c7d8e8c RDMA/irdma: Process extended CQ entries correctly
The valid bit for extended CQE's written by HW is retrieved from the
incorrect quad-word. This leads to missed completions for any UD traffic
particularly after a wrap-around.

Get the valid bit for extended CQE's from the correct quad-word in the
descriptor.

Fixes: 551c46edc769 ("RDMA/irdma: Add user/kernel shared libraries")
Link: https://lore.kernel.org/r/20211005182302.374-1-shiraz.saleem@intel.com
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-06 16:38:07 -03:00
Zhu Yanjun
0de71d7ada RDMA/irdma: Delete unused struct irdma_bth
The struct irdma_bth is not used, so remove it.

Link: https://lore.kernel.org/r/20211006201531.469650-1-yanjun.zhu@linux.dev
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-06 16:34:52 -03:00
Andy Shevchenko
286dba65a4 IB/hf1: Use string_upper() instead of an open coded variant
Use string_upper() from the string helper module instead of an	open coded
variant.

Link: https://lore.kernel.org/r/20211001123153.67379-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-05 15:22:37 -03:00
Jakub Kicinski
ded6e16b37 mlx4: replace mlx4_mac_to_u64() with ether_addr_to_u64()
mlx4_mac_to_u64() predates and opencodes ether_addr_to_u64().
It doesn't make the argument constant so it'll be problematic
when dev->dev_addr becomes a const. Convert to the generic helper.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-05 13:15:35 +01:00
Shay Drory
3663ad34bc net/mlx5: Shift control IRQ to the last index
Control IRQ is the first IRQ vector. This complicates handling of
completion irqs as we need to offset them by one.
in the next patch, there are scenarios where completion and control EQs
will share the same irq. for example: functions with single IRQ. To ease
such scenarios, we shift control IRQ to the end of the irq array.

Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-10-04 18:10:57 -07:00
Aharon Landau
b68362304b RDMA/mlx5: Avoid taking MRs from larger MR cache pools when a pool is empty
Currently, if a cache entry is empty, the driver will try to take MRs
from larger cache entries. This behavior consumes a lot of memory.
In addition, when searching for an mkey in an entry, the entry is locked.
When using a multithreaded application with the old behavior, the threads
will block each other more often, which can hurt performance as can be
seen in the table below.

Therefore, avoid it by creating a new mkey when the requested cache entry
is empty.

The test was performed on a machine with
Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz 44 cores.

Here are the time measures for allocating MRs of 2^6 pages. The search in
the cache started from entry 6.

+------------+---------------------+---------------------+
|            |     Old behavior    |     New behavior    |
|            +----------+----------+----------+----------+
|            | 1 thread | 5 thread | 1 thread | 5 thread |
+============+==========+==========+==========+==========+
|  1,000 MRs |   14 ms  |   30 ms  |   14 ms  |   80 ms  |
+------------+----------+----------+----------+----------+
| 10,000 MRs |  135 ms  |   6 sec  |  173 ms  |  880 ms  |
+------------+----------+----------+----------+----------+
|100,000 MRs | 11.2 sec |  57 sec  | 1.74 sec |  8.8 sec |
+------------+----------+----------+----------+----------+

Link: https://lore.kernel.org/r/71af2770c737b936f7b10f457f0ef303ffcf7ad7.1632644527.git.leonro@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-04 16:59:50 -03:00
Jason Gunthorpe
c78d218fc5 Linux 5.15-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmFaG98eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGosAH/jqy5B2BIEE39O+8
 QTr3vO54SyRRuY/d98wZ+O4SPjfqfpCHuyjKt9YJpEdmzH754NC9gSPOOBegnvHI
 DfrWaivmJ5mdjN2h7+JVqjs58krUv98wWNa5xfvqUp5H7wF3WQg3AxsaMKS1PePD
 kFHfeFbxsg2gYhyhPK6gHtwLn6dEsx9bGny2bKvCh6KuJQEiUXoEcgnFzjFgLNxp
 T5zI1cNSCNUzwRIe+vqQRlfVR2JlSI4tiy0zNJWy9dQ5Z4HOSbFcEz5Df2N7qNYn
 /MqruaASmyREgo9yLHpR1BSyzrea8MCckY04ycYqKZb7gDwcrpAe4QVw2I/Fuzu9
 q//PV4I=
 =+mYg
 -----END PGP SIGNATURE-----

Merge tag 'v5.15-rc4' into rdma.get for-next

Merged due to dependencies in following patches.

Conflict in drivers/infiniband/hw/hfi1/ipoib_tx.c resolved by hand to take
the %p change and txq stats rename together.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-04 16:01:26 -03:00
Shai Malin
fb09a1ed5c qed: Remove e4_ and _e4 from FW HSI
The existing qed/qede/qedr/qedi/qedf code uses chip-specific naming in
structures,  functions, variables and defines in FW HSI (Hardware
Software Interface).

The new FW version introduced a generic naming convention in HSI
in-which the same code will be used across different versions
for simpler maintainability. It also eases in providing support for
new features.

With this patch every "_e4" or "e4_" prefix or suffix is not needed
anymore and it will be removed.

Reviewed-by: Manish Rangankar <mrangankar@marvell.com>
Reviewed-by: Javed Hasan <jhasan@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
Signed-off-by: Omkar Kulkarni <okulkarni@marvell.com>
Signed-off-by: Shai Malin <smalin@marvell.com>
Signed-off-by: Prabhakar Kushwaha <pkushwaha@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-04 12:55:48 +01:00
Jason Gunthorpe
49b99314b4 IB/mlx5: Flow through a more detailed return code from get_prefetchable_mr()
The error returns for various cases detected by get_prefetchable_mr() get
confused as it flows back to userspace. Properly label each error path and
flow the error code properly back to the system call.

Link: https://lore.kernel.org/r/20210928170846.GA1721590@nvidia.com
Suggested-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-01 11:40:07 -03:00
Jason Gunthorpe
d30ef6d5c0 Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Leon Romanovsky says:
====================
This is short series for mlx5 from Meir that adds a DevX UID to the UAR.
====================

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

* mellanox/mlx5-next:
  IB/mlx5: Enable UAR to have DevX UID
  net/mlx5: Add uid field to UAR allocation structures
2021-09-28 13:27:40 -03:00
Meir Lichtinger
d2c8a1554c IB/mlx5: Enable UAR to have DevX UID
UID field was added to alloc_uar and dealloc_uar PRM command, to specify
DevX UID for UAR. This change enables firmware validating user access to
its own UAR resources.

For the kernel allocated UARs the UID will stay 0 as of today.

Signed-off-by: Meir Lichtinger <meirl@nvidia.com>
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
2021-09-28 18:31:21 +03:00
Gustavo A. R. Silva
11333be19c RDMA/hfi1: Use struct_size() and flex_array_size() helpers
Make use of the struct_size() and flex_array_size() helpers instead of
open-coded versions, in order to avoid any potential type mistakes or
integer overflows that, in the worse scenario, could lead to heap
overflows.

Link: https://lore.kernel.org/r/20210927225333.GA192634@embeddedor
Link: https://github.com/KSPP/linux/issues/160
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-27 20:15:54 -03:00
Mike Marciniszyn
6d1ebccbd6 IB/hfi1: Add ring consumer and producers traces
These traces are used to debugging ring issues.

The ipoib_txreq needed to be moved to a header file to allow access from
the trace header file.

The trace changes include:
- new producer/consumer traces
- new allocation deallocation traces
- additional fidelity for SDMA engine prints

Fixes: 4bd00b55c978 ("IB/hfi1: Add AIP tx traces")
Link: https://lore.kernel.org/r/20210913132852.131370.9664.stgit@awfm-01.cornelisnetworks.com
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-27 20:06:42 -03:00
Mike Marciniszyn
b4b90a50cb IB/hfi1: Remove atomic completion count
The atomic is not needed.

Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Link: https://lore.kernel.org/r/20210913132847.131370.54250.stgit@awfm-01.cornelisnetworks.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-27 20:06:42 -03:00
Mike Marciniszyn
f5dc70a0e1 IB/hfi1: Tune netdev xmit cachelines
This patch moves fields in the ring and creates a line for the producer
and the consumer.

The adds a consumer side variable that tracks the ring avail so that the
code doesn't have the read the other cacheline to get a count for every
packet. A read now only occurs when the avail is at 0.

Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Link: https://lore.kernel.org/r/20210913132842.131370.15636.stgit@awfm-01.cornelisnetworks.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-27 20:06:42 -03:00
Mike Marciniszyn
a7125869b2 IB/hfi1: Get rid of tx priv backpointer
The txq has the backpointer, so this is a micro optimization for the tx
path.

Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Link: https://lore.kernel.org/r/20210913132836.131370.89704.stgit@awfm-01.cornelisnetworks.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-27 20:06:41 -03:00
Mike Marciniszyn
4bf0ca0c9f IB/hfi1: Get rid of hot path divide
The pointer math in this statemet does a divide;
	struct hfi1_ipoib_txq *txq = &priv->txqs[napi - priv->tx_napis];

Elminate the divide by embedding the struct napi_strut in the txq and
getting the txq with a container_of() using the newly embedded napi.

Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Link: https://lore.kernel.org/r/20210913132831.131370.3993.stgit@awfm-01.cornelisnetworks.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-27 20:06:41 -03:00
Mike Marciniszyn
d47dfc2b00 IB/hfi1: Remove cache and embed txreq in ring
This patch removes kmem cache allocation and deallocation in favor of
having the ipoib_txreq in the ring.

The consumer is now the packet sending side allocating tx descriptors from
ring and the producer is the napi interrupt handling freeing tx
descriptors.

The locks are now eliminated because the napi tx lock insures a single
consumer and the napi handling insures a single producer.

The napi poll is converted to memory poll looking for items that have been
marked completed.

Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Link: https://lore.kernel.org/r/20210913132826.131370.4397.stgit@awfm-01.cornelisnetworks.com
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-27 20:06:41 -03:00
Wenpeng Liang
e671f0ecfe RDMA/hns: Add the check of the CQE size of the user space
If the CQE size of the user space is not the size supported by the
hardware, the creation of CQ should be stopped.

Fixes: 09a5f210f67e ("RDMA/hns: Add support for CQE in size of 64 Bytes")
Link: https://lore.kernel.org/r/20210927125557.15031-3-liangwenpeng@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-27 14:49:49 -03:00
Wenpeng Liang
cc26aee100 RDMA/hns: Fix the size setting error when copying CQE in clean_cq()
The size of CQE is different for different versions of hardware, so the
driver needs to specify the size of CQE explicitly.

Fixes: 09a5f210f67e ("RDMA/hns: Add support for CQE in size of 64 Bytes")
Link: https://lore.kernel.org/r/20210927125557.15031-2-liangwenpeng@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-27 14:49:48 -03:00
Guo Zhi
7d5cfafe8b RDMA/hfi1: Fix kernel pointer leak
Pointers should be printed with %p or %px rather than cast to 'unsigned
long long' and printed with %llx.  Change %llx to %p to print the secured
pointer.

Fixes: 042a00f93aad ("IB/{ipoib,hfi1}: Add a timeout handler for rdma_netdev")
Link: https://lore.kernel.org/r/20210922134857.619602-1-qtxuning1999@sjtu.edu.cn
Signed-off-by: Guo Zhi <qtxuning1999@sjtu.edu.cn>
Acked-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-27 14:32:14 -03:00
Leon Romanovsky
a86cd017a4 RDMA/usnic: Lock VF with mutex instead of spinlock
Usnic VF doesn't need lock in atomic context to create QPs, so it is safe
to use mutex instead of spinlock. Such change fixes the following smatch
error.

Smatch static checker warning:

   lib/kobject.c:289 kobject_set_name_vargs()
    warn: sleeping in atomic context

Fixes: 514aee660df4 ("RDMA: Globally allocate and release QP memory")
Link: https://lore.kernel.org/r/2a0e295786c127e518ebee8bb7cafcb819a625f6.1631520231.git.leonro@nvidia.com
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-24 10:55:28 -03:00
Jason Gunthorpe
14351f08ed RDMA/hns: Work around broken constant propagation in gcc 8
gcc 8.3 and 5.4 throw this:

In function 'modify_qp_init_to_rtr',
././include/linux/compiler_types.h:322:38: error: call to '__compiletime_assert_1859' declared with attribute error: FIELD_PREP: value too large for the field
  _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
[..]
drivers/infiniband/hw/hns/hns_roce_common.h:91:52: note: in expansion of macro 'FIELD_PREP'
   *((__le32 *)ptr + (field_h) / 32) |= cpu_to_le32(FIELD_PREP(   \
                                                    ^~~~~~~~~~
drivers/infiniband/hw/hns/hns_roce_common.h:95:39: note: in expansion of macro '_hr_reg_write'
 #define hr_reg_write(ptr, field, val) _hr_reg_write(ptr, field, val)
                                       ^~~~~~~~~~~~~
drivers/infiniband/hw/hns/hns_roce_hw_v2.c:4412:2: note: in expansion of macro 'hr_reg_write'
  hr_reg_write(context, QPC_LP_PKTN_INI, lp_pktn_ini);

Because gcc has miscalculated the constantness of lp_pktn_ini:

	mtu = ib_mtu_enum_to_int(ib_mtu);
	if (WARN_ON(mtu < 0)) [..]
	lp_pktn_ini = ilog2(MAX_LP_MSG_LEN / mtu);

Since mtu is limited to {256,512,1024,2048,4096} lp_pktn_ini is between 4
and 8 which is compatible with the 4 bit field in the FIELD_PREP.

Work around this broken compiler by adding a 'can never be true'
constraint on lp_pktn_ini's value which clears out the problem.

Fixes: f0cb411aad23 ("RDMA/hns: Use new interface to modify QP context")
Link: https://lore.kernel.org/r/0-v1-c773ecb137bc+11f-hns_gcc8_jgg@nvidia.com
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-24 08:47:55 -03:00
Sindhu Devale
9f7fa37a6b RDMA/irdma: Report correct WC error when there are MW bind errors
Report the correct WC error when MW bind error related asynchronous events
are generated by HW.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20210916191222.824-5-shiraz.saleem@intel.com
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 14:13:23 -03:00
Sindhu Devale
d3bdcd5963 RDMA/irdma: Report correct WC error when transport retry counter is exceeded
When the retry counter exceeds, as the remote QP didn't send any Ack or
Nack an asynchronous event (AE) for too many retries is generated. Add
code to handle the AE and set the correct IB WC error code
IB_WC_RETRY_EXC_ERR.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20210916191222.824-4-shiraz.saleem@intel.com
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 14:13:23 -03:00
Sindhu Devale
f4475f2494 RDMA/irdma: Validate number of CQ entries on create CQ
Add lower bound check for CQ entries at creation time.

Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs")
Link: https://lore.kernel.org/r/20210916191222.824-3-shiraz.saleem@intel.com
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 14:13:23 -03:00
Sindhu Devale
5b1e985f76 RDMA/irdma: Skip CQP ring during a reset
Due to duplicate reset flags, CQP commands are processed during reset.

This leads CQP failures such as below:

 irdma0: [Delete Local MAC Entry Cmd Error][op_code=49] status=-27 waiting=1 completion_err=0 maj=0x0 min=0x0

Remove the redundant flag and set the correct reset flag so CPQ is paused
during reset

Fixes: 8498a30e1b94 ("RDMA/irdma: Register auxiliary driver and implement private channel OPs")
Link: https://lore.kernel.org/r/20210916191222.824-2-shiraz.saleem@intel.com
Reported-by: LiLiang <liali@redhat.com>
Signed-off-by: Sindhu Devale <sindhu.devale@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 14:13:22 -03:00
Selvin Xavier
6bda39149d RDMA/bnxt_re: Check if the vlan is valid before reporting
When VF is configured with default vlan, HW strips the vlan from the
packet and driver receives it in Rx completion. VLAN needs to be reported
for UD work completion only if the vlan is configured on the host. Add a
check for valid vlan in the UD receive path.

Link: https://lore.kernel.org/r/1631709163-2287-12-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 13:37:02 -03:00
Selvin Xavier
7a3c3a121e RDMA/bnxt_re: Correct FRMR size calculation
FRMR WQE requires to provide the log2 value of the PBL and page size.  Use
the standard ilog2() to calculate the log2 value

Link: https://lore.kernel.org/r/1631709163-2287-11-git-send-email-selvin.xavier@broadcom.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 13:37:02 -03:00
Selvin Xavier
690ea7fe00 RDMA/bnxt_re: Use GFP_KERNEL in non atomic context
Use GFP_KERNEL instead of GFP_ATOMIC while allocating control path
structures which will be only called from non atomic context

Link: https://lore.kernel.org/r/1631709163-2287-10-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 13:37:02 -03:00
Selvin Xavier
2b4ccce6ca RDMA/bnxt_re: Fix FRMR issue with single page MR allocation
When the FRMR is allocated with single page, driver is attempting to
create a level 0 HWQ and not allocating any page because the nopte field
is set. This causes the crash during post_send as the pbl is not
populated.

To avoid this crash, check for the nopte bit during HWQ creation with
single page and create a level 1 page table and populate the pbl address
correctly.

Link: https://lore.kernel.org/r/1631709163-2287-9-git-send-email-selvin.xavier@broadcom.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 13:37:02 -03:00
Selvin Xavier
598d16fa1b RDMA/bnxt_re: Fix query SRQ failure
Fill the missing parameters for the FW command while querying SRQ.

Fixes: 37cb11acf1f7 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters")
Link: https://lore.kernel.org/r/1631709163-2287-8-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 13:37:02 -03:00
Selvin Xavier
d195ff03bf RDMA/bnxt_re: Suppress unwanted error messages
Terminal CQEs are expected during QP destroy. Avoid the unwanted error
messages.

Link: https://lore.kernel.org/r/1631709163-2287-7-git-send-email-selvin.xavier@broadcom.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 13:37:01 -03:00
Selvin Xavier
6a7296c918 RDMA/bnxt_re: Support multiple page sizes
HW can support multiple page sizes. Enable bits for enabling sizes from 4k
to 1G by reporting page_size_cap.

Link: https://lore.kernel.org/r/1631709163-2287-6-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 13:37:01 -03:00
Selvin Xavier
b9b43ad3ce RDMA/bnxt_re: Reduce the delay in polling for hwrm command completion
Driver has 1ms delay between the polling for atomic command completion.
Polling immediately after issuing command usually doesn't report any
completions. So all commands in the blocking path needs two iterations. So
effectively 1ms spend on each command. HW requires much lesser time for
each command. So reduce the delay to 1us and increase the iteration count
to wait for the same time.

Link: https://lore.kernel.org/r/1631709163-2287-5-git-send-email-selvin.xavier@broadcom.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 13:37:01 -03:00