linux

iv/linux

Author	SHA1	Message	Date
Yixian Liu	17475e104d	RDMA/hns: Bugfix for memory window mtpt configuration When a memory window is bound to a memory region, the local write access should be set for its mtpt table. Fixes: `c7c2819140` ("RDMA/hns: Add MW support for hip08") Link: https://lore.kernel.org/r/1606386372-21094-1-git-send-email-liweihang@huawei.com Signed-off-by: Yixian Liu <liuyixian@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-26 10:57:32 -04:00
Wenpeng Liang	ab6f7248cc	RDMA/hns: Fix retry_cnt and rnr_cnt when querying QP The maximum number of retransmission should be returned when querying QP, not the value of retransmission counter. Fixes: `99fcf82521` ("RDMA/hns: Fix the wrong value of rnr_retry when querying qp") Fixes: `926a01dc00` ("RDMA/hns: Add QP operations support for hip08 SoC") Link: https://lore.kernel.org/r/1606382977-21431-1-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-26 10:57:32 -04:00
Wenpeng Liang	ebed7b7ca4	RDMA/hns: Fix wrong field of SRQ number the device supports The SRQ capacity is got from the firmware, whose field should be ended at bit 19. Fixes: `ba6bb7e974` ("RDMA/hns: Add interfaces to get pf capabilities from firmware") Link: https://lore.kernel.org/r/1606382812-23636-1-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-26 10:57:32 -04:00
Jason Gunthorpe	e0477b34d9	RDMA: Explicitly pass in the dma_device to ib_register_device The code in setup_dma_device has become rather convoluted, move all of this to the drivers. Drives now pass in a DMA capable struct device which will be used to setup DMA, or drivers must fully configure the ibdev for DMA and pass in NULL. Other than setting the masks in rvt all drivers were doing this already anyhow. mthca, mlx4 and mlx5 were already setting up maximum DMA segment size for DMA based on their hardweare limits in: __mthca_init_one() dma_set_max_seg_size (1G) __mlx4_init_one() dma_set_max_seg_size (1G) mlx5_pci_init() set_dma_caps() dma_set_max_seg_size (2G) Other non software drivers (except usnic) were extended to UINT_MAX [1, 2] instead of 2G as was before. [1] https://lore.kernel.org/linux-rdma/20200924114940.GE9475@nvidia.com/ [2] https://lore.kernel.org/linux-rdma/20200924114940.GE9475@nvidia.com/ Link: https://lore.kernel.org/r/20201008082752.275846-1-leon@kernel.org Link: https://lore.kernel.org/r/6b2ed339933d066622d5715903870676d8cc523a.1602590106.git.mchehab+huawei@kernel.org Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-10-16 13:53:46 -03:00
Lang Cheng	cf4c0fb00d	RDMA/hns: Remove unused variables and definitions Some code was removed but the variables were still there, and some parameters have been changed to be queried from firmware. So the definitions of them are no longer needed. Fixes: `2a3d923f87` ("RDMA/hns: Replace magic numbers with #defines") Fixes: `82e620d9c3` ("RDMA/hns: Modify the data structure of hns_roce_av") Fixes: `8254746978` ("IB/hns: Implement the add_gid/del_gid and optimize the GIDs management") Fixes: `21b97f5387` ("RDMA/hns: Fixup qp release bug") Link: https://lore.kernel.org/r/1601371934-40003-1-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-29 14:01:20 -03:00
Leon Romanovsky	b925c555a1	RDMA/drivers: Remove udata check from special QP GSI QP can't be created from the user space, hence the udata check is always false (udata == NULL). Remove that check and simplify the flow. Link: https://lore.kernel.org/r/20200926102450.2966017-9-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-29 13:11:06 -03:00
Weihang Li	30b707886a	RDMA/hns: Support inline data in extented sge space for RC HIP08 supports RC inline up to size of 32 Bytes, and all data should be put into SQWQE. For HIP09, this capability is extended to 1024 Bytes, if length of data is longer than 32 Bytes, they will be filled into extended sge space. Link: https://lore.kernel.org/r/1599744069-9968-1-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-24 16:06:57 -03:00
Weihang Li	05df49279f	RDMA/hns: Fix missing sq_sig_type when querying QP The sq_sig_type field should be filled when querying QP, or the users may get a wrong value. Fixes: `926a01dc00` ("RDMA/hns: Add QP operations support for hip08 SoC") Link: https://lore.kernel.org/r/1600509802-44382-9-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-24 15:56:27 -03:00
Weihang Li	fbed9d2be2	RDMA/hns: Fix configuration of ack_req_freq in QPC The hardware will add AckReq flag in BTH header according to the value of ack_req_freq to request ACK from responder for the packets with this flag. It should be greater than or equal to lp_pktn_ini instead of using a fixed value. Fixes: `7b9bd73ed1` ("RDMA/hns: Fix wrong assignment of lp_pktn_ini in QPC") Link: https://lore.kernel.org/r/1600509802-44382-8-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-24 15:56:27 -03:00
Wenpeng Liang	99fcf82521	RDMA/hns: Fix the wrong value of rnr_retry when querying qp The rnr_retry returned to the user is not correct, it should be got from another fields in QPC. Fixes: `bfe860351e` ("RDMA/hns: Fix cast from or to restricted __le32 for driver") Link: https://lore.kernel.org/r/1600509802-44382-7-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-24 15:56:27 -03:00
Jiaran Zhang	768202a082	RDMA/hns: Solve the overflow of the calc_pg_sz() calc_pg_sz() may gets a data calculation overflow if the PAGE_SIZE is 64 KB and hop_num is 2. It is because that all variables involved in calculation are defined in type of int. So change the type of bt_chunk_size, buf_chunk_size and obj_per_chunk_default to u64. Fixes: `ba6bb7e974` ("RDMA/hns: Add interfaces to get pf capabilities from firmware") Link: https://lore.kernel.org/r/1600509802-44382-6-git-send-email-liweihang@huawei.com Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-24 15:56:27 -03:00
Jiaran Zhang	172505cfa3	RDMA/hns: Add check for the validity of sl configuration According to the RoCE v1 specification, the sl (service level) 0-7 are mapped directly to priorities 0-7 respectively, sl 8-15 are reserved. The driver should verify whether the the value of sl is larger than 7, if so, an exception should be returned. Fixes: `926a01dc00` ("RDMA/hns: Add QP operations support for hip08 SoC") Link: https://lore.kernel.org/r/1600509802-44382-5-git-send-email-liweihang@huawei.com Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-24 15:56:27 -03:00
Lang Cheng	c19893fd9c	RDMA/hns: Correct typo of hns_roce_create_cq() Change "initialze" to "initialize". Fixes: `8f3e9f3ea0` ("IB/hns: Add code for refreshing CQ CI using TPTR") Link: https://lore.kernel.org/r/1600509802-44382-4-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-24 15:56:27 -03:00
Yangyang Li	221109e643	RDMA/hns: Add interception for resizing SRQs HIP08 doesn't support modifying the maximum number of outstanding WR in an SRQ. However, the driver does not return a failure message, and users may mistakenly think that the resizing is executed successfully. So the driver needs to intercept this operation. Fixes: `ffb1308b88` ("RDMA/hns: Move SRQ code to the reasonable place") Link: https://lore.kernel.org/r/1600509802-44382-3-git-send-email-liweihang@huawei.com Signed-off-by: Yangyang Li <liyangyang20@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-24 15:56:26 -03:00
Weihang Li	12542f1de1	RDMA/hns: Refactor process about opcode in post_send() According to the IB specifications, the verbs should return an immediate error when the users set an unsupported opcode. Furthermore, refactor codes about opcode in process of post_send to make the difference between opcodes clearer. Link: https://lore.kernel.org/r/1600509802-44382-2-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-24 15:56:26 -03:00
Yangyang Li	3cb2c996c9	RDMA/hns: Add support for SCCC in size of 64 Bytes For HIP09, size of SCCC (Soft Congestion Control Context) is increased to 64 Bytes from 32 Bytes. The hardware will get the configuration of SCCC from driver instead of using a fixed value. Link: https://lore.kernel.org/r/1600245806-56321-5-git-send-email-liweihang@huawei.com Signed-off-by: Yangyang Li <liyangyang20@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-24 15:35:11 -03:00
Wenpeng Liang	98912ee82a	RDMA/hns: Add support for QPC in size of 512 Bytes The new version of RoCEE supports using QPC in size of 256B or 512B, so that HIP09 can supports new congestion control algorithms by using QPC in larger size. Link: https://lore.kernel.org/r/1600245806-56321-4-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Yangyang Li <liyangyang20@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-24 15:35:11 -03:00
Wenpeng Liang	09a5f210f6	RDMA/hns: Add support for CQE in size of 64 Bytes The new version of RoCEE supports using CQE in size of 32B or 64B. The performance of bus can be improved by using larger size of CQE. Link: https://lore.kernel.org/r/1600245806-56321-3-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-24 15:35:11 -03:00
Wenpeng Liang	247fc16d73	RDMA/hns: Add support for EQE in size of 64 Bytes The new version of RoCEE supports using CEQE in size of 4B or 64B, AEQE in size of 16B or 64B. The performance of bus can be improved by using larger size of EQE. Link: https://lore.kernel.org/r/1600245806-56321-2-git-send-email-liweihang@huawei.com Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-24 15:35:10 -03:00
Lijun Ou	22d3e1ed2c	RDMA/hns: Set the unsupported wr opcode hip06 does not support IB_WR_LOCAL_INV, so the ps_opcode should be set to an invalid value instead of being left uninitialized. Fixes: `9a4435375c` ("IB/hns: Add driver files for hns RoCE driver") Fixes: `a2f3d4479f` ("RDMA/hns: Avoid unncessary initialization") Link: https://lore.kernel.org/r/1600350615-115217-1-git-send-email-oulijun@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-18 12:52:58 -03:00
Leon Romanovsky	d18bb3e152	RDMA: Clean MW allocation and free flows Move allocation and destruction of memory windows under ib_core responsibility and clean drivers to ensure that no updates to MW ib_core structures are done in driver layer. Link: https://lore.kernel.org/r/20200902081623.746359-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-17 14:04:32 -03:00
Jason Gunthorpe	cf9ce3c8ab	RDMA/hns: Use ib_umem_num_dma_blocks() instead of opencoding mtr_umem_page_count() does the same thing, replace it with the core code. Also, ib_umem_find_best_pgsz() should always be called to check that the umem meets the page_size requirement. If there is a limited set of page_sizes that work it the pgsz_bitmap should be set to that set. 0 is a failure and the umem cannot be used. Lightly tidy the control flow to implement this flow properly. Link: https://lore.kernel.org/r/12-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com Acked-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-11 10:24:54 -03:00
Jason Gunthorpe	ebc24096c4	RDMA/umem: Add rdma_umem_for_each_dma_block() This helper does the same as rdma_for_each_block(), except it works on a umem. This simplifies most of the call sites. Link: https://lore.kernel.org/r/4-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com> Acked-by: Shiraz Saleem <shiraz.saleem@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-09 15:33:17 -03:00
Leon Romanovsky	43d781b9fa	RDMA: Allow fail of destroy CQ Like any other verbs objects, CQ shouldn't fail during destroy, but mlx5_ib didn't follow this contract with mixed IB verbs objects with DEVX. Such mix causes to the situation where FW and kernel are fully interdependent on the reference counting of each side. Kernel verbs and drivers that don't have DEVX flows shouldn't fail. Fixes: `e39afe3d6d` ("RDMA: Convert CQ allocations to be under core responsibility") Link: https://lore.kernel.org/r/20200907120921.476363-7-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-09 14:14:29 -03:00
Leon Romanovsky	119181d1d4	RDMA: Restore ability to fail on SRQ destroy In similar way to other IB objects, restore the ability to return error on SRQ destroy. Strictly speaking, this change is not necessary, and provided here to ensure a symmetrical interface like other destroy functions. Fixes: `68e326dea1` ("RDMA: Handle SRQ allocations by IB/core") Link: https://lore.kernel.org/r/20200907120921.476363-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-09 14:14:24 -03:00
Leon Romanovsky	9a9ebf8cd7	RDMA: Restore ability to fail on AH destroy Like any other IB verbs objects, AH are refcounted by ib_core. The release of those objects are controlled by ib_core with promise that AH destroy can't fail. Being SW object for now, this change makes dealloc_ah() to behave like any other destroy IB flows. Fixes: `d345691471` ("RDMA: Handle AH allocations by IB/core") Link: https://lore.kernel.org/r/20200907120921.476363-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-09 13:57:22 -03:00
Leon Romanovsky	91a7c58fce	RDMA: Restore ability to fail on PD deallocate The IB verbs objects are counted by the kernel and ib_core ensures that deallocate PD will success so it will be called once all other objects that depends on PD will be released. This is achieved by managing various reference counters on such objects. The mlx5 driver didn't follow this standard flow when allowed DEVX objects that are not managed by ib_core to be interleaved with the ones under ib_core responsibility. In such interleaved scenarios deallocate command can fail and ib_core will leave uobject in internal DB and attempt to clean it later to free resources anyway. This change partially restores returned value from dealloc_pd() for all drivers, but keeping in mind that non-DEVX devices and kernel verbs paths shouldn't fail. Fixes: `21a428a019` ("RDMA: Handle PD allocations by IB/core") Link: https://lore.kernel.org/r/20200907120921.476363-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-09 13:57:22 -03:00
Lijun Ou	a2f3d4479f	RDMA/hns: Avoid unncessary initialization Some variables have been initialized when used. As a result, here removes some unncessary initial assignment. Link: https://lore.kernel.org/r/1599547944-30671-1-git-send-email-oulijun@huawei.com Signed-off-by: Lijun Ou <oulijun@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-09 13:23:15 -03:00
Jason Gunthorpe	6989aa62d3	Merge tag 'v5.9-rc3' into rdma.git for-next Required due to dependencies in following patches. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-08-31 12:28:12 -03:00
Weihang Li	074bf2c2c7	RDMA/hns: Get udp sport num dynamically instead of using a fixed value The UDP source port number in RoCE v2 is used to create entropy for network routers (ECMP), load balancers and 802.3ad link aggregation switching that are not aware of RoCE IB headers. Considering that the IB core has achieved a new interface to get a hashed value of it, the fixed value of it in QPC and UD WQE in hns driver could be fixed and the port number is to be set dynamically now. For QPC of RC, the value could be hashed from flow_lable if the user pass it in or from remote qpn and local qpn. For WQE of UD, it is set according to fl or as a random value. Link: https://lore.kernel.org/r/1598002289-8611-1-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-08-31 12:03:17 -03:00
Lang Cheng	e0ef0f68c4	RDMA/hns: Add a check for current state before modifying QP It should be considered an illegal operation if the ULP attempts to modify a QP from another state to the current hardware state. Otherwise, the ULP can modify some fields of QPC at any time. For example, for a QP in state of RTS, modify it from RTR to RTS can change the PSN, which is always not as expected. Fixes: `9a4435375c` ("IB/hns: Add driver files for hns RoCE driver") Link: https://lore.kernel.org/r/1598353674-24270-1-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-08-27 09:46:07 -03:00
Weihang Li	6da06c6291	Revert "RDMA/hns: Reserve one sge in order to avoid local length error" This patch caused some issues on SEND operation, and it should be reverted to make the drivers work correctly. There will be a better solution that has been tested carefully to solve the original problem. This reverts commit `711195e57d`. Fixes: `711195e57d` ("RDMA/hns: Reserve one sge in order to avoid local length error") Link: https://lore.kernel.org/r/1597829984-20223-1-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-08-20 08:35:19 -03:00
Colin Ian King	d963c524a4	RDMA/hns: Fix spelling mistake "epmty" -> "empty" There is a spelling mistake in a dev_dbg message. Fix it. Link: https://lore.kernel.org/r/20200805141111.22804-1-colin.king@canonical.com Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-08-18 13:18:53 -03:00
Linus Torvalds	d7806bbd22	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma updates from Jason Gunthorpe: "A quiet cycle after the larger 5.8 effort. Substantially cleanup and driver work with a few smaller features this time. - Driver updates for hfi1, rxe, mlx5, hns, qedr, usnic, bnxt_re - Removal of dead or redundant code across the drivers - RAW resource tracker dumps to include a device specific data blob for device objects to aide device debugging - Further advance the IOCTL interface, remove the ability to turn it off. Add QUERY_CONTEXT, QUERY_MR, and QUERY_PD commands - Remove stubs related to devices with no pkey table - A shared CQ scheme to allow multiple ULPs to share the CQ rings of a device to give higher performance - Several more static checker, syzkaller and rare crashers fixed" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (121 commits) RDMA/mlx5: Fix flow destination setting for RDMA TX flow table RDMA/rxe: Remove pkey table RDMA/umem: Add a schedule point in ib_umem_get() RDMA/hns: Fix the unneeded process when getting a general type of CQE error RDMA/hns: Fix error during modify qp RTS2RTS RDMA/hns: Delete unnecessary memset when allocating VF resource RDMA/hns: Remove redundant parameters in set_rc_wqe() RDMA/hns: Remove support for HIP08_A RDMA/hns: Refactor hns_roce_v2_set_hem() RDMA/hns: Remove redundant hardware opcode definitions RDMA/netlink: Remove CAP_NET_RAW check when dump a raw QP RDMA/include: Replace license text with SPDX tags RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wq RDMA/rtrs-clt: add an additional random 8 seconds before reconnecting RDMA/cma: Execute rdma_cm destruction from a handler properly RDMA/cma: Remove unneeded locking for req paths RDMA/cma: Using the standard locking pattern when delivering the removal event RDMA/cma: Simplify DEVICE_REMOVAL for internal_id RDMA/efa: Add EFA 0xefa1 PCI ID RDMA/efa: User/kernel compatibility handshake mechanism ...	2020-08-06 16:43:36 -07:00
Xi Wang	395f2e8fd3	RDMA/hns: Fix the unneeded process when getting a general type of CQE error If the hns ROCEE reports a general error CQE (types not specified by the IB General Specifications), it's no need to change the QP state to error, and the driver should just skip it. Fixes: `7c044adca2` ("RDMA/hns: Simplify the cqe code of poll cq") Link: https://lore.kernel.org/r/1595932941-40613-8-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-30 15:11:49 -03:00
Lang Cheng	4327bd2c41	RDMA/hns: Fix error during modify qp RTS2RTS One qp state migrations legal configuration was deleted mistakenly. Fixes: `357f342946` ("RDMA/hns: Simplify the state judgment code of qp") Link: https://lore.kernel.org/r/1595932941-40613-7-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-30 15:11:49 -03:00
Lang Cheng	a5531e9b70	RDMA/hns: Delete unnecessary memset when allocating VF resource The hns_roce_cmq_setup_basic_desc() can clear the whole desc, so removes these redundant memset operations. Link: https://lore.kernel.org/r/1595932941-40613-6-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-30 15:11:49 -03:00
Weihang Li	eaaa98dedf	RDMA/hns: Remove redundant parameters in set_rc_wqe() There are some functions called by set_rc_wqe() use two parameters: "void wqe" and "struct hns_roce_v2_rc_send_wqe rc_sq_wqe", but the first one can be got from the second one. So remove the redundant wqe from related functions. Link: https://lore.kernel.org/r/1595932941-40613-5-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-30 11:31:02 -03:00
Lang Cheng	a247fd28c1	RDMA/hns: Remove support for HIP08_A HIP08_A is an temporary version and all features of it are supported by HIP08_B. So remove the relevant code. Link: https://lore.kernel.org/r/1595932941-40613-4-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Yangyang Li <liyangyang20@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-30 11:31:02 -03:00
Weihang Li	cdc1f3e946	RDMA/hns: Refactor hns_roce_v2_set_hem() The parts about preparing and sending mailbox to hardware is not strongly related to other codes in hns_roce_v2_set_hem(), and can be encapsulated into a separate function. Link: https://lore.kernel.org/r/1595932941-40613-3-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-30 11:31:02 -03:00
Lang Cheng	57005c96b7	RDMA/hns: Remove redundant hardware opcode definitions HNS_ROCE_SQ_OPCODE_XXXs and HNS_ROCE_V2_WQE_OP_XXXs have same values, so remove a set of redundant definitions. In addition, remove the suffix of HNS_ROCE_V2_WQE_OP_BIND_MW_TYPE. Link: https://lore.kernel.org/r/1595932941-40613-2-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-30 11:31:02 -03:00
Xi Wang	79d5208386	RDMA/hns: Fix wrong PBL offset when VA is not aligned to PAGE_SIZE ROCE uses "VA % buf_page_size" to caclulate the offset in the PBL's first page, the actual PA corresponding to the MR's VA is equal to MR's PA plus this offset. The first PA in PBL has already been aligned to PAGE_SIZE after calling ib_umem_get(), but the MR's VA may not. If the buf_page_size is smaller than the PAGE_SIZE, this will lead the HW to access the wrong memory because the offset is smaller than expected. Fixes: `9b2cf76c9f` ("RDMA/hns: Optimize PBL buffer allocation process") Link: https://lore.kernel.org/r/1594726935-45666-1-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-16 09:55:01 -03:00
Weihang Li	7b9bd73ed1	RDMA/hns: Fix wrong assignment of lp_pktn_ini in QPC The RoCE Engine will schedule to another QP after one has sent (2 ^ lp_pktn_ini) packets. lp_pktn_ini is set in QPC and should be calculated from 2 factors: 1. current MTU as a integer 2. the RoCE Engine's maximum slice length 64KB But the driver use MTU as a enum ib_mtu and the max inline capability, the lp_pktn_ini will be much bigger than expected which may cause traffic of some QPs to never get scheduled. Fixes: `b713128de7` ("RDMA/hns: Adjust lp_pktn_ini dynamically") Link: https://lore.kernel.org/r/1594726138-49294-1-git-send-email-liweihang@huawei.com Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-16 09:52:14 -03:00
Xi Wang	cc33b23e1e	RDMA/hns: Optimize MTR level-0 addressing to access huge page If hns ROCEE is set to level-0 addressing, the length of the entire buffer can be used as the page size. The driver needn't to split the buffer into small units because all pages are continuous. Link: https://lore.kernel.org/r/1593525696-12570-1-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-07 13:15:10 -03:00
Gal Pressman	42a3b15396	RDMA: Remove the udata parameter from alloc_mr callback Allocating an MR flow can only be initiated by kernel users, and not from userspace so a udata parameter is redundant. Link: https://lore.kernel.org/r/20200706120343.10816-4-galpress@amazon.com Signed-off-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-06 19:25:53 -03:00
Maor Gottlieb	9e2a187a93	RDMA: Add a dedicated CQ resource tracker function In order to avoid double multiplexing of the resource when it is a CQ, add a dedicated callback function. Link: https://lore.kernel.org/r/20200623113043.1228482-6-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-06-23 11:46:27 -03:00
Yangyang Li	3ec5f54f7a	RDMA/hns: Fix an cmd queue issue when resetting If a IMP reset caused by some hardware errors and hns RoCE driver reset occurred at the same time, there is a possiblity that the IMP will stop dealing with command and users can't use the hardware. The logs are as follows: hns3 0000:fd:00.1: cleaned 0, need to clean 1 hns3 0000:fd:00.1: firmware version query failed -11 hns3 0000:fd:00.1: Cmd queue init failed hns3 0000:fd:00.1: Upgrade reset level hns3 0000:fd:00.1: global reset interrupt The hns NIC driver divides the reset process into 3 status: initialization, hardware resetting and softwaring restting. RoCE driver gets reset status by interfaces provided by NIC driver and commands will not be sent to the IMP if the driver is in any above status. The main reason for this issue is that there is a time gap between status 1 and 2, if the RoCE driver sends commands to the IMP during this gap, the IMP will stop working because it is not ready. To eliminate the time gap, the hns NIC driver has added a new interface in commit `a4de02287a` ("net: hns3: provide .get_cmdq_stat interface for the client"), so RoCE driver can ensure that no commands will be sent during resetting. Link: https://lore.kernel.org/r/1592314778-52822-1-git-send-email-liweihang@huawei.com Signed-off-by: Yangyang Li <liyangyang20@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-06-18 10:48:39 -03:00
Yangyang Li	98a6151907	RDMA/hns: Fix a calltrace when registering MR from userspace ibmr.device is assigned after MR is successfully registered, but both write_mtpt() and frmr_write_mtpt() accesses it during the mr registration process, which may cause the following error when trying to register MR in userspace and pbl_hop_num is set to 0. pc : hns_roce_mtr_find+0xa0/0x200 [hns_roce] lr : set_mtpt_pbl+0x54/0x118 [hns_roce_hw_v2] sp : ffff00023e73ba20 x29: ffff00023e73ba20 x28: ffff00023e73bad8 x27: 0000000000000000 x26: 0000000000000000 x25: 0000000000000002 x24: 0000000000000000 x23: ffff00023e73bad0 x22: 0000000000000000 x21: ffff0000094d9000 x20: 0000000000000000 x19: ffff8020a6bdb2c0 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0000000000000000 x13: 0140000000000000 x12: 0040000000000041 x11: ffff000240000000 x10: 0000000000001000 x9 : 0000000000000000 x8 : ffff802fb7558480 x7 : ffff802fb7558480 x6 : 000000000003483d x5 : ffff00023e73bad0 x4 : 0000000000000002 x3 : ffff00023e73bad8 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000094d9708 Call trace: hns_roce_mtr_find+0xa0/0x200 [hns_roce] set_mtpt_pbl+0x54/0x118 [hns_roce_hw_v2] hns_roce_v2_write_mtpt+0x14c/0x168 [hns_roce_hw_v2] hns_roce_mr_enable+0x6c/0x148 [hns_roce] hns_roce_reg_user_mr+0xd8/0x130 [hns_roce] ib_uverbs_reg_mr+0x14c/0x2e0 [ib_uverbs] ib_uverbs_write+0x27c/0x3e8 [ib_uverbs] __vfs_write+0x60/0x190 vfs_write+0xac/0x1c0 ksys_write+0x6c/0xd8 __arm64_sys_write+0x24/0x30 el0_svc_common+0x78/0x130 el0_svc_handler+0x38/0x78 el0_svc+0x8/0xc Solve above issue by adding a pointer of structure hns_roce_dev as a parameter of write_mtpt() and frmr_write_mtpt(), so that both of these functions can access it before finishing MR's registration. Fixes: `9b2cf76c9f` ("RDMA/hns: Optimize PBL buffer allocation process") Link: https://lore.kernel.org/r/1592314629-51715-1-git-send-email-liweihang@huawei.com Signed-off-by: Yangyang Li <liyangyang20@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-06-18 10:47:04 -03:00
Masahiro Yamada	a7f7f6248d	treewide: replace '---help---' in Kconfig files with 'help' Since commit `84af7a6194` ("checkpatch: kconfig: prefer 'help' over '---help---'"), the number of '---help---' has been gradually decreasing, but there are still more than 2400 instances. This commit finishes the conversion. While I touched the lines, I also fixed the indentation. There are a variety of indentation styles found. a) 4 spaces + '---help---' b) 7 spaces + '---help---' c) 8 spaces + '---help---' d) 1 space + 1 tab + '---help---' e) 1 tab + '---help---' (correct indentation) f) 1 tab + 1 space + '---help---' g) 1 tab + 2 spaces + '---help---' In order to convert all of them to 1 tab + 'help', I ran the following commend: $ find . -name 'Kconfig' \| xargs sed -i 's/^[[:space:]]---help---/\thelp/' Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>	2020-06-14 01:57:21 +09:00
Dan Carpenter	87d9e56849	RDMA/hns: Uninitialized variable in modify_qp_init_to_rtr() The "dmac" variable is used before it is initialized. Fixes: `494c3b3122` ("RDMA/hns: Refactor the QP context filling process related to WQE buffer configure") Link: https://lore.kernel.org/r/20200529083918.GA1298465@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>	2020-06-02 20:32:54 -03:00

1 2 3 4 5 ...

638 Commits