5010 Commits

Author SHA1 Message Date
Henry Orosco
a8b9234b12 i40iw: Refactor of driver generated AEs
The flush CQP OP can be used to optionally generate
Asynchronous Events (AEs) in addition to QP flush.
Consolidate all HW AE generation code under a new
function i40iw_gen_ae which use the flush CQP OP
to only generate AEs.

Signed-off-by: Henry Orosco <henry.orosco@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-15 15:58:04 -06:00
Jason Gunthorpe
7f86260b5f RDMA/cxgb4: Use structs to describe the uABI instead of opencoding
Open coding a loose value is not acceptable for describing the uABI in
RDMA. Provide the missing struct.

Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-15 15:58:04 -06:00
Jason Gunthorpe
633fb4d9fd RDMA/hns: Use structs to describe the uABI instead of opencoding
Open coding a loose value is not acceptable for describing the uABI in
RDMA. Provide the missing struct.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-15 15:58:04 -06:00
Jason Gunthorpe
9a657b4c4a RDMA/i40iw: Move uapi header to include/uapi
All of these defines are part of the uABI for the driver, this
header duplicates providers/i40iw/i40iw-abi.h in rdma-core.

Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-15 15:58:03 -06:00
Jason Gunthorpe
48962f5c6f RDMA/mlx4: Move flag constants to uapi header
MLX4_USER_DEV_CAP_LARGE_CQE (via mlx4_ib_alloc_ucontext_resp.dev_caps)
and MLX4_IB_QUERY_DEV_RESP_MASK_CORE_CLOCK_OFFSET (via
mlx4_uverbs_ex_query_device_resp.comp_mask) are copied directly to
userspace and form part of the uAPI.

Move them to the uapi header where they belong.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-15 15:58:03 -06:00
Sinan Kaya
561e5d4896 RDMA/qedr: eliminate duplicate barriers on weakly-ordered archs
Code includes wmb() followed by writel() in multiple places. writel()
already has a barrier on some architectures like arm64.

This ends up CPU observing two barriers back to back before executing the
register write.

Since code already has an explicit barrier call, changing writel() to
writel_relaxed().

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-15 15:35:44 -06:00
Yixian Liu
7b48221cf4 RDMA/hns: Fix cqn type and init resp
This patch changes the type of cqn from u32 to u64 to keep
userspace and kernel consistent, initializes resp both for
cq and qp to zeros, and also changes the condition judgment
of outlen considering future caps extension.

Suggested-by: Jason Gunthorpe <jgg@mellanox.com>
Fixes: e088a685eae9 (hns: Support rq record doorbell for the user space)
Fixes: 9b44703d0a21 (hns: Support cq record doorbell for the user space)
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-15 15:34:26 -06:00
Parav Pandit
115b68aa6e IB/ocrdma: Removed GID add/del null routines
add_gid() and del_gid() are optional callback routines.
ib_core ignores invoking them while updating GID table entries if
they are not implemented by provider drivers. Therefore remove them.

Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-15 15:17:48 -06:00
Leon Romanovsky
eeea6953c4 RDMA/mlx5: Simplify clean and destroy MR calls
The failure to destroy the MRs is printed on mlx5_core layer
as error and it makes warning prints useless.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-15 10:59:58 -04:00
Leon Romanovsky
c985bd0ed7 RDMA/mlx5: Guard ODP specific assignments with specific CONFIG
"live" is needed for ODP only and is better to be guarded
by appropriate CONFIG.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-15 10:59:58 -04:00
Leon Romanovsky
4638a3b242 RDMA/mlx5: Unify error flows in rereg MR failure paths
According to the IBTA spec 1.3, the driver failure in
MR reregister shall release old and new MRs.

 C11-20: If the CI returns any other error, the CI shall
 invalidate both "old" and "new" registrations, and release
 any associated resources.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-15 10:59:58 -04:00
Leon Romanovsky
ea30f01376 RDMA/mlx5: Return proper value for not-supported command
Return -EOPNOTSUPP value to the user for unsupported reg_user_mr.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-15 10:59:58 -04:00
Leon Romanovsky
4289861d88 RDMA/mlx5: Protect from NULL pointer derefence
The mlx5_ib_alloc_implicit_mr() can fail to acquire pages
and the returned mr pointer won't be valid. Ensure that it
is not error prior to access.

Cc: <stable@vger.kernel.org> # 4.10
Fixes: 81713d3788d2 ("IB/mlx5: Add implicit MR support")
Reported-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-15 10:59:58 -04:00
Doug Ledford
2d873449a2 Merge branch 'k.o/wip/dl-for-rc' into k.o/wip/dl-for-next
Due to bug fixes found by the syzkaller bot and taken into the for-rc
branch after development for the 4.17 merge window had already started
being taken into the for-next branch, there were fairly non-trivial
merge issues that would need to be resolved between the for-rc branch
and the for-next branch.  This merge resolves those conflicts and
provides a unified base upon which ongoing development for 4.17 can
be based.

Conflicts:
	drivers/infiniband/hw/mlx5/main.c - Commit 42cea83f9524
	(IB/mlx5: Fix cleanup order on unload) added to for-rc and
	commit b5ca15ad7e61 (IB/mlx5: Add proper representors support)
	add as part of the devel cycle both needed to modify the
	init/de-init functions used by mlx5.  To support the new
	representors, the new functions added by the cleanup patch
	needed to be made non-static, and the init/de-init list
	added by the representors patch needed to be modified to
	match the init/de-init list changes made by the cleanup
	patch.
Updates:
	drivers/infiniband/hw/mlx5/mlx5_ib.h - Update function
	prototypes added by representors patch to reflect new function
	names as changed by cleanup patch
	drivers/infiniband/hw/mlx5/ib_rep.c - Update init/de-init
	stage list to match new order from cleanup patch

Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-14 19:28:58 -04:00
Arnd Bergmann
bd8602ca42 infiniband: bnxt_re: use BIT_ULL() for 64-bit bit masks
On 32-bit targets, we otherwise get a warning about an impossible constant
integer expression:

In file included from include/linux/kernel.h:11,
                 from include/linux/interrupt.h:6,
                 from drivers/infiniband/hw/bnxt_re/ib_verbs.c:39:
drivers/infiniband/hw/bnxt_re/ib_verbs.c: In function 'bnxt_re_query_device':
include/linux/bitops.h:7:24: error: left shift count >= width of type [-Werror=shift-count-overflow]
 #define BIT(nr)   (1UL << (nr))
                        ^~
drivers/infiniband/hw/bnxt_re/bnxt_re.h:61:34: note: in expansion of macro 'BIT'
 #define BNXT_RE_MAX_MR_SIZE_HIGH BIT(39)
                                  ^~~
drivers/infiniband/hw/bnxt_re/bnxt_re.h:62:30: note: in expansion of macro 'BNXT_RE_MAX_MR_SIZE_HIGH'
 #define BNXT_RE_MAX_MR_SIZE  BNXT_RE_MAX_MR_SIZE_HIGH
                              ^~~~~~~~~~~~~~~~~~~~~~~~
drivers/infiniband/hw/bnxt_re/ib_verbs.c:149:25: note: in expansion of macro 'BNXT_RE_MAX_MR_SIZE'
  ib_attr->max_mr_size = BNXT_RE_MAX_MR_SIZE;
                         ^~~~~~~~~~~~~~~~~~~

Fixes: 872f3578241d ("RDMA/bnxt_re: Add support for MRs with Huge pages")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-14 18:24:13 -04:00
Arnd Bergmann
5388a50847 infiniband: qplib_fp: fix pointer cast
Building for a 32-bit target results in a couple of warnings from casting
between a 32-bit pointer and a 64-bit integer:

drivers/infiniband/hw/bnxt_re/qplib_fp.c: In function 'bnxt_qplib_service_nq':
drivers/infiniband/hw/bnxt_re/qplib_fp.c:333:23: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
    bnxt_qplib_arm_srq((struct bnxt_qplib_srq *)q_handle,
                       ^
drivers/infiniband/hw/bnxt_re/qplib_fp.c:336:12: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
            (struct bnxt_qplib_srq *)q_handle,
            ^
In file included from include/linux/byteorder/little_endian.h:5,
                 from arch/arm/include/uapi/asm/byteorder.h:22,
                 from include/asm-generic/bitops/le.h:6,
                 from arch/arm/include/asm/bitops.h:342,
                 from include/linux/bitops.h:38,
                 from include/linux/kernel.h:11,
                 from include/linux/interrupt.h:6,
                 from drivers/infiniband/hw/bnxt_re/qplib_fp.c:39:
drivers/infiniband/hw/bnxt_re/qplib_fp.c: In function 'bnxt_qplib_create_srq':
include/uapi/linux/byteorder/little_endian.h:31:43: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
 #define __cpu_to_le64(x) ((__force __le64)(__u64)(x))
                                           ^
include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64'
 #define cpu_to_le64 __cpu_to_le64
                     ^~~~~~~~~~~~~
drivers/infiniband/hw/bnxt_re/qplib_fp.c:569:19: note: in expansion of macro 'cpu_to_le64'
  req.srq_handle = cpu_to_le64(srq);

Using a uintptr_t as an intermediate works on all architectures.

Fixes: 37cb11acf1f7 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-14 18:24:12 -04:00
Mark Bloch
42cea83f95 IB/mlx5: Fix cleanup order on unload
On load we create private CQ/QP/PD in order to be used by UMR, we create
those resources after we register ourself as an IB device, and we destroy
them after we unregister as an IB device. This was changed by commit
16c1975f1032 ("IB/mlx5: Create profile infrastructure to add and remove
stages") which moved the destruction before we unregistration. This
allowed to trigger an invalid memory access when unloading mlx5_ib while
there are open resources:

BUG: unable to handle kernel paging request at 00000001002c012c
...
Call Trace:
 mlx5_ib_post_send_wait+0x75/0x110 [mlx5_ib]
 __slab_free+0x9a/0x2d0
 delay_time_func+0x10/0x10 [mlx5_ib]
 unreg_umr.isra.15+0x4b/0x50 [mlx5_ib]
 mlx5_mr_cache_free+0x46/0x150 [mlx5_ib]
 clean_mr+0xc9/0x190 [mlx5_ib]
 dereg_mr+0xba/0xf0 [mlx5_ib]
 ib_dereg_mr+0x13/0x20 [ib_core]
 remove_commit_idr_uobject+0x16/0x70 [ib_uverbs]
 uverbs_cleanup_ucontext+0xe8/0x1a0 [ib_uverbs]
 ib_uverbs_cleanup_ucontext.isra.9+0x19/0x40 [ib_uverbs]
 ib_uverbs_remove_one+0x162/0x2e0 [ib_uverbs]
 ib_unregister_device+0xd4/0x190 [ib_core]
 __mlx5_ib_remove+0x2e/0x40 [mlx5_ib]
 mlx5_remove_device+0xf5/0x120 [mlx5_core]
 mlx5_unregister_interface+0x37/0x90 [mlx5_core]
 mlx5_ib_cleanup+0xc/0x225 [mlx5_ib]
 SyS_delete_module+0x153/0x230
 do_syscall_64+0x62/0x110
 entry_SYSCALL_64_after_hwframe+0x21/0x86
...

We restore the original behavior by breaking the UMR stage into two parts,
pre and post IB registration stages, this way we can restore the original
functionality and maintain clean separation of logic between stages.

Fixes: 16c1975f1032 ("IB/mlx5: Create profile infrastructure to add and remove stages")
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-14 16:44:02 -04:00
Arnd Bergmann
baa00fcde4 RDMA/i40iw: include linux/irq.h
We get a build failure on ARM unless the header is included explicitly:

drivers/infiniband/hw/i40iw/i40iw_verbs.c: In function 'i40iw_get_vector_affinity':
drivers/infiniband/hw/i40iw/i40iw_verbs.c:2747:9: error: implicit declaration of function 'irq_get_affinity_mask'; did you mean 'irq_create_affinity_masks'? [-Werror=implicit-function-declaration]
  return irq_get_affinity_mask(msix_vec->irq);
         ^~~~~~~~~~~~~~~~~~~~~
         irq_create_affinity_masks
drivers/infiniband/hw/i40iw/i40iw_verbs.c:2747:9: error: returning 'int' from a function with return type 'const struct cpumask *' makes pointer from integer without a cast [-Werror=int-conversion]
  return irq_get_affinity_mask(msix_vec->irq);

Fixes: 7e952b19eb63 ("i40iw: Implement get_vector_affinity API")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-14 16:09:56 -04:00
Ilya Lesokhin
c44ef998f2 IB/mlx5: Maintain a single emergency page
The mlx5 driver needs to be able to issue invalidation to ODP MRs
even if it cannot allocate memory. To this end it preallocates
emergency pages to use when the situation arises.

This flow should be extremely rare enough, that we don't need
to worry about contention and therefore a single emergency page
is good enough.

Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-14 16:05:16 -04:00
Daniel Jurgens
65edd0e758 IB/mlx5: Only synchronize RCU once when removing mkeys
Instead synchronizing RCU in a loop when removing mkeys in a batch do it
once at the end before freeing them. The result is only waiting for one
RCU grace period instead of many serially.

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-14 16:05:16 -04:00
Leon Romanovsky
f3f134f526 RDMA/mlx5: Fix crash while accessing garbage pointer and freed memory
The failure in rereg_mr flow caused to set garbage value (error value)
into mr->umem pointer. This pointer is accessed at the release stage
and it causes to the following crash.

There is not enough to simply change umem to point to NULL, because the
MR struct is needed to be accessed during MR deregistration phase, so
delay kfree too.

[    6.237617] BUG: unable to handle kernel NULL pointer dereference a 0000000000000228
[    6.238756] IP: ib_dereg_mr+0xd/0x30
[    6.239264] PGD 80000000167eb067 P4D 80000000167eb067 PUD 167f9067 PMD 0
[    6.240320] Oops: 0000 [#1] SMP PTI
[    6.240782] CPU: 0 PID: 367 Comm: dereg Not tainted 4.16.0-rc1-00029-gc198fafe0453 #183
[    6.242120] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[    6.244504] RIP: 0010:ib_dereg_mr+0xd/0x30
[    6.245253] RSP: 0018:ffffaf5d001d7d68 EFLAGS: 00010246
[    6.246100] RAX: 0000000000000000 RBX: ffff95d4172daf00 RCX: 0000000000000000
[    6.247414] RDX: 00000000ffffffff RSI: 0000000000000001 RDI: ffff95d41a317600
[    6.248591] RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
[    6.249810] R10: ffff95d417033c10 R11: 0000000000000000 R12: ffff95d4172c3a80
[    6.251121] R13: ffff95d4172c3720 R14: ffff95d4172c3a98 R15: 00000000ffffffff
[    6.252437] FS:  0000000000000000(0000) GS:ffff95d41fc00000(0000) knlGS:0000000000000000
[    6.253887] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    6.254814] CR2: 0000000000000228 CR3: 00000000172b4000 CR4: 00000000000006b0
[    6.255943] Call Trace:
[    6.256368]  remove_commit_idr_uobject+0x1b/0x80
[    6.257118]  uverbs_cleanup_ucontext+0xe4/0x190
[    6.257855]  ib_uverbs_cleanup_ucontext.constprop.14+0x19/0x40
[    6.258857]  ib_uverbs_close+0x2a/0x100
[    6.259494]  __fput+0xca/0x1c0
[    6.259938]  task_work_run+0x84/0xa0
[    6.260519]  do_exit+0x312/0xb40
[    6.261023]  ? __do_page_fault+0x24d/0x490
[    6.261707]  do_group_exit+0x3a/0xa0
[    6.262267]  SyS_exit_group+0x10/0x10
[    6.262802]  do_syscall_64+0x75/0x180
[    6.263391]  entry_SYSCALL_64_after_hwframe+0x21/0x86
[    6.264253] RIP: 0033:0x7f1b39c49488
[    6.264827] RSP: 002b:00007ffe2de05b68 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
[    6.266049] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1b39c49488
[    6.267187] RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
[    6.268377] RBP: 00007f1b39f258e0 R08: 00000000000000e7 R09: ffffffffffffff98
[    6.269640] R10: 00007f1b3a147260 R11: 0000000000000246 R12: 00007f1b39f258e0
[    6.270783] R13: 00007f1b39f2ac20 R14: 0000000000000000 R15: 0000000000000000
[    6.271943] Code: 74 07 31 d2 e9 25 d8 6c 00 b8 da ff ff ff c3 0f 1f
44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 07 53 48 8b
5f 08 <48> 8b 80 28 02 00 00 e8 f7 d7 6c 00 85 c0 75 04 3e ff 4b 18 5b
[    6.274927] RIP: ib_dereg_mr+0xd/0x30 RSP: ffffaf5d001d7d68
[    6.275760] CR2: 0000000000000228
[    6.276200] ---[ end trace a35641f1c474bd20 ]---

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org>
Reported-by: Noa Osherovich <noaos@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-14 15:37:53 -04:00
Leon Romanovsky
fbf1795c96 RDMA/pvrdma: Properly annotate QP states
QP states provided by core layer are converted to enum ib_qp_state
and better to use internal variable in that type instead of int.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-14 15:34:25 -04:00
Leon Romanovsky
75a4598209 RDMA/mlx5: Fix NULL dereference while accessing XRC_TGT QPs
mlx5 modify_qp() relies on FW that the error will be thrown if wrong
state is supplied. The missing check in FW causes the following crash
while using XRC_TGT QPs.

[   14.769632] BUG: unable to handle kernel NULL pointer dereference at (null)
[   14.771085] IP: mlx5_ib_modify_qp+0xf60/0x13f0
[   14.771894] PGD 800000001472e067 P4D 800000001472e067 PUD 14529067 PMD 0
[   14.773126] Oops: 0002 [#1] SMP PTI
[   14.773763] CPU: 0 PID: 365 Comm: ubsan Not tainted 4.16.0-rc1-00038-g8151138c0793 #119
[   14.775192] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[   14.777522] RIP: 0010:mlx5_ib_modify_qp+0xf60/0x13f0
[   14.778417] RSP: 0018:ffffbf48001c7bd8 EFLAGS: 00010246
[   14.779346] RAX: 0000000000000000 RBX: ffff9a8f9447d400 RCX: 0000000000000000
[   14.780643] RDX: 0000000000000000 RSI: 000000000000000a RDI: 0000000000000000
[   14.781930] RBP: 0000000000000000 R08: 00000000000217b0 R09: ffffffffbc9c1504
[   14.783214] R10: fffff4a180519480 R11: ffff9a8f94523600 R12: ffff9a8f9493e240
[   14.784507] R13: ffff9a8f9447d738 R14: 000000000000050a R15: 0000000000000000
[   14.785800] FS:  00007f545b466700(0000) GS:ffff9a8f9fc00000(0000) knlGS:0000000000000000
[   14.787073] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   14.787792] CR2: 0000000000000000 CR3: 00000000144be000 CR4: 00000000000006b0
[   14.788689] Call Trace:
[   14.789007]  _ib_modify_qp+0x71/0x120
[   14.789475]  modify_qp.isra.20+0x207/0x2f0
[   14.790010]  ib_uverbs_modify_qp+0x90/0xe0
[   14.790532]  ib_uverbs_write+0x1d2/0x3c0
[   14.791049]  ? __handle_mm_fault+0x93c/0xe40
[   14.791644]  __vfs_write+0x36/0x180
[   14.792096]  ? handle_mm_fault+0xc1/0x210
[   14.792601]  vfs_write+0xad/0x1e0
[   14.793018]  SyS_write+0x52/0xc0
[   14.793422]  do_syscall_64+0x75/0x180
[   14.793888]  entry_SYSCALL_64_after_hwframe+0x21/0x86
[   14.794527] RIP: 0033:0x7f545ad76099
[   14.794975] RSP: 002b:00007ffd78787468 EFLAGS: 00000287 ORIG_RAX: 0000000000000001
[   14.795958] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f545ad76099
[   14.797075] RDX: 0000000000000078 RSI: 0000000020009000 RDI: 0000000000000003
[   14.798140] RBP: 00007ffd78787470 R08: 00007ffd78787480 R09: 00007ffd78787480
[   14.799207] R10: 00007ffd78787480 R11: 0000000000000287 R12: 00005599ada98760
[   14.800277] R13: 00007ffd78787560 R14: 0000000000000000 R15: 0000000000000000
[   14.801341] Code: 4c 8b 1c 24 48 8b 83 70 02 00 00 48 c7 83 cc 02 00
00 00 00 00 00 48 c7 83 24 03 00 00 00 00 00 00 c7 83 2c 03 00 00 00 00
00 00 <c7> 00 00 00 00 00 48 8b 83 70 02 00 00 c7 40 04 00 00 00 00 4c
[   14.804012] RIP: mlx5_ib_modify_qp+0xf60/0x13f0 RSP: ffffbf48001c7bd8
[   14.804838] CR2: 0000000000000000
[   14.805288] ---[ end trace 3f1da0df5c8b7c37 ]---

Cc: syzkaller <syzkaller@googlegroups.com>
Reported-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-14 15:34:25 -04:00
Zhu Yanjun
8a18e911d0 IB: remove duplicate header files
In hfi.h, the header file opa_addr.h is included twice.
In vt.h, the header file mmap.h is included twice.

Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-13 16:46:03 -04:00
Yixian Liu
86188a8810 RDMA/hns: Support cq record doorbell for kernel space
This patch updates to support cq record doorbell for
the kernel space.

Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-13 16:40:15 -04:00
Yixian Liu
472bc0fbd4 RDMA/hns: Support rq record doorbell for kernel space
This patch updates to support rq record doorbell for
the kernel space.

Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-13 16:40:15 -04:00
Yixian Liu
9b44703d0a RDMA/hns: Support cq record doorbell for the user space
This patch updates to support cq record doorbell for
the user space.

Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-13 16:40:15 -04:00
Yixian Liu
e088a685ea RDMA/hns: Support rq record doorbell for the user space
This patch adds interfaces and definitions to support the rq record
doorbell for the user space.

Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-13 16:40:15 -04:00
Boris Pismenny
c2b37f7648 IB/mlx5: Fix integer overflows in mlx5_ib_create_srq
This patch validates user provided input to prevent integer overflow due
to integer manipulation in the mlx5_ib_create_srq function.

Cc: syzkaller <syzkaller@googlegroups.com>
Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-13 16:31:21 -04:00
Boris Pismenny
2c292dbb39 IB/mlx5: Fix out-of-bounds read in create_raw_packet_qp_rq
Add a check for the length of the qpin structure to prevent out-of-bounds reads

BUG: KASAN: slab-out-of-bounds in create_raw_packet_qp+0x114c/0x15e2
Read of size 8192 at addr ffff880066b99290 by task syz-executor3/549

CPU: 3 PID: 549 Comm: syz-executor3 Not tainted 4.15.0-rc2+ #27 Hardware
name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
Call Trace:
 dump_stack+0x8d/0xd4
 print_address_description+0x73/0x290
 kasan_report+0x25c/0x370
 ? create_raw_packet_qp+0x114c/0x15e2
 memcpy+0x1f/0x50
 create_raw_packet_qp+0x114c/0x15e2
 ? create_raw_packet_qp_tis.isra.28+0x13d/0x13d
 ? lock_acquire+0x370/0x370
 create_qp_common+0x2245/0x3b50
 ? destroy_qp_user.isra.47+0x100/0x100
 ? kasan_kmalloc+0x13d/0x170
 ? sched_clock_cpu+0x18/0x180
 ? fs_reclaim_acquire.part.15+0x5/0x30
 ? __lock_acquire+0xa11/0x1da0
 ? sched_clock_cpu+0x18/0x180
 ? kmem_cache_alloc_trace+0x17e/0x310
 ? mlx5_ib_create_qp+0x30e/0x17b0
 mlx5_ib_create_qp+0x33d/0x17b0
 ? sched_clock_cpu+0x18/0x180
 ? create_qp_common+0x3b50/0x3b50
 ? lock_acquire+0x370/0x370
 ? __radix_tree_lookup+0x180/0x220
 ? uverbs_try_lock_object+0x68/0xc0
 ? rdma_lookup_get_uobject+0x114/0x240
 create_qp.isra.5+0xce4/0x1e20
 ? ib_uverbs_ex_create_cq_cb+0xa0/0xa0
 ? copy_ah_attr_from_uverbs.isra.2+0xa00/0xa00
 ? ib_uverbs_cq_event_handler+0x160/0x160
 ? __might_fault+0x17c/0x1c0
 ib_uverbs_create_qp+0x21b/0x2a0
 ? ib_uverbs_destroy_cq+0x2e0/0x2e0
 ib_uverbs_write+0x55a/0xad0
 ? ib_uverbs_destroy_cq+0x2e0/0x2e0
 ? ib_uverbs_destroy_cq+0x2e0/0x2e0
 ? ib_uverbs_open+0x760/0x760
 ? futex_wake+0x147/0x410
 ? check_prev_add+0x1680/0x1680
 ? do_futex+0x3d3/0xa60
 ? sched_clock_cpu+0x18/0x180
 __vfs_write+0xf7/0x5c0
 ? ib_uverbs_open+0x760/0x760
 ? kernel_read+0x110/0x110
 ? lock_acquire+0x370/0x370
 ? __fget+0x264/0x3b0
 vfs_write+0x18a/0x460
 SyS_write+0xc7/0x1a0
 ? SyS_read+0x1a0/0x1a0
 ? trace_hardirqs_on_thunk+0x1a/0x1c
 entry_SYSCALL_64_fastpath+0x18/0x85
RIP: 0033:0x4477b9
RSP: 002b:00007f1822cadc18 EFLAGS: 00000292 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000005 RCX: 00000000004477b9
RDX: 0000000000000070 RSI: 000000002000a000 RDI: 0000000000000005
RBP: 0000000000708000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000292 R12: 00000000ffffffff
R13: 0000000000005d70 R14: 00000000006e6e30 R15: 0000000020010ff0

Allocated by task 549:
 __kmalloc+0x15e/0x340
 kvmalloc_node+0xa1/0xd0
 create_user_qp.isra.46+0xd42/0x1610
 create_qp_common+0x2e63/0x3b50
 mlx5_ib_create_qp+0x33d/0x17b0
 create_qp.isra.5+0xce4/0x1e20
 ib_uverbs_create_qp+0x21b/0x2a0
 ib_uverbs_write+0x55a/0xad0
 __vfs_write+0xf7/0x5c0
 vfs_write+0x18a/0x460
 SyS_write+0xc7/0x1a0
 entry_SYSCALL_64_fastpath+0x18/0x85

Freed by task 368:
 kfree+0xeb/0x2f0
 kernfs_fop_release+0x140/0x180
 __fput+0x266/0x700
 task_work_run+0x104/0x180
 exit_to_usermode_loop+0xf7/0x110
 syscall_return_slowpath+0x298/0x370
 entry_SYSCALL_64_fastpath+0x83/0x85

The buggy address belongs to the object at ffff880066b99180  which
belongs to the cache kmalloc-512 of size 512 The buggy address is
located 272 bytes inside of  512-byte region [ffff880066b99180,
ffff880066b99380) The buggy address belongs to the page:
page:000000006040eedd count:1 mapcount:0 mapping:          (null)
index:0x0 compound_mapcount: 0
flags: 0x4000000000008100(slab|head)
raw: 4000000000008100 0000000000000000 0000000000000000 0000000180190019
raw: ffffea00019a7500 0000000b0000000b ffff88006c403080 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff880066b99180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffff880066b99200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff880066b99280: 00 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                         ^
 ffff880066b99300: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff880066b99380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc

Cc: syzkaller <syzkaller@googlegroups.com>
Fixes: 0fb2ed66a14c ("IB/mlx5: Add create and destroy functionality for Raw Packet QP")
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-13 16:30:21 -04:00
Bart Van Assche
036ef0a1a8 RDMA/bnxt_re: Remove an unused variable
This patch does not change any functionality.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Selvin Xavier <selvin.xavier@broadcom.com>
Cc: Devesh Sharma <devesh.sharma@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-13 16:21:42 -04:00
Bart Van Assche
8932ff803d IB/hfi1: Fix a kernel-doc warning
Avoid that building with W=1 causes the following warning to appear:

drivers/infiniband/hw/hfi1/qp.c:484: warning: Cannot understand * on line 484 - I thought it was a doc line

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-13 16:21:14 -04:00
Leon Romanovsky
28e9091e31 RDMA/mlx5: Fix integer overflow while resizing CQ
The user can provide very large cqe_size which will cause to integer
overflow as it can be seen in the following UBSAN warning:

=======================================================================
UBSAN: Undefined behaviour in drivers/infiniband/hw/mlx5/cq.c:1192:53
signed integer overflow:
64870 * 65536 cannot be represented in type 'int'
CPU: 0 PID: 267 Comm: syzkaller605279 Not tainted 4.15.0+ #90 Hardware
name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
Call Trace:
 dump_stack+0xde/0x164
 ? dma_virt_map_sg+0x22c/0x22c
 ubsan_epilogue+0xe/0x81
 handle_overflow+0x1f3/0x251
 ? __ubsan_handle_negate_overflow+0x19b/0x19b
 ? lock_acquire+0x440/0x440
 mlx5_ib_resize_cq+0x17e7/0x1e40
 ? cyc2ns_read_end+0x10/0x10
 ? native_read_msr_safe+0x6c/0x9b
 ? cyc2ns_read_end+0x10/0x10
 ? mlx5_ib_modify_cq+0x220/0x220
 ? sched_clock_cpu+0x18/0x200
 ? lookup_get_idr_uobject+0x200/0x200
 ? rdma_lookup_get_uobject+0x145/0x2f0
 ib_uverbs_resize_cq+0x207/0x3e0
 ? ib_uverbs_ex_create_cq+0x250/0x250
 ib_uverbs_write+0x7f9/0xef0
 ? cyc2ns_read_end+0x10/0x10
 ? print_irqtrace_events+0x280/0x280
 ? ib_uverbs_ex_create_cq+0x250/0x250
 ? uverbs_devnode+0x110/0x110
 ? sched_clock_cpu+0x18/0x200
 ? do_raw_spin_trylock+0x100/0x100
 ? __lru_cache_add+0x16e/0x290
 __vfs_write+0x10d/0x700
 ? uverbs_devnode+0x110/0x110
 ? kernel_read+0x170/0x170
 ? sched_clock_cpu+0x18/0x200
 ? security_file_permission+0x93/0x260
 vfs_write+0x1b0/0x550
 SyS_write+0xc7/0x1a0
 ? SyS_read+0x1a0/0x1a0
 ? trace_hardirqs_on_thunk+0x1a/0x1c
 entry_SYSCALL_64_fastpath+0x1e/0x8b
RIP: 0033:0x433549
RSP: 002b:00007ffe63bd1ea8 EFLAGS: 00000217
=======================================================================

Cc: syzkaller <syzkaller@googlegroups.com>
Cc: <stable@vger.kernel.org> # 3.13
Fixes: bde51583f49b ("IB/mlx5: Add support for resize CQ")
Reported-by: Noa Osherovich <noaos@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-09 18:10:48 -05:00
Doug Ledford
212a0cbc56 Revert "RDMA/mlx5: Fix integer overflow while resizing CQ"
The original commit of this patch has a munged log message that is
missing several of the tags the original author intended to be on the
patch.  This was due to patchworks misinterpreting a cut-n-paste
separator line as an end of message line and munging the mbox that was
used to import the patch:

https://patchwork.kernel.org/patch/10264089/

The original patch will be reapplied with a fixed commit message so the
proper tags are applied.

This reverts commit aa0de36a40f446f5a21a7c1e677b98206e242edb.

Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-09 18:07:46 -05:00
Steve Wise
5292443431 mlx4_ib: zero out struct ib_pd when allocating
Zero out the fields of the struct ib_pd for user mode pds so that
users querying pds via nldev will not get garbage.  For simplicity,
use kzalloc() to allocate the mlx4_ib_pd struct.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-08 15:03:03 -05:00
Steve Wise
e6f0330106 mlx4_ib: set user mr attributes in struct ib_mr
Setting iova, length, and page_size allows this information to be
seen via NLDEV netlink queries, which can aid in user rdma debugging.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-08 15:03:03 -05:00
Steve Wise
750fb1656a iw_cxgb4: initialize ib_mr fields for user mrs
Some of the struct ib_mr fields weren't getting initialized.  This was
benign, but will cause problems when dumping the mr resource via
nldev/restrack.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-08 15:03:03 -05:00
Yishai Hadas
d50a8a96ee IB/mlx4: Move mlx4_uverbs_ex_query_device_resp to include/uapi/
This struct is involved in the user API for mlx4 and should not be hidden
inside a driver header file.

Fixes: 09d208b258a2 ("IB/mlx4: Add report for RSS capabilities by vendor channel")
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-07 16:10:07 -07:00
Doug Ledford
1abb791fcd Merge tag 'mlx5-updates-2018-02-28-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux into k.o/wip/dl-for-next
mlx5-updates-2018-02-28-1 (IPSec-1)

This series consists of some fixes and refactors for the mlx5 drivers,
especially around the FPGA and flow steering. Most of them are trivial
fixes and are the foundation of allowing IPSec acceleration from user-space.

We use flow steering abstraction in order to accelerate IPSec packets.
When a user creates a steering rule, [s]he states that we'll carry an
encrypt/decrypt flow action (using a specific configuration) for every
packet which conforms to a certain match. Since currently offloading these
packets is done via FPGA, we'll add another set of flow steering ops.
These ops will execute the required FPGA commands and then call the
standard steering ops.

In order to achieve this, we need that the commands will get all the
required information. Therefore, we pass the fte object and embed the
flow_action struct inside the fte. In addition, we add the shim layer
that will later be used for alternating between the standard and the
FPGA steering commands.

Some fixes, like " net/mlx5e: Wait for FPGA command responses with a timeout"
are very relevant for user-space applications, as these applications could
be killed, but we still want to wait for the FPGA and update the kernel's
database.

Regards,
Aviad and Matan

Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-07 15:56:39 -07:00
Leon Romanovsky
aa0de36a40 RDMA/mlx5: Fix integer overflow while resizing CQ
The user can provide very large cqe_size which will cause to integer
overflow as it can be seen in the following UBSAN warning:

Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-07 15:23:43 -05:00
Boris Pismenny
3346c48737 {net,IB}/mlx5: Add flow steering helpers
Add helper functions that check if a protocol is
part of a flow steering match criteria.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-06 22:20:14 -08:00
Matan Barak
a9db0ecf15 {net,IB}/mlx5: Add has_tag to mlx5_flow_act
The has_tag member will indicate whether a tag action was specified
in flow specification.

A flow tag 0 = MLX5_FS_DEFAULT_FLOW_TAG is assumed a valid flow tag
that is currently used by mlx5 RDMA driver, whereas in HW flow_tag = 0
means that the user doesn't care about flow_tag.  HW always provide
a flow_tag = 0 if all flow tags requested on a specific flow are 0.

So we need a way (in the driver) to differentiate between a user really
requesting flow_tag = 0 and a user who does not care, in order to be
able to report conflicting flow tags on a specific flow.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-06 22:06:33 -08:00
Boris Pismenny
075572d4b7 IB/mlx5: Pass mlx5_flow_act struct instead of multiple arguments
Group and pass all function arguments of parse_flow_attr call in one
common struct mlx5_flow_act.

This patch passes all the action arguments of parse_flow_attr in one common
struct mlx5_flow_act. It allows us to scale the number of actions without adding
new arguments to the function.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 22:06:11 -08:00
Aviad Yehezkel
c33251a3c6 IB/mlx5: Removed not used parameters
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 22:05:36 -08:00
Selvin Xavier
942c9b6ca8 RDMA/bnxt_re: Avoid Hard lockup during error CQE processing
Hitting the following hardlockup due to a race condition in
error CQE processing.

[26146.879798] bnxt_en 0000:04:00.0: QPLIB: FP: CQ Processed Req
[26146.886346] bnxt_en 0000:04:00.0: QPLIB: wr_id[1251] = 0x0 with status 0xa
[26156.350935] NMI watchdog: Watchdog detected hard LOCKUP on cpu 4
[26156.357470] Modules linked in: nfsd auth_rpcgss nfs_acl lockd grace
[26156.447957] CPU: 4 PID: 3413 Comm: kworker/4:1H Kdump: loaded
[26156.457994] Hardware name: Dell Inc. PowerEdge R430/0CN7X8,
[26156.466390] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
[26156.472639] Call Trace:
[26156.475379]  <NMI>  [<ffffffff98d0d722>] dump_stack+0x19/0x1b
[26156.481833]  [<ffffffff9873f775>] watchdog_overflow_callback+0x135/0x140
[26156.489341]  [<ffffffff9877f237>] __perf_event_overflow+0x57/0x100
[26156.496256]  [<ffffffff98787c24>] perf_event_overflow+0x14/0x20
[26156.502887]  [<ffffffff9860a580>] intel_pmu_handle_irq+0x220/0x510
[26156.509813]  [<ffffffff98d16031>] perf_event_nmi_handler+0x31/0x50
[26156.516738]  [<ffffffff98d1790c>] nmi_handle.isra.0+0x8c/0x150
[26156.523273]  [<ffffffff98d17be8>] do_nmi+0x218/0x460
[26156.528834]  [<ffffffff98d16d79>] end_repeat_nmi+0x1e/0x7e
[26156.534980]  [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200
[26156.543268]  [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200
[26156.551556]  [<ffffffff987089c0>] ? native_queued_spin_lock_slowpath+0x1d0/0x200
[26156.559842]  <EOE>  [<ffffffff98d083e4>] queued_spin_lock_slowpath+0xb/0xf
[26156.567555]  [<ffffffff98d15690>] _raw_spin_lock+0x20/0x30
[26156.573696]  [<ffffffffc08381a1>] bnxt_qplib_lock_buddy_cq+0x31/0x40 [bnxt_re]
[26156.581789]  [<ffffffffc083bbaa>] bnxt_qplib_poll_cq+0x43a/0xf10 [bnxt_re]
[26156.589493]  [<ffffffffc083239b>] bnxt_re_poll_cq+0x9b/0x760 [bnxt_re]

The issue happens if RQ poll_cq or SQ poll_cq or Async error event tries to
put the error QP in flush list. Since SQ and RQ of each error qp are added
to two different flush list, we need to protect it using locks of
corresponding CQs. Difference in order of acquiring the lock in
SQ poll_cq and RQ poll_cq can cause a hard lockup.

Revisits the locking strategy and removes the usage of qplib_cq.hwq.lock.
Instead of this lock, introduces qplib_cq.flush_lock to handle
addition/deletion of QPs in flush list. Also, always invoke the flush_lock
in order (SQ CQ lock first and then RQ CQ lock) to avoid any potential
deadlock.

Other than the poll_cq context, the movement of QP to/from flush list can
be done in modify_qp context or from an async error event from HW.
Synchronize these operations using the bnxt_re verbs layer CQ locks.
To achieve this, adds a call back to the HW abstraction layer(qplib) to
bnxt_re ib_verbs layer in case of async error event. Also, removes the
buddy cq functions as it is no longer required.

Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 20:08:39 -07:00
Dan Carpenter
5d414b178e IB/mlx5: Fix an error code in __mlx5_ib_modify_qp()
"err" is either zero or possibly uninitialized here.  It should be
-EINVAL.

Fixes: 427c1e7bcd7e ("{IB, net}/mlx5: Move the modify QP operation table to mlx5_ib")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 20:08:39 -07:00
Mark Bloch
210b1f7807 IB/mlx5: When not in dual port RoCE mode, use provided port as native
The series that introduced dual port RoCE mode assumed that we don't have
a dual port HCA that use the mlx5 driver, this is not the case for
Connect-IB HCAs. This reasoning led to assigning 1 as the native port
index which causes issue when the second port is used.

For example query_pkey() when called on the second port will return values
of the first port. Make sure that we assign the right port index as the
native port index.

Fixes: 32f69e4be269 ("{net, IB}/mlx5: Manage port association for multiport RoCE")
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 20:08:38 -07:00
Jack M
a18177925c IB/mlx4: Include GID type when deleting GIDs from HW table under RoCE
The commit cited below added a gid_type field (RoCEv1 or RoCEv2)
to GID properties.

When adding GIDs, this gid_type field was copied over to the
hardware gid table. However, when deleting GIDs, the gid_type field
was not copied over to the hardware gid table.

As a result, when running RoCEv2, all RoCEv2 gids in the
hardware gid table were set to type RoCEv1 when any gid was deleted.

This problem would persist until the next gid was added (which would again
restore the gid_type field for all the gids in the hardware gid table).

Fix this by copying over the gid_type field to the hardware gid table
when deleting gids, so that the gid_type of all remaining gids is
preserved when a gid is deleted.

Fixes: b699a859d17b ("IB/mlx4: Add gid_type to GID properties")
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 20:08:38 -07:00
Jack Morgenstein
0077416a3d IB/mlx4: Fix corruption of RoCEv2 IPv4 GIDs
When using IPv4 addresses in RoCEv2, the GID format for the mapped
IPv4 address should be: ::ffff:<4-byte IPv4 address>.

In the cited commit, IPv4 mapped IPV6 addresses had the 3 upper dwords
zeroed out by memset, which resulted in deleting the ffff field.

However, since procedure ipv6_addr_v4mapped() already verifies that the
gid has format ::ffff:<ipv4 address>, no change is needed for the gid,
and the memset can simply be removed.

Fixes: 7e57b85c444c ("IB/mlx4: Add support for setting RoCEv2 gids in hardware")
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 20:08:38 -07:00
Kalderon, Michal
551e1c67b4 RDMA/qedr: Fix iWARP write and send with immediate
iWARP does not support RDMA WRITE or SEND with immediate data.
Driver should check this before submitting to FW and return an
immediate error

Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com>
Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 19:57:37 -07:00