Commit Graph

8065 Commits

Author SHA1 Message Date
Leon Romanovsky
19b1f54099 RDMA/verbs: Simplify modify QP check
All callers to ib_modify_qp_is_ok() provides enum ib_qp_state
makes the checks of out-of-scope redundant. Let's remove them
together with updating function signature to return boolean result.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-14 15:34:25 -04:00
Leon Romanovsky
fbf1795c96 RDMA/pvrdma: Properly annotate QP states
QP states provided by core layer are converted to enum ib_qp_state
and better to use internal variable in that type instead of int.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-14 15:34:25 -04:00
Leon Romanovsky
88de869bbe RDMA/uverbs: Ensure validity of current QP state value
The QP state is internal enum which is checked at the driver
level by calling to ib_modify_qp_is_ok(). Move this check closer
to user and leave kernel users to be checked by compiler.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-14 15:34:25 -04:00
Leon Romanovsky
75a4598209 RDMA/mlx5: Fix NULL dereference while accessing XRC_TGT QPs
mlx5 modify_qp() relies on FW that the error will be thrown if wrong
state is supplied. The missing check in FW causes the following crash
while using XRC_TGT QPs.

[   14.769632] BUG: unable to handle kernel NULL pointer dereference at (null)
[   14.771085] IP: mlx5_ib_modify_qp+0xf60/0x13f0
[   14.771894] PGD 800000001472e067 P4D 800000001472e067 PUD 14529067 PMD 0
[   14.773126] Oops: 0002 [#1] SMP PTI
[   14.773763] CPU: 0 PID: 365 Comm: ubsan Not tainted 4.16.0-rc1-00038-g8151138c0793 #119
[   14.775192] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[   14.777522] RIP: 0010:mlx5_ib_modify_qp+0xf60/0x13f0
[   14.778417] RSP: 0018:ffffbf48001c7bd8 EFLAGS: 00010246
[   14.779346] RAX: 0000000000000000 RBX: ffff9a8f9447d400 RCX: 0000000000000000
[   14.780643] RDX: 0000000000000000 RSI: 000000000000000a RDI: 0000000000000000
[   14.781930] RBP: 0000000000000000 R08: 00000000000217b0 R09: ffffffffbc9c1504
[   14.783214] R10: fffff4a180519480 R11: ffff9a8f94523600 R12: ffff9a8f9493e240
[   14.784507] R13: ffff9a8f9447d738 R14: 000000000000050a R15: 0000000000000000
[   14.785800] FS:  00007f545b466700(0000) GS:ffff9a8f9fc00000(0000) knlGS:0000000000000000
[   14.787073] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   14.787792] CR2: 0000000000000000 CR3: 00000000144be000 CR4: 00000000000006b0
[   14.788689] Call Trace:
[   14.789007]  _ib_modify_qp+0x71/0x120
[   14.789475]  modify_qp.isra.20+0x207/0x2f0
[   14.790010]  ib_uverbs_modify_qp+0x90/0xe0
[   14.790532]  ib_uverbs_write+0x1d2/0x3c0
[   14.791049]  ? __handle_mm_fault+0x93c/0xe40
[   14.791644]  __vfs_write+0x36/0x180
[   14.792096]  ? handle_mm_fault+0xc1/0x210
[   14.792601]  vfs_write+0xad/0x1e0
[   14.793018]  SyS_write+0x52/0xc0
[   14.793422]  do_syscall_64+0x75/0x180
[   14.793888]  entry_SYSCALL_64_after_hwframe+0x21/0x86
[   14.794527] RIP: 0033:0x7f545ad76099
[   14.794975] RSP: 002b:00007ffd78787468 EFLAGS: 00000287 ORIG_RAX: 0000000000000001
[   14.795958] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f545ad76099
[   14.797075] RDX: 0000000000000078 RSI: 0000000020009000 RDI: 0000000000000003
[   14.798140] RBP: 00007ffd78787470 R08: 00007ffd78787480 R09: 00007ffd78787480
[   14.799207] R10: 00007ffd78787480 R11: 0000000000000287 R12: 00005599ada98760
[   14.800277] R13: 00007ffd78787560 R14: 0000000000000000 R15: 0000000000000000
[   14.801341] Code: 4c 8b 1c 24 48 8b 83 70 02 00 00 48 c7 83 cc 02 00
00 00 00 00 00 48 c7 83 24 03 00 00 00 00 00 00 c7 83 2c 03 00 00 00 00
00 00 <c7> 00 00 00 00 00 48 8b 83 70 02 00 00 c7 40 04 00 00 00 00 4c
[   14.804012] RIP: mlx5_ib_modify_qp+0xf60/0x13f0 RSP: ffffbf48001c7bd8
[   14.804838] CR2: 0000000000000000
[   14.805288] ---[ end trace 3f1da0df5c8b7c37 ]---

Cc: syzkaller <syzkaller@googlegroups.com>
Reported-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-14 15:34:25 -04:00
Zhu Yanjun
8a18e911d0 IB: remove duplicate header files
In hfi.h, the header file opa_addr.h is included twice.
In vt.h, the header file mmap.h is included twice.

Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-13 16:46:03 -04:00
Yixian Liu
86188a8810 RDMA/hns: Support cq record doorbell for kernel space
This patch updates to support cq record doorbell for
the kernel space.

Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-13 16:40:15 -04:00
Yixian Liu
472bc0fbd4 RDMA/hns: Support rq record doorbell for kernel space
This patch updates to support rq record doorbell for
the kernel space.

Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-13 16:40:15 -04:00
Yixian Liu
9b44703d0a RDMA/hns: Support cq record doorbell for the user space
This patch updates to support cq record doorbell for
the user space.

Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-13 16:40:15 -04:00
Yixian Liu
e088a685ea RDMA/hns: Support rq record doorbell for the user space
This patch adds interfaces and definitions to support the rq record
doorbell for the user space.

Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Shaobo Xu <xushaobo2@huawei.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-13 16:40:15 -04:00
Bart Van Assche
036ef0a1a8 RDMA/bnxt_re: Remove an unused variable
This patch does not change any functionality.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Selvin Xavier <selvin.xavier@broadcom.com>
Cc: Devesh Sharma <devesh.sharma@broadcom.com>
Cc: Somnath Kotur <somnath.kotur@broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-13 16:21:42 -04:00
Bart Van Assche
8932ff803d IB/hfi1: Fix a kernel-doc warning
Avoid that building with W=1 causes the following warning to appear:

drivers/infiniband/hw/hfi1/qp.c:484: warning: Cannot understand * on line 484 - I thought it was a doc line

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-13 16:21:14 -04:00
Steve Wise
29cf1351d4 RDMA/nldev: provide detailed PD information
Implement the RDMA nldev netlink interface for dumping detailed PD
information.

Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-08 15:03:03 -05:00
Steve Wise
5292443431 mlx4_ib: zero out struct ib_pd when allocating
Zero out the fields of the struct ib_pd for user mode pds so that
users querying pds via nldev will not get garbage.  For simplicity,
use kzalloc() to allocate the mlx4_ib_pd struct.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-08 15:03:03 -05:00
Steve Wise
fccec5b89a RDMA/nldev: provide detailed MR information
Implement the RDMA nldev netlink interface for dumping detailed
MR information.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-08 15:03:03 -05:00
Steve Wise
e6f0330106 mlx4_ib: set user mr attributes in struct ib_mr
Setting iova, length, and page_size allows this information to be
seen via NLDEV netlink queries, which can aid in user rdma debugging.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-08 15:03:03 -05:00
Steve Wise
750fb1656a iw_cxgb4: initialize ib_mr fields for user mrs
Some of the struct ib_mr fields weren't getting initialized.  This was
benign, but will cause problems when dumping the mr resource via
nldev/restrack.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-08 15:03:03 -05:00
Steve Wise
a34fc0893e RDMA/nldev: provide detailed CQ information
Implement the RDMA nldev netlink interface for dumping detailed
CQ information.

Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-08 15:03:03 -05:00
Steve Wise
00313983cd RDMA/nldev: provide detailed CM_ID information
Implement RDMA nldev netlink interface to get detailed CM_ID information.

Because cm_id's are attached to rdma devices in various work queue
contexts, the pid and task information at restrak_add() time is sometimes
not useful.  For example, an nvme/f host connection cm_id ends up being
bound to a device in a work queue context and the resulting pid at attach
time no longer exists after connection setup.  So instead we mark all
cm_id's created via the rdma_ucm as "user", and all others as "kernel".
This required tweaking the restrack code a little.  It also required
wrapping some rdma_cm functions to allow passing the module name string.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-08 15:03:03 -05:00
Steve Wise
a3b641af72 RDMA/CM: move rdma_id_private to cma_priv.h
Move struct rdma_id_private to a new header cma_priv.h so the resource
tracking services in core/nldev.c can read useful information about cm_ids.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-08 15:03:03 -05:00
Steve Wise
d12ff62482 RDMA/nldev: common resource dumpit function
Create a common dumpit function that can be used by all common resource
types.  This reduces code replication and simplifies the code as we add
more resource types.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-08 15:03:03 -05:00
Steve Wise
88831a2cfe RDMA/restrack: clean up res_to_dev()
Simplify res_to_dev() to make it easier to read/maintain.

Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-08 15:03:03 -05:00
Yishai Hadas
d50a8a96ee IB/mlx4: Move mlx4_uverbs_ex_query_device_resp to include/uapi/
This struct is involved in the user API for mlx4 and should not be hidden
inside a driver header file.

Fixes: 09d208b258 ("IB/mlx4: Add report for RSS capabilities by vendor channel")
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-07 16:10:07 -07:00
Doug Ledford
1abb791fcd Merge tag 'mlx5-updates-2018-02-28-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux into k.o/wip/dl-for-next
mlx5-updates-2018-02-28-1 (IPSec-1)

This series consists of some fixes and refactors for the mlx5 drivers,
especially around the FPGA and flow steering. Most of them are trivial
fixes and are the foundation of allowing IPSec acceleration from user-space.

We use flow steering abstraction in order to accelerate IPSec packets.
When a user creates a steering rule, [s]he states that we'll carry an
encrypt/decrypt flow action (using a specific configuration) for every
packet which conforms to a certain match. Since currently offloading these
packets is done via FPGA, we'll add another set of flow steering ops.
These ops will execute the required FPGA commands and then call the
standard steering ops.

In order to achieve this, we need that the commands will get all the
required information. Therefore, we pass the fte object and embed the
flow_action struct inside the fte. In addition, we add the shim layer
that will later be used for alternating between the standard and the
FPGA steering commands.

Some fixes, like " net/mlx5e: Wait for FPGA command responses with a timeout"
are very relevant for user-space applications, as these applications could
be killed, but we still want to wait for the FPGA and update the kernel's
database.

Regards,
Aviad and Matan

Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-07 15:56:39 -07:00
Zhu Yanjun
befd8d98f2 IB/rxe: change the function rxe_init_device_param type
The function rxe_init_device_param always return 0. So the function
type is changed to void.

CC: Srinivas Eeda <srinivas.eeda@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-07 15:56:15 -07:00
Zhu Yanjun
31f1bd14cb IB/rxe: remove unnecessary rxe in rxe_send
In the function rxe_send, the variable rxe is not used in it.
So it should be removed.

CC: Srinivas Eeda <srinivas.eeda@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-07 15:56:14 -07:00
Zhu Yanjun
86af617641 IB/rxe: remove unnecessary skb_clone
In send_atomic_ack function, it is not necessary to make a
skb_clone. To gain better performance (high throughput and
low latency), this skb_clone is removed.

The following tests are made.

 server                       client
---------                    ---------
|1.1.1.1|<----rxe-channel--->|1.1.1.2|
---------                    ---------

On server: rping -s -a 1.1.1.1 -v -C 1000 -S 512
On client: rping -c -a 1.1.1.1 -v -C 1000 -S 512

The kernel config CONFIG_DEBUG_KMEMLEAK is enabled on both server
and client.

This test runs for several hours. There is no memory leak and the whole
system can work well.

Based on the above network, the following tests are made.

Server: ibv_rc_pingpong -d rxe0 -g 1
Client: ibv_rc_pingpong -d rxe0 -g 1 1.1.1.1

The test results on Server(10 tests are made).
Before:
Throughput is 137.07 Mbit/sec
Latency is 517.76 usec/iter

After:
Throughput is 148.85 Mbit/sec
Latency is 476.64 usec/iter

The throughput is enhanced and the latency is reduced.

CC: Srinivas Eeda <srinivas.eeda@oracle.com>
CC: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>

Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-07 15:56:14 -07:00
Bart Van Assche
63cf1a902c IB/srpt: Add RDMA/CM support
Add a parameter for configuring the port on which the ib_srpt driver
listens for incoming RDMA/CM connections, namely
/sys/kernel/config/target/srpt/discovery_auth/rdma_cm_port. The default
value for this parameter is 0 which means "do not listen for incoming
RDMA/CM connections". Add RDMA/CM support to all code that handles
connection state changes. Modify srpt_init_nodeacl() such that ACLs can
be configured for IPv4 and IPv6 addresses.

Note: incoming connection requests are only accepted for ports that
have been enabled. See also the "if (!sport->enabled)" code in the
connection request handler. See also the following configfs attribute:
/sys/kernel/config/target/srpt/$port/$port/enable.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-07 15:56:14 -07:00
Boris Pismenny
3346c48737 {net,IB}/mlx5: Add flow steering helpers
Add helper functions that check if a protocol is
part of a flow steering match criteria.

Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-06 22:20:14 -08:00
Matan Barak
a9db0ecf15 {net,IB}/mlx5: Add has_tag to mlx5_flow_act
The has_tag member will indicate whether a tag action was specified
in flow specification.

A flow tag 0 = MLX5_FS_DEFAULT_FLOW_TAG is assumed a valid flow tag
that is currently used by mlx5 RDMA driver, whereas in HW flow_tag = 0
means that the user doesn't care about flow_tag.  HW always provide
a flow_tag = 0 if all flow tags requested on a specific flow are 0.

So we need a way (in the driver) to differentiate between a user really
requesting flow_tag = 0 and a user who does not care, in order to be
able to report conflicting flow tags on a specific flow.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Reviewed-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-03-06 22:06:33 -08:00
Boris Pismenny
075572d4b7 IB/mlx5: Pass mlx5_flow_act struct instead of multiple arguments
Group and pass all function arguments of parse_flow_attr call in one
common struct mlx5_flow_act.

This patch passes all the action arguments of parse_flow_attr in one common
struct mlx5_flow_act. It allows us to scale the number of actions without adding
new arguments to the function.

Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Boris Pismenny <borisp@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 22:06:11 -08:00
Aviad Yehezkel
c33251a3c6 IB/mlx5: Removed not used parameters
Signed-off-by: Aviad Yehezkel <aviadye@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Acked-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 22:05:36 -08:00
Gustavo A. R. Silva
63231585a6 RDMA/bnxt_re/qplib_sp: Use true and false for boolean values
Assign true or false to boolean variables instead of an integer value.

This issue was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 16:00:51 -07:00
Sergey Gorenko
fbd36818ee IB/srp: Use the IB_DEVICE_SG_GAPS_REG HCA feature if supported
If a HCA supports the SG_GAPS_REG feature then fewer memory regions
are required per command. This patch reduces the number of memory
regions that is allocated per SRP session.

Signed-off-by: Sergey Gorenko <sergeygo@mellanox.com>
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Acked-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 16:00:51 -07:00
Bart Van Assche
4190443947 IB/hfi1: Add a missing rcu_read_unlock()
This patch avoids that sparse reports the following:

drivers/infiniband/hw/hfi1/driver.c:251:13: warning: context imbalance in 'rcv_hdrerr' - different lock contexts for basic block

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 16:00:51 -07:00
Arushi
666fe24bbe infiniband: hw: Drop unnecessary continue
Continue at the bottom of a loop are removed.
Issue found using drop_continue.cocci Coccinelle script.

Signed-off-by: Arushi Singhal <arushisinghal19971997@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 16:00:51 -07:00
Shiraz Saleem
7e952b19eb i40iw: Implement get_vector_affinity API
Storage ULPs (like NVMEoF) benefit from exposing affinity mapping
per completion vector to find the optimal multi-queue affinity
assignments. The ULPs call the verbs API ib_get_vector_affinity
introduced in commit c66cd353bb ("RDMA/core: expose affinity mappings per
completion vector") to get the underlying devices affinity mappings.

Add support in driver to expose the affinity masks per MSI-X
completion vector.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 16:00:51 -07:00
Shiraz Saleem
7de8b3576a i40iw: Improve CM node lookup time on connection setup
Currently all CM nodes involved in a connection are
maintained in a connected_node list per dev. During
connection setup, we need to search this every time
we receive a packet on the iWARP LAN Queue (ILQ) and
this can be pretty inefficient for large number of
connections.

Fix this by organizing the CM nodes in two lists -
accelerated list and non-accelerated list. The search
on ILQ receive would be limited to only non accelerated
nodes. When a node moves to RTS, it is added to the
accelerated list.

Benchmarking ucmatose 16k connections shows a 20%
improvement in test completion time.

Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 16:00:51 -07:00
Mustafa Ismail
6b0c549fc6 i40iw: Refactor handling of txpend list
Currently the TX pending lists for IEQ and ILQ are
handled separately. The handling of both can be
consolidated in i40iw_poll_completion.

Signed-off-by: Mustafa Ismail <mustafa.ismail@intel.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 16:00:51 -07:00
Bart Van Assche
2a78cb4db4 IB/srpt: Fix an out-of-bounds stack access in srpt_zerolength_write()
Avoid triggering an out-of-bounds stack access by changing the type
of 'wr' from ib_send_wr into ib_rdma_wr.

This patch fixes the following KASAN bug report:

BUG: KASAN: stack-out-of-bounds in rxe_post_send+0x7a9/0x9a0 [rdma_rxe]
Read of size 8 at addr ffff880068197a48 by task kworker/2:1/44

Workqueue: ib_cm cm_work_handler [ib_cm]
Call Trace:
 dump_stack+0x8e/0xcd
 print_address_description+0x6f/0x280
 kasan_report+0x25a/0x380
 __asan_load8+0x54/0x90
 rxe_post_send+0x7a9/0x9a0 [rdma_rxe]
 srpt_zerolength_write+0xf0/0x180 [ib_srpt]
 srpt_cm_rtu_recv+0x68/0x110 [ib_srpt]
 srpt_rdma_cm_handler+0xbb/0x15b [ib_srpt]
 cma_ib_handler+0x1aa/0x4a0 [rdma_cm]
 cm_process_work+0x30/0x100 [ib_cm]
 cm_work_handler+0xa86/0x351b [ib_cm]
 process_one_work+0x475/0x9f0
 worker_thread+0x69/0x690
 kthread+0x1ad/0x1d0
 ret_from_fork+0x3a/0x50

Fixes: aaf45bd83e ("IB/srpt: Detect session shutdown reliably")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 16:00:51 -07:00
Bart Van Assche
a6544a624c RDMA/rxe: Fix an out-of-bounds read
This patch avoids that KASAN reports the following when the SRP initiator
calls srp_post_send():

==================================================================
BUG: KASAN: stack-out-of-bounds in rxe_post_send+0x5c4/0x980 [rdma_rxe]
Read of size 8 at addr ffff880066606e30 by task 02-mq/1074

CPU: 2 PID: 1074 Comm: 02-mq Not tainted 4.16.0-rc3-dbg+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
Call Trace:
dump_stack+0x85/0xc7
print_address_description+0x65/0x270
kasan_report+0x231/0x350
rxe_post_send+0x5c4/0x980 [rdma_rxe]
srp_post_send.isra.16+0x149/0x190 [ib_srp]
srp_queuecommand+0x94d/0x1670 [ib_srp]
scsi_dispatch_cmd+0x1c2/0x550 [scsi_mod]
scsi_queue_rq+0x843/0xa70 [scsi_mod]
blk_mq_dispatch_rq_list+0x143/0xac0
blk_mq_do_dispatch_ctx+0x1c5/0x260
blk_mq_sched_dispatch_requests+0x2bf/0x2f0
__blk_mq_run_hw_queue+0xdb/0x160
__blk_mq_delay_run_hw_queue+0xba/0x100
blk_mq_run_hw_queue+0xf2/0x190
blk_mq_sched_insert_request+0x163/0x2f0
blk_execute_rq+0xb0/0x130
scsi_execute+0x14e/0x260 [scsi_mod]
scsi_probe_and_add_lun+0x366/0x13d0 [scsi_mod]
__scsi_scan_target+0x18a/0x810 [scsi_mod]
scsi_scan_target+0x11e/0x130 [scsi_mod]
srp_create_target+0x1522/0x19e0 [ib_srp]
kernfs_fop_write+0x180/0x210
__vfs_write+0xb1/0x2e0
vfs_write+0xf6/0x250
SyS_write+0x99/0x110
do_syscall_64+0xee/0x2b0
entry_SYSCALL_64_after_hwframe+0x42/0xb7

The buggy address belongs to the page:
page:ffffea0001998180 count:0 mapcount:0 mapping:0000000000000000 index:0x0
flags: 0x4000000000000000()
raw: 4000000000000000 0000000000000000 0000000000000000 00000000ffffffff
raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff880066606d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1
ffff880066606d80: f1 00 f2 f2 f2 f2 f2 f2 f2 00 00 f2 f2 f2 f2 f2
>ffff880066606e00: f2 00 00 00 00 00 f2 f2 f2 f3 f3 f3 f3 00 00 00
                                    ^
ffff880066606e80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff880066606f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================

Fixes: 8700e3e7c4 ("Soft RoCE driver")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Moni Shoua <monis@mellanox.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 16:00:51 -07:00
Bart Van Assche
a1ae7d0345 RDMA/core: Avoid that ib_drain_qp() triggers an out-of-bounds stack access
This patch fixes the following KASAN complaint:

==================================================================
BUG: KASAN: stack-out-of-bounds in rxe_post_send+0x77d/0x9b0 [rdma_rxe]
Read of size 8 at addr ffff880061aef860 by task 01/1080

CPU: 2 PID: 1080 Comm: 01 Not tainted 4.16.0-rc3-dbg+ #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
Call Trace:
dump_stack+0x85/0xc7
print_address_description+0x65/0x270
kasan_report+0x231/0x350
rxe_post_send+0x77d/0x9b0 [rdma_rxe]
__ib_drain_sq+0x1ad/0x250 [ib_core]
ib_drain_qp+0x9/0x30 [ib_core]
srp_destroy_qp+0x51/0x70 [ib_srp]
srp_free_ch_ib+0xfc/0x380 [ib_srp]
srp_create_target+0x1071/0x19e0 [ib_srp]
kernfs_fop_write+0x180/0x210
__vfs_write+0xb1/0x2e0
vfs_write+0xf6/0x250
SyS_write+0x99/0x110
do_syscall_64+0xee/0x2b0
entry_SYSCALL_64_after_hwframe+0x42/0xb7

The buggy address belongs to the page:
page:ffffea000186bbc0 count:0 mapcount:0 mapping:0000000000000000 index:0x0
flags: 0x4000000000000000()
raw: 4000000000000000 0000000000000000 0000000000000000 00000000ffffffff
raw: 0000000000000000 ffffea000186bbe0 0000000000000000 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff880061aef700: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff880061aef780: 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00
>ffff880061aef800: f2 f2 f2 f2 f2 f2 f2 00 00 00 00 00 f2 f2 f2 f2
                                                      ^
ffff880061aef880: f2 f2 f2 00 00 00 00 00 00 00 00 00 00 00 f2 f2
ffff880061aef900: f2 f2 f2 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================

Fixes: 765d67748b ("IB: new common API for draining queues")
Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: Steve Wise <swise@opengridcomputing.com>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: stable@vger.kernel.org
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 16:00:51 -07:00
Colin Ian King
042932f7a3 infiniband: remove redundant assignment to pointer 'rdi'
The pointer rdi is being initialized with a value that is never read
and re-assigned immediately after, hence the initialization is redundant
and can be removed.

Cleans up clang warning:
drivers/infiniband/sw/rdmavt/vt.c:94:23: warning: Value stored to 'rdi'
during its initialization is never read

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-06 16:00:51 -07:00
Hernán Gonzalez
c33bab622d IB/rxe: Remove unused variable (char *rxe_qp_state_name[])
Note: This is compile only tested as I have no access to the hw.  This
variable was not used anywhere in the code. Removing it saves 24 bytes.

add/remove: 0/1 grow/shrink: 0/0 up/down: 0/-24 (-24)
Function                                     old     new   delta
rxe_qp_state_name                             24       -     -24
Total: Before=3348732, After=3348708, chg -0.00%

Signed-off-by: Hernán Gonzalez <hernan@vanguardiasur.com.ar>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-28 13:57:40 -07:00
Hernán Gonzalez
7f566a91b1 IB/qib: Move char *qib_sdma_state_names[] and constify while there.
Note: This is compile only tested as I have no access to the hw.
This variable was not used in qib_sdma.c but in qib_iba7322.c. Declaring it
there, as static, saves 56 bytes.

add/remove: 0/2 grow/shrink: 0/0 up/down: 0/-144 (-144)
Function                                     old     new   delta
qib_sdma_state_names                          56       -     -56
qib_sdma_event_names                          88       -     -88
Total: Before=2874565, After=2874421, chg -0.01%

Signed-off-by: Hernán Gonzalez <hernan@vanguardiasur.com.ar>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-28 13:57:40 -07:00
Hernán Gonzalez
4f1d583432 IB/qib: Remove unused variable (char *qib_sdma_event_names[])
Note: This is compile only tested as I have no access to the hw.

This variable was not used anywhere in the code. Removing it saves 88
bytes.

add/remove: 0/1 grow/shrink: 0/0 up/down: 0/-88 (-88)
Function                                     old     new   delta
qib_sdma_event_names                          88       -     -88
Total: Before=2874565, After=2874477, chg -0.00%

Signed-off-by: Hernán Gonzalez <hernan@vanguardiasur.com.ar>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-28 13:57:39 -07:00
Bart Van Assche
7da09af91d IB/srp: Use %pIS instead of inet_ntop()
Except for a minor log message change, this patch does not change
any functionality. For the introduction of %pIS, see also commit
1067964305 ("lib: vsprintf: add IPv4/v6 generic %p[Ii]S[pfs]
format specifier").

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-28 13:57:39 -07:00
Bart Van Assche
c74ff7501e Revert "IB/srp: Avoid that a cable pull can trigger a kernel crash"
The caller of srp_ib_lookup_path() is responsible for holding a reference
on the SCSI host. That means that commit 8a0d18c621 was not necessary.
Hence revert it.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-28 13:57:39 -07:00
Bart Van Assche
e68088e78d IB/srp: Fix srp_abort()
Before commit e494f6a728 ("[SCSI] improved eh timeout handler") it
did not really matter whether or not abort handlers like srp_abort()
called .scsi_done() when returning another value than SUCCESS. Since
that commit however this matters. Hence only call .scsi_done() when
returning SUCCESS.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-28 13:57:39 -07:00
Arnd Bergmann
a8ed748708 infiniband: bnxt_re: use BIT_ULL() for 64-bit bit masks
On 32-bit targets, we otherwise get a warning about an impossible constant
integer expression:

In file included from include/linux/kernel.h:11,
                 from include/linux/interrupt.h:6,
                 from drivers/infiniband/hw/bnxt_re/ib_verbs.c:39:
drivers/infiniband/hw/bnxt_re/ib_verbs.c: In function 'bnxt_re_query_device':
include/linux/bitops.h:7:24: error: left shift count >= width of type [-Werror=shift-count-overflow]
 #define BIT(nr)   (1UL << (nr))
                        ^~
drivers/infiniband/hw/bnxt_re/bnxt_re.h:61:34: note: in expansion of macro 'BIT'
 #define BNXT_RE_MAX_MR_SIZE_HIGH BIT(39)
                                  ^~~
drivers/infiniband/hw/bnxt_re/bnxt_re.h:62:30: note: in expansion of macro 'BNXT_RE_MAX_MR_SIZE_HIGH'
 #define BNXT_RE_MAX_MR_SIZE  BNXT_RE_MAX_MR_SIZE_HIGH
                              ^~~~~~~~~~~~~~~~~~~~~~~~
drivers/infiniband/hw/bnxt_re/ib_verbs.c:149:25: note: in expansion of macro 'BNXT_RE_MAX_MR_SIZE'
  ib_attr->max_mr_size = BNXT_RE_MAX_MR_SIZE;
                         ^~~~~~~~~~~~~~~~~~~

Fixes: 872f357824 ("RDMA/bnxt_re: Add support for MRs with Huge pages")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-28 13:57:39 -07:00
Arnd Bergmann
e5d6574ded infiniband: qplib_fp: fix pointer cast
Building for a 32-bit target results in a couple of warnings from casting
between a 32-bit pointer and a 64-bit integer:

drivers/infiniband/hw/bnxt_re/qplib_fp.c: In function 'bnxt_qplib_service_nq':
drivers/infiniband/hw/bnxt_re/qplib_fp.c:333:23: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
    bnxt_qplib_arm_srq((struct bnxt_qplib_srq *)q_handle,
                       ^
drivers/infiniband/hw/bnxt_re/qplib_fp.c:336:12: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
            (struct bnxt_qplib_srq *)q_handle,
            ^
In file included from include/linux/byteorder/little_endian.h:5,
                 from arch/arm/include/uapi/asm/byteorder.h:22,
                 from include/asm-generic/bitops/le.h:6,
                 from arch/arm/include/asm/bitops.h:342,
                 from include/linux/bitops.h:38,
                 from include/linux/kernel.h:11,
                 from include/linux/interrupt.h:6,
                 from drivers/infiniband/hw/bnxt_re/qplib_fp.c:39:
drivers/infiniband/hw/bnxt_re/qplib_fp.c: In function 'bnxt_qplib_create_srq':
include/uapi/linux/byteorder/little_endian.h:31:43: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
 #define __cpu_to_le64(x) ((__force __le64)(__u64)(x))
                                           ^
include/linux/byteorder/generic.h:86:21: note: in expansion of macro '__cpu_to_le64'
 #define cpu_to_le64 __cpu_to_le64
                     ^~~~~~~~~~~~~
drivers/infiniband/hw/bnxt_re/qplib_fp.c:569:19: note: in expansion of macro 'cpu_to_le64'
  req.srq_handle = cpu_to_le64(srq);

Using a uintptr_t as an intermediate works on all architectures.

Fixes: 37cb11acf1 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-02-28 13:57:39 -07:00