Commit Graph

8867 Commits

Author SHA1 Message Date
David S. Miller
2e2d6f0342 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
net/sched/cls_api.c has overlapping changes to a call to
nlmsg_parse(), one (from 'net') added rtm_tca_policy instead of NULL
to the 5th argument, and another (from 'net-next') added cb->extack
instead of NULL to the 6th argument.

net/ipv4/ipmr_base.c is a case of a bug fix in 'net' being done to
code which moved (to mr_table_dump)) in 'net-next'.  Thanks to David
Ahern for the heads up.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-19 11:03:06 -07:00
Greg Kroah-Hartman
eb6d938ffa Really final for-rc pull request for 4.19
Ok, so last week I thought we had sent our final pull request for 4.19.
 Well, wouldn't ya know someone went and found a couple Spectre v1 fixes
 were needed :-/.  So, a couple *very* small specter patches for this
 (hopefully) final -rc week.
 
 Signed-off-by: Doug Ledford <dledford@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbyR/5AAoJELgmozMOVy/djDQP/2XYToI8u4n35IpylwLnd9OC
 p+GogThE7MoqxxizZbSdXqVmkPvhMBiSsv1cKXEhuyt2KnF8xwvJwjgv6X6bJv3E
 yesuG2odbiyNQ5K0avgWQHey20HIbH8My+u2S5OyphjI36bY1oiKiKmb+rJ9AT6B
 2fumq7jiEB732lBuMnPniwKZQO+LUMSD9EVRKU4mprN7MlRWt9mLJzdu1GZ8eQJZ
 bE1ncPCrjkJTr9osGrDbkHhuBkFVI1kU6duZki8wkkmwwxzyuibbQ6SQsWBDvMRL
 qIdbRz4V17cG0YTjaaD6zb226o80zFT79e1MY9dk1/yJ2msnE1N7qSf25a0xEiWj
 XQaETabfNBCVclldSL2EenM9S4NkKpnsiUnxH0TxAW3tOY/Hi37Ouv8EtM0Iq/U6
 Ns60U4Cy1TDNY+L9sE4yjrfL0mnwBYf5N3qhktFpo7lk76s+4KmiiQjeTuwXQleV
 cYXzkJCTV9d2YJLUCVJieZIZx5xMtPAHkrokjjgztVLPmm9peM+xet1JugoGG6ss
 DFhVD7vHt23a1u/R2ETmvAxUHDm8acKaaJTYIhh8889N6pH+vnVvZL0WQUPoiQrr
 wOTrT/R7Dm0LZERM69T/PhOaFbhRkBHppaaZaQE6Lype2H6A7i+hLDTZCg9IWV9f
 gHlBSApUr8vutAyiieXE
 =ZHZX
 -----END PGP SIGNATURE-----

Merge tag 'for-gkh' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Doug writes:
  "Really final for-rc pull request for 4.19

   Ok, so last week I thought we had sent our final pull request for
   4.19.  Well, wouldn't ya know someone went and found a couple Spectre
   v1 fixes were needed :-/.  So, a couple *very* small specter patches
   for this (hopefully) final -rc week."

* tag 'for-gkh' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/ucma: Fix Spectre v1 vulnerability
  IB/ucm: Fix Spectre v1 vulnerability
2018-10-19 08:30:35 +02:00
Tariq Toukan
4972e6fa3a net/mlx5: Refactor fragmented buffer struct fields and init flow
Take struct mlx5_frag_buf out of mlx5_frag_buf_ctrl, as it is not
needed to manage and control the datapath of the fragmented buffers API.

struct mlx5_frag_buf contains control info to manage the allocation
and de-allocation of the fragmented buffer.
Its fields are not relevant for datapath, so here I take them out of the
struct mlx5_frag_buf_ctrl, except for the fragments array itself.

In addition, modified mlx5_fill_fbc to initialise the frags pointers
as well. This implies that the buffer must be allocated before the
function is called.

A set of type-specific *_get_byte_size() functions are replaced by
a generic one.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-18 13:13:31 -07:00
Paul Blakey
d5634fee24 net/mlx5: Add a no-append flow insertion mode
If no-append flag is set, we will add a new FTE, instead of appending
the actions of the inserted rule when the same match already exists.

While here, move the has_flow_tag boolean indicator to be a flag too.

This patch doesn't change any functionality.

Signed-off-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanmox.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-17 14:18:50 -07:00
Mark Bloch
171c7625be net/mlx5: Use flow counter IDs and not the wrapping cache object
Currently, when a flow rule is created using the FS core layer, the caller
has to pass the entire flow counter object and not just the counter HW
handle (ID). This requires both the FS core and the caller to have
knowledge about the inner implementation of the FS layer flow counters
cache and limits the possible users.

Move to use the counter ID across the place when dealing with flows.

Doing this decoupling, now can we privatize the inner implementation
of the flow counters.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-17 14:15:48 -07:00
Saeed Mahameed
186daf0c20 Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux into net-next
mlx5 updates for both net-next and rdma-next

* 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: (21 commits)
  net/mlx5: Expose DC scatter to CQE capability bit
  net/mlx5: Update mlx5_ifc with DEVX UID bits
  net/mlx5: Set uid as part of DCT commands
  net/mlx5: Set uid as part of SRQ commands
  net/mlx5: Set uid as part of SQ commands
  net/mlx5: Set uid as part of RQ commands
  net/mlx5: Set uid as part of QP commands
  net/mlx5: Set uid as part of CQ commands
  net/mlx5: Rename incorrect naming in IFC file
  net/mlx5: Export packet reformat alloc/dealloc functions
  net/mlx5: Pass a namespace for packet reformat ID allocation
  net/mlx5: Expose new packet reformat capabilities
  {net, RDMA}/mlx5: Rename encap to reformat packet
  net/mlx5: Move header encap type to IFC header file
  net/mlx5: Break encap/decap into two separated flow table creation flags
  net/mlx5: Add support for more namespaces when allocating modify header
  net/mlx5: Export modify header alloc/dealloc functions
  net/mlx5: Add proper NIC TX steering flow tables support
  net/mlx5: Cleanup flow namespace getter switch logic
  net/mlx5: Add memic command opcode to command checker
  ...

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-17 14:13:36 -07:00
Gustavo A. R. Silva
a3671a4f97 RDMA/ucma: Fix Spectre v1 vulnerability
hdr.cmd can be indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

drivers/infiniband/core/ucma.c:1686 ucma_write() warn: potential
spectre issue 'ucma_cmd_table' [r] (local cap)

Fix this by sanitizing hdr.cmd before using it to index
ucm_cmd_table.

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-16 12:47:40 -04:00
Gustavo A. R. Silva
0295e39595 IB/ucm: Fix Spectre v1 vulnerability
hdr.cmd can be indirectly controlled by user-space, hence leading to
a potential exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

drivers/infiniband/core/ucm.c:1127 ib_ucm_write() warn: potential
spectre issue 'ucm_cmd_table' [r] (local cap)

Fix this by sanitizing hdr.cmd before using it to index
ucm_cmd_table.

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-16 11:32:40 -04:00
David S. Miller
1986647c2f mlx5e-updates-2018-10-10
IPoIB netlink support and mlx5e pre-allocated netdevice initialization
 
 IP link was broken due to the changes in IPoIB for the rdma_netdev
 support after commit cd565b4b51
 ("IB/IPoIB: Support acceleration options callbacks").
 
 This patchset fixes IPoIB pkey creation and removal using rtnetlink by
 adding support in both IPoIB ULP layer and mlx5 layer:
 
 From Jason and Denis:
 1) Introduces changes in the RDMA netdev code in order to
    allow allocation of the netdev to be done by the rtnl netdev code.
 2) Reworks IPoIB initialization to use the two step rdma_netdev
    creation.
 
 From Feras and Saeed, mlx5e netdev layer refactoring to allow accepting
 pre-allocated netdevs:
 3) Adds support to initialize/cleanup netdevs that are not created
    by mlx5 driver.
 4) Change mlx5e netdevice layer to accept the pre-allocated netdevice
    queue number.
 5) Initialize mlx5e generic structures in one place to be used for all
    netdevs types NIC/representors/IPoIB (both mlx5 allocated and
    pre-allocted).
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbvqHhAAoJEEg/ir3gV/o+TRQH+wYWgvi5AbqYlGpT8xSho8Dg
 5tuFE9AbYDGLLWptZgtYXsBsTaltGVqKnwWy//dAuuSI7LvqtjeKqq0G1JKau70L
 /49+vv8NTQ6vov2yY95yzzEo5B1/zVcYEl9dmJl4y6Xlwc+YIteewReVqRj+tucR
 xO7ghLWc1o4Pl7gWBAcOULyOsZc+CtG8ElwEaBROXTtYRpsR7FKn7vN+WdWZ2VOG
 MjtCUaG0vZlK1vaF76J3P9bL2V3y6o6i6S8RZwg1kPxO4jnrzyGrkxWMOX6f32KB
 JAUcLVlJW5wckcRAKfTwLxsFMrc9vLBPHyj4XneXVWGKh8/dGUGY5gdGIJV5Jlc=
 =m0tM
 -----END PGP SIGNATURE-----

Merge tag 'mlx5e-updates-2018-10-10' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5e-updates-2018-10-10

IPoIB netlink support and mlx5e pre-allocated netdevice initialization

IP link was broken due to the changes in IPoIB for the rdma_netdev
support after commit cd565b4b51
("IB/IPoIB: Support acceleration options callbacks").

This patchset fixes IPoIB pkey creation and removal using rtnetlink by
adding support in both IPoIB ULP layer and mlx5 layer:

From Jason and Denis:
1) Introduces changes in the RDMA netdev code in order to
   allow allocation of the netdev to be done by the rtnl netdev code.
2) Reworks IPoIB initialization to use the two step rdma_netdev
   creation.

From Feras and Saeed, mlx5e netdev layer refactoring to allow accepting
pre-allocated netdevs:
3) Adds support to initialize/cleanup netdevs that are not created
   by mlx5 driver.
4) Change mlx5e netdevice layer to accept the pre-allocated netdevice
   queue number.
5) Initialize mlx5e generic structures in one place to be used for all
   netdevs types NIC/representors/IPoIB (both mlx5 allocated and
   pre-allocted).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-15 21:49:56 -07:00
David S. Miller
d864991b22 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Conflicts were easy to resolve using immediate context mostly,
except the cls_u32.c one where I simply too the entire HEAD
chunk.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-12 21:38:46 -07:00
Greg Kroah-Hartman
c789174bde Final for-rc pull request for 4.19
We only have one bug to submit this time around.  It fixes a DMA unmap
 issue where we unmapped the DMA address from the IOMMU before we did
 from the card, resulting in a DMAR error with IOMMU enabled, or possible
 crash without.
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbv508AAoJELgmozMOVy/dw4oP/2ibxsE6MOtGfpbYldqiK2Jt
 wmq1Sfdunx85acHv3gyF4MEprqGbFPR9RXhLA/0gn3ywPHElgVAG5/w4xgFJXDk7
 nlwBeMuPjWkSvfPgPAKl1NhQ/nmSjbeBl3B+Nh2T3tZwjKEBRDdoFacxfSuP7K1x
 yzGVkniSvyecskfgVV026qN9XiVboXeNotFWNq41MGmPrbLZrQRyu/uwnn1dbdKE
 CpHfsFUpIXlCwhJT8s/MVs4CpvKQ0cEPTc+mkoy0YBsVBxQXFMeHXRUAHmsy/Q6P
 AKsYkBA9hZjAMLcGeF3YOQyZ3/c4hXdQYshw+g566MtmpHA+pn0wZQT9YOEZwq7C
 OoSaxLM1arOJKZvYv3v1Q0hZAtZgAsZeRItEEAmIGdxn9V90omyMd7CaZ+7FsKIp
 X4e8ooR8v9qyiQu6uMZw6JIydwp0NM3BR9Ko1b165LRS8+Zaq/SjjUo6jmczhzZv
 6i619iBl57nC+S7HLwv7HgoZnWgs8Pfp9cu3JI6RUcjwT8LRr0MeASW3CEkS9Pq0
 T7poMHZ7X8ydGtKuRbCzSJvYRGAfVmYBUAZaobE28mNXVx1kCzN1ZT3JCEC6UgnT
 uydC9zYps7I50vyr0Ah4kahnupStId9mjGf6Aa8UlZzh5koZqk4tHyTpkoqMFp44
 0LyrVB1C1Ai8PIRJIj/L
 =hozJ
 -----END PGP SIGNATURE-----

Merge tag 'for-gkh' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Doug writes:
  "RDMA fixes:

   Final for-rc pull request for 4.19

   We only have one bug to submit this time around.  It fixes a DMA
   unmap issue where we unmapped the DMA address from the IOMMU before
   we did from the card, resulting in a DMAR error with IOMMU enabled,
   or possible crash without."

* tag 'for-gkh' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  IB/mlx5: Unmap DMA addr from HCA before IOMMU
2018-10-12 12:53:06 +02:00
Denis Drozdov
5d6b0cb336 RDMA/netdev: Fix netlink support in IPoIB
IPoIB netlink support was broken by the below commit since integrating
the rdma_netdev support relies on an allocation flow for netdevs that
was controlled by the ipoib driver while netdev's rtnl_newlink
implementation assumes that the netdev will be allocated by netlink.
Such situation leads to crash in __ipoib_device_add, once trying to
reuse netlink device.

This patch fixes the kernel oops for both mlx4 and mlx5
devices triggered by the following command:

Fixes: cd565b4b51 ("IB/IPoIB: Support acceleration options callbacks")
Signed-off-by: Denis Drozdov <denisd@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10 17:58:12 -07:00
Denis Drozdov
f6a8a19bb1 RDMA/netdev: Hoist alloc_netdev_mqs out of the driver
netdev has several interfaces that expect to call alloc_netdev_mqs from
the core code, with the driver only providing the arguments.  This is
incompatible with the rdma_netdev interface that returns the netdev
directly.

Thus re-organize the API used by ipoib so that the verbs core code calls
alloc_netdev_mqs for the driver. This is done by allowing the drivers to
provide the allocation parameters via a 'get_params' callback and then
initializing an allocated netdev as a second step.

Fixes: cd565b4b51 ("IB/IPoIB: Support acceleration options callbacks")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Denis Drozdov <denisd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-10-10 17:58:11 -07:00
Valentine Fatiev
dd9a403495 IB/mlx5: Unmap DMA addr from HCA before IOMMU
The function that puts back the MR in cache also removes the DMA address
from the HCA. Therefore we need to call this function before we remove
the DMA mapping from MMU. Otherwise the HCA may access a memory that
is no longer DMA mapped.

Call trace:
NMI: IOCK error (debug interrupt?) for reason 71 on CPU 0.
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.19.0-rc6+ #4
Hardware name: HP ProLiant DL360p Gen8, BIOS P71 08/20/2012
RIP: 0010:intel_idle+0x73/0x120
Code: 80 5c 01 00 0f ae 38 0f ae f0 31 d2 65 48 8b 04 25 80 5c 01 00 48 89 d1 0f 60 02
RSP: 0018:ffffffff9a403e38 EFLAGS: 00000046
RAX: 0000000000000030 RBX: 0000000000000005 RCX: 0000000000000001
RDX: 0000000000000000 RSI: ffffffff9a5790c0 RDI: 0000000000000000
RBP: 0000000000000030 R08: 0000000000000000 R09: 0000000000007cf9
R10: 000000000000030a R11: 0000000000000018 R12: 0000000000000000
R13: ffffffff9a5792b8 R14: ffffffff9a5790c0 R15: 0000002b48471e4d
FS:  0000000000000000(0000) GS:ffff9c6caf400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f5737185000 CR3: 0000000590c0a002 CR4: 00000000000606f0
Call Trace:
 cpuidle_enter_state+0x7e/0x2e0
 do_idle+0x1ed/0x290
 cpu_startup_entry+0x6f/0x80
 start_kernel+0x524/0x544
 ? set_init_arg+0x55/0x55
 secondary_startup_64+0xa4/0xb0
DMAR: DRHD: handling fault status reg 2
DMAR: [DMA Read] Request device [04:00.0] fault addr b34d2000 [fault reason 06] PTE Read access is not set
DMAR: [DMA Read] Request device [01:00.2] fault addr bff8b000 [fault reason 06] PTE Read access is not set

Fixes: f3f134f526 ("RDMA/mlx5: Fix crash while accessing garbage pointer and freed memory")
Signed-off-by: Valentine Fatiev <valentinef@mellanox.com>
Reviewed-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-10 14:52:43 -04:00
David S. Miller
6f41617bf2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Minor conflict in net/core/rtnetlink.c, David Ahern's bug fix in 'net'
overlapped the renaming of a netlink attribute in net-next.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-03 21:00:17 -07:00
Greg Kroah-Hartman
ad0371482b Second rc pull request
- Fix a long standing race bug when destroying comp_event file descriptors
 
 - srp, hfi1, bnxt_re: Various driver crashes from missing validation and
   other cases
 
 - Fixes for regressions in patches merged this window in the gid cache,
   devx, ucma and uapi.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAlutG7QACgkQOG33FX4g
 mxps3Q//fPL1o4Na/nx/aSmtfcEuPxjXwczUbB0MzlN7rKu0HjrES/DOXbGJZgS7
 5XGu7TQyOBWzpZHZM4Xs+EjTdVY9pOK5+Q+t76Ns1vZsblCATJPOCIYMVkr+zM5T
 zQxEfarlz8kEPWVz/TBAM7djoSkGXNLQQ1ttSVJBSAindhb25NzoiV1fsFGkXKmT
 xbZHW03K1o4dZMf0JJO77eu1m9tkRfX0iFXBg4efgNOZtl9MP4l5lRwPqyRdGhMO
 MvBNCJte/4dKlg5TeoVtp8B0sJvrO6QOj878VqYH4nL8uW2lZWFuzB7ZWxWGxMb/
 VgXwpmXyDuTUjX9gn3kQToG8L9FJqsMzPZGBePMjilJqpgpqoVZpQ8i674JBZznF
 as/bRSLdkA4PL5upFepBD/j9ep9LGIb6ojDcUSmwB+aJ99ZtlFUOhdf4ehQ2x9MU
 WETIcdSqlmFjhY2IoHFKEr5PP3NGIDtPlvZbAmJrrne3uTCwXChXeZgzX1S6AHkY
 djvDEaV8DrMmdJc0XTIOIQvxzMnHuoCo3oHdVaJMLudGkw1NKMN84SkIPb6U2Qjk
 QiPf3pRRUazO6tqJfEZr8Es5lQjYkZUK4wKagOJKdcLBAubEnMhrsvECXQuoFWT8
 sYhzK3AwyEaXp4Y3VbA+aqjZA/AhtPk/l3QKesU6u7hoLtiawoc=
 =cjxG
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Jason writes:
  "Second RDMA rc pull request

   - Fix a long standing race bug when destroying comp_event file descriptors

   - srp, hfi1, bnxt_re: Various driver crashes from missing validation
     and other cases

   - Fixes for regressions in patches merged this window in the gid
     cache, devx, ucma and uapi."

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/core: Set right entry state before releasing reference
  IB/mlx5: Destroy the DEVX object upon error flow
  IB/uverbs: Free uapi on destroy
  RDMA/bnxt_re: Fix system crash during RDMA resource initialization
  IB/hfi1: Fix destroy_qp hang after a link down
  IB/hfi1: Fix context recovery when PBC has an UnsupportedVL
  IB/hfi1: Invalid user input can result in crash
  IB/hfi1: Fix SL array bounds check
  RDMA/uverbs: Fix validity check for modify QP
  IB/srp: Avoid that sg_reset -d ${srp_device} triggers an infinite loop
  ucma: fix a use-after-free in ucma_resolve_ip()
  RDMA/uverbs: Atomically flush and mark closed the comp event queue
  cxgb4: fix abort_req_rss6 struct
2018-09-27 21:53:55 +02:00
Parav Pandit
5c5702e259 RDMA/core: Set right entry state before releasing reference
Currently add_modify_gid() for IB link layer has followong issue
in cache update path.

When GID update event occurs, core releases reference to the GID
table without updating its state and/or entry pointer.

CPU-0                              CPU-1
------                             -----
ib_cache_update()                    IPoIB ULP
   add_modify_gid()                   [..]
      put_gid_entry()
      refcnt = 0, but
      state = valid,
      entry is valid.
      (work item is not yet executed).
                                   ipoib_create_ah()
                                     rdma_create_ah()
                                        rdma_get_gid_attr() <--
                                   	Tries to acquire gid_attr
                                        which has refcnt = 0.
                                   	This is incorrect.

GID entry state and entry pointer is provides the accurate GID enty
state. Such fields must be updated with rwlock to protect against
readers and, such fields must be in sane state before refcount can drop
to zero. Otherwise above race condition can happen leading to
use-after-free situation.

Following backtrace has been observed when cache update for an IB port
is triggered while IPoIB ULP is creating an AH.

Therefore, when updating GID entry, first mark a valid entry as invalid
through state and set the barrier so that no callers can acquired
the GID entry, followed by release reference to it.

refcount_t: increment on 0; use-after-free.
WARNING: CPU: 4 PID: 29106 at lib/refcount.c:153 refcount_inc_checked+0x30/0x50
Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core]
RIP: 0010:refcount_inc_checked+0x30/0x50
RSP: 0018:ffff8802ad36f600 EFLAGS: 00010082
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000002 RSI: 0000000000000008 RDI: ffffffff86710100
RBP: ffff8802d6e60a30 R08: ffffed005d67bf8b R09: ffffed005d67bf8b
R10: 0000000000000001 R11: ffffed005d67bf8a R12: ffff88027620cee8
R13: ffff8802d6e60988 R14: ffff8802d6e60a78 R15: 0000000000000202
FS: 0000000000000000(0000) GS:ffff8802eb200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f3ab35e5c88 CR3: 00000002ce84a000 CR4: 00000000000006e0
IPv6: ADDRCONF(NETDEV_CHANGE): ib1: link becomes ready
Call Trace:
rdma_get_gid_attr+0x220/0x310 [ib_core]
? lock_acquire+0x145/0x3a0
rdma_fill_sgid_attr+0x32c/0x470 [ib_core]
rdma_create_ah+0x89/0x160 [ib_core]
? rdma_fill_sgid_attr+0x470/0x470 [ib_core]
? ipoib_create_ah+0x52/0x260 [ib_ipoib]
ipoib_create_ah+0xf5/0x260 [ib_ipoib]
ipoib_mcast_join_complete+0xbbe/0x2540 [ib_ipoib]

Fixes: b150c3862d ("IB/core: Introduce GID entry reference counts")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-25 15:01:09 -06:00
Yishai Hadas
e8ef090a61 IB/mlx5: Destroy the DEVX object upon error flow
Upon DEVX object creation the object must be destroyed upon a follows
error flow.

Fixes: 7efce3691d ("IB/mlx5: Add obj create and destroy functionality")
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-25 14:49:17 -06:00
Mark Bloch
a9360abd3d IB/uverbs: Free uapi on destroy
Make sure we free struct uverbs_api once we clean the radix tree. It was
allocated by uverbs_alloc_api().

Fixes: 9ed3e5f447 ("IB/uverbs: Build the specs into a radix tree at runtime")
Reported-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-25 14:47:33 -06:00
Selvin Xavier
de5c95d0f5 RDMA/bnxt_re: Fix system crash during RDMA resource initialization
bnxt_re_ib_reg acquires and releases the rtnl lock whenever it accesses
the L2 driver.

The following sequence can trigger a crash

Acquires the rtnl_lock ->
	Registers roce driver callback with L2 driver ->
		release the rtnl lock
bnxt_re acquires the rtnl_lock ->
	Request for MSIx vectors ->
		release the rtnl_lock

Issue happens when bnxt_re proceeds with remaining part of initialization
and L2 driver invokes bnxt_ulp_irq_stop as a part of bnxt_open_nic.

The crash is in bnxt_qplib_nq_stop_irq as the NQ structures are
not initialized yet,

<snip>
[ 3551.726647] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 3551.726656] IP: [<ffffffffc0840ee9>] bnxt_qplib_nq_stop_irq+0x59/0xb0 [bnxt_re]
[ 3551.726674] PGD 0
[ 3551.726679] Oops: 0002 1 SMP
...
[ 3551.726822] Hardware name: Dell Inc. PowerEdge R720/08RW36, BIOS 2.4.3 07/09/2014
[ 3551.726826] task: ffff97e30eec5ee0 ti: ffff97e3173bc000 task.ti: ffff97e3173bc000
[ 3551.726829] RIP: 0010:[<ffffffffc0840ee9>] [<ffffffffc0840ee9>]
bnxt_qplib_nq_stop_irq+0x59/0xb0 [bnxt_re]
...
[ 3551.726872] Call Trace:
[ 3551.726886] [<ffffffffc082cb9e>] bnxt_re_stop_irq+0x4e/0x70 [bnxt_re]
[ 3551.726899] [<ffffffffc07d6a53>] bnxt_ulp_irq_stop+0x43/0x70 [bnxt_en]
[ 3551.726908] [<ffffffffc07c82f4>] bnxt_reserve_rings+0x174/0x1e0 [bnxt_en]
[ 3551.726917] [<ffffffffc07cafd8>] __bnxt_open_nic+0x368/0x9a0 [bnxt_en]
[ 3551.726925] [<ffffffffc07cb62b>] bnxt_open_nic+0x1b/0x50 [bnxt_en]
[ 3551.726934] [<ffffffffc07cc62f>] bnxt_setup_mq_tc+0x11f/0x260 [bnxt_en]
[ 3551.726943] [<ffffffffc07d5f58>] bnxt_dcbnl_ieee_setets+0xb8/0x1f0 [bnxt_en]
[ 3551.726954] [<ffffffff890f983a>] dcbnl_ieee_set+0x9a/0x250
[ 3551.726966] [<ffffffff88fd6d21>] ? __alloc_skb+0xa1/0x2d0
[ 3551.726972] [<ffffffff890f72fa>] dcb_doit+0x13a/0x210
[ 3551.726981] [<ffffffff89003ff7>] rtnetlink_rcv_msg+0xa7/0x260
[ 3551.726989] [<ffffffff88ffdb00>] ? rtnl_unicast+0x20/0x30
[ 3551.726996] [<ffffffff88bf9dc8>] ? __kmalloc_node_track_caller+0x58/0x290
[ 3551.727002] [<ffffffff890f7326>] ? dcb_doit+0x166/0x210
[ 3551.727007] [<ffffffff88fd6d0d>] ? __alloc_skb+0x8d/0x2d0
[ 3551.727012] [<ffffffff89003f50>] ? rtnl_newlink+0x880/0x880
...
[ 3551.727104] [<ffffffff8911f7d5>] system_call_fastpath+0x1c/0x21
...
[ 3551.727164] RIP [<ffffffffc0840ee9>] bnxt_qplib_nq_stop_irq+0x59/0xb0 [bnxt_re]
[ 3551.727175] RSP <ffff97e3173bf788>
[ 3551.727177] CR2: 0000000000000000

Avoid this inconsistent state and  system crash by acquiring
the rtnl lock for the entire duration of device initialization.
Re-factor the code to remove the rtnl lock from the individual function
and acquire and release it from the caller.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Fixes: 6e04b10356 ("RDMA/bnxt_re: Fix broken RoCE driver due to recent L2 driver changes")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-24 09:24:16 -06:00
Mark Bloch
5d773ff41a net/mlx5: Rename incorrect naming in IFC file
Remove a trailing underscore from the multicast/unicast names.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-09-22 00:38:39 +03:00
Michael J. Ruhl
b4a4957d3d IB/hfi1: Fix destroy_qp hang after a link down
rvt_destroy_qp() cannot complete until all in process packets have
been released from the underlying hardware.  If a link down event
occurs, an application can hang with a kernel stack similar to:

cat /proc/<app PID>/stack
 quiesce_qp+0x178/0x250 [hfi1]
 rvt_reset_qp+0x23d/0x400 [rdmavt]
 rvt_destroy_qp+0x69/0x210 [rdmavt]
 ib_destroy_qp+0xba/0x1c0 [ib_core]
 nvme_rdma_destroy_queue_ib+0x46/0x80 [nvme_rdma]
 nvme_rdma_free_queue+0x3c/0xd0 [nvme_rdma]
 nvme_rdma_destroy_io_queues+0x88/0xd0 [nvme_rdma]
 nvme_rdma_error_recovery_work+0x52/0xf0 [nvme_rdma]
 process_one_work+0x17a/0x440
 worker_thread+0x126/0x3c0
 kthread+0xcf/0xe0
 ret_from_fork+0x58/0x90
 0xffffffffffffffff

quiesce_qp() waits until all outstanding packets have been freed.
This wait should be momentary.  During a link down event, the cleanup
handling does not ensure that all packets caught by the link down are
flushed properly.

This is caused by the fact that the freeze path and the link down
event is handled the same.  This is not correct.  The freeze path
waits until the HFI is unfrozen and then restarts PIO.  A link down
is not a freeze event.  The link down path cannot restart the PIO
until link is restored.  If the PIO path is restarted before the link
comes up, the application (QP) using the PIO path will hang (until
link is restored).

Fix by separating the linkdown path from the freeze path and use the
link down path for link down events.

Close a race condition sc_disable() by acquiring both the progress
and release locks.

Close a race condition in sc_stop() by moving the setting of the flag
bits under the alloc lock.

Cc: <stable@vger.kernel.org> # 4.9.x+
Fixes: 7724105686 ("IB/hfi1: add driver files")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-20 19:24:51 -06:00
Michael J. Ruhl
d623500b3c IB/hfi1: Fix context recovery when PBC has an UnsupportedVL
If a packet stream uses an UnsupportedVL (virtual lane), the send
engine will not send the packet, and it will not indicate that an
error has occurred.  This will cause the packet stream to block.

HFI has 8 virtual lanes available for packet streams.  Each lane can
be enabled or disabled using the UnsupportedVL mask.  If a lane is
disabled, adding a packet to the send context must be disallowed.

The current mask for determining unsupported VLs defaults to 0 (allow
all).  This is incorrect.  Only the VLs that are defined should be
allowed.

Determine which VLs are disabled (mtu == 0), and set the appropriate
unsupported bit in the mask.  The correct mask will allow the send
engine to error on the invalid VL, and error recovery will work
correctly.

Cc: <stable@vger.kernel.org> # 4.9.x+
Fixes: 7724105686 ("IB/hfi1: add driver files")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-20 19:24:51 -06:00
Michael J. Ruhl
94694d18cf IB/hfi1: Invalid user input can result in crash
If the number of packets in a user sdma request does not match
the actual iovectors being sent, sdma_cleanup can be called on
an uninitialized request structure, resulting in a crash similar
to this:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: [<ffffffffc0ae8bb7>] __sdma_txclean+0x57/0x1e0 [hfi1]
PGD 8000001044f61067 PUD 1052706067 PMD 0
Oops: 0000 [#1] SMP
CPU: 30 PID: 69912 Comm: upsm Kdump: loaded Tainted: G           OE
------------   3.10.0-862.el7.x86_64 #1
Hardware name: Intel Corporation S2600KPR/S2600KPR, BIOS
SE5C610.86B.01.01.0019.101220160604 10/12/2016
task: ffff8b331c890000 ti: ffff8b2ed1f98000 task.ti: ffff8b2ed1f98000
RIP: 0010:[<ffffffffc0ae8bb7>]  [<ffffffffc0ae8bb7>] __sdma_txclean+0x57/0x1e0
[hfi1]
RSP: 0018:ffff8b2ed1f9bab0  EFLAGS: 00010286
RAX: 0000000000008b2b RBX: ffff8b2adf6e0000 RCX: 0000000000000000
RDX: 00000000000000a0 RSI: ffff8b2e9eedc540 RDI: ffff8b2adf6e0000
RBP: ffff8b2ed1f9bad8 R08: 0000000000000000 R09: ffffffffc0b04a06
R10: ffff8b331c890190 R11: ffffe6ed00bf1840 R12: ffff8b3315480000
R13: ffff8b33154800f0 R14: 00000000fffffff2 R15: ffff8b2e9eedc540
FS:  00007f035ac47740(0000) GS:ffff8b331e100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 0000000c03fe6000 CR4: 00000000001607e0
Call Trace:
 [<ffffffffc0b0570d>] user_sdma_send_pkts+0xdcd/0x1990 [hfi1]
 [<ffffffff9fe75fb0>] ? gup_pud_range+0x140/0x290
 [<ffffffffc0ad3105>] ? hfi1_mmu_rb_insert+0x155/0x1b0 [hfi1]
 [<ffffffffc0b0777b>] hfi1_user_sdma_process_request+0xc5b/0x11b0 [hfi1]
 [<ffffffffc0ac193a>] hfi1_aio_write+0xba/0x110 [hfi1]
 [<ffffffffa001a2bb>] do_sync_readv_writev+0x7b/0xd0
 [<ffffffffa001bede>] do_readv_writev+0xce/0x260
 [<ffffffffa022b089>] ? tty_ldisc_deref+0x19/0x20
 [<ffffffffa02268c0>] ? n_tty_ioctl+0xe0/0xe0
 [<ffffffffa001c105>] vfs_writev+0x35/0x60
 [<ffffffffa001c2bf>] SyS_writev+0x7f/0x110
 [<ffffffffa051f7d5>] system_call_fastpath+0x1c/0x21
Code: 06 49 c7 47 18 00 00 00 00 0f 87 89 01 00 00 5b 41 5c 41 5d 41 5e 41 5f
5d c3 66 2e 0f 1f 84 00 00 00 00 00 48 8b 4e 10 48 89 fb <48> 8b 51 08 49 89 d4
83 e2 0c 41 81 e4 00 e0 00 00 48 c1 ea 02
RIP  [<ffffffffc0ae8bb7>] __sdma_txclean+0x57/0x1e0 [hfi1]
 RSP <ffff8b2ed1f9bab0>
CR2: 0000000000000008

There are two exit points from user_sdma_send_pkts().  One (free_tx)
merely frees the slab entry and one (free_txreq) cleans the sdma_txreq
prior to freeing the slab entry.   The free_txreq variation can only be
called after one of the sdma_init*() variations has been called.

In the panic case, the slab entry had been allocated but not inited.

Fix the issue by exiting through free_tx thus avoiding sdma_clean().

Cc: <stable@vger.kernel.org> # 4.9.x+
Fixes: 7724105686 ("IB/hfi1: add driver files")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-20 19:24:51 -06:00
Ira Weiny
0dbfaa9f28 IB/hfi1: Fix SL array bounds check
The SL specified by a user needs to be a valid SL.

Add a range check to the user specified SL value which protects from
running off the end of the SL to SC table.

CC: stable@vger.kernel.org
Fixes: 7724105686 ("IB/hfi1: add driver files")
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-20 19:24:51 -06:00
Majd Dibbiny
4eeed36869 RDMA/uverbs: Fix validity check for modify QP
Uverbs shouldn't enforce QP state in the command unless the user set the QP
state bit in the attribute mask.

In addition, only copy qp attr fields which have the corresponding bit set
in the attribute mask over to the internal attr structure.

Fixes: 88de869bbe ("RDMA/uverbs: Ensure validity of current QP state value")
Fixes: bc38a6abdd ("[PATCH] IB uverbs: core implementation")
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-20 16:47:30 -06:00
Bart Van Assche
ee92efe41c IB/srp: Avoid that sg_reset -d ${srp_device} triggers an infinite loop
Use different loop variables for the inner and outer loop. This avoids
that an infinite loop occurs if there are more RDMA channels than
target->req_ring_size.

Fixes: d92c0da71a ("IB/srp: Add multichannel support")
Cc: <stable@vger.kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-19 15:28:24 -06:00
David S. Miller
e366fa4350 Merge ra.kernel.org:/pub/scm/linux/kernel/git/davem/net
Two new tls tests added in parallel in both net and net-next.

Used Stephen Rothwell's linux-next resolution.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-18 09:33:27 -07:00
Cong Wang
5fe23f262e ucma: fix a use-after-free in ucma_resolve_ip()
There is a race condition between ucma_close() and ucma_resolve_ip():

CPU0				CPU1
ucma_resolve_ip():		ucma_close():

ctx = ucma_get_ctx(file, cmd.id);

        list_for_each_entry_safe(ctx, tmp, &file->ctx_list, list) {
                mutex_lock(&mut);
                idr_remove(&ctx_idr, ctx->id);
                mutex_unlock(&mut);
		...
                mutex_lock(&mut);
                if (!ctx->closing) {
                        mutex_unlock(&mut);
                        rdma_destroy_id(ctx->cm_id);
		...
                ucma_free_ctx(ctx);

ret = rdma_resolve_addr();
ucma_put_ctx(ctx);

Before idr_remove(), ucma_get_ctx() could still find the ctx
and after rdma_destroy_id(), rdma_resolve_addr() may still
access id_priv pointer. Also, ucma_put_ctx() may use ctx after
ucma_free_ctx() too.

ucma_close() should call ucma_put_ctx() too which tests the
refcnt and waits for the last one releasing it. The similar
pattern is already used by ucma_destroy_id().

Reported-and-tested-by: syzbot+da2591e115d57a9cbb8b@syzkaller.appspotmail.com
Reported-by: syzbot+cfe3c1e8ef634ba8964b@syzkaller.appspotmail.com
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-13 13:04:13 -04:00
Linus Torvalds
54eda9df17 pci-v4.19-fixes-1
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAluZ1pkUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxllw//R9UVckBvd7AUseb/8EMUt3GIqvhy
 xy9thJy9cygQF+6Ti9K9TA4aTTriyW7Ur4/lvlSnciffYiZhydaR6MJEgO9+C3wa
 v/Ev9dUHYANVT8z3Q7KVwu1KaNXcb3RBIs0rg0BAqJUzhMNZo4i5NJzxTC5DvnzB
 P8brxz8Oa+IiAdBkJyDDuJ7QK+yIiApcbYWdcnAEwuSxTh3vBSEFjr9hks6TPZUd
 Vp5k2kYOdaslBBKg2yRqK3NcUaWNVpoqXJkVnFpfd6OA3aDPQ79t20se5lkMbpK+
 CbbXWgE2LXnh0wfTDx1nexD9/PbkGHO+NxKX0LRHnedBbWGlOoQVoaXIe/7x79cX
 EWyDG+r0SC5QD5Paj/OtD+Is98PsP5AUiddxnZRUnyIZlJHe9Ja8vV0SuAus8Pjt
 E5AarnLVLrCZGU3XIqj4vy1Kh12TZp2ZaBdFnNkdW+cL7XcRHm9wTEtn4COJKKy5
 MvW2VnEMbIy+JmXZ16R67Ggc9/yU6W/qjzauw0+BkdJSHnrKWRDkfVcuSYf7LquD
 INcTpRekBEPyrgsc5IRvJc+UE7AXUYPpkJ6VkNNwXuB+AVB/3s7LjTEoboWYEp1Y
 QSC4BYClEaUVStgQQ7mTRzvIacIIoYQcEhTJOLVZ2R2o4bWJxZ8Aio2sT2MssZfL
 S6/6csccKXNLS+E=
 =oXm8
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.19-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:

 - Add Tyrel Datwyler as maintainer for PPC64 RPA hotplug (Tyrel
   Datwyler)

 - Add Gustavo Pimentel as DesignWare PCI maintainer (Joao Pinto)

 - Fix a Switchtec Spectre v1 vulnerability (Gustavo A. R. Silva)

 - Revert an unnecessary Intel 300 ACS quirk (Mika Westerberg)

 - Fix pciehp hot-add/powerfault detection that left indicators in wrong
   state (Keith Busch)

 - Fix pci_reset_bus() logic error (Dennis Dalessandro)

 - Revert IB/hfi1 PCI reset change that caused a deadlock (Dennis
   Dalessandro)

 - Allow enabling PASID on Root Complex Integrated Endpoints (Felix
   Kuehling)

* tag 'pci-v4.19-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: Fix enabling of PASID on RC integrated endpoints
  IB/hfi1,PCI: Allow bus reset while probing
  PCI: Fix faulty logic in pci_reset_bus()
  PCI: pciehp: Fix hot-add vs powerfault detection order
  switchtec: Fix Spectre v1 vulnerability
  Revert "PCI: Add ACS quirk for Intel 300 series"
  MAINTAINERS: Add Gustavo Pimentel as DesignWare PCI maintainer
  MAINTAINERS: Add entries for PPC64 RPA PCI hotplug drivers
2018-09-12 19:39:56 -10:00
David S. Miller
aaf9253025 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2018-09-12 22:22:42 -07:00
Steve Wise
67e3816842 RDMA/uverbs: Atomically flush and mark closed the comp event queue
Currently a uverbs completion event queue is flushed of events in
ib_uverbs_comp_event_close() with the queue spinlock held and then
released.  Yet setting ev_queue->is_closed is not set until later in
uverbs_hot_unplug_completion_event_file().

In between the time ib_uverbs_comp_event_close() releases the lock and
uverbs_hot_unplug_completion_event_file() acquires the lock, a completion
event can arrive and be inserted into the event queue by
ib_uverbs_comp_handler().

This can cause a "double add" list_add warning or crash depending on the
kernel configuration, or a memory leak because the event is never dequeued
since the queue is already closed down.

So add setting ev_queue->is_closed = 1 to ib_uverbs_comp_event_close().

Cc: stable@vger.kernel.org
Fixes: 1e7710f3f6 ("IB/core: Change completion channel to use the reworked objects schema")
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-12 15:43:15 -06:00
Dennis Dalessandro
bfc456060d IB/hfi1,PCI: Allow bus reset while probing
Calling into the new API to reset the secondary bus results in a deadlock.
This occurs because the device/bus is already locked at probe time.
Reverting back to the old behavior while the API is improved.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=200985
Fixes: c6a44ba950 ("PCI: Rename pci_try_reset_bus() to pci_reset_bus()")
Fixes: 409888e096 ("IB/hfi1: Use pci_try_reset_bus() for initiating PCI Secondary Bus Reset")
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Cc: Sinan Kaya <okaya@codeaurora.org>
2018-09-11 21:44:52 -05:00
David S. Miller
0c69198d81 infiniband: nes: Use skb_peek_next() and skb_queue_walk().
Instead of direct SKB list accesses.

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-09-10 10:06:53 -07:00
Leon Romanovsky
8f28b178f7 RDMA/mlx4: Ensure that maximal send/receive SGE less than supported by HW
In calculating the global maximum number of the Scatter/Gather elements
supported, the following four maximum parameters must be taken into
consideration: max_sg_rq, max_sg_sq, max_desc_sz_rq and max_desc_sz_sq.

However instead of bringing this complexity to query_device, which still
won't be sufficient anyway (the calculations are dependent on QP type),
the safer approach will be to restore old code, which will give us 32
SGEs.

Fixes: 33023fb85a ("IB/core: add max_send_sge and max_recv_sge attributes")
Reported-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06 13:16:12 -06:00
Parav Pandit
954a8e3aea RDMA/cma: Protect cma dev list with lock
When AF_IB addresses are used during rdma_resolve_addr() a lock is not
held. A cma device can get removed while list traversal is in progress
which may lead to crash. ie

        CPU0                                     CPU1
        ====                                     ====
rdma_resolve_addr()
 cma_resolve_ib_dev()
  list_for_each()                         cma_remove_one()
    cur_dev->device                        mutex_lock(&lock)
                                            list_del();
                                           mutex_unlock(&lock);
                                           cma_process_remove();


Therefore, hold a lock while traversing the list which avoids such
situation.

Cc: <stable@vger.kernel.org> # 3.10
Fixes: f17df3b0de ("RDMA/cma: Add support for AF_IB to rdma_resolve_addr()")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06 13:01:59 -06:00
Parav Pandit
08e74be103 RDMA/uverbs: Fix error cleanup path of ib_uverbs_add_one()
If ib_uverbs_create_uapi() fails, dev_num should be freed from the bitmap.

Fixes: 7d96c9b176 ("IB/uverbs: Have the core code create the uverbs_root_spec")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-05 16:15:52 -06:00
Somnath Kotur
f40f299bbe bnxt_re: Fix couple of memory leaks that could lead to IOMMU call traces
1. DMA-able memory allocated for Shadow QP was not being freed.
2. bnxt_qplib_alloc_qp_hdr_buf() had a bug wherein the SQ pointer was
   erroneously pointing to the RQ. But since the corresponding
   free_qp_hdr_buf() was correct, memory being free was less than what was
   allocated.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-05 16:08:41 -06:00
Aaron Knister
816e846c2e IB/ipoib: Avoid a race condition between start_xmit and cm_rep_handler
Inside of start_xmit() the call to check if the connection is up and the
queueing of the packets for later transmission is not atomic which leaves
a window where cm_rep_handler can run, set the connection up, dequeue
pending packets and leave the subsequently queued packets by start_xmit()
sitting on neigh->queue until they're dropped when the connection is torn
down. This only applies to connected mode. These dropped packets can
really upset TCP, for example, and cause multi-minute delays in
transmission for open connections.

Here's the code in start_xmit where we check to see if the connection is
up:

       if (ipoib_cm_get(neigh)) {
               if (ipoib_cm_up(neigh)) {
                       ipoib_cm_send(dev, skb, ipoib_cm_get(neigh));
                       goto unref;
               }
       }

The race occurs if cm_rep_handler execution occurs after the above
connection check (specifically if it gets to the point where it acquires
priv->lock to dequeue pending skb's) but before the below code snippet in
start_xmit where packets are queued.

       if (skb_queue_len(&neigh->queue) < IPOIB_MAX_PATH_REC_QUEUE) {
               push_pseudo_header(skb, phdr->hwaddr);
               spin_lock_irqsave(&priv->lock, flags);
               __skb_queue_tail(&neigh->queue, skb);
               spin_unlock_irqrestore(&priv->lock, flags);
       } else {
               ++dev->stats.tx_dropped;
               dev_kfree_skb_any(skb);
       }

The patch acquires the netif tx lock in cm_rep_handler for the section
where it sets the connection up and dequeues and retransmits deferred
skb's.

Fixes: 839fcaba35 ("IPoIB: Connected mode experimental support")
Cc: stable@vger.kernel.org
Signed-off-by: Aaron Knister <aaron.s.knister@nasa.gov>
Tested-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-05 15:32:06 -06:00
Mark Bloch
60786f0987 {net, RDMA}/mlx5: Rename encap to reformat packet
Renames all encap mlx5_{core,ib} code to use the new naming of packet
reformat. This change doesn't introduce any function change and is
needed to properly reflect the operation being done by this action.
For example not only can we encapsulate a packet, but also decapsulate it.

Signed-off-by: Mark Bloch <markb@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-09-05 08:10:59 +03:00
Steve Wise
308aa2b8f7 iw_cxgb4: only allow 1 flush on user qps
Once the qp has been flushed, it cannot be flushed again.  The user qp
flush logic wasn't enforcing it however.  The bug can cause
touch-after-free crashes like:

Unable to handle kernel paging request for data at address 0x000001ec
Faulting instruction address: 0xc008000016069100
Oops: Kernel access of bad area, sig: 11 [#1]
...
NIP [c008000016069100] flush_qp+0x80/0x480 [iw_cxgb4]
LR [c00800001606cd6c] c4iw_modify_qp+0x71c/0x11d0 [iw_cxgb4]
Call Trace:
[c00800001606cd6c] c4iw_modify_qp+0x71c/0x11d0 [iw_cxgb4]
[c00800001606e868] c4iw_ib_modify_qp+0x118/0x200 [iw_cxgb4]
[c0080000119eae80] ib_security_modify_qp+0xd0/0x3d0 [ib_core]
[c0080000119c4e24] ib_modify_qp+0xc4/0x2c0 [ib_core]
[c008000011df0284] iwcm_modify_qp_err+0x44/0x70 [iw_cm]
[c008000011df0fec] destroy_cm_id+0xcc/0x370 [iw_cm]
[c008000011ed4358] rdma_destroy_id+0x3c8/0x520 [rdma_cm]
[c0080000134b0540] ucma_close+0x90/0x1b0 [rdma_ucm]
[c000000000444da4] __fput+0xe4/0x2f0

So fix flush_qp() to only flush the wq once.

Cc: stable@vger.kernel.org
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-04 15:07:56 -06:00
Artemy Kovalyov
e4ff3d22c1 IB/core: Release object lock if destroy failed
The object lock was supposed to always be released during destroy, but
when the destruction retry series was integrated with the destroy series
it created a failure path that missed the unlock.

Keep with convention, if destroy fails the caller must undo all locking.

Fixes: 87ad80abc7 ("IB/uverbs: Consolidate uobject destruction")
Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-04 15:07:55 -06:00
Jann Horn
0d23ba6034 RDMA/ucma: check fd type in ucma_migrate_id()
The current code grabs the private_data of whatever file descriptor
userspace has supplied and implicitly casts it to a `struct ucma_file *`,
potentially causing a type confusion.

This is probably fine in practice because the pointer is only used for
comparisons, it is never actually dereferenced; and even in the
comparisons, it is unlikely that a file from another filesystem would have
a ->private_data pointer that happens to also be valid in this context.
But ->private_data is not always guaranteed to be a valid pointer to an
object owned by the file's filesystem; for example, some filesystems just
cram numbers in there.

Check the type of the supplied file descriptor to be safe, analogous to how
other places in the kernel do it.

Fixes: 88314e4dda ("RDMA/cma: add support for rdma_migrate_id()")
Signed-off-by: Jann Horn <jannh@google.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-04 15:07:55 -06:00
Linus Torvalds
1290290c92 Second merge window update
- Switch SMC over to rdma_get_gid_attr and remove the compat
 
 - Fix a crash in HFI1 with some BIOS's
 
 - Fix a randconfig failure
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAlt9z6kACgkQOG33FX4g
 mxqLag//Zr3RrNSmswNjl+6PrWeg88GQstAm4gQps8mR03eI8Ha5fAS6MWuq1U8+
 LGP8bbBtE6fHSCkPSnN/owReZSTcTIbgWB021vBIUKUpytmxie+ugMVXIHwm6cIM
 E3pgPsRngTrqtuRxyOhyoaMKSjRCbSdMe43SWf42/S+9SxDkmXtBzymYT119bu/A
 eqU7z/HuRqcVxrgOinhPjcQkEFdRYUaRk+g6jzaQmZl8IjKHp/d3BSzmzbE6txMt
 txP5hu/msz4xkyusX/I6ZS3do+4WMIAMNhq9bVMo5pNu2RG1eo0LLmHgjsDsFI3u
 ZTaZQj6eYWlbwWRiOU+2Frf4kH4CGAyxlFVCr4yEvfe2VvEbZsLZbYuF1iasnqrP
 3Mrbh7Esj4MfDT/QbunnoX+49X/Rzma/a+tu6hqAhnGjvYsaDyy8WSNwHMs65f7m
 sii58arik9exJHVKMXZ5FJJydBYBcPt5IlkUdoTTZ1eLkoc2B9lG6lKZbWxUYNPx
 uS6XOyR4fcsmLfKfEvQuLZznWi3Jo4R17QKlL5ulqHrZUJV4PZ4Eiu5izScZ4ci/
 xcPGz1a4MT7Zg3V611qGSGt7zYDdD5+R+k/sDIOleCFIpueFzzh/I1oLkSIoXFIx
 DmwqCdRGeB5F7ZhfArFTezgiaMe55SPJKY5PAiThxzoostUGOfw=
 =6z1N
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull more rdma updates from Jason Gunthorpe:
 "This is the SMC cleanup promised, a randconfig regression fix, and
  kernel oops fix.

  Summary:

   - Switch SMC over to rdma_get_gid_attr and remove the compat

   - Fix a crash in HFI1 with some BIOS's

   - Fix a randconfig failure"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  IB/ucm: fix UCM link error
  IB/hfi1: Invalid NUMA node information can cause a divide by zero
  RDMA/smc: Replace ib_query_gid with rdma_get_gid_attr
2018-08-23 15:34:48 -07:00
Michal Hocko
93065ac753 mm, oom: distinguish blockable mode for mmu notifiers
There are several blockable mmu notifiers which might sleep in
mmu_notifier_invalidate_range_start and that is a problem for the
oom_reaper because it needs to guarantee a forward progress so it cannot
depend on any sleepable locks.

Currently we simply back off and mark an oom victim with blockable mmu
notifiers as done after a short sleep.  That can result in selecting a new
oom victim prematurely because the previous one still hasn't torn its
memory down yet.

We can do much better though.  Even if mmu notifiers use sleepable locks
there is no reason to automatically assume those locks are held.  Moreover
majority of notifiers only care about a portion of the address space and
there is absolutely zero reason to fail when we are unmapping an unrelated
range.  Many notifiers do really block and wait for HW which is harder to
handle and we have to bail out though.

This patch handles the low hanging fruit.
__mmu_notifier_invalidate_range_start gets a blockable flag and callbacks
are not allowed to sleep if the flag is set to false.  This is achieved by
using trylock instead of the sleepable lock for most callbacks and
continue as long as we do not block down the call chain.

I think we can improve that even further because there is a common pattern
to do a range lookup first and then do something about that.  The first
part can be done without a sleeping lock in most cases AFAICS.

The oom_reaper end then simply retries if there is at least one notifier
which couldn't make any progress in !blockable mode.  A retry loop is
already implemented to wait for the mmap_sem and this is basically the
same thing.

The simplest way for driver developers to test this code path is to wrap
userspace code which uses these notifiers into a memcg and set the hard
limit to hit the oom.  This can be done e.g.  after the test faults in all
the mmu notifier managed memory and set the hard limit to something really
small.  Then we are looking for a proper process tear down.

[akpm@linux-foundation.org: coding style fixes]
[akpm@linux-foundation.org: minor code simplification]
Link: http://lkml.kernel.org/r/20180716115058.5559-1-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Acked-by: Christian König <christian.koenig@amd.com> # AMD notifiers
Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx and umem_odp
Reported-by: David Rientjes <rientjes@google.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: "Jérôme Glisse" <jglisse@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-22 10:52:44 -07:00
Arnd Bergmann
845b397a77 IB/ucm: fix UCM link error
Building UCM with CONFIG_INFINIBAND_USER_ACCESS=m results in a
set of link errors including:

drivers/infiniband/core/ucm.o: In function `ib_ucm_event_handler':
ucm.c:(.text+0x6dc): undefined reference to `ib_copy_path_rec_to_user'
drivers/infiniband/core/ucma.o: In function `ucma_event_handler':
ucma.c:(.text+0xdc0): undefined reference to `ib_copy_ah_attr_to_user'

To get it to build-test again, this makes the option itself a
tristate, which lets Kconfig figure out the dependency correctly.

Fixes: 486edfb103 ("IB/ucm: Fix compiling ucm.c")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-21 16:56:32 -06:00
Michael J. Ruhl
c513de490f IB/hfi1: Invalid NUMA node information can cause a divide by zero
If the system BIOS does not supply NUMA node information to the
PCI devices, the NUMA node is selected by choosing the current
node.

This can lead to the following crash:

divide error: 0000 SMP
CPU: 0 PID: 4 Comm: kworker/0:0 Tainted: G          IOE
------------   3.10.0-693.21.1.el7.x86_64 #1
Hardware name: Intel Corporation S2600KP/S2600KP, BIOS
SE5C610.86B.01.01.0005.101720141054 10/17/2014
Workqueue: events work_for_cpu_fn
task: ffff880174480fd0 ti: ffff880174488000 task.ti: ffff880174488000
RIP: 0010: [<ffffffffc020ac69>] hfi1_dev_affinity_init+0x129/0x6a0 [hfi1]
RSP: 0018:ffff88017448bbf8  EFLAGS: 00010246
RAX: 0000000000000011 RBX: ffff88107ffba6c0 RCX: ffff88085c22e130
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880824ad0000
RBP: ffff88017448bc48 R08: 0000000000000011 R09: 0000000000000002
R10: ffff8808582b6ca0 R11: 0000000000003151 R12: ffff8808582b6ca0
R13: ffff8808582b6518 R14: ffff8808582b6010 R15: 0000000000000012
FS:  0000000000000000(0000) GS:ffff88085ec00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007efc707404f0 CR3: 0000000001a02000 CR4: 00000000001607f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
 hfi1_init_dd+0x14b3/0x27a0 [hfi1]
 ? pcie_capability_write_word+0x46/0x70
 ? hfi1_pcie_init+0xc0/0x200 [hfi1]
 do_init_one+0x153/0x4c0 [hfi1]
 ? sched_clock_cpu+0x85/0xc0
 init_one+0x1b5/0x260 [hfi1]
 local_pci_probe+0x4a/0xb0
 work_for_cpu_fn+0x1a/0x30
 process_one_work+0x17f/0x440
 worker_thread+0x278/0x3c0
 ? manage_workers.isra.24+0x2a0/0x2a0
 kthread+0xd1/0xe0
 ? insert_kthread_work+0x40/0x40
 ret_from_fork+0x77/0xb0
 ? insert_kthread_work+0x40/0x40

If the BIOS is not supplying NUMA information:
  - set the default table count to 1 for all possible nodes
  - select node 0 (instead of current NUMA) node to get consistent
    performance
  - generate an error indicating that the BIOS should be upgraded

Reviewed-by: Gary Leshner <gary.s.leshner@intel.com>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-20 10:45:48 -06:00
Jason Gunthorpe
0a3173a5f0 Merge branch 'linus/master' into rdma.git for-next
rdma.git merge resolution for the 4.19 merge window

Conflicts:
 drivers/infiniband/core/rdma_core.c
   - Use the rdma code and revise with the new spelling for
     atomic_fetch_add_unless
 drivers/nvme/host/rdma.c
   - Replace max_sge with max_send_sge in new blk code
 drivers/nvme/target/rdma.c
   - Use the blk code and revise to use NULL for ib_post_recv when
     appropriate
   - Replace max_sge with max_recv_sge in new blk code
 net/rds/ib_send.c
   - Use the net code and revise to use NULL for ib_post_recv when
     appropriate

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-16 14:21:29 -06:00
Jason Gunthorpe
89982f7cce Linux 4.18
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAltwm2geHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGITkH/iSzkVhT2OxHoir0
 mLVzTi7/Z17L0e/ELl7TvAC0iLFlWZKdlGR0g3b4/QpXLPmNK4HxiDRTQuWn8ke0
 qDZyDq89HqLt+mpeFZ43PCd9oqV8CH2xxK3iCWReqv6bNnowGnRpSStlks4rDqWn
 zURC/5sUh7TzEG4s997RrrpnyPeQWUlf/Mhtzg2/WvK2btoLWgu5qzjX1uFh3s7u
 vaF2NXVJ3X03gPktyxZzwtO1SwLFS1jhwUXWBZ5AnoJ99ywkghQnkqS/2YpekNTm
 wFk80/78sU+d91aAqO8kkhHj8VRrd+9SGnZ4mB2aZHwjZjGcics4RRtxukSfOQ+6
 L47IdXo=
 =sJkt
 -----END PGP SIGNATURE-----

Merge tag 'v4.18' into rdma.git for-next

Resolve merge conflicts from the -rc cycle against the rdma.git tree:

Conflicts:
 drivers/infiniband/core/uverbs_cmd.c
  - New ifs added to ib_uverbs_ex_create_flow in -rc and for-next
  - Merge removal of file->ucontext in for-next with new code in -rc
 drivers/infiniband/core/uverbs_main.c
  - for-next removed code from ib_uverbs_write() that was modified
    in for-rc

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-16 13:12:00 -06:00
Linus Torvalds
4e31843f68 pci-v4.19-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlt1f9AUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vxbdhAArnhRvkwOk4m4/LCuKF6HpmlxbBNC
 TjnBCenNf+lFXzWskfDFGFl/Wif4UzGbRTSCNQrwMzj3Ww3f/6R2QIq9rEJvyNC4
 VdxQnaBEZSUgN87q5UGqgdjMTo3zFvlFH6fpb5XDiQ5IX/QZeXeYqoB64w+HvKPU
 M+IsoOvnA5gb7pMcpchrGUnSfS1e6AqQbbTt6tZflore6YCEA4cH5OnpGx8qiZIp
 ut+CMBvQjQB01fHeBc/wGrVte4NwXdONrXqpUb4sHF7HqRNfEh0QVyPhvebBi+k1
 kquqoBQfPFTqgcab31VOcQhg70dEx+1qGm5/YBAwmhCpHR/g2gioFXoROsr+iUOe
 BtF6LZr+Y8cySuhJnkCrJBqWvvBaKbJLg0KMbI+7p4o9MZpod2u7LS5LFrlRDyKW
 3nz3o+b1+v3tCCKVKIhKo0ljolgkweQtR1f6KIHvq93wBODHVQnAOt9NlPfHVyks
 ryGBnOhMjoU5hvfexgIWFk9Ph9MEVQSffkI+TeFPO/tyGBfGfQyGtESiXuEaMQaH
 FGdZHX2RLkY3pWHOtWeMzRHzOnr2XjpDFcAqL3HBGPdJ30K3Umv3WOgoFe2SaocG
 0gaddPjKSwwM4Sa/VP+O5cjGuzi7QnczSDdpYjxIGZzBav32hqx4/rsnLw7bHH8y
 XkEme7cYJc8MGsA=
 =2Dmn
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull pci updates from Bjorn Helgaas:

 - Decode AER errors with names similar to "lspci" (Tyler Baicar)

 - Expose AER statistics in sysfs (Rajat Jain)

 - Clear AER status bits selectively based on the type of recovery (Oza
   Pawandeep)

 - Honor "pcie_ports=native" even if HEST sets FIRMWARE_FIRST (Alexandru
   Gagniuc)

 - Don't clear AER status bits if we're using the "Firmware-First"
   strategy where firmware owns the registers (Alexandru Gagniuc)

 - Use sysfs_match_string() to simplify ASPM sysfs parsing (Andy
   Shevchenko)

 - Remove unnecessary includes of <linux/pci-aspm.h> (Bjorn Helgaas)

 - Defer DPC event handling to work queue (Keith Busch)

 - Use threaded IRQ for DPC bottom half (Keith Busch)

 - Print AER status while handling DPC events (Keith Busch)

 - Work around IDT switch ACS Source Validation erratum (James
   Puthukattukaran)

 - Emit diagnostics for all cases of PCIe Link downtraining (Links
   operating slower than they're capable of) (Alexandru Gagniuc)

 - Skip VFs when configuring Max Payload Size (Myron Stowe)

 - Reduce Root Port Max Payload Size if necessary when hot-adding a
   device below it (Myron Stowe)

 - Simplify SHPC existence/permission checks (Bjorn Helgaas)

 - Remove hotplug sample skeleton driver (Lukas Wunner)

 - Convert pciehp to threaded IRQ handling (Lukas Wunner)

 - Improve pciehp tolerance of missed events and initially unstable
   links (Lukas Wunner)

 - Clear spurious pciehp events on resume (Lukas Wunner)

 - Add pciehp runtime PM support, including for Thunderbolt controllers
   (Lukas Wunner)

 - Support interrupts from pciehp bridges in D3hot (Lukas Wunner)

 - Mark fall-through switch cases before enabling -Wimplicit-fallthrough
   (Gustavo A. R. Silva)

 - Move DMA-debug PCI init from arch code to PCI core (Christoph
   Hellwig)

 - Fix pci_request_irq() usage of IRQF_ONESHOT when no handler is
   supplied (Heiner Kallweit)

 - Unify PCI and DMA direction #defines (Shunyong Yang)

 - Add PCI_DEVICE_DATA() macro (Andy Shevchenko)

 - Check for VPD completion before checking for timeout (Bert Kenward)

 - Limit Netronome NFP5000 config space size to work around erratum
   (Jakub Kicinski)

 - Set IRQCHIP_ONESHOT_SAFE for PCI MSI irqchips (Heiner Kallweit)

 - Document ACPI description of PCI host bridges (Bjorn Helgaas)

 - Add "pci=disable_acs_redir=" parameter to disable ACS redirection for
   peer-to-peer DMA support (we don't have the peer-to-peer support yet;
   this is just one piece) (Logan Gunthorpe)

 - Clean up devm_of_pci_get_host_bridge_resources() resource allocation
   (Jan Kiszka)

 - Fixup resizable BARs after suspend/resume (Christian König)

 - Make "pci=earlydump" generic (Sinan Kaya)

 - Fix ROM BAR access routines to stay in bounds and check for signature
   correctly (Rex Zhu)

 - Add DMA alias quirk for Microsemi Switchtec NTB (Doug Meyer)

 - Expand documentation for pci_add_dma_alias() (Logan Gunthorpe)

 - To avoid bus errors, enable PASID only if entire path supports
   End-End TLP prefixes (Sinan Kaya)

 - Unify slot and bus reset functions and remove hotplug knowledge from
   callers (Sinan Kaya)

 - Add Function-Level Reset quirks for Intel and Samsung NVMe devices to
   fix guest reboot issues (Alex Williamson)

 - Add function 1 DMA alias quirk for Marvell 88SS9183 PCIe SSD
   Controller (Bjorn Helgaas)

 - Remove Xilinx AXI-PCIe host bridge arch dependency (Palmer Dabbelt)

 - Remove Aardvark outbound window configuration (Evan Wang)

 - Fix Aardvark bridge window sizing issue (Zachary Zhang)

 - Convert Aardvark to use pci_host_probe() to reduce code duplication
   (Thomas Petazzoni)

 - Correct the Cadence cdns_pcie_writel() signature (Alan Douglas)

 - Add Cadence support for optional generic PHYs (Alan Douglas)

 - Add Cadence power management ops (Alan Douglas)

 - Remove redundant variable from Cadence driver (Colin Ian King)

 - Add Kirin MSI support (Xiaowei Song)

 - Drop unnecessary root_bus_nr setting from exynos, imx6, keystone,
   armada8k, artpec6, designware-plat, histb, qcom, spear13xx (Shawn
   Guo)

 - Move link notification settings from DesignWare core to individual
   drivers (Gustavo Pimentel)

 - Add endpoint library MSI-X interfaces (Gustavo Pimentel)

 - Correct signature of endpoint library IRQ interfaces (Gustavo
   Pimentel)

 - Add DesignWare endpoint library MSI-X callbacks (Gustavo Pimentel)

 - Add endpoint library MSI-X test support (Gustavo Pimentel)

 - Remove unnecessary GFP_ATOMIC from Hyper-V "new child" allocation
   (Jia-Ju Bai)

 - Add more devices to Broadcom PAXC quirk (Ray Jui)

 - Work around corrupted Broadcom PAXC config space to enable SMMU and
   GICv3 ITS (Ray Jui)

 - Disable MSI parsing to work around broken Broadcom PAXC logic in some
   devices (Ray Jui)

 - Hide unconfigured functions to work around a Broadcom PAXC defect
   (Ray Jui)

 - Lower iproc log level to reduce console output during boot (Ray Jui)

 - Fix mobiveil iomem/phys_addr_t type usage (Lorenzo Pieralisi)

 - Fix mobiveil missing include file (Lorenzo Pieralisi)

 - Add mobiveil Kconfig/Makefile support (Lorenzo Pieralisi)

 - Fix mvebu I/O space remapping issues (Thomas Petazzoni)

 - Use generic pci_host_bridge in mvebu instead of ARM-specific API
   (Thomas Petazzoni)

 - Whitelist VMD devices with fast interrupt handlers to avoid sharing
   vectors with slow handlers (Keith Busch)

* tag 'pci-v4.19-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (153 commits)
  PCI/AER: Don't clear AER bits if error handling is Firmware-First
  PCI: Limit config space size for Netronome NFP5000
  PCI/MSI: Set IRQCHIP_ONESHOT_SAFE for PCI-MSI irqchips
  PCI/VPD: Check for VPD access completion before checking for timeout
  PCI: Add PCI_DEVICE_DATA() macro to fully describe device ID entry
  PCI: Match Root Port's MPS to endpoint's MPSS as necessary
  PCI: Skip MPS logic for Virtual Functions (VFs)
  PCI: Add function 1 DMA alias quirk for Marvell 88SS9183
  PCI: Check for PCIe Link downtraining
  PCI: Add ACS Redirect disable quirk for Intel Sunrise Point
  PCI: Add device-specific ACS Redirect disable infrastructure
  PCI: Convert device-specific ACS quirks from NULL termination to ARRAY_SIZE
  PCI: Add "pci=disable_acs_redir=" parameter for peer-to-peer support
  PCI: Allow specifying devices using a base bus and path of devfns
  PCI: Make specifying PCI devices in kernel parameters reusable
  PCI: Hide ACS quirk declarations inside PCI core
  PCI: Delay after FLR of Intel DC P3700 NVMe
  PCI: Disable Samsung SM961/PM961 NVMe before FLR
  PCI: Export pcie_has_flr()
  PCI: mvebu: Drop bogus comment above mvebu_pcie_map_registers()
  ...
2018-08-16 09:21:54 -07:00