IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
This buffer is currently allocated in hfi1_init():
if (reinit)
ret = init_after_reset(dd);
else
ret = loadtime_init(dd);
if (ret)
goto done;
/* allocate dummy tail memory for all receive contexts */
dd->rcvhdrtail_dummy_kvaddr = dma_alloc_coherent(&dd->pcidev->dev,
sizeof(u64),
&dd->rcvhdrtail_dummy_dma,
GFP_KERNEL);
if (!dd->rcvhdrtail_dummy_kvaddr) {
dd_dev_err(dd, "cannot allocate dummy tail memory\n");
ret = -ENOMEM;
goto done;
}
The reinit triggered path will overwrite the old allocation and leak it.
Fix by moving the allocation to hfi1_alloc_devdata() and the deallocation
to hfi1_free_devdata().
Link: https://lore.kernel.org/r/20211129192008.101968.91302.stgit@awfm-01.cornelisnetworks.com
Cc: stable@vger.kernel.org
Fixes: 46b010d3eeb8 ("staging/rdma/hfi1: Workaround to prevent corruption during packet delivery")
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The code tests the dma address which legitimately can be 0.
The code should test the kernel logical address to avoid leaking eager
buffer allocations that happen to map to a dma address of 0.
Fixes: 60368186fd85 ("IB/hfi1: Fix user-space buffers mapping with IOMMU enabled")
Link: https://lore.kernel.org/r/20211129191952.101968.17137.stgit@awfm-01.cornelisnetworks.com
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
When optional counter support was added the allocation of the memory
holding the counter descriptors was not cleared properly. This caused
WARN_ON()s in the IB/sysfs code to be hit.
This is because the uninitialized memory made some of the counters wrongly
look like optional counters. Use kzalloc.
While here change the sizeof() calls to use the pointer rather than the
name of the type.
WARNING: CPU: 0 PID: 32644 at drivers/infiniband/core/sysfs.c:1064 ib_setup_port_attrs+0x7e1/0x890 [ib_core]
CPU: 0 PID: 32644 Comm: kworker/0:2 Tainted: G S W 5.15.0+ #36
Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.0018.C4.072020161249 07/20/2016
Workqueue: events work_for_cpu_fn
RIP: 0010:ib_setup_port_attrs+0x7e1/0x890 [ib_core]
RSP: 0018:ffffc90006ea3c40 EFLAGS: 00010202
RAX: 0000000000000068 RBX: ffff888106ad8000 RCX: 0000000000000138
RDX: ffff888126c84c00 RSI: ffff888103c41000 RDI: 0000000000000124
RBP: ffff88810f63a801 R08: ffff888126c8a000 R09: 0000000000000001
R10: ffffffffa09acf20 R11: 0000000000000065 R12: ffff88810f63a800
R13: ffff88810f63a800 R14: ffff88810f63a8e0 R15: 0000000000000001
FS: 0000000000000000(0000) GS:ffff888667a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005590102cb078 CR3: 000000000240a003 CR4: 00000000001706f0
Call Trace:
ib_register_device.cold.44+0x23e/0x2d0 [ib_core]
rvt_register_device+0xfa/0x230 [rdmavt]
hfi1_register_ib_device+0x623/0x690 [hfi1]
init_one.cold.36+0x2d1/0x49b [hfi1]
local_pci_probe+0x45/0x80
work_for_cpu_fn+0x16/0x20
process_one_work+0x1b1/0x360
worker_thread+0x1d4/0x3a0
kthread+0x11a/0x140
ret_from_fork+0x22/0x30
Fixes: 5e2ddd1e5982 ("RDMA/counter: Add optional counter support")
Link: https://lore.kernel.org/r/20211115200913.124104.47770.stgit@awfm-01.cornelisnetworks.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmF/AjYeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG1hkIAJ6sFDbvb4M4LMwf
Slh2NVL9o5sLMBDzVwnVlyMSKDbMn1WBKreGssaLgZjGDc74lxsdSmw5l9MZm0JN
xlq95Q6XFiuu+0qDHPWwfDz3JFO4TqW2ZLLPWk9NnkNbRXqccSrlVRi1RpgE1t3/
NUtS8CQLu6A2BYMc6mkk3aV6IwSNKOkWbM5eBHSvU4j8B6lLbNQop0AfO/wyY1xB
U6LiVE1RpN/b7Yv+75ITtNzuHzVIBx6305FvSnOlKbMKKvIClt96Vd2OeuoEkK+6
wGU8JraB1+fc0GckAhynNrjWQWdvi0MAhFWWEJxjS20OGcV1rXDduNfkVNauO1Zn
+dNyJ3s=
=g9fz
-----END PGP SIGNATURE-----
Merge tag 'v5.15' into rdma.git for-next
Pull in the accepted for-rc patches as the next merge needs a newer base.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
netdev->dev_addr will become const soon, make sure drivers propagate the
qualifier.
Link: https://lore.kernel.org/r/20211019182604.1441387-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
sc_disable() after having disabled the send context wakes up any waiters
by calling hfi1_qp_wakeup() while holding the waitlock for the sc.
This is contrary to the model for all other calls to hfi1_qp_wakeup()
where the waitlock is dropped and a local is used to drive calls to
hfi1_qp_wakeup().
Fix by moving the sc->piowait into a local list and driving the wakeup
calls from the list.
Fixes: 099a884ba4c0 ("IB/hfi1: Handle wakeup of orphaned QPs for pio")
Link: https://lore.kernel.org/r/20211013141852.128104.2682.stgit@awfm-01.cornelisnetworks.com
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Add a counter statistic descriptor structure in rdma_hw_stats. In addition
to the counter name, more meta-information will be added. This code
extension is needed for optional-counter support in the following patches.
Link: https://lore.kernel.org/r/20211008122439.166063-4-markzhang@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmFaG98eHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGosAH/jqy5B2BIEE39O+8
QTr3vO54SyRRuY/d98wZ+O4SPjfqfpCHuyjKt9YJpEdmzH754NC9gSPOOBegnvHI
DfrWaivmJ5mdjN2h7+JVqjs58krUv98wWNa5xfvqUp5H7wF3WQg3AxsaMKS1PePD
kFHfeFbxsg2gYhyhPK6gHtwLn6dEsx9bGny2bKvCh6KuJQEiUXoEcgnFzjFgLNxp
T5zI1cNSCNUzwRIe+vqQRlfVR2JlSI4tiy0zNJWy9dQ5Z4HOSbFcEz5Df2N7qNYn
/MqruaASmyREgo9yLHpR1BSyzrea8MCckY04ycYqKZb7gDwcrpAe4QVw2I/Fuzu9
q//PV4I=
=+mYg
-----END PGP SIGNATURE-----
Merge tag 'v5.15-rc4' into rdma.get for-next
Merged due to dependencies in following patches.
Conflict in drivers/infiniband/hw/hfi1/ipoib_tx.c resolved by hand to take
the %p change and txq stats rename together.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Make use of the struct_size() and flex_array_size() helpers instead of
open-coded versions, in order to avoid any potential type mistakes or
integer overflows that, in the worse scenario, could lead to heap
overflows.
Link: https://lore.kernel.org/r/20210927225333.GA192634@embeddedor
Link: https://github.com/KSPP/linux/issues/160
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
These traces are used to debugging ring issues.
The ipoib_txreq needed to be moved to a header file to allow access from
the trace header file.
The trace changes include:
- new producer/consumer traces
- new allocation deallocation traces
- additional fidelity for SDMA engine prints
Fixes: 4bd00b55c978 ("IB/hfi1: Add AIP tx traces")
Link: https://lore.kernel.org/r/20210913132852.131370.9664.stgit@awfm-01.cornelisnetworks.com
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The atomic is not needed.
Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Link: https://lore.kernel.org/r/20210913132847.131370.54250.stgit@awfm-01.cornelisnetworks.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
This patch moves fields in the ring and creates a line for the producer
and the consumer.
The adds a consumer side variable that tracks the ring avail so that the
code doesn't have the read the other cacheline to get a count for every
packet. A read now only occurs when the avail is at 0.
Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Link: https://lore.kernel.org/r/20210913132842.131370.15636.stgit@awfm-01.cornelisnetworks.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The txq has the backpointer, so this is a micro optimization for the tx
path.
Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Link: https://lore.kernel.org/r/20210913132836.131370.89704.stgit@awfm-01.cornelisnetworks.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The pointer math in this statemet does a divide;
struct hfi1_ipoib_txq *txq = &priv->txqs[napi - priv->tx_napis];
Elminate the divide by embedding the struct napi_strut in the txq and
getting the txq with a container_of() using the newly embedded napi.
Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Link: https://lore.kernel.org/r/20210913132831.131370.3993.stgit@awfm-01.cornelisnetworks.com
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
This patch removes kmem cache allocation and deallocation in favor of
having the ipoib_txreq in the ring.
The consumer is now the packet sending side allocating tx descriptors from
ring and the producer is the napi interrupt handling freeing tx
descriptors.
The locks are now eliminated because the napi tx lock insures a single
consumer and the napi handling insures a single producer.
The napi poll is converted to memory poll looking for items that have been
marked completed.
Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Link: https://lore.kernel.org/r/20210913132826.131370.4397.stgit@awfm-01.cornelisnetworks.com
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Pointers should be printed with %p or %px rather than cast to 'unsigned
long long' and printed with %llx. Change %llx to %p to print the secured
pointer.
Fixes: 042a00f93aad ("IB/{ipoib,hfi1}: Add a timeout handler for rdma_netdev")
Link: https://lore.kernel.org/r/20210922134857.619602-1-qtxuning1999@sjtu.edu.cn
Signed-off-by: Guo Zhi <qtxuning1999@sjtu.edu.cn>
Acked-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
An important error case regression fixes in mlx5:
- Wrong size used when computing the error path smaller allocation request
leads to corruption
- Confusing but ultimately harmless alignment mis-calculation
- Static checker warnings:
Null pointer subtraction in qib
kcalloc in bnxt_re
Missing static on global variable in hfi1
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAmE4qagACgkQOG33FX4g
mxoueg//SfLuxirMifDHOpH2Db5vC2a/BwYW3AuT78zAXzlUmmEWYIUw1fbJA2vK
vt/fsjaxW2FBsluiIgV8J961Mlrj+/Y1KPxbV+76CKFoBkaYcjEJcIajGXkm71r5
bXviUuFEHV2810Nxs6QEitMPyEP3sODmcK5EoKIuifVfjhdcJ3XrLbJb0NA5PKk1
voD2zpAfhumiAzjZ2q3rJgpCv08WWqts30DzAKdii1asdG9tDF8WBS9BJuPC7MHF
B2AmYVTgwhsepCAUaYGathKxHblqakRXF3gjOIAB4TKuZEv0OwGVzH3flFbukg5L
icEw/Ai0H2GM7e3yT3BI3QDoXMg52DfCeChgwttkNsJJidjlc7cUCwaaBCJ1oVo/
860foW9YFA14ftAlBt2BtvfNfOaqYnmz4tmavVNxqW7NiZCqT/+uyV5BSDPD5b5M
Xtsa8NK/6u65rz92GtNpVYd8W4ZfjOCAc8cWj6+ELeSDsU92sFWBWalWsl59Hq+t
AdCUi25uwUAlLPiPKRlYxSXoCCOG80iDgrj8sntIGoVYT0nr4ITVv5VmmVWNQLPw
V5xfDhCmALrNlp1nCBRsmtBHA/Legqu82/lWpkl12WZGOytU/jTEb2gLWPDglwr+
BWmaq3Us/g36NX9h/GOvqD5tZCogzkjHvWLW0biMRebDyO1DW1Y=
=Sg6A
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
"I don't usually send a second PR in the merge window, but the fix to
mlx5 is significant enough that it should start going through the
process ASAP. Along with it comes some of the usual -rc stuff that
would normally wait for a -rc2 or so.
Summary:
Important error case regression fixes in mlx5:
- Wrong size used when computing the error path smaller allocation
request leads to corruption
- Confusing but ultimately harmless alignment mis-calculation
Static checker warning fixes:
- NULL pointer subtraction in qib
- kcalloc in bnxt_re
- Missing static on global variable in hfi1"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
IB/hfi1: make hist static
RDMA/bnxt_re: Prefer kcalloc over open coded arithmetic
IB/qib: Fix null pointer subtraction compiler warning
RDMA/mlx5: Fix xlt_chunk_align calculation
RDMA/mlx5: Fix number of allocated XLT entries
This symbol is not used outside of trace.c, so marks it static.
Fix the following sparse warning:
drivers/infiniband/hw/hfi1/trace.c:491:23: warning: symbol 'hist' was not declared. Should it be static?
Link: https://lore.kernel.org/r/1630921723-21545-1-git-send-email-jiapeng.chong@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: chongjiapeng <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
- Various cleanup and small features for rtrs
- kmap_local_page() conversions
- Driver updates and fixes for: efa, rxe, mlx5, hfi1, qed, hns
- Cache the IB subnet prefix
- Rework how CRC is calcuated in rxe
- Clean reference counting in iwpm's netlink
- Pull object allocation and lifecycle for user QPs to the uverbs core
code
- Several small hns features and continued general code cleanups
- Fix the scatterlist confusion of orig_nents/nents introduced in an
earlier patch creating the append operation
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAmEudRgACgkQOG33FX4g
mxraJA//c6bMxrrTVrzmrtrkyYD4tYWE8RDfgvoyZtleZnnEOJeunCQWakQrpJSv
ukSnOGCA3PtnmRMdV54f/11YJ/7otxOJodSO7jWsIoBrqG/lISAdX8mn2iHhrvJ0
dIaFEFPLy0WqoMLCJVIYIupR0IStVHb/mWx0uYL4XnnoYKyt7f7K5JMZpNWMhDN2
ieJw0jfrvEYm8pipWuxUvB16XARlzAWQrjqLpMRI+jFRpbDVBY21dz2/LJvOJPrA
LcQ+XXsV/F659ibOAGm6bU4BMda8fE6Lw90B/gmhSswJ205NrdziF5cNYHP0QxcN
oMjrjSWWHc9GEE7MTipC2AH8e36qob16Q7CK+zHEJ+ds7R6/O/8XmED1L8/KFpNA
FGqnjxnxsl1y27mUegfj1Hh8PfoDp2oVq0lmpEw0CYo4cfVzHSMRrbTR//XmW628
Ie/mJddpFK4oLk+QkSNjSLrnxOvdTkdA58PU0i84S5eUVMNm41jJDkxg2J7vp0Zn
sclZsclhUQ9oJ5Q2so81JMWxu4JDn7IByXL0ULBaa6xwQTiVEnyvSxSuPlflhLRW
0vI2ylATYKyWkQqyX7VyWecZJzwhwZj5gMMWmoGsij8bkZhQ/VaQMaesByzSth+h
NV5UAYax4GqyOQ/tg/tqT6e5nrI1zof87H64XdTCBpJ7kFyQ/oA=
=ZwOe
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"This is quite a small cycle, no major series stands out. The HNS and
rxe drivers saw the most activity this cycle, with rxe being broken
for a good chunk of time. The significant deleted line count is due to
a SPDX cleanup series.
Summary:
- Various cleanup and small features for rtrs
- kmap_local_page() conversions
- Driver updates and fixes for: efa, rxe, mlx5, hfi1, qed, hns
- Cache the IB subnet prefix
- Rework how CRC is calcuated in rxe
- Clean reference counting in iwpm's netlink
- Pull object allocation and lifecycle for user QPs to the uverbs
core code
- Several small hns features and continued general code cleanups
- Fix the scatterlist confusion of orig_nents/nents introduced in an
earlier patch creating the append operation"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (90 commits)
RDMA/mlx5: Relax DCS QP creation checks
RDMA/hns: Delete unnecessary blank lines.
RDMA/hns: Encapsulate the qp db as a function
RDMA/hns: Adjust the order in which irq are requested and enabled
RDMA/hns: Remove RST2RST error prints for hw v1
RDMA/hns: Remove dqpn filling when modify qp from Init to Init
RDMA/hns: Fix QP's resp incomplete assignment
RDMA/hns: Fix query destination qpn
RDMA/hfi1: Convert to SPDX identifier
IB/rdmavt: Convert to SPDX identifier
RDMA/hns: Bugfix for incorrect association between dip_idx and dgid
RDMA/hns: Bugfix for the missing assignment for dip_idx
RDMA/hns: Bugfix for data type of dip_idx
RDMA/hns: Fix incorrect lsn field
RDMA/irdma: Remove the repeated declaration
RDMA/core/sa_query: Retry SA queries
RDMA: Use the sg_table directly and remove the opencoded version from umem
lib/scatterlist: Fix wrong update of orig_nents
lib/scatterlist: Provide a dedicated function to support table append
RDMA/hns: Delete unused hns bitmap interface
...
Just use seq_write to copy the stats into the seq_file buffer instead of
poking holes into the seq_file abstraction.
Link: https://lore.kernel.org/r/20210810151711.1795374-1-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
kmalloc_array() is called to allocate memory for tx->descp. If it fails,
the function __sdma_txclean() is called:
__sdma_txclean(dd, tx);
However, in the function __sdma_txclean(), tx-descp is dereferenced if
tx->num_desc is not zero:
sdma_unmap_desc(dd, &tx->descp[0]);
To fix this possible null-pointer dereference, assign the return value of
kmalloc_array() to a local variable descp, and then assign it to tx->descp
if it is not NULL. Otherwise, go to enomem.
Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/20210806133029.194964-1-islituo@gmail.com
Reported-by: TOTE Robot <oslab@tsinghua.edu.cn>
Signed-off-by: Tuo Li <islituo@gmail.com>
Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Acked-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
refcount_t type and corresponding API can protect refcounters from
accidental underflow and overflow and further use-after-free situations.
Link: https://lore.kernel.org/r/1626674454-56075-1-git-send-email-xiyuyang19@fudan.edu.cn
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Tested-by: Josh Fisher <josh.fisher@cornelisnetworks.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
It is possible for the primary IPoIB network device associated with any
RDMA device to fail to join certain multicast groups preventing IPv6
neighbor discovery and possibly other network ULPs from working
correctly. The IPv4 broadcast group is not affected as the IPoIB network
device handles joining that multicast group directly.
This is because the primary IPoIB network device uses the pkey at ndex 0
in the associated RDMA device's pkey table. Anytime the pkey value of
index 0 changes, the primary IPoIB network device automatically modifies
it's broadcast address (i.e. /sys/class/net/[ib0]/broadcast), since the
broadcast address includes the pkey value, and then bounces carrier. This
includes initial pkey assignment, such as when the pkey at index 0
transitions from the opa default of invalid (0x0000) to some value such as
the OPA default pkey for Virtual Fabric 0: 0x8001 or when the fabric
manager is restarted with a configuration change causing the pkey at index
0 to change. Many network ULPs are not sensitive to the carrier bounce and
are not expecting the broadcast address to change including the linux IPv6
stack. This problem does not affect IPoIB child network devices as their
pkey value is constant for all time.
To mitigate this issue, change the default pkey in at index 0 to 0x8001 to
cover the predominant case and avoid issues as ipoib comes up and the FM
sweeps.
At some point, ipoib multicast support should automatically fix
non-broadcast addresses as it does with the primary broadcast address.
Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/20210715160445.142451.47651.stgit@awfm-01.cornelisnetworks.com
Suggested-by: Josh Collier <josh.d.collier@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
There is no counter for dmawait in AIP, which hampers debugging
performance issues.
Add the counter increment when the txq is queued.
Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets")
Link: https://lore.kernel.org/r/20210715160440.142451.8278.stgit@awfm-01.cornelisnetworks.com
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
- Added option for per CPU threads to the hwlat tracer
- Have hwlat tracer handle hotplug CPUs
- New tracer: osnoise, that detects latency caused by interrupts, softirqs
and scheduling of other tasks.
- Added timerlat tracer that creates a thread and measures in detail what
sources of latency it has for wake ups.
- Removed the "success" field of the sched_wakeup trace event.
This has been hardcoded as "1" since 2015, no tooling should be looking
at it now. If one exists, we can revert this commit, fix that tool and
try to remove it again in the future.
- tgid mapping fixed to handle more than PID_MAX_DEFAULT pids/tgids.
- New boot command line option "tp_printk_stop", as tp_printk causes trace
events to write to console. When user space starts, this can easily live
lock the system. Having a boot option to stop just after boot up is
useful to prevent that from happening.
- Have ftrace_dump_on_oops boot command line option take numbers that match
the numbers shown in /proc/sys/kernel/ftrace_dump_on_oops.
- Bootconfig clean ups, fixes and enhancements.
- New ktest script that tests bootconfig options.
- Add tracepoint_probe_register_may_exist() to register a tracepoint
without triggering a WARN*() if it already exists. BPF has a path from
user space that can do this. All other paths are considered a bug.
- Small clean ups and fixes
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYN8YPhQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qhxLAP9Mo5hHv7Hg6W7Ddv77rThm+qclsMR/
yW0P+eJpMm4+xAD8Cq03oE1DimPK+9WZBKU5rSqAkqG6CjgDRw6NlIszzQQ=
=WEPR
-----END PGP SIGNATURE-----
Merge tag 'trace-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
- Added option for per CPU threads to the hwlat tracer
- Have hwlat tracer handle hotplug CPUs
- New tracer: osnoise, that detects latency caused by interrupts,
softirqs and scheduling of other tasks.
- Added timerlat tracer that creates a thread and measures in detail
what sources of latency it has for wake ups.
- Removed the "success" field of the sched_wakeup trace event. This has
been hardcoded as "1" since 2015, no tooling should be looking at it
now. If one exists, we can revert this commit, fix that tool and try
to remove it again in the future.
- tgid mapping fixed to handle more than PID_MAX_DEFAULT pids/tgids.
- New boot command line option "tp_printk_stop", as tp_printk causes
trace events to write to console. When user space starts, this can
easily live lock the system. Having a boot option to stop just after
boot up is useful to prevent that from happening.
- Have ftrace_dump_on_oops boot command line option take numbers that
match the numbers shown in /proc/sys/kernel/ftrace_dump_on_oops.
- Bootconfig clean ups, fixes and enhancements.
- New ktest script that tests bootconfig options.
- Add tracepoint_probe_register_may_exist() to register a tracepoint
without triggering a WARN*() if it already exists. BPF has a path
from user space that can do this. All other paths are considered a
bug.
- Small clean ups and fixes
* tag 'trace-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (49 commits)
tracing: Resize tgid_map to pid_max, not PID_MAX_DEFAULT
tracing: Simplify & fix saved_tgids logic
treewide: Add missing semicolons to __assign_str uses
tracing: Change variable type as bool for clean-up
trace/timerlat: Fix indentation on timerlat_main()
trace/osnoise: Make 'noise' variable s64 in run_osnoise()
tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing
tracing: Fix spelling in osnoise tracer "interferences" -> "interference"
Documentation: Fix a typo on trace/osnoise-tracer
trace/osnoise: Fix return value on osnoise_init_hotplug_support
trace/osnoise: Make interval u64 on osnoise_main
trace/osnoise: Fix 'no previous prototype' warnings
tracing: Have osnoise_main() add a quiescent state for task rcu
seq_buf: Make trace_seq_putmem_hex() support data longer than 8
seq_buf: Fix overflow in seq_buf_putmem_hex()
trace/osnoise: Support hotplug operations
trace/hwlat: Support hotplug operations
trace/hwlat: Protect kdata->kthread with get/put_online_cpus
trace: Add timerlat tracer
trace: Add osnoise tracer
...
The __assign_str macro has an unusual ending semicolon but the vast
majority of uses of the macro already have semicolon termination.
$ git grep -P '\b__assign_str\b' | wc -l
551
$ git grep -P '\b__assign_str\b.*;' | wc -l
480
Add semicolons to the __assign_str() uses without semicolon termination
and all the other uses without semicolon termination via additional defines
that are equivalent to __assign_str() with the eventual goal of removing
the semicolon from the __assign_str() macro definition.
Link: https://lore.kernel.org/lkml/1e068d21106bb6db05b735b4916bb420e6c9842a.camel@perches.com/
Link: https://lkml.kernel.org/r/48a056adabd8f70444475352f617914cef504a45.camel@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Compilation with W=1 produces warnings similar to the below.
drivers/infiniband/ulp/ipoib/ipoib_main.c:320: warning: This comment
starts with '/**', but isn't a kernel-doc comment. Refer
Documentation/doc-guide/kernel-doc.rst
All such occurrences were found with the following one line
git grep -A 1 "\/\*\*" drivers/infiniband/
Link: https://lore.kernel.org/r/e57d5f4ddd08b7a19934635b44d6d632841b9ba7.1623823612.git.leonro@nvidia.com
Reviewed-by: Jack Wang <jinpu.wang@ionos.com> #rtrs
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
init_port was only being used to register sysfs attributes against the
port kobject. Now that all users are creating static attribute_group's we
can simply set the attribute_group list in the ops and the core code can
just handle it directly.
This makes all the sysfs management quite straightforward and prevents any
driver from abusing the naked port kobject in future because no driver
code can access it.
Link: https://lore.kernel.org/r/114f68f3d921460eafe14cea5a80ca65d81729c3.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
hfi1 should not be creating a mess of kobjects to attach to the port
kobject - this is all attributes. The proper API is to create an
attribute_group list and create it against the port's kobject.
Link: https://lore.kernel.org/r/cbe0ccb6175dd22274359b6ad803a37435a70e91.1623427137.git.leonro@nvidia.com
Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
This is being used to implement both the port and device global stats,
which is causing some confusion in the drivers. For instance EFA and i40iw
both seem to be misusing the device stats.
Split it into two ops so drivers that don't support one or the other can
leave the op NULL'd, making the calling code a little simpler to
understand.
Link: https://lore.kernel.org/r/1955c154197b2a159adc2dc97266ddc74afe420c.1623427137.git.leonro@nvidia.com
Tested-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Since ib_get_len() only has one caller, move it from a header file into a
.c file. Additionally, remove the superfluous u16 cast. That cast was
introduced by commit 7dafbab3753f ("IB/hfi1: Add functions to parse BTH/IB
headers").
Link: https://lore.kernel.org/r/20210524041211.9480-2-bvanassche@acm.org
Cc: Don Hiatt <don.hiatt@intel.com>
Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The result of an expression consisting of a single relational operator is
already of the bool type and does not need to be evaluated explicitly.
No functional change.
Link: https://lore.kernel.org/r/20210510120635.3636-1-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
This is significantly bug fixes and general cleanups. The noteworthy new
features are fairly small:
- XRC support for HNS and improves RQ operations
- Bug fixes and updates for hns, mlx5, bnxt_re, hfi1, i40iw, rxe, siw and
qib
- Quite a few general cleanups on spelling, error handling, static checker
detections, etc
- Increase the number of device ports supported beyond 255. High port
count software switches now exist
- Several bug fixes for rtrs
- mlx5 Device Memory support for host controlled atomics
- Report SRQ tables through to rdma-tool
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAmCMMHEACgkQOG33FX4g
mxri3Q//RAgIExCGHebQ9xkptZHVyTLLJMpiMl2cqk3ZVRdDZ7QdiQjIqY2KqlUK
nxBj7EXJeX6rV5a1xqCcOO1gBetB28TSwnCNE2ZqrXP5B59ISW8D052IWza3UkUz
WmHLARxHQlyKBWA4+ZAgfoUGL0NmWA8QPf56t/RK/3/OsuYnGzcnWmmFbt8XKFcH
NtO3KC45mKWDqqG0A0XRrLbEQz/ElO3OuPBqlBKgB3ZgGPzgsOUTOGkm1tCcZ89L
/pvZGB7SklKZdCX8TxdpVGd9h0zHl8pqh1yEzvTA1ypNAYSUId2mvZXluU8J5yJl
FLk7E1IxE5050FNEc7T5uZdUVntulYiqL2558coRI34l5w26pKGjIMxw/nTB8hg8
4ZfBtKVemIG6yzW5Up6iBpK7qWYpvLWVShwYAWhbNsjN7JGzJuh1gJnjbmYgyz2P
RTMU9wjFPLL2wZxg4LDHACVJNBb82j6KKuE+kZWpk11ro7INw9+7YwRuTo7/ezxC
BwXKu8wF4igwSigV55jM+WnGXLhxdC3qmx/2cbtWyLM/PzdRL96tM0RWW5v8/Nv7
teFhkt+f3RVqcfYH5K1qCXy3UFrxG6bxFSvcHHSBx2bdIrqhuTY5FqszAYImeW2j
iHoyIsuSuGu79HQgOzAQZsEyksWi6OYDvA9Q9VBoPP4bJ3DOAa4=
=vsXA
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma updates from Jason Gunthorpe:
"This is significantly bug fixes and general cleanups. The noteworthy
new features are fairly small:
- XRC support for HNS and improves RQ operations
- Bug fixes and updates for hns, mlx5, bnxt_re, hfi1, i40iw, rxe, siw
and qib
- Quite a few general cleanups on spelling, error handling, static
checker detections, etc
- Increase the number of device ports supported beyond 255. High port
count software switches now exist
- Several bug fixes for rtrs
- mlx5 Device Memory support for host controlled atomics
- Report SRQ tables through to rdma-tool"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (145 commits)
IB/qib: Remove redundant assignment to ret
RDMA/nldev: Add copy-on-fork attribute to get sys command
RDMA/bnxt_re: Fix a double free in bnxt_qplib_alloc_res
RDMA/siw: Fix a use after free in siw_alloc_mr
IB/hfi1: Remove redundant variable rcd
RDMA/nldev: Add QP numbers to SRQ information
RDMA/nldev: Return SRQ information
RDMA/restrack: Add support to get resource tracking for SRQ
RDMA/nldev: Return context information
RDMA/core: Add CM to restrack after successful attachment to a device
RDMA/cma: Skip device which doesn't support CM
RDMA/rxe: Fix a bug in rxe_fill_ip_info()
RDMA/mlx5: Expose private query port
RDMA/mlx4: Remove an unused variable
RDMA/mlx5: Fix type assignment for ICM DM
IB/mlx5: Set right RoCE l3 type and roce version while deleting GID
RDMA/i40iw: Fix error unwinding when i40iw_hmc_sd_one fails
RDMA/cxgb4: add missing qpid increment
IB/ipoib: Remove unnecessary struct declaration
RDMA/bnxt_re: Get rid of custom module reference counting
...
The variable rcd is being assigned a value from a calculation however the
variable is never read, so this redundant variable can be removed.
Cleans up the following clang-analyzer warning:
drivers/infiniband/hw/hfi1/affinity.c:986:3: warning: Value stored to
'rcd' is never read [clang-analyzer-deadcode.DeadStores].
Link: https://lore.kernel.org/r/1619346696-46300-1-git-send-email-jiapeng.chong@linux.alibaba.com
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
All other users of the dummy netdevice embed the netdev in other
structures:
init_dummy_netdev(&mal->dummy_dev);
init_dummy_netdev(ð->dummy_dev);
init_dummy_netdev(&ar->napi_dev);
init_dummy_netdev(&irq_grp->napi_ndev);
init_dummy_netdev(&wil->napi_ndev);
init_dummy_netdev(&trans_pcie->napi_dev);
init_dummy_netdev(&dev->napi_dev);
init_dummy_netdev(&bus->mux_dev);
The AIP and VNIC implementation turns that model inside out and used a
kfree() to free what appears to be a netdev struct when in reality, it is
a struct that enbodies the rx state as well as the dummy netdev used to
support napi_poll across disparate receive contexts. The relationship is
infered by the odd allocation:
const int netdev_size = sizeof(*dd->dummy_netdev) +
sizeof(struct hfi1_netdev_priv);
<snip>
dd->dummy_netdev = kcalloc_node(1, netdev_size, GFP_KERNEL, dd->node);
Correct the issue by:
- Correctly naming the alloc and free functions
- Renaming hfi1_netdev_priv to hfi1_netdev_rx
- Replacing dd dummy_netdev with a netdev_rx pointer
- Embedding the net_device in hfi1_netdev_rx
- Moving the init_dummy_netdev to the alloc routine
- Adjusting wrappers to fit the new model
Fixes: 6991abcb993c ("IB/hfi1: Add functions to receive accelerated ipoib packets")
Link: https://lore.kernel.org/r/1617026056-50483-11-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Fix to return a negative error code from the error handling case instead
of 0, as done elsewhere in this function.
Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/20210408113140.103032-1-wangwensheng4@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Remove the unused function sdma_iowait_schedule().
Fixes: 7724105686e7 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/1617026791-89997-1-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
The code currently assumes that the mmu_notifier struct
embedded in mmu_rb_handler only contains two fields.
There are now extra fields:
struct mmu_notifier {
struct hlist_node hlist;
const struct mmu_notifier_ops *ops;
struct mm_struct *mm;
struct rcu_head rcu;
unsigned int users;
};
Given that there in no init for the mmu_notifier, a kzalloc() should
be used to insure that any newly added fields are given a predictable
initial value of zero.
Fixes: 06e0ffa69312 ("IB/hfi1: Re-factor MMU notification code")
Link: https://lore.kernel.org/r/1617026056-50483-9-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Adam Goldman <adam.goldman@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Add traces that were vital in isolating an issue with pq waitlist in
commit fa8dac396863 ("IB/hfi1: Fix another case where pq is left on
waitlist")
Link: https://lore.kernel.org/r/1617026056-50483-8-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
hfi1_ipoib_send() directly calls hfi1_ipoib_send_dma() with no value add.
Fix by renaming hfi1_ipoib_send_dma() to hfi1_ipoib_send().
Link: https://lore.kernel.org/r/1617026056-50483-6-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>