Commit Graph

634 Commits

Author SHA1 Message Date
Wen Gu
2153bd1e3d net/smc: Transfer remaining wait queue entries during fallback
The SMC fallback is incomplete currently. There may be some
wait queue entries remaining in smc socket->wq, which should
be removed to clcsocket->wq during the fallback.

For example, in nginx/wrk benchmark, this issue causes an
all-zeros test result:

server: nginx -g 'daemon off;'
client: smc_run wrk -c 1 -t 1 -d 5 http://11.200.15.93/index.html

  Running 5s test @ http://11.200.15.93/index.html
     1 threads and 1 connections
     Thread Stats   Avg      Stdev     Max   ± Stdev
     	Latency     0.00us    0.00us   0.00us    -nan%
	Req/Sec     0.00      0.00     0.00      -nan%
	0 requests in 5.00s, 0.00B read
     Requests/sec:      0.00
     Transfer/sec:       0.00B

The reason for this all-zeros result is that when wrk used SMC
to replace TCP, it added an eppoll_entry into smc socket->wq
and expected to be notified if epoll events like EPOLL_IN/
EPOLL_OUT occurred on the smc socket.

However, once a fallback occurred, wrk switches to use clcsocket.
Now it is clcsocket->wq instead of smc socket->wq which will
be woken up. The eppoll_entry remaining in smc socket->wq does
not work anymore and wrk stops the test.

This patch fixes this issue by removing remaining wait queue
entries from smc socket->wq to clcsocket->wq during the fallback.

Link: https://www.spinics.net/lists/netdev/msg779769.html
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-15 13:27:02 +00:00
Dust Li
e5d5aadcf3 net/smc: fix sk_refcnt underflow on linkdown and fallback
We got the following WARNING when running ab/nginx
test with RDMA link flapping (up-down-up).
The reason is when smc_sock fallback and at linkdown
happens simultaneously, we may got the following situation:

__smc_lgr_terminate()
 --> smc_conn_kill()
    --> smc_close_active_abort()
           smc_sock->sk_state = SMC_CLOSED
           sock_put(smc_sock)

smc_sock was set to SMC_CLOSED and sock_put() been called
when terminate the link group. But later application call
close() on the socket, then we got:

__smc_release():
    if (smc_sock->fallback)
        smc_sock->sk_state = SMC_CLOSED
        sock_put(smc_sock)

Again we set the smc_sock to CLOSED through it's already
in CLOSED state, and double put the refcnt, so the following
warning happens:

refcount_t: underflow; use-after-free.
WARNING: CPU: 5 PID: 860 at lib/refcount.c:28 refcount_warn_saturate+0x8d/0xf0
Modules linked in:
CPU: 5 PID: 860 Comm: nginx Not tainted 5.10.46+ #403
Hardware name: Alibaba Cloud Alibaba Cloud ECS, BIOS 8c24b4c 04/01/2014
RIP: 0010:refcount_warn_saturate+0x8d/0xf0
Code: 05 5c 1e b5 01 01 e8 52 25 bc ff 0f 0b c3 80 3d 4f 1e b5 01 00 75 ad 48

RSP: 0018:ffffc90000527e50 EFLAGS: 00010286
RAX: 0000000000000026 RBX: ffff8881300df2c0 RCX: 0000000000000027
RDX: 0000000000000000 RSI: ffff88813bd58040 RDI: ffff88813bd58048
RBP: 0000000000000000 R08: 0000000000000003 R09: 0000000000000001
R10: ffff8881300df2c0 R11: ffffc90000527c78 R12: ffff8881300df340
R13: ffff8881300df930 R14: ffff88810b3dad80 R15: ffff8881300df4f8
FS:  00007f739de8fb80(0000) GS:ffff88813bd40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000a01b008 CR3: 0000000111b64003 CR4: 00000000003706e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 smc_release+0x353/0x3f0
 __sock_release+0x3d/0xb0
 sock_close+0x11/0x20
 __fput+0x93/0x230
 task_work_run+0x65/0xa0
 exit_to_user_mode_prepare+0xf9/0x100
 syscall_exit_to_user_mode+0x27/0x190
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

This patch adds check in __smc_release() to make
sure we won't do an extra sock_put() and set the
socket to CLOSED when its already in CLOSED state.

Fixes: 51f1de79ad (net/smc: replace sock_put worker by socket refcounting)
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: Dust Li <dust.li@linux.alibaba.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-10 14:43:59 +00:00
Tony Lu
af1877b6ca net/smc: Print function name in smcr_link_down tracepoint
This makes the output of smcr_link_down tracepoint easier to use and
understand without additional translating function's pointer address.

It prints the function name with offset:

  <idle>-0       [000] ..s.    69.087164: smcr_link_down: lnk=00000000dab41cdc lgr=000000007d5d8e24 state=0 rc=1 dev=mlx5_0 location=smc_wr_tx_tasklet_fn+0x5ef/0x6f0 [smc]

Link: https://lore.kernel.org/netdev/11f17a34-fd35-f2ec-3f20-dd0c34e55fde@linux.ibm.com/
Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-05 10:14:38 +00:00
Tony Lu
a3a0e81b6f net/smc: Introduce tracepoint for smcr link down
SMC-R link down event is important to help us find links' issues, we
should track this event, especially in the single nic mode, which means
upper layer connection would be shut down. Then find out the direct
link-down reason in time, not only increased the counter, also the
location of the code who triggered this event.

Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:39:14 +00:00
Tony Lu
aff3083f10 net/smc: Introduce tracepoints for tx and rx msg
This introduce two tracepoints for smc tx and rx msg to help us
diagnosis issues of data path. These two tracepoitns don't cover the
path of CORK or MSG_MORE in tx, just the top half of data path.

Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:39:14 +00:00
Tony Lu
4826260868 net/smc: Introduce tracepoint for fallback
This introduces tracepoint for smc fallback to TCP, so that we can track
which connection and why it fallbacks, and map the clcsocks' pointer with
/proc/net/tcp to find more details about TCP connections. Compared with
kprobe or other dynamic tracing, tracepoints are stable and easy to use.

Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-01 13:39:14 +00:00
Jakub Kicinski
7df621a3ee Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
include/net/sock.h
  7b50ecfcc6 ("net: Rename ->stream_memory_read to ->sock_is_readable")
  4c1e34c0db ("vsock: Enable y2038 safe timeval for timeout")

drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c
  0daa55d033 ("octeontx2-af: cn10k: debugfs for dumping LMTST map table")
  e77bcdd1f6 ("octeontx2-af: Display all enabled PF VF rsrc_alloc entries.")

Adjacent code addition in both cases, keep both.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-28 10:43:58 -07:00
Wen Gu
f3a3a0fe0b net/smc: Correct spelling mistake to TCPF_SYN_RECV
There should use TCPF_SYN_RECV instead of TCP_SYN_RECV.

Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Tony Lu <tonylu@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-28 13:04:28 +01:00
Tony Lu
c4a146c7cf net/smc: Fix smc_link->llc_testlink_time overflow
The value of llc_testlink_time is set to the value stored in
net->ipv4.sysctl_tcp_keepalive_time when linkgroup init. The value of
sysctl_tcp_keepalive_time is already jiffies, so we don't need to
multiply by HZ, which would cause smc_link->llc_testlink_time overflow,
and test_link send flood.

Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-28 13:04:28 +01:00
Karsten Graul
29397e34c7 net/smc: stop links when their GID is removed
With SMC-Rv2 the GID is an IP address that can be deleted from the
device. When an IB_EVENT_GID_CHANGE event is provided then iterate over
all active links and check if their GID is still defined. Otherwise
stop the affected link.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:13 +01:00
Karsten Graul
b0539f5edd net/smc: add netlink support for SMC-Rv2
Implement the netlink support for SMC-Rv2 related attributes that are
provided to user space.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:13 +01:00
Karsten Graul
b4ba4652b3 net/smc: extend LLC layer for SMC-Rv2
Add support for large v2 LLC control messages in smc_llc.c.
The new large work request buffer allows to combine control
messages into one packet that had to be spread over several
packets before.
Add handling of the new v2 LLC messages.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:13 +01:00
Karsten Graul
8799e310fb net/smc: add v2 support to the work request layer
In the work request layer define one large v2 buffer for each link group
that is used to transmit and receive large LLC control messages.
Add the completion queue handling for this buffer.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:13 +01:00
Karsten Graul
24fb68111d net/smc: retrieve v2 gid from IB device
In smc_ib.c, scan for RoCE devices that support UDP encapsulation.
Find an eligible device and check that there is a route to the
remote peer.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:13 +01:00
Karsten Graul
8ade200c26 net/smc: add v2 format of CLC decline message
The CLC decline message changed with SMC-Rv2 and supports up to
4 additional diagnosis codes.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:13 +01:00
Karsten Graul
e49300a6bf net/smc: add listen processing for SMC-Rv2
Implement the server side of the SMC-Rv2 processing. Process incoming
CLC messages, find eligible devices and check for a valid route to the
remote peer.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:13 +01:00
Karsten Graul
e5c4744cfb net/smc: add SMC-Rv2 connection establishment
Send a CLC proposal message, and the remote side process this type of
message and determine the target GID. Check for a valid route to this
GID, and complete the connection establishment.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:12 +01:00
Karsten Graul
42042dbbc2 net/smc: prepare for SMC-Rv2 connection
Prepare the connection establishment with SMC-Rv2. Detect eligible
RoCE cards and indicate all supported SMC modes for the connection.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:12 +01:00
Karsten Graul
ed990df29f net/smc: save stack space and allocate smc_init_info
The struct smc_init_info grew over time, its time to save space on stack
and allocate this struct dynamically.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-16 14:58:12 +01:00
Jakub Kicinski
e15f5972b8 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
tools/testing/selftests/net/ioam6.sh
  7b1700e009 ("selftests: net: modify IOAM tests for undef bits")
  bf77b1400a ("selftests: net: Test for the IOAM encapsulation with IPv6")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-14 16:50:14 -07:00
Karsten Graul
95f7f3e7dc net/smc: improved fix wait on already cleared link
Commit 8f3d65c166 ("net/smc: fix wait on already cleared link")
introduced link refcounting to avoid waits on already cleared links.
This patch extents and improves the refcounting to cover all
remaining possible cases for this kind of error situation.

Fixes: 15e1b99aad ("net/smc: no WR buffer wait for terminating link group")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-08 17:00:16 +01:00
Jakub Kicinski
2fcd14d0f7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
net/mptcp/protocol.c
  977d293e23 ("mptcp: ensure tx skbs always have the MPTCP ext")
  efe686ffce ("mptcp: ensure tx skbs always have the MPTCP ext")

same patch merged in both trees, keep net-next.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-09-23 11:19:49 -07:00
Karsten Graul
a18cee4791 net/smc: fix 'workqueue leaked lock' in smc_conn_abort_work
The abort_work is scheduled when a connection was detected to be
out-of-sync after a link failure. The work calls smc_conn_kill(),
which calls smc_close_active_abort() and that might end up calling
smc_close_cancel_work().
smc_close_cancel_work() cancels any pending close_work and tx_work but
needs to release the sock_lock before and acquires the sock_lock again
afterwards. So when the sock_lock was NOT acquired before then it may
be held after the abort_work completes. Thats why the sock_lock is
acquired before the call to smc_conn_kill() in __smc_lgr_terminate(),
but this is missing in smc_conn_abort_work().

Fix that by acquiring the sock_lock first and release it after the
call to smc_conn_kill().

Fixes: b286a0651e ("net/smc: handle incoming CDC validation message")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-21 10:54:16 +01:00
Karsten Graul
6c90731980 net/smc: add missing error check in smc_clc_prfx_set()
Coverity stumbled over a missing error check in smc_clc_prfx_set():

*** CID 1475954:  Error handling issues  (CHECKED_RETURN)
/net/smc/smc_clc.c: 233 in smc_clc_prfx_set()
>>>     CID 1475954:  Error handling issues  (CHECKED_RETURN)
>>>     Calling "kernel_getsockname" without checking return value (as is done elsewhere 8 out of 10 times).
233     	kernel_getsockname(clcsock, (struct sockaddr *)&addrs);

Add the return code check in smc_clc_prfx_set().

Fixes: c246d942ea ("net/smc: restructure netinfo for CLC proposal msgs")
Reported-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-21 10:54:16 +01:00
Karsten Graul
3c572145c2 net/smc: add generic netlink support for system EID
With SMC-Dv2 users can configure if the static system EID should be used
during CLC handshake, or if only user EIDs are allowed.
Add generic netlink support to enable and disable the system EID, and
to retrieve the system EID and its current enabled state.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Guvenc Gulce  <guvenc@linux.ibm.com>
Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14 12:49:10 +01:00
Karsten Graul
11a26c59fc net/smc: keep static copy of system EID
The system EID is retrieved using an registered ISM device each time
when needed. This adds some unnecessary complexity at all places where
the system EID is needed, but no ISM device is at hand.
Simplify the code and save the system EID in a static variable in
smc_ism.c.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Guvenc Gulce  <guvenc@linux.ibm.com>
Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14 12:49:10 +01:00
Karsten Graul
fa08666255 net/smc: add support for user defined EIDs
SMC-Dv2 allows users to define EIDs which allows to create separate
name spaces enabling users to cluster their SMC-Dv2 connections.
Add support for user defined EIDs and extent the generic netlink
interface so users can add, remove and dump EIDs.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Reviewed-by: Guvenc Gulce  <guvenc@linux.ibm.com>
Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-09-14 12:49:10 +01:00
Jakub Kicinski
f4083a752a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts:

drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h
  9e26680733 ("bnxt_en: Update firmware call to retrieve TX PTP timestamp")
  9e518f2580 ("bnxt_en: 1PPS functions to configure TSIO pins")
  099fdeda65 ("bnxt_en: Event handler for PPS events")

kernel/bpf/helpers.c
include/linux/bpf-cgroup.h
  a2baf4e8bb ("bpf: Fix potentially incorrect results with bpf_get_local_storage()")
  c7603cfa04 ("bpf: Add ambient BPF runtime context stored in current")

drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
  5957cc557d ("net/mlx5: Set all field of mlx5_irq before inserting it to the xarray")
  2d0b41a376 ("net/mlx5: Refcount mlx5_irq with integer")

MAINTAINERS
  7b637cd52f ("MAINTAINERS: fix Microchip CAN BUS Analyzer Tool entry typo")
  7d901a1e87 ("net: phy: add Maxlinear GPY115/21x/24x driver")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-13 06:41:22 -07:00
Stefan Raspl
67161779a9 net/smc: Allow SMC-D 1MB DMB allocations
Commit a3fe3d01bd ("net/smc: introduce sg-logic for RMBs") introduced
a restriction for RMB allocations as used by SMC-R. However, SMC-D does
not use scatter-gather lists to back its DMBs, yet it was limited by
this restriction, still.
This patch exempts SMC, but limits allocations to the maximum RMB/DMB
size respectively.

Signed-off-by: Stefan Raspl <raspl@linux.ibm.com>
Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-09 10:57:04 +01:00
Guvenc Gulce
64513d269e net/smc: Correct smc link connection counter in case of smc client
SMC clients may be assigned to a different link after the initial
connection between two peers was established. In such a case,
the connection counter was not correctly set.

Update the connection counter correctly when a smc client connection
is assigned to a different smc link.

Fixes: 07d51580ff ("net/smc: Add connection counters for links")
Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Tested-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-09 10:46:59 +01:00
Karsten Graul
8f3d65c166 net/smc: fix wait on already cleared link
There can be a race between the waiters for a tx work request buffer
and the link down processing that finally clears the link. Although
all waiters are woken up before the link is cleared there might be
waiters which did not yet get back control and are still waiting.
This results in an access to a cleared wait queue head.

Fix this by introducing atomic reference counting around the wait calls,
and wait with the link clear processing until all waiters have finished.
Move the work request layer related calls into smc_wr.c and set the
link state to INACTIVE before calling smcr_link_clear() in
smc_llc_srv_add_link().

Fixes: 15e1b99aad ("net/smc: no WR buffer wait for terminating link group")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-09 10:46:59 +01:00
Yajun Deng
1160dfa178 net: Remove redundant if statements
The 'if (dev)' statement already move into dev_{put , hold}, so remove
redundant if statements.

Signed-off-by: Yajun Deng <yajun.deng@linux.dev>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-08-05 13:27:50 +01:00
Alexander Aring
e3ae2365ef net: sock: introduce sk_error_report
This patch introduces a function wrapper to call the sk_error_report
callback. That will prepare to add additional handling whenever
sk_error_report is called, for example to trace socket errors.

Signed-off-by: Alexander Aring <aahringo@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-29 11:28:21 -07:00
Guvenc Gulce
17081633e2 net/smc: Ensure correct state of the socket in send path
When smc_sendmsg() is called before the SMC socket initialization has
completed, smc_tx_sendmsg() will access un-initialized fields of the
SMC socket which results in a null-pointer dereference.
Fix this by checking the socket state first in smc_tx_sendmsg().

Fixes: e0e4b8fa53 ("net/smc: Add SMC statistics support")
Reported-by: syzbot+5dda108b672b54141857@syzkaller.appspotmail.com
Reviewed-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-25 11:53:51 -07:00
Dan Carpenter
1a1100d53f net/smc: Fix ENODATA tests in smc_nl_get_fback_stats()
These functions return negative ENODATA but the minus sign was left out
in the tests.

Fixes: f0dd7bf5e3 ("net/smc: Add netlink support for SMC fallback statistics")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-21 12:16:58 -07:00
Guvenc Gulce
194730a9be net/smc: Make SMC statistics network namespace aware
Make the gathered SMC statistics network namespace aware, for each
namespace collect an own set of statistic information.

Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:54:02 -07:00
Guvenc Gulce
f0dd7bf5e3 net/smc: Add netlink support for SMC fallback statistics
Add support to collect more detailed SMC fallback reason statistics and
provide these statistics to user space on the netlink interface.

Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:54:02 -07:00
Guvenc Gulce
8c40602b4b net/smc: Add netlink support for SMC statistics
Add the netlink function which collects the statistics information and
delivers it to the userspace.

Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:54:02 -07:00
Guvenc Gulce
e0e4b8fa53 net/smc: Add SMC statistics support
Add the ability to collect SMC statistics information. Per-cpu
variables are used to collect the statistic information for better
performance and for reducing concurrency pitfalls. The code that is
collecting statistic data is implemented in macros to increase code
reuse and readability.

Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-16 12:54:02 -07:00
Julian Wiedmann
5e4a43ceb2 net/smc: no need to flush smcd_dev's event_wq before destroying it
destroy_workqueue() already calls drain_workqueue(), which is a stronger
variant of flush_workqueue().

Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03 13:54:49 -07:00
Karsten Graul
f8e0a68bab net/smc: avoid possible duplicate dmb unregistration
smc_lgr_cleanup() calls smcd_unregister_all_dmbs() as part of the link
group termination process. This is a leftover from the times when
smc_lgr_cleanup() scheduled a worker to actually free the link group.
Nowadays smc_lgr_cleanup() directly calls smc_lgr_free() without any
delay so an earlier dmb unregistration is no longer needed.
So remove smcd_unregister_all_dmbs() and the call to it.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-06-03 13:54:49 -07:00
Linus Torvalds
d7c5303fbc Networking fixes for 5.13-rc4, including fixes from bpf, netfilter,
can and wireless trees. Notably including fixes for the recently
 announced "FragAttacks" WiFi vulnerabilities. Rather large batch,
 touching some core parts of the stack, too, but nothing hair-raising.
 
 Current release - regressions:
 
  - tipc: make node link identity publish thread safe
 
  - dsa: felix: re-enable TAS guard band mode
 
  - stmmac: correct clocks enabled in stmmac_vlan_rx_kill_vid()
 
  - stmmac: fix system hang if change mac address after interface ifdown
 
 Current release - new code bugs:
 
  - mptcp: avoid OOB access in setsockopt()
 
  - bpf: Fix nested bpf_bprintf_prepare with more per-cpu buffers
 
  - ethtool: stats: fix a copy-paste error - init correct array size
 
 Previous releases - regressions:
 
  - sched: fix packet stuck problem for lockless qdisc
 
  - net: really orphan skbs tied to closing sk
 
  - mlx4: fix EEPROM dump support
 
  - bpf: fix alu32 const subreg bound tracking on bitwise operations
 
  - bpf: fix mask direction swap upon off reg sign change
 
  - bpf, offload: reorder offload callback 'prepare' in verifier
 
  - stmmac: Fix MAC WoL not working if PHY does not support WoL
 
  - packetmmap: fix only tx timestamp on request
 
  - tipc: skb_linearize the head skb when reassembling msgs
 
 Previous releases - always broken:
 
  - mac80211: address recent "FragAttacks" vulnerabilities
 
  - mac80211: do not accept/forward invalid EAPOL frames
 
  - mptcp: avoid potential error message floods
 
  - bpf, ringbuf: deny reserve of buffers larger than ringbuf to prevent
                  out of buffer writes
 
  - bpf: forbid trampoline attach for functions with variable arguments
 
  - bpf: add deny list of functions to prevent inf recursion of tracing
         programs
 
  - tls splice: check SPLICE_F_NONBLOCK instead of MSG_DONTWAIT
 
  - can: isotp: prevent race between isotp_bind() and isotp_setsockopt()
 
  - netfilter: nft_set_pipapo_avx2: Add irq_fpu_usable() check,
               fallback to non-AVX2 version
 
 Misc:
 
  - bpf: add kconfig knob for disabling unpriv bpf by default
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmCuy2gACgkQMUZtbf5S
 IruE5BAAhihia5EaiV71Bz/Cqr/d+osv5u283riKT8kBft0bWFVFFnT3iweWyR0/
 5X+bB6zmr80Cuqh45ZeYyq+zJtiAAlsbD5hqBIGdMriSWLxciNKjVJRzuEjuqnek
 USMW/LqGyf4NhmLogmQKpx8XcKSG7VYuK7vPrsH8us1dL5vIssceIXn8R9Dzj9NN
 P77K5Z+Oka8XQJgetNLxR3tDAM/92RwIshotkhJbRwgiUvzb+wbnrnSOAZCIPgku
 ydJyOxOklln1Sx07SejgzEl33ri0CkioDPThBWpOn7Mu0JrYKukXPKludoZcRYuJ
 2jNLYfbH0ZS5EkOfk89h7j7MDoAJMUK72M+S1w5DEYz6eH2EjhAq9noZ6E1iQH+U
 9vfoIvQjPh6Zhyk5QeM4dpt0cvR7rSElXkLVxo/x0dSBAi2rIng1bKeCUtv2J689
 CsoD0oghtEzvUTYVxY6iNr15OFGl6KsZv4tVQ709gGA36sDlK8ozGbJH5WReobBl
 f8H2WJlj2tVW5V75yUoio8TumDw34yk/5xlJFzm9GOwkqBrUcqOraHtHdUIsa4qr
 KbELQQ9QVt4zYdLAiWy5BL/QLycp0ibmA1IB8W1bxEVSK1JXzREHzPxv85KOfZkn
 8+vzNHmk2PEZYYsExiEykc5jXKOCPs8L0rJ6p4OverlbpDZcwIg=
 =peMK
 -----END PGP SIGNATURE-----

Merge tag 'net-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Networking fixes for 5.13-rc4, including fixes from bpf, netfilter,
  can and wireless trees. Notably including fixes for the recently
  announced "FragAttacks" WiFi vulnerabilities. Rather large batch,
  touching some core parts of the stack, too, but nothing hair-raising.

  Current release - regressions:

   - tipc: make node link identity publish thread safe

   - dsa: felix: re-enable TAS guard band mode

   - stmmac: correct clocks enabled in stmmac_vlan_rx_kill_vid()

   - stmmac: fix system hang if change mac address after interface
     ifdown

  Current release - new code bugs:

   - mptcp: avoid OOB access in setsockopt()

   - bpf: Fix nested bpf_bprintf_prepare with more per-cpu buffers

   - ethtool: stats: fix a copy-paste error - init correct array size

  Previous releases - regressions:

   - sched: fix packet stuck problem for lockless qdisc

   - net: really orphan skbs tied to closing sk

   - mlx4: fix EEPROM dump support

   - bpf: fix alu32 const subreg bound tracking on bitwise operations

   - bpf: fix mask direction swap upon off reg sign change

   - bpf, offload: reorder offload callback 'prepare' in verifier

   - stmmac: Fix MAC WoL not working if PHY does not support WoL

   - packetmmap: fix only tx timestamp on request

   - tipc: skb_linearize the head skb when reassembling msgs

  Previous releases - always broken:

   - mac80211: address recent "FragAttacks" vulnerabilities

   - mac80211: do not accept/forward invalid EAPOL frames

   - mptcp: avoid potential error message floods

   - bpf, ringbuf: deny reserve of buffers larger than ringbuf to
     prevent out of buffer writes

   - bpf: forbid trampoline attach for functions with variable arguments

   - bpf: add deny list of functions to prevent inf recursion of tracing
     programs

   - tls splice: check SPLICE_F_NONBLOCK instead of MSG_DONTWAIT

   - can: isotp: prevent race between isotp_bind() and
     isotp_setsockopt()

   - netfilter: nft_set_pipapo_avx2: Add irq_fpu_usable() check,
     fallback to non-AVX2 version

  Misc:

   - bpf: add kconfig knob for disabling unpriv bpf by default"

* tag 'net-5.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (172 commits)
  net: phy: Document phydev::dev_flags bits allocation
  mptcp: validate 'id' when stopping the ADD_ADDR retransmit timer
  mptcp: avoid error message on infinite mapping
  mptcp: drop unconditional pr_warn on bad opt
  mptcp: avoid OOB access in setsockopt()
  nfp: update maintainer and mailing list addresses
  net: mvpp2: add buffer header handling in RX
  bnx2x: Fix missing error code in bnx2x_iov_init_one()
  net: zero-initialize tc skb extension on allocation
  net: hns: Fix kernel-doc
  sctp: fix the proc_handler for sysctl encap_port
  sctp: add the missing setting for asoc encap_port
  bpf, selftests: Adjust few selftest result_unpriv outcomes
  bpf: No need to simulate speculative domain for immediates
  bpf: Fix mask direction swap upon off reg sign change
  bpf: Wrap aux data inside bpf_sanitize_info container
  bpf: Fix BPF_LSM kconfig symbol dependency
  selftests/bpf: Add test for l3 use of bpf_redirect_peer
  bpftool: Add sock_release help info for cgroup attach/prog load command
  net: dsa: microchip: enable phy errata workaround on 9567
  ...
2021-05-26 17:44:49 -10:00
Julian Wiedmann
444d7be953 net/smc: remove device from smcd_dev_list after failed device_add()
If the device_add() for a smcd_dev fails, there's no cleanup step that
rolls back the earlier list_add(). The device subsequently gets freed,
and we end up with a corrupted list.

Add some error handling that removes the device from the list.

Fixes: c6ba7c9ba4 ("net/smc: add base infrastructure for SMC-D and ISM")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-17 15:27:22 -07:00
Anirudh Rayabharam
bbeb18f27a net/smc: properly handle workqueue allocation failure
In smcd_alloc_dev(), if alloc_ordered_workqueue() fails, properly catch
it, clean up and return NULL to let the caller know there was a failure.
Move the call to alloc_ordered_workqueue higher in the function in order
to abort earlier without needing to unwind the call to device_initialize().

Cc: Ursula Braun <ubraun@linux.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Anirudh Rayabharam <mail@anirudhrb.com>
Link: https://lore.kernel.org/r/20210503115736.2104747-18-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-13 17:30:36 +02:00
Greg Kroah-Hartman
5369ead83f Revert "net/smc: fix a NULL pointer dereference"
This reverts commit e183d4e414.

Because of recent interactions with developers from @umn.edu, all
commits from them have been recently re-reviewed to ensure if they were
correct or not.

Upon review, this commit was found to be incorrect for the reasons
below, so it must be reverted.  It will be fixed up "correctly" in a
later kernel change.

The original commit causes a memory leak and does not properly fix the
issue it claims to fix.  I will send a follow-on patch to resolve this
properly.

Cc: Kangjie Lu <kjlu@umn.edu>
Cc: Ursula Braun <ubraun@linux.ibm.com>
Cc: David S. Miller <davem@davemloft.net>
Link: https://lore.kernel.org/r/20210503115736.2104747-17-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-05-13 17:30:34 +02:00
Cong Wang
8621436671 smc: disallow TCP_ULP in smc_setsockopt()
syzbot is able to setup kTLS on an SMC socket which coincidentally
uses sk_user_data too. Later, kTLS treats it as psock so triggers a
refcnt warning. The root cause is that smc_setsockopt() simply calls
TCP setsockopt() which includes TCP_ULP. I do not think it makes
sense to setup kTLS on top of SMC sockets, so we should just disallow
this setup.

It is hard to find a commit to blame, but we can apply this patch
since the beginning of TCP_ULP.

Reported-and-tested-by: syzbot+b54a1ce86ba4a623b7f0@syzkaller.appspotmail.com
Fixes: 734942cc4e ("tcp: ULP infrastructure")
Cc: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-05-05 12:52:45 -07:00
Jiapeng Chong
6fd6c483e7 net/smc: Remove redundant assignment to rc
Variable rc is set to zero but this value is never read as it is
overwritten with a new value later on, hence it is a redundant
assignment and can be removed.

Cleans up the following clang-analyzer warning:

net/smc/af_smc.c:1079:3: warning: Value stored to 'rc' is never read
[clang-analyzer-deadcode.DeadStores].

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-27 14:17:50 -07:00
Wan Jiabing
ec7e48ca4b net: smc: Remove repeated struct declaration
struct smc_clc_msg_local is declared twice. One is declared at
301st line. The blew one is not needed. Remove the duplicate.

Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-04-01 15:52:38 -07:00
Guvenc Gulce
8a44653689 net/smc: use memcpy instead of snprintf to avoid out of bounds read
Using snprintf() to convert not null-terminated strings to null
terminated strings may cause out of bounds read in the source string.
Therefore use memcpy() and terminate the target string with a null
afterwards.

Fixes: a3db10efcc ("net/smc: Add support for obtaining SMCR device list")
Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-12 20:22:01 -08:00
Jakub Kicinski
25fe2c9c4c smc: fix out of bound access in smc_nl_get_sys_info()
smc_clc_get_hostname() sets the host pointer to a buffer
which is not NULL-terminated (see smc_clc_init()).

Reported-by: syzbot+f4708c391121cfc58396@syzkaller.appspotmail.com
Fixes: 099b990bd1 ("net/smc: Add support for obtaining system information")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-12 20:22:01 -08:00
Karsten Graul
995433b795 net/smc: fix access to parent of an ib device
The parent of an ib device is used to retrieve the PCI device
attributes. It turns out that there are possible cases when an ib device
has no parent set in the device structure, which may lead to page
faults when trying to access this memory.
Fix that by checking the parent pointer and consolidate the pci device
specific processing in a new function.

Fixes: a3db10efcc ("net/smc: Add support for obtaining SMCR device list")
Reported-by: syzbot+600fef7c414ee7e2d71b@syzkaller.appspotmail.com
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Link: https://lore.kernel.org/r/20201215091058.49354-2-kgraul@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-16 13:33:47 -08:00
Guvenc Gulce
a3db10efcc net/smc: Add support for obtaining SMCR device list
Deliver SMCR device information via netlink based
diagnostic interface.

Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01 17:56:13 -08:00
Guvenc Gulce
aaf95523d5 net/smc: Add support for obtaining SMCD device list
Deliver SMCD device information via netlink based
diagnostic interface.

Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01 17:56:13 -08:00
Guvenc Gulce
8f9dde4bf2 net/smc: Add SMC-D Linkgroup diagnostic support
Deliver SMCD Linkgroup information via netlink based
diagnostic interface.

Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01 17:56:13 -08:00
Guvenc Gulce
5a7e09d58f net/smc: Introduce SMCR get link command
Introduce get link command which loops through
all available links of all available link groups. It
uses the SMC-R linkgroup list as entry point, not
the socket list, which makes linkgroup diagnosis
possible, in case linkgroup does not contain active
connections anymore.

Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01 17:56:13 -08:00
Guvenc Gulce
e9b8c845cb net/smc: Introduce SMCR get linkgroup command
Introduce get linkgroup command which loops through
all available SMCR linkgroups. It uses the SMC-R linkgroup
list as entry point, not the socket list, which makes
linkgroup diagnosis possible, in case linkgroup does not
contain active connections anymore.

Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01 17:56:13 -08:00
Guvenc Gulce
099b990bd1 net/smc: Add support for obtaining system information
Add new netlink command to obtain system information
of the smc module.

Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01 17:56:13 -08:00
Guvenc Gulce
e8372d9d21 net/smc: Introduce generic netlink interface for diagnostic purposes
Introduce generic netlink interface infrastructure to expose
the diagnostic information regarding smc linkgroups, links and devices.

Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01 17:56:12 -08:00
Guvenc Gulce
49407ae2bc net/smc: Refactor smc ism v2 capability handling
Encapsulate the smc ism v2 capability boolean value
in a function for better information hiding.

Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01 17:56:12 -08:00
Guvenc Gulce
6443b2f60e net/smc: Add diagnostic information to link structure
During link creation add net-device ifindex and ib-device
name to link structure. This is needed for diagnostic purposes.

When diagnostic information is gathered, we need to traverse
device, linkgroup and link structures, to be able to do that
we need to hold a spinlock for the linkgroup list, without this
diagnostic information in link structure, another device list
mutex holding would be necessary to dereference the device
pointer in the link structure which would be impossible when
holding a spinlock already.

Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01 17:56:12 -08:00
Guvenc Gulce
3d453f53c7 net/smc: Add diagnostic information to smc ib-device
During smc ib-device creation, add network device ifindex to smc
ib-device structure. Register for netdevice changes and update ib-device
accordingly. This is needed for diagnostic purposes.

Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01 17:56:12 -08:00
Guvenc Gulce
ddc992866f net/smc: Add link counters for IB device ports
Add link counters to the structure of the smc ib device, one counter per
ib port. Increase/decrease the counters as needed in the corresponding
routines.

Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01 17:56:12 -08:00
Guvenc Gulce
07d51580ff net/smc: Add connection counters for links
Add connection counters to the structure of the link.
Increase/decrease the counters as needed in the corresponding
routines.

Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01 17:56:12 -08:00
Guvenc Gulce
8b2f0f44f0 net/smc: Use active link of the connection
Use active link of the connection directly and not
via linkgroup array structure when obtaining link
data of the connection.

Signed-off-by: Guvenc Gulce <guvenc@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01 17:56:12 -08:00
Karsten Graul
8cf3f3e423 net/smc: use helper smc_conn_abort() in listen processing
The helper smc_connect_abort() can be used by the listen processing
functions, too. And rename this helper to smc_conn_abort() to make the
purpose clearer.
No functional change.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01 17:56:12 -08:00
Jakub Kicinski
56495a2442 Merge https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-19 19:08:46 -08:00
Karsten Graul
41a0be3f8f net/smc: fix direct access to ib_gid_addr->ndev in smc_ib_determine_gid()
Sparse complaints 3 times about:
net/smc/smc_ib.c:203:52: warning: incorrect type in argument 1 (different address spaces)
net/smc/smc_ib.c:203:52:    expected struct net_device const *dev
net/smc/smc_ib.c:203:52:    got struct net_device [noderef] __rcu *const ndev

Fix that by using the existing and validated ndev variable instead of
accessing attr->ndev directly.

Fixes: 5102eca903 ("net/smc: Use rdma_read_gid_l2_fields to L2 fields")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-19 10:59:19 -08:00
Karsten Graul
0530bd6e6a net/smc: fix matching of existing link groups
With the multi-subnet support of SMC-Dv2 the match for existing link
groups should not include the vlanid of the network device.
Set ini->smcd_version accordingly before the call to smc_conn_create()
and use this value in smc_conn_create() to skip the vlanid check.

Fixes: 5c21c4ccaf ("net/smc: determine accepted ISM devices")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-19 10:59:19 -08:00
Allen Pais
fcb8e3a328 net: smc: convert tasklets to use new tasklet_setup() API
In preparation for unconditionally passing the
struct tasklet_struct pointer to all tasklet
callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Signed-off-by: Romain Perier <romain.perier@gmail.com>
Signed-off-by: Allen Pais <apais@linux.microsoft.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-07 10:41:15 -08:00
Jakub Kicinski
ae0d0bb29b Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-06 17:33:38 -08:00
Karsten Graul
3752404a68 net/smc: improve return codes for SMC-Dv2
To allow better problem diagnosis the return codes for SMC-Dv2 are
improved by this patch. A few more CLC DECLINE codes are defined and
sent to the peer when an SMC connection cannot be established.
There are now multiple SMC variations that are offered by the client and
the server may encounter problems to initialize all of them.
Because only one diagnosis code can be sent to the client the decision
was made to send the first code that was encountered. Because the server
tries the variations in the order of importance (SMC-Dv2, SMC-D, SMC-R)
this makes sure that the diagnosis code of the most important variation
is sent.

v2: initialize rc in smc_listen_v2_check().

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Link: https://lore.kernel.org/r/20201031181938.69903-1-kgraul@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-31 15:44:13 -07:00
Linus Torvalds
53760f9b74 flexible-array member conversion patches for 5.10-rc2
Hi Linus,
 
 Please, pull the following patches that replace zero-length arrays with
 flexible-array members.
 
 Thanks
 --
 Gustavo
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEkmRahXBSurMIg1YvRwW0y0cG2zEFAl+cjRUACgkQRwW0y0cG
 2zGWAhAAjUfTsAmXWhKNaWFSCYR0Q822puTUWOKfiBd+jjGaO04luTtr2gjv2Dkb
 Vgad8H4N8oZU79xfh5JZ5PUyScaso8wE6ZJTh2PLKXpKmNd213f5x/pIt78CCDTa
 Y1L/eR41mmveTL3VNS3sf6WaZpT9owxJKGIY8JgdiOmSjxJQpX5zdaC1KYso4eXr
 lIXIRo9VLEmVLhhHhZi+QmX6+aQ05E1D9K0ENe4/uEnRsV525W78iwZ4fYeLzr+A
 krEOdgx6sPgzajPYnHoayrrcKNKxD5YY1SWuVSm2tqYYIhlRoK3f5xgLOd10RiHE
 YMgx8aWzGmGJwoUhgp1bo/l9EZ7O8OWRqM/GOP4x6Wgjdhqw2x5jgskmhsKNGEXu
 /BlbS+qL5aUrMCxhvNbApuZW6xBiBbva76MH3vU9vFhZbVz1CHLQdGI0tfxggYWS
 jc2UPgoxL9OQlf3jSc+gK7RMFhBGNWn2Aiy8GQas3BxPYXuYPvwOj+irDOG/qZ9D
 VZ5swUw4+th+DsF5K53mEFeLv0fONMgL9Ka5bNR6+k6HG0WNLYYVOiet3xYUDo1f
 eZbMZthfc+QW7R8cwG0WuFk6rC6mLqE+A9nQuLZoJD+VMuJd4pwW9+6EW8nDX08w
 FS4/o92xUFJfOCgaLRS61FSAuSmFENieN+yoKMK/Uf6PJVdNMb4=
 =vyu3
 -----END PGP SIGNATURE-----

Merge tag 'flexible-array-conversions-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux

Pull more flexible-array member conversions from Gustavo A. R. Silva:
 "Replace zero-length arrays with flexible-array members"

* tag 'flexible-array-conversions-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
  printk: ringbuffer: Replace zero-length array with flexible-array member
  net/smc: Replace zero-length array with flexible-array member
  net/mlx5: Replace zero-length array with flexible-array member
  mei: hw: Replace zero-length array with flexible-array member
  gve: Replace zero-length array with flexible-array member
  Bluetooth: btintel: Replace zero-length array with flexible-array member
  scsi: target: tcmu: Replace zero-length array with flexible-array member
  ima: Replace zero-length array with flexible-array member
  enetc: Replace zero-length array with flexible-array member
  fs: Replace zero-length array with flexible-array member
  Bluetooth: Replace zero-length array with flexible-array member
  params: Replace zero-length array with flexible-array member
  tracepoint: Replace zero-length array with flexible-array member
  platform/chrome: cros_ec_proto: Replace zero-length array with flexible-array member
  platform/chrome: cros_ec_commands: Replace zero-length array with flexible-array member
  mailbox: zynqmp-ipi-message: Replace zero-length array with flexible-array member
  dmaengine: ti-cppi5: Replace zero-length array with flexible-array member
2020-10-31 14:31:28 -07:00
Gustavo A. R. Silva
7206d58a3a net/smc: Replace zero-length array with flexible-array member
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.9/process/deprecated.html#zero-length-and-one-element-arrays

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-10-30 16:57:42 -05:00
Karsten Graul
96d6fded95 net/smc: fix suppressed return code
The patch that repaired the invalid return code in smcd_new_buf_create()
missed to take care of errno ENOSPC which has a special meaning that no
more DMBEs can be registered on the device. Fix that by keeping this
errno value during the translation of the return code.

Fixes: 6b1bbf94ab ("net/smc: fix invalid return code in smcd_new_buf_create()")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-26 16:29:14 -07:00
Karsten Graul
4a9baf45fd net/smc: fix null pointer dereference in smc_listen_decline()
smc_listen_work() calls smc_listen_decline() on label out_decl,
providing the ini pointer variable. But this pointer can still be null
when the label out_decl is reached.
Fix this by checking the ini variable in smc_listen_work() and call
smc_listen_decline() with the result directly.

Fixes: a7c9c5f4af ("net/smc: CLC accept / confirm V2")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-26 16:29:14 -07:00
Jakub Kicinski
2295cddf99 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Minor conflicts in net/mptcp/protocol.h and
tools/testing/selftests/net/Makefile.

In both cases code was added on both sides in the same place
so just keep both.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-15 12:43:21 -07:00
Karsten Graul
6b1bbf94ab net/smc: fix invalid return code in smcd_new_buf_create()
smc_ism_register_dmb() returns error codes set by the ISM driver which
are not guaranteed to be negative or in the errno range. Such values
would not be handled by ERR_PTR() and finally the return code will be
used as a memory address.
Fix that by using a valid negative errno value with ERR_PTR().

Fixes: 72b7f6c487 ("net/smc: unique reason code for exceeded max dmb count")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-15 09:54:43 -07:00
Karsten Graul
ef12ad4588 net/smc: fix valid DMBE buffer sizes
The SMCD_DMBE_SIZES should include all valid DMBE buffer sizes, so the
correct value is 6 which means 1MB. With 7 the registration of an ISM
buffer would always fail because of the invalid size requested.
Fix that and set the value to 6.

Fixes: c6ba7c9ba4 ("net/smc: add base infrastructure for SMC-D and ISM")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-15 09:54:43 -07:00
Karsten Graul
d535ca1367 net/smc: fix use-after-free of delayed events
When a delayed event is enqueued then the event worker will send this
event the next time it is running and no other flow is currently
active. The event handler is called for the delayed event, and the
pointer to the event keeps set in lgr->delayed_event. This pointer is
cleared later in the processing by smc_llc_flow_start().
This can lead to a use-after-free condition when the processing does not
reach smc_llc_flow_start(), but frees the event because of an error
situation. Then the delayed_event pointer is still set but the event is
freed.
Fix this by always clearing the delayed event pointer when the event is
provided to the event handler for processing, and remove the code to
clear it in smc_llc_flow_start().

Fixes: 555da9af82 ("net/smc: add event-based llc_flow framework")
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-15 09:54:43 -07:00
Pujin Shi
16cb365380 net: smc: fix missing brace warning for old compilers
For older versions of gcc, the array = {0}; will cause warnings:

net/smc/smc_llc.c: In function 'smc_llc_add_link_local':
net/smc/smc_llc.c:1212:9: warning: missing braces around initializer [-Wmissing-braces]
  struct smc_llc_msg_add_link add_llc = {0};
         ^
net/smc/smc_llc.c:1212:9: warning: (near initialization for 'add_llc.hd') [-Wmissing-braces]
net/smc/smc_llc.c: In function 'smc_llc_srv_delete_link_local':
net/smc/smc_llc.c:1245:9: warning: missing braces around initializer [-Wmissing-braces]
  struct smc_llc_msg_del_link del_llc = {0};
         ^
net/smc/smc_llc.c:1245:9: warning: (near initialization for 'del_llc.hd') [-Wmissing-braces]

2 warnings generated

Fixes: 4dadd151b2 ("net/smc: enqueue local LLC messages")
Signed-off-by: Pujin Shi <shipujin.t@gmail.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-10 10:38:40 -07:00
Pujin Shi
7e94e46c16 net: smc: fix missing brace warning for old compilers
For older versions of gcc, the array = {0}; will cause warnings:

net/smc/smc_llc.c: In function 'smc_llc_send_link_delete_all':
net/smc/smc_llc.c:1317:9: warning: missing braces around initializer [-Wmissing-braces]
  struct smc_llc_msg_del_link delllc = {0};
         ^
net/smc/smc_llc.c:1317:9: warning: (near initialization for 'delllc.hd') [-Wmissing-braces]

1 warnings generated

Fixes: f3811fd7bc ("net/smc: send DELETE_LINK, ALL message and wait for send to complete")
Signed-off-by: Pujin Shi <shipujin.t@gmail.com>
Acked-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-10 10:38:40 -07:00
Karsten Graul
f29fa00399 net/smc: restore smcd_version when all ISM V2 devices failed to init
Field ini->smcd_version is set to SMC_V2 before calling
smc_listen_ism_init(). This clears the V1 bit that may be set. When all
matching ISM V2 devices fail to initialize then the smcd_version field
needs to get restored to allow any possible V1 devices to initialize.
And be consistent, always go to the not_found label when no device was
found.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-09 18:15:47 -07:00
Karsten Graul
9047a617dc net/smc: cleanup buffer usage in smc_listen_work()
coccinelle informs about
net/smc/af_smc.c:1770:10-11: WARNING: opportunity for kzfree/kvfree_sensitive

Its not that kzfree() would help here, the memset() is done to prepare
the buffer for another socket receive.
Fix that warning message by reordering the calls, while at it eliminate
the unneeded variable cclc2 and use sizeof(*buf) as above in the same
function. No functional changes.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-09 18:15:47 -07:00
Karsten Graul
c60a2cefb3 net/smc: consolidate unlocking in same function
Static code checkers warn of inconsistent returns because the lgr mutex
is locked in one function and unlocked in a function called by the
locking function:
net/smc/af_smc.c:823 smc_connect_rdma() warn: inconsistent returns 'smc_client_lgr_pending'.
net/smc/af_smc.c:897 smc_connect_ism() warn: inconsistent returns 'smc_server_lgr_pending'.

Make the code consistent by doing the unlock in the same function that
fetches the lock. No functional changes.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-10-09 18:15:47 -07:00
Karsten Graul
fd6ebb6fb2 net/smc: use an array to check fields in system EID
The check for old hardware versions that did not have SMCDv2 support was
using suspicious pointer magic. Address the fields using an array.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 17:04:48 -07:00
Karsten Graul
839d696ffb net/smc: send ISM devices with unique chid in CLC proposal
When building a CLC proposal message then the list of ISM devices does
not need to contain multiple devices that have the same chid value,
all these devices use the same function at the end.
Improve smc_find_ism_v2_device_clnt() to collect only ISM devices that
have unique chid values.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-10-03 17:04:48 -07:00
Ursula Braun
e8d726c8e8 net/smc: CLC decline - V2 enhancements
This patch covers the small SMCD version 2 changes for CLC decline.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:03 -07:00
Ursula Braun
b81a5eb789 net/smc: introduce CLC first contact extension
SMC Version 2 defines a first contact extension for CLC accept
and CLC confirm. This patch covers sending and receiving of the
CLC first contact extension.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:03 -07:00
Ursula Braun
a7c9c5f4af net/smc: CLC accept / confirm V2
The new format of SMCD V2 CLC accept and confirm is introduced,
and building and checking of SMCD V2 CLC accepts / confirms is adapted
accordingly.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:03 -07:00
Ursula Braun
5c21c4ccaf net/smc: determine accepted ISM devices
SMCD Version 2 allows to propose up to 8 additional ISM devices
offered to the peer as candidates for SMCD communication.
This patch covers the server side, i.e. selection of an ISM device
matching one of the proposed ISM devices, that will be used for
CLC accept

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:03 -07:00
Ursula Braun
8c3dca341a net/smc: build and send V2 CLC proposal
The new format of an SMCD V2 CLC proposal is introduced, and
building and checking of SMCD V2 CLC proposals is adapted
accordingly.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:03 -07:00
Ursula Braun
d70bf4f7a9 net/smc: determine proposed ISM devices
SMCD Version 2 allows to propose up to 8 additional ISM devices
offered to the peer as candidates for SMCD communication.
This patch covers determination of the ISM devices to be proposed.
ISM devices without PNETID are preferred, since ISM devices with
PNETID are a V1 leftover and will disappear over the time.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:03 -07:00
Ursula Braun
e888a2e833 net/smc: introduce list of pnetids for Ethernet devices
SMCD version 2 allows usage of ISM devices with hardware PNETID
only, if an Ethernet net_device exists with the same hardware PNETID.
This requires to maintain a list of pnetids belonging to
Ethernet net_devices, which is covered by this patch.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:03 -07:00
Ursula Braun
8caaccf521 net/smc: introduce CHID callback for ISM devices
With SMCD version 2 the CHIDs of ISM devices are needed for the
CLC handshake.
This patch provides the new callback to retrieve the CHID of an
ISM device.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:03 -07:00
Ursula Braun
201091ebb2 net/smc: introduce System Enterprise ID (SEID)
SMCD version 2 defines a System Enterprise ID (short SEID).
This patch contains the SEID creation and adds the callback to
retrieve the created SEID.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:02 -07:00
Ursula Braun
3fc6493761 net/smc: prepare for more proposed ISM devices
SMCD Version 2 allows proposing of up to 8 ISM devices in addition
to the native ISM device of SMCD Version 1.
This patch prepares the struct smc_init_info to deal with these
additional 8 ISM devices.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:02 -07:00
Ursula Braun
e15c6c46de net/smc: split CLC confirm/accept data to be sent
When sending CLC confirm and CLC accept, separate the trailing
part of the message from the initial part (to be prepared for
future first contact extension).

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:02 -07:00
Ursula Braun
7affc80982 net/smc: separate find device functions
This patch provides better separation of device determinations
in function smc_listen_work(). No functional change.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:02 -07:00
Ursula Braun
f1eb02f952 net/smc: CLC header fields renaming
SMCD version 2 defines 2 more bits in the CLC header to specify
version 2 types. This patch prepares better naming of the CLC
header fields. No functional change.

Signed-off-by: Ursula Braun <ubraun@linux.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:02 -07:00
Karsten Graul
a304e29a24 net/smc: remove constant and introduce helper to check for a pnet id
Use the existing symbol _S instead of SMC_ASCII_BLANK, and introduce a
helper to check if a pnetid is set. No functional change.

Signed-off-by: Karsten Graul <kgraul@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28 15:19:02 -07:00