1187364 Commits

Author SHA1 Message Date
Yang Li
e7c5433c5a tools: ynl: Remove duplicated include in handshake-user.c
./tools/net/ynl/generated/handshake-user.c: stdlib.h is included more than once.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5464
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09 11:36:41 +01:00
David S. Miller
56f7783ba4 Merge branch 'broadcom-phy-led-brightness'
Florian Fainelli says:

====================
LED brightness support for Broadcom PHYs

This patch series adds support for controlling the LED brightness on
Broadcom PHYs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09 10:38:44 +01:00
Florian Fainelli
bd5736e146 net: phy: broadcom: Add support for setting LED brightness
Broadcom PHYs have two LEDs selector registers which allow us to control
the LED assignment, including how to turn them on/off.

Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09 10:38:43 +01:00
Florian Fainelli
57fd7d59b1 net: phy: broadcom: Rename LED registers
These registers are common to most PHYs and are not specific to the
BCM5482, renamed the constants accordingly, no functional change.

Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09 10:38:43 +01:00
David S. Miller
54a8c43f3b Merge branch 'net-ncsi-refactoring-for-GMA-cmd'
Ivan Mikhaylov says:

====================
net/ncsi: refactoring for GMA command

Make one GMA function for all manufacturers, change ndo_set_mac_address
to dev_set_mac_address for notifiying net layer about MAC change which
ndo_set_mac_address doesn't do.

Changes from v1:
	1. delete ftgmac100.txt changes about mac-address-increment
	2. add convert to yaml from ftgmac100.txt
	3. add mac-address-increment option for ethernet-controller.yaml

Changes from v2:
	1. remove DT changes from series, will be done in another one
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09 10:32:51 +01:00
Ivan Mikhaylov
790071347a net/ncsi: change from ndo_set_mac_address to dev_set_mac_address
Change ndo_set_mac_address to dev_set_mac_address because
dev_set_mac_address provides a way to notify network layer about MAC
change. In other case, services may not aware about MAC change and keep
using old one which set from network adapter driver.

As example, DHCP client from systemd do not update MAC address without
notification from net subsystem which leads to the problem with acquiring
the right address from DHCP server.

Fixes: cb10c7c0dfd9e ("net/ncsi: Add NCSI Broadcom OEM command")
Cc: stable@vger.kernel.org # v6.0+ 2f38e84 net/ncsi: make one oem_gma function for all mfr id
Signed-off-by: Paul Fertser <fercerpav@gmail.com>
Signed-off-by: Ivan Mikhaylov <fr0st61te@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09 10:32:51 +01:00
Ivan Mikhaylov
74b449b98d net/ncsi: make one oem_gma function for all mfr id
Make the one Get Mac Address function for all manufacturers and change
this call in handlers accordingly.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Ivan Mikhaylov <fr0st61te@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09 10:32:51 +01:00
Foster Snowhill
0c6e9d32ef usbnet: ipheth: update Kconfig description
This module has for a long time not been limited to iPhone <= 3GS.
Update description to match the actual state of the driver.

Remove dead link from 2010, instead reference an existing userspace
iOS device pairing implementation as part of libimobiledevice.

Signed-off-by: Foster Snowhill <forst@pen.gy>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09 10:26:57 +01:00
Foster Snowhill
a2d274c62e usbnet: ipheth: add CDC NCM support
Recent iOS releases support CDC NCM encapsulation on RX. This mode is
the default on macOS and Windows. In this mode, an iOS device may include
one or more Ethernet frames inside a single URB.

Freshly booted iOS devices start in legacy mode, but are put into
NCM mode by the official Apple driver. When reconnecting such a device
from a macOS/Windows machine to a Linux host, the device stays in
NCM mode, making it unusable with the legacy ipheth driver code.

To correctly support such a device, the driver has to either support
the NCM mode too, or put the device back into legacy mode.

To match the behaviour of the macOS/Windows driver, and since there
is no documented control command to revert to legacy mode, implement
NCM support. The device is attempted to be put into NCM mode by default,
and falls back to legacy mode if the attempt fails.

Signed-off-by: Foster Snowhill <forst@pen.gy>
Tested-by: Georgi Valkov <gvalkov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09 10:26:57 +01:00
Foster Snowhill
3e65efcca8 usbnet: ipheth: transmit URBs without trailing padding
The behaviour of the official iOS tethering driver on macOS is to not
transmit any trailing padding at the end of URBs. This is applicable
to both NCM and legacy modes, including older devices.

Adapt the driver to not include trailing padding in TX URBs, matching
the behaviour of the official macOS driver.

Signed-off-by: Foster Snowhill <forst@pen.gy>
Tested-by: Georgi Valkov <gvalkov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09 10:26:57 +01:00
Georgi Valkov
2203718c2f usbnet: ipheth: fix risk of NULL pointer deallocation
The cleanup precedure in ipheth_probe will attempt to free a
NULL pointer in dev->ctrl_buf if the memory allocation for
this buffer is not successful. While kfree ignores NULL pointers,
and the existing code is safe, it is a better design to rearrange
the goto labels and avoid this.

Signed-off-by: Georgi Valkov <gvalkov@gmail.com>
Signed-off-by: Foster Snowhill <forst@pen.gy>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-06-09 10:26:57 +01:00
Jakub Kicinski
fd5f4d7da2 Merge branch 'splice-net-rewrite-splice-to-socket-fix-splice_f_more-and-handle-msg_splice_pages-in-af_tls'
David Howells says:

====================
splice, net: Rewrite splice-to-socket, fix SPLICE_F_MORE and handle MSG_SPLICE_PAGES in AF_TLS

Here are patches to do the following:

 (1) Block MSG_SENDPAGE_* flags from leaking into ->sendmsg() from
     userspace, whilst allowing splice_to_socket() to pass them in.

 (2) Allow MSG_SPLICE_PAGES to be passed into tls_*_sendmsg().  Until
     support is added, it will be ignored and a splice-driven sendmsg()
     will be treated like a normal sendmsg().  TCP, UDP, AF_UNIX and
     Chelsio-TLS already handle the flag in net-next.

 (3) Replace a chain of functions to splice-to-sendpage with a single
     function to splice via sendmsg() with MSG_SPLICE_PAGES.  This allows a
     bunch of pages to be spliced from a pipe in a single call using a
     bio_vec[] and pushes the main processing loop down into the bowels of
     the protocol driver rather than repeatedly calling in with a page at a
     time.

 (4) Provide a ->splice_eof() op[2] that allows splice to signal to its
     output that the input observed a premature EOF and that the caller
     didn't flag SPLICE_F_MORE, thereby allowing a corked socket to be
     flushed.  This attempts to maintain the current behaviour.  It is also
     not called if we didn't manage to read any data and so didn't called
     the actor function.

     This needs routing though several layers to get it down to the network
     protocol.

     [!] Note that I chose not to pass in any flags - I'm not sure it's
     	 particularly useful to pass in the splice flags; I also elected
     	 not to return any error code - though we might actually want to do
     	 that.

 (5) Provide tls_{device,sw}_splice_eof() to flush a pending TLS record if
     there is one.

 (6) Provide splice_eof() for UDP, TCP, Chelsio-TLS and AF_KCM.  AF_UNIX
     doesn't seem to pay attention to the MSG_MORE or MSG_SENDPAGE_NOTLAST
     flags.

 (7) Alter the behaviour of sendfile() and fix SPLICE_F_MORE/MSG_MORE
     signalling[1] such SPLICE_F_MORE is always signalled until we have
     read sufficient data to finish the request.  If we get a zero-length
     before we've managed to splice sufficient data, we now leave the
     socket expecting more data and leave it to userspace to deal with it.

 (8) Make AF_TLS handle the MSG_SPLICE_PAGES internal sendmsg flag.
     MSG_SPLICE_PAGES is an internal hint that tells the protocol that it
     should splice the pages supplied if it can.  Its sendpage
     implementations are then turned into wrappers around that.

Link: https://lore.kernel.org/r/499791.1685485603@warthog.procyon.org.uk/ [1]
Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/ [2]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=51c78a4d532efe9543a4df019ff405f05c6157f6 # part 1
Link: https://lore.kernel.org/r/20230524153311.3625329-1-dhowells@redhat.com/ # v1
====================

Link: https://lore.kernel.org/r/20230607181920.2294972-1-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:33 -07:00
David Howells
3dc8976c7a tls/device: Convert tls_device_sendpage() to use MSG_SPLICE_PAGES
Convert tls_device_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather
than directly splicing in the pages itself.  With that, the tls_iter_offset
union is no longer necessary and can be replaced with an iov_iter pointer
and the zc_page argument to tls_push_data() can also be removed.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:31 -07:00
David Howells
24763c9c09 tls/device: Support MSG_SPLICE_PAGES
Make TLS's device sendmsg() support MSG_SPLICE_PAGES.  This causes pages to
be spliced from the source iterator if possible.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:31 -07:00
David Howells
45e5be844a tls/sw: Convert tls_sw_sendpage() to use MSG_SPLICE_PAGES
Convert tls_sw_sendpage() and tls_sw_sendpage_locked() to use sendmsg()
with MSG_SPLICE_PAGES rather than directly splicing in the pages itself.

[!] Note that tls_sw_sendpage_locked() appears to have the wrong locking
    upstream.  I think the caller will only hold the socket lock, but it
    should hold tls_ctx->tx_lock too.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:31 -07:00
David Howells
fe1e81d4f7 tls/sw: Support MSG_SPLICE_PAGES
Make TLS's sendmsg() support MSG_SPLICE_PAGES.  This causes pages to be
spliced from the source iterator if possible.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:31 -07:00
David Howells
219d92056b splice, net: Fix SPLICE_F_MORE signalling in splice_direct_to_actor()
splice_direct_to_actor() doesn't manage SPLICE_F_MORE correctly[1] - and,
as a result, it incorrectly signals/fails to signal MSG_MORE when splicing
to a socket.  The problem I'm seeing happens when a short splice occurs
because we got a short read due to hitting the EOF on a file: as the length
read (read_len) is less than the remaining size to be spliced (len),
SPLICE_F_MORE (and thus MSG_MORE) is set.

The issue is that, for the moment, we have no way to know *why* the short
read occurred and so can't make a good decision on whether we *should* keep
MSG_MORE set.

MSG_SENDPAGE_NOTLAST was added to work around this, but that is also set
incorrectly under some circumstances - for example if a short read fills a
single pipe_buffer, but the next read would return more (seqfile can do
this).

This was observed with the multi_chunk_sendfile tests in the tls kselftest
program.  Some of those tests would hang and time out when the last chunk
of file was less than the sendfile request size:

	build/kselftest/net/tls -r tls.12_aes_gcm.multi_chunk_sendfile

This has been observed before[2] and worked around in AF_TLS[3].

Fix this by making splice_direct_to_actor() always signal SPLICE_F_MORE if
we haven't yet hit the requested operation size.  SPLICE_F_MORE remains
signalled if the user passed it in to splice() but otherwise gets cleared
when we've read sufficient data to fulfill the request.

If, however, we get a premature EOF from ->splice_read(), have sent at
least one byte and SPLICE_F_MORE was not set by the caller, ->splice_eof()
will be invoked.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Linus Torvalds <torvalds@linux-foundation.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Matthew Wilcox <willy@infradead.org>
cc: Jan Kara <jack@suse.cz>
cc: Jeff Layton <jlayton@kernel.org>
cc: David Hildenbrand <david@redhat.com>
cc: Christian Brauner <brauner@kernel.org>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: linux-mm@kvack.org

Link: https://lore.kernel.org/r/499791.1685485603@warthog.procyon.org.uk/ [1]
Link: https://lore.kernel.org/r/1591392508-14592-1-git-send-email-pooja.trivedi@stackpath.com/ [2]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=d452d48b9f8b1a7f8152d33ef52cfd7fe1735b0a [3]
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:31 -07:00
David Howells
951ace9951 kcm: Use splice_eof() to flush
Allow splice to undo the effects of MSG_MORE after prematurely ending a
splice/sendfile due to getting an EOF condition (->splice_read() returned
0) after splice had called sendmsg() with MSG_MORE set when the user didn't
set MSG_MORE.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Tom Herbert <tom@herbertland.com>
cc: Tom Herbert <tom@quantonium.net>
cc: Cong Wang <cong.wang@bytedance.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:31 -07:00
David Howells
c289a1601a chelsio/chtls: Use splice_eof() to flush
Allow splice to end a Chelsio TLS record after prematurely ending a
splice/sendfile due to getting an EOF condition (->splice_read() returned
0) after splice had called sendmsg() with MSG_MORE set when the user didn't
set MSG_MORE.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Ayush Sawal <ayush.sawal@chelsio.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:31 -07:00
David Howells
1d7e4538a5 ipv4, ipv6: Use splice_eof() to flush
Allow splice to undo the effects of MSG_MORE after prematurely ending a
splice/sendfile due to getting an EOF condition (->splice_read() returned
0) after splice had called sendmsg() with MSG_MORE set when the user didn't
set MSG_MORE.

For UDP, a pending packet will not be emitted if the socket is closed
before it is flushed; with this change, it be flushed by ->splice_eof().

For TCP, it's not clear that MSG_MORE is actually effective.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Kuniyuki Iwashima <kuniyu@amazon.com>
cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
cc: David Ahern <dsahern@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:30 -07:00
David Howells
d4c1e80b0d tls/device: Use splice_eof() to flush
Allow splice to end a TLS record after prematurely ending a splice/sendfile
due to getting an EOF condition (->splice_read() returned 0) after splice
had called TLS with a sendmsg() with MSG_MORE set when the user didn't set
MSG_MORE.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:30 -07:00
David Howells
df720d288d tls/sw: Use splice_eof() to flush
Allow splice to end a TLS record after prematurely ending a splice/sendfile
due to getting an EOF condition (->splice_read() returned 0) after splice
had called TLS with a sendmsg() with MSG_MORE set when the user didn't set
MSG_MORE.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:30 -07:00
David Howells
2bfc668509 splice, net: Add a splice_eof op to file-ops and socket-ops
Add an optional method, ->splice_eof(), to allow splice to indicate the
premature termination of a splice to struct file_operations and struct
proto_ops.

This is called if sendfile() or splice() encounters all of the following
conditions inside splice_direct_to_actor():

 (1) the user did not set SPLICE_F_MORE (splice only), and

 (2) an EOF condition occurred (->splice_read() returned 0), and

 (3) we haven't read enough to fulfill the request (ie. len > 0 still), and

 (4) we have already spliced at least one byte.

A further patch will modify the behaviour of SPLICE_F_MORE to always be
passed to the actor if either the user set it or we haven't yet read
sufficient data to fulfill the request.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/CAHk-=wh=V579PDYvkpnTobCLGczbgxpMgGmmhqiTyE34Cpi5Gg@mail.gmail.com/
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Matthew Wilcox <willy@infradead.org>
cc: Jan Kara <jack@suse.cz>
cc: Jeff Layton <jlayton@kernel.org>
cc: David Hildenbrand <david@redhat.com>
cc: Christian Brauner <brauner@kernel.org>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: linux-mm@kvack.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:30 -07:00
David Howells
2dc334f1a6 splice, net: Use sendmsg(MSG_SPLICE_PAGES) rather than ->sendpage()
Replace generic_splice_sendpage() + splice_from_pipe + pipe_to_sendpage()
with a net-specific handler, splice_to_socket(), that calls sendmsg() with
MSG_SPLICE_PAGES set instead of calling ->sendpage().

MSG_MORE is used to indicate if the sendmsg() is expected to be followed
with more data.

This allows multiple pipe-buffer pages to be passed in a single call in a
BVEC iterator, allowing the processing to be pushed down to a loop in the
protocol driver.  This helps pave the way for passing multipage folios down
too.

Protocols that haven't been converted to handle MSG_SPLICE_PAGES yet should
just ignore it and do a normal sendmsg() for now - although that may be a
bit slower as it may copy everything.

Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:30 -07:00
David Howells
81840b3b91 tls: Allow MSG_SPLICE_PAGES but treat it as normal sendmsg
Allow MSG_SPLICE_PAGES to be specified to sendmsg() but treat it as normal
sendmsg for now.  This means the data will just be copied until
MSG_SPLICE_PAGES is handled.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:30 -07:00
David Howells
4fe38acdac net: Block MSG_SENDPAGE_* from being passed to sendmsg() by userspace
It is necessary to allow MSG_SENDPAGE_* to be passed into ->sendmsg() to
allow sendmsg(MSG_SPLICE_PAGES) to replace ->sendpage().  Unblocking them
in the network protocol, however, allows these flags to be passed in by
userspace too[1].

Fix this by marking MSG_SENDPAGE_NOPOLICY, MSG_SENDPAGE_NOTLAST and
MSG_SENDPAGE_DECRYPTED as internal flags, which causes sendmsg() to object
if they are passed to sendmsg() by userspace.  Network protocol ->sendmsg()
implementations can then allow them through.

Note that it should be possible to remove MSG_SENDPAGE_NOTLAST once
sendpage is removed as a whole slew of pages will be passed in in one go by
splice through sendmsg, with MSG_MORE being set if it has more data waiting
in the pipe.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
Link: https://lore.kernel.org/r/20230526181338.03a99016@kernel.org/ [1]
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:40:30 -07:00
Eric Dumazet
736013292e tcp: let tcp_mtu_probe() build headless packets
tcp_mtu_probe() is still copying payload from skbs in the write queue,
using skb_copy_bits(), ignoring potential errors.

Modern TCP stack wants to only deal with payload found in page frags,
as this is a prereq for TCPDirect (host stack might not have access
to the payload)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230607214113.1992947-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:31:06 -07:00
Jakub Kicinski
f84ad5cffd mlx5-updates-2023-06-06
1) Support 4 ports VF LAG, part 2/2
 2) Few extra trivial cleanup patches
 
 Shay Drory Says:
 ================
 
 Support 4 ports VF LAG, part 2/2
 
 This series continues the series[1] "Support 4 ports VF LAG, part1/2".
 This series adds support for 4 ports VF LAG (single FDB E-Switch).
 
 This series of patches refactoring LAG code that make assumptions
 about VF LAG supporting only two ports and then enable 4 ports VF LAG.
 
 Patch 1:
 - Fix for ib rep code
 Patches 2-5:
 - Refactors LAG layer.
 Patches 6-7:
 - Block LAG types which doesn't support 4 ports.
 Patch 8:
 - Enable 4 ports VF LAG.
 
 This series specifically allows HCAs with 4 ports to create a VF LAG
 with only 4 ports. It is not possible to create a VF LAG with 2 or 3
 ports using HCAs that have 4 ports.
 
 Currently, the Merged E-Switch feature only supports HCAs with 2 ports.
 However, upcoming patches will introduce support for HCAs with 4 ports.
 
 In order to activate VF LAG a user can execute:
 
 devlink dev eswitch set pci/0000:08:00.0 mode switchdev
 devlink dev eswitch set pci/0000:08:00.1 mode switchdev
 devlink dev eswitch set pci/0000:08:00.2 mode switchdev
 devlink dev eswitch set pci/0000:08:00.3 mode switchdev
 ip link add name bond0 type bond
 ip link set dev bond0 type bond mode 802.3ad
 ip link set dev eth2 master bond0
 ip link set dev eth3 master bond0
 ip link set dev eth4 master bond0
 ip link set dev eth5 master bond0
 
 Where eth2, eth3, eth4 and eth5 are net-interfaces of pci/0000:08:00.0
 pci/0000:08:00.1 pci/0000:08:00.2 pci/0000:08:00.3 respectively.
 
 User can verify LAG state and type via debugfs:
 /sys/kernel/debug/mlx5/0000\:08\:00.0/lag/state
 /sys/kernel/debug/mlx5/0000\:08\:00.0/lag/type
 
 [1]
 https://lore.kernel.org/netdev/20230601060118.154015-1-saeed@kernel.org/T/#mf1d2083780970ba277bfe721554d4925f03f36d1
 
 ================
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmSA7/0ACgkQSD+KveBX
 +j4faQgApm14Id8QTB0rSj9tO1tJFtSgCcpDN9DtyYWuq3B0rGW9CPC1rPdaFOlt
 xst7PtEaCiJu7a7dwlH/kFLSAlXpZHdUZA+VG8JF0aYV8qOV/0R0xQKZgP68kwkn
 vFZqZCzA1vR6egK3AweAjKAVaqDKSSKVlFGXJGzyNpGMWpGEEKodlZKCH7Jd580F
 UFhCqbyY8vccMUa3cvrLVjePUjdM1xxsLKHWmYXTaN2NkoLvOYXnXThElu7skm96
 Uqv8B9t2FoojZPBxgiJtoGKZ516+1dozORq7ioQug3oG9P/vpTY5QnTMKSZpJUTH
 5cdSCdqii4UDPqfe0PtdEdE1O2aWig==
 =A44p
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2023-06-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2023-06-06

1) Support 4 ports VF LAG, part 2/2
2) Few extra trivial cleanup patches

Shay Drory Says:
================

Support 4 ports VF LAG, part 2/2

This series continues the series[1] "Support 4 ports VF LAG, part1/2".
This series adds support for 4 ports VF LAG (single FDB E-Switch).

This series of patches refactoring LAG code that make assumptions
about VF LAG supporting only two ports and then enable 4 ports VF LAG.

Patch 1:
- Fix for ib rep code
Patches 2-5:
- Refactors LAG layer.
Patches 6-7:
- Block LAG types which doesn't support 4 ports.
Patch 8:
- Enable 4 ports VF LAG.

This series specifically allows HCAs with 4 ports to create a VF LAG
with only 4 ports. It is not possible to create a VF LAG with 2 or 3
ports using HCAs that have 4 ports.

Currently, the Merged E-Switch feature only supports HCAs with 2 ports.
However, upcoming patches will introduce support for HCAs with 4 ports.

In order to activate VF LAG a user can execute:

devlink dev eswitch set pci/0000:08:00.0 mode switchdev
devlink dev eswitch set pci/0000:08:00.1 mode switchdev
devlink dev eswitch set pci/0000:08:00.2 mode switchdev
devlink dev eswitch set pci/0000:08:00.3 mode switchdev
ip link add name bond0 type bond
ip link set dev bond0 type bond mode 802.3ad
ip link set dev eth2 master bond0
ip link set dev eth3 master bond0
ip link set dev eth4 master bond0
ip link set dev eth5 master bond0

Where eth2, eth3, eth4 and eth5 are net-interfaces of pci/0000:08:00.0
pci/0000:08:00.1 pci/0000:08:00.2 pci/0000:08:00.3 respectively.

User can verify LAG state and type via debugfs:
/sys/kernel/debug/mlx5/0000\:08\:00.0/lag/state
/sys/kernel/debug/mlx5/0000\:08\:00.0/lag/type

[1]
https://lore.kernel.org/netdev/20230601060118.154015-1-saeed@kernel.org/T/#mf1d2083780970ba277bfe721554d4925f03f36d1

================

* tag 'mlx5-updates-2023-06-06' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: simplify condition after napi budget handling change
  mlx5/core: E-Switch, Allocate ECPF vport if it's an eswitch manager
  net/mlx5: Skip inline mode check after mlx5_eswitch_enable_locked() failure
  net/mlx5e: TC, refactor access to hash key
  net/mlx5e: Remove RX page cache leftovers
  net/mlx5e: Expose catastrophic steering error counters
  net/mlx5: Enable 4 ports VF LAG
  net/mlx5: LAG, block multiport eswitch LAG in case ldev have more than 2 ports
  net/mlx5: LAG, block multipath LAG in case ldev have more than 2 ports
  net/mlx5: LAG, change mlx5_shared_fdb_supported() to static
  net/mlx5: LAG, generalize handling of shared FDB
  net/mlx5: LAG, check if all eswitches are paired for shared FDB
  {net/RDMA}/mlx5: introduce lag_for_each_peer
  RDMA/mlx5: Free second uplink ib port
====================

Link: https://lore.kernel.org/r/20230607210410.88209-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:28:21 -07:00
Justin Chen
55b24334c0 ethtool: ioctl: improve error checking for set_wol
The netlink version of set_wol checks for not supported wolopts and avoids
setting wol when the correct wolopt is already set. If we do the same with
the ioctl version then we can remove these checks from the driver layer.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Justin Chen <justin.chen@broadcom.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://lore.kernel.org/r/1686179653-29750-1-git-send-email-justin.chen@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:24:54 -07:00
Jakub Kicinski
68bd67b43f Merge branch 'complete-lynx-mdio-device-handling'
Russell King says:

====================
complete Lynx mdio device handling

This series completes the mdio device lifetime handling for Lynx PCS
users which do not create their own mdio device, but instead fetch
it using a firmware description - namely the DPAA2 and FMAN_MEMAC
drivers.

In a previous patch set, lynx_pcs_create() was modified to increase
the mdio device refcount, and lynx_pcs_destroy() to drop that
refcount.

The first two patches change these two drivers to put the reference
which they hold immediately after lynx_pcs_create(), effectively
handing the responsibility for maintaining the refcount to the Lynx
PCS driver.

A side effect of the first two patches is that lynx_get_mdio_device()
is no longer used, so patch 3 removes it.

Patch 4 adds a new helper - lynx_pcs_create_fwnode(), which creates
a Lynx PCS instance from the fwnode.

Patch 5 and 6 convert the two drivers to make use of this new helper,
which simply has to find the mdio device, and then create the Lynx
PCS from that.

With those conversions done, lynx_pcs_create() is no longer required
outside pcs-lynx.c, so remove it from public view.

Patch 8 we changes lynx_pcs_create() to return an error-pointer rather
than NULL to bring consistency to the return style, and means that we
can remove the NULL-to-error-pointer conversion from both
lynx_pcs_create_fwnode() and lynx_pcs_create_mdiodev().

Patch 9 adds a check for the fwnode being available, and returns an
-ENODEV error pointer if unavailable.

Patch 10 removes this check from DPAA2, detecting the error pointer
value to continue printing the helpful message.

Patch 11 removes this check from fman_memac, and in doing so fixes a
bug where if the node is unavailable, the reference count is not
dropped.
====================

Link: https://lore.kernel.org/r/ZIBwuw+IuGQo5yV8@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:19:52 -07:00
Russell King (Oracle)
32fc30353f net: fman_memac: use pcs-lynx's check for fwnode availability
Use pcs-lynx's check rather than our own when determining if the device
is available. This fixes a bug where the reference gained by
of_parse_phandle() is not dropped if the device is not available.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:19:50 -07:00
Russell King (Oracle)
8c1d0b339d net: dpaa2: use pcs-lynx's check for fwnode availability
Use pcs-lynx's check rather than our own when determining if the device
is available.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:19:50 -07:00
Russell King (Oracle)
d143898c6d net: pcs: lynx: check that the fwnode is available prior to use
Check that the fwnode is marked as available prior to trying to lookup
the PCS device, and return -ENODEV if unavailable. Document the return
codes from lynx_pcs_create_fwnode().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:19:50 -07:00
Russell King (Oracle)
05b606b884 net: pcs: lynx: change lynx_pcs_create() to return error-pointers
Change lynx_pcs_create() to return an error-pointer on failure to
allocate memory, rather than returning NULL. This allows the removal
of the conversion in lynx_pcs_create_fwnode() and
lynx_pcs_create_mdiodev().

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:19:50 -07:00
Russell King (Oracle)
84e476b876 net: pcs: lynx: make lynx_pcs_create() static
We no longer need to export lynx_pcs_create() for drivers to use as we
now have all the functionality we need in the two new creation helpers.
Remove the export and prototype, and make it static.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:19:50 -07:00
Russell King (Oracle)
929a629c21 net: fman_memac: use lynx_pcs_create_fwnode()
Use lynx_pcs_create_fwnode() to create a lynx PCS from a fwnode handle.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:19:50 -07:00
Russell King (Oracle)
595fa7634d net: dpaa2-mac: use lynx_pcs_create_fwnode()
Use lynx_pcs_create_fwnode() to create a lynx PCS from a fwnode handle.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:19:50 -07:00
Russell King (Oracle)
6e1a12821d net: pcs: lynx: add lynx_pcs_create_fwnode()
Add a helper to create a lynx PCS from a fwnode handle.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:19:50 -07:00
Russell King (Oracle)
b3b984dc0b net: pcs: lynx: remove lynx_get_mdio_device()
lynx_get_mdio_device() is no longer necessary, let's remove it so the
lynx PCS code is always managing the lifetime of the mdiodev.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:19:50 -07:00
Russell King (Oracle)
d7b6ea1a14 net: fman_memac: allow lynx PCS to handle mdiodev lifetime
Put the mdiodev after lynx_pcs_create() so that the Lynx PCS driver
can manage the lifetime of the mdiodev its using.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:19:50 -07:00
Russell King (Oracle)
6c79a9c8b1 net: dpaa2-mac: allow lynx PCS to manage mdiodev lifetime
Put the mdiodev after lynx_pcs_create() so that the Lynx PCS driver
can manage the lifetime of the mdiodev its using.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:19:50 -07:00
Jiaxun Yang
c8cc2ae229 net: pch_gbe: Allow build on MIPS_GENERIC kernel
MIPS Boston board, which is using MIPS_GENERIC kernel is using
EG20T PCH and thus need this driver.

Dependency of PCH_GBE, PTP_1588_CLOCK_PCH is also fixed for
MIPS_GENERIC.

Note that CONFIG_PCH_GBE is selected in arch/mips/configs/generic/
board-boston.config for a while, some how it's never wired up
in Kconfig.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230607055953.34110-1-jiaxun.yang@flygoat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 19:18:32 -07:00
Ido Schimmel
37ff78e977 mlxsw: spectrum_nve_vxlan: Fix unsupported flag regression
The recently added 'VXLAN_F_LOCALBYPASS' flag is set by default on VXLAN
devices and denotes a behavior that is irrelevant for the hardware data
path. Add it to the lists of IPv4 and IPv6 supported flags to avoid
rejecting offload of VXLAN devices which have this flag set.

Fixes: 69474a8a5837 ("net: vxlan: Add nolocalbypass option to vxlan.")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/5533e63643bf719bbe286fef60f749c9cad35005.1686139716.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 18:56:02 -07:00
Jakub Kicinski
392c108bce Merge branch 'tools-ynl-generate-code-for-the-devlink-family'
Jakub Kicinski says:

====================
tools: ynl: generate code for the devlink family

Another chunk of changes to support more capabilities in the YNL
code gen. Devlink brings in deep nesting and directional messages
(requests and responses have different IDs). We need a healthy
dose of codegen changes to support those (I wasn't planning to
support code gen for "directional" families initially, but
the importance of devlink and ethtool is undeniable).
====================

Link: https://lore.kernel.org/r/20230607202403.1089925-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 14:01:12 -07:00
Jakub Kicinski
fff8660b54 tools: ynl: add sample for devlink
Add a sample to show off how to issue basic devlink requests.
For added testing issue get requests while walking a dump.

$ ./devlink
netdevsim/netdevsim1:
    driver: netdevsim
    running fw:
        fw.mgmt: 10.20.30
    ...
netdevsim/netdevsim2:
    driver: netdevsim
    running fw:
        fw.mgmt: 10.20.30
    ...

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 14:01:10 -07:00
Jakub Kicinski
5d1a30eb98 tools: ynl: generate code for the devlink family
Admittedly the devlink.yaml spec is fairly limitted,
it only covers basic device get and info-get ops.
That's sufficient to be useful (monitoring FW versions
in the fleet). Plus it gives us a chance to exercise
deep nesting and directional messaging in YNL.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 14:01:10 -07:00
Jakub Kicinski
0a94712196 tools: ynl-gen: don't generate forward declarations for policies - regen
Renegerate code after dropping forward declarations for policies.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 14:01:10 -07:00
Jakub Kicinski
168dea20ec tools: ynl-gen: don't generate forward declarations for policies
Now that all nested types have structs and are sorted topologically
there should be no need to generate forward declarations for policies.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 14:01:10 -07:00
Jakub Kicinski
eae7af21bd tools: ynl-gen: walk nested types in depth
So far we had only created structures for nested types nested
directly in messages (second level of attrs so to speak).
Walk types in depth to support deeper nesting.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 14:01:10 -07:00
Jakub Kicinski
37487f93b1 tools: ynl-gen: inherit struct use info
We only render parse and netlink generation helpers as needed,
to avoid generating dead code. Propagate the information from
first- and second-layer attribute sets onto all children.
Otherwise devlink won't work, it has a lot more levels of nesting.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-06-08 14:01:10 -07:00