IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Add a new outer_vlan_ops member to the ice_vsi structure as outer VLAN
ops are only available when the device is in Double VLAN Mode (DVM).
Depending on the VSI type, the requirements for what operations to
use/allow differ.
By default all VSI's have unsupported inner and outer VSI VLAN ops. This
implementation was chosen to prevent unexpected crashes due to null
pointer dereferences. Instead, if a VSI calls an unsupported op, it will
just return -EOPNOTSUPP.
Add implementations to support modifying outer VLAN fields for VSI
context. This includes the ability to modify VLAN stripping, insertion,
and the port VLAN based on the outer VLAN handling fields of the VSI
context.
These functions should only ever be used if DVM is enabled because that
means the firmware supports the outer VLAN fields in the VSI context. If
the device is in DVM, then always use the outer_vlan_ops, else use the
vlan_ops since the device is in Single VLAN Mode (SVM).
Also, move adding the untagged VLAN 0 filter from ice_vsi_setup() to
ice_vsi_vlan_setup() as the latter function is specific to the PF and
all other VSI types that need an untagged VLAN 0 filter already do this
in their specific flows. Without this change, Flow Director is failing
to initialize because it does not implement any VSI VLAN ops.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Current operations act on inner VLAN fields. To support double VLAN, outer
VLAN operations and functions will be implemented. Add the "inner" naming
to existing VLAN operations to distinguish them from the upcoming outer
values and functions. Some spacing adjustments are made to align
values.
Note that the inner is not talking about a tunneled VLAN, but the second
VLAN in the packet. For SVM the driver uses inner or single VLAN
filtering and offloads and in Double VLAN Mode the driver uses the
inner filtering and offloads for SR-IOV VFs in port VLANs in order to
support offloading the guest VLAN while a port VLAN is configured.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Currently the proto argument is unused. This is because the driver only
supports 802.1Q VLAN filtering. This policy is enforced via netdev
features that the driver sets up when configuring the netdev, so the
proto argument won't ever be anything other than 802.1Q. However, this
will allow for future iterations of the driver to seemlessly support
802.1ad filtering. Begin using the proto argument and extend the related
structures to support its use.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The current vf->port_vlan_info variable is a packed u16 that contains
the port VLAN ID and QoS/prio value. This is fine, but changes are
incoming that allow for an 802.1ad port VLAN. Add flexibility by
changing the vf->port_vlan_info member to be an ice_vlan structure.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Add a new struct for VLAN related information. Currently this holds
VLAN ID and priority values, but will be expanded to hold TPID value.
This reduces the changes necessary if any other values are added in
future. Remove the action argument from these calls as it's always
ICE_FWD_VSI.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Incoming changes to support 802.1Q and/or 802.1ad VLAN filtering and
offloads require more flexibility when configuring VLANs. The VSI VLAN
interface will allow flexibility for configuring VLANs for all VSI
types. Add new files to separate the VSI VLAN ops and move functions to
make the code more organized.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
There are multiple places where VLAN 0 is being added. Create a function
to be called in order to minimize changes as the implementation is expanded
to support double VLAN and avoid duplicated code.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Add functions to configure Tx VLAN antispoof based on iproute
configuration and/or VLAN mode and VF driver support. This is needed
later so the driver can control when it can be configured. Also, add
functions that can be used to enable and disable MAC and VLAN
spoofcheck. Move spoofchk configuration during VSI setup into the
SR-IOV initialization path and into the post VSI rebuild flow for VF
VSIs.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Gurucharan G <gurucharanx.g@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
The SIDA MEMOPs must only be used for secure guests, otherwise userspace
can do unwanted memory accesses.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+SKTgaM0CPnbq/vKEXu8gLWmHHwFAmH6S9MACgkQEXu8gLWm
HHwpjxAAqPo76yBxTsSp0gLmczY07F2J4OgJtkHkvqPss2X3pfZuEgfyzr40dzyB
7uC/q+l09/dSFhKhHRcxAR/LB5m+zGagwJQFOBbXI/Rn5AfKlQksDwstGL4Y+f1e
93feXq824HZ5d7tpyIBKbxdP09u5mtL1zswryUSjwGuJQntNmi+lXqFU1FUv+LmW
e6MxHLyuPZlIIyHN3I7CeRBURwF/V1dUKmBNRPFu7sZ0FcFHKvafMwsiiMPu/ImB
bOQW5pPKFxrOjY6Zk6iOL/nkqCRDOt9rBIqpRBatzINANrDuaCXEjklZO5Owuc4z
r3KTPauj/yZGPeV6eXTdpcbocMFnDSrGkMNQNMyqGLCu+s3BxRek7EkV0Q81U89D
7CixloS4ZGPOIkXyhii772DelsBX5mxe1AtW9BA+Dcjr6SbvGItVP3pgwOMJiZrm
miJgVLmCd7sN+ZgKUdA1HUjIpNPG8lmoGsDasDY2irBQI8DNTJrTPAHSDa2nTTcF
wFpOhPy0lxcyHmXuqFFVIPx8AWLdnUAB0U8uNr5XpO6xonmVEgEaG+wgWlLkfjJa
vTEdl7yirYvgiZ89BHrSYLY760IuNJFbmQVIeYBfoTNwQJ9WyjUmA6rFijokR8pH
lHI9HdQvbW5HGcCx6HykJUQz1i6tX8Ir3OjRfDjDaIbmuxS6J5o=
=PBt2
-----END PGP SIGNATURE-----
Merge tag 'kvm-s390-kernel-access' from emailed bundle
Pull s390 kvm fix from Christian Borntraeger:
"Add missing check for the MEMOP ioctl
The SIDA MEMOPs must only be used for secure guests, otherwise
userspace can do unwanted memory accesses"
* tag 'kvm-s390-kernel-access' from emailed bundle:
KVM: s390: Return error on SIDA memop on normal guest
NFS_OFFSET_MAX was introduced way back in Linux v2.3.y before there
was a kernel-wide OFFSET_MAX value. As a clean up, replace the last
few uses of it with its generic equivalent, and get rid of it.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
NFSv3 and NFSv4 use u64 offset values on the wire. Record these values
verbatim without the implicit type case to loff_t.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Since, well, forever, the Linux NFS server's nfsd_commit() function
has returned nfserr_inval when the passed-in byte range arguments
were non-sensical.
However, according to RFC 1813 section 3.3.21, NFSv3 COMMIT requests
are permitted to return only the following non-zero status codes:
NFS3ERR_IO
NFS3ERR_STALE
NFS3ERR_BADHANDLE
NFS3ERR_SERVERFAULT
NFS3ERR_INVAL is not included in that list. Likewise, NFS4ERR_INVAL
is not listed in the COMMIT row of Table 6 in RFC 8881.
RFC 7530 does permit COMMIT to return NFS4ERR_INVAL, but does not
specify when it can or should be used.
Instead of dropping or failing a COMMIT request in a byte range that
is not supported, turn it into a valid request by treating one or
both arguments as zero. Offset zero means start-of-file, count zero
means until-end-of-file, so we only ever extend the commit range.
NFS servers are always allowed to commit more and sooner than
requested.
The range check is no longer bounded by NFS_OFFSET_MAX, but rather
by the value that is returned in the maxfilesize field of the NFSv3
FSINFO procedure or the NFSv4 maxfilesize file attribute.
Note that this change results in a new pynfs failure:
CMT4 st_commit.testCommitOverflow : RUNNING
CMT4 st_commit.testCommitOverflow : FAILURE
COMMIT with offset + count overflow should return
NFS4ERR_INVAL, instead got NFS4_OK
IMO the test is not correct as written: RFC 8881 does not allow the
COMMIT operation to return NFS4ERR_INVAL.
Reported-by: Dan Aloni <dan.aloni@vastdata.com>
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Bruce Fields <bfields@fieldses.org>
Ensure that a client cannot specify a WRITE range that falls in a
byte range outside what the kernel's internal types (such as loff_t,
which is signed) can represent. The kiocb iterators, invoked in
nfsd_vfs_write(), should properly limit write operations to within
the underlying file system's s_maxbytes.
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
iattr::ia_size is a loff_t, so these NFSv3 procedures must be
careful to deal with incoming client size values that are larger
than s64_max without corrupting the value.
Silently capping the value results in storing a different value
than the client passed in which is unexpected behavior, so remove
the min_t() check in decode_sattr3().
Note that RFC 1813 permits only the WRITE procedure to return
NFS3ERR_FBIG. We believe that NFSv3 reference implementations
also return NFS3ERR_FBIG when ia_size is too large.
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
iattr::ia_size is a loff_t, which is a signed 64-bit type. NFSv3 and
NFSv4 both define file size as an unsigned 64-bit type. Thus there
is a range of valid file size values an NFS client can send that is
already larger than Linux can handle.
Currently decode_fattr4() dumps a full u64 value into ia_size. If
that value happens to be larger than S64_MAX, then ia_size
underflows. I'm about to fix up the NFSv3 behavior as well, so let's
catch the underflow in the common code path: nfsd_setattr().
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Dan Aloni reports:
> Due to commit 8cfb9015280d ("NFS: Always provide aligned buffers to
> the RPC read layers") on the client, a read of 0xfff is aligned up
> to server rsize of 0x1000.
>
> As a result, in a test where the server has a file of size
> 0x7fffffffffffffff, and the client tries to read from the offset
> 0x7ffffffffffff000, the read causes loff_t overflow in the server
> and it returns an NFS code of EINVAL to the client. The client as
> a result indefinitely retries the request.
The Linux NFS client does not handle NFS?ERR_INVAL, even though all
NFS specifications permit servers to return that status code for a
READ.
Instead of NFS?ERR_INVAL, have out-of-range READ requests succeed
and return a short result. Set the EOF flag in the result to prevent
the client from retrying the READ request. This behavior appears to
be consistent with Solaris NFS servers.
Note that NFSv3 and NFSv4 use u64 offset values on the wire. These
must be converted to loff_t internally before use -- an implicit
type cast is not adequate for this purpose. Otherwise VFS checks
against sb->s_maxbytes do not work properly.
Reported-by: Dan Aloni <dan.aloni@vastdata.com>
Cc: stable@vger.kernel.org
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Xin Long says:
====================
vlan: fix a netdev refcnt leak for QinQ
This issue can be simply reproduced by:
# ip link add dummy0 type dummy
# ip link add link dummy0 name dummy0.1 type vlan id 1
# ip link add link dummy0.1 name dummy0.1.2 type vlan id 2
# rmmod 8021q
unregister_netdevice: waiting for dummy0.1 to become free. Usage count = 1
So as to fix it, adjust vlan_dev_uninit() in Patch 1/1 so that it won't
be called twice for the same device, then do the fix in vlan_dev_uninit()
in Patch 2/2.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Shuang Li reported an QinQ issue by simply doing:
# ip link add dummy0 type dummy
# ip link add link dummy0 name dummy0.1 type vlan id 1
# ip link add link dummy0.1 name dummy0.1.2 type vlan id 2
# rmmod 8021q
unregister_netdevice: waiting for dummy0.1 to become free. Usage count = 1
When rmmods 8021q, all vlan devs are deleted from their real_dev's vlan grp
and added into list_kill by unregister_vlan_dev(). dummy0.1 is unregistered
before dummy0.1.2, as it's using for_each_netdev() in __rtnl_kill_links().
When unregisters dummy0.1, dummy0.1.2 is not unregistered in the event of
NETDEV_UNREGISTER, as it's been deleted from dummy0.1's vlan grp. However,
due to dummy0.1.2 still holding dummy0.1, dummy0.1 will keep waiting in
netdev_wait_allrefs(), while dummy0.1.2 will never get unregistered and
release dummy0.1, as it delays dev_put until calling dev->priv_destructor,
vlan_dev_free().
This issue was introduced by Commit 563bcbae3ba2 ("net: vlan: fix a UAF in
vlan_dev_real_dev()"), and this patch is to fix it by moving dev_put() into
vlan_dev_uninit(), which is called after NETDEV_UNREGISTER event but before
netdev_wait_allrefs().
Fixes: 563bcbae3ba2 ("net: vlan: fix a UAF in vlan_dev_real_dev()")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch is to introduce vlan_dev_free_egress_priority() to
free egress priority for vlan dev, and keep vlan_dev_uninit()
static as .ndo_uninit. It makes the code more clear and safer
when adding new code in vlan_dev_uninit() in the future.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On ppc64le architecture __s64 is long int and requires %ld. Cast to
ssize_t and use %zd to avoid architecture-specific specifiers.
Fixes: 4172843ed4a3 ("libbpf: Fix signedness bug in btf_dump_array_data()")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220209063909.1268319-1-andrii@kernel.org
The ax25_kill_by_device() will set s->ax25_dev = NULL and
call ax25_disconnect() to change states of ax25_cb and
sock, if we call ax25_bind() before ax25_kill_by_device().
However, if we call ax25_bind() again between the window of
ax25_kill_by_device() and ax25_dev_device_down(), the values
and states changed by ax25_kill_by_device() will be reassigned.
Finally, ax25_dev_device_down() will deallocate net_device.
If we dereference net_device in syscall functions such as
ax25_release(), ax25_sendmsg(), ax25_getsockopt(), ax25_getname()
and ax25_info_show(), a UAF bug will occur.
One of the possible race conditions is shown below:
(USE) | (FREE)
ax25_bind() |
| ax25_kill_by_device()
ax25_bind() |
ax25_connect() | ...
| ax25_dev_device_down()
| ...
| dev_put_track(dev, ...) //FREE
ax25_release() | ...
ax25_send_control() |
alloc_skb() //USE |
the corresponding fail log is shown below:
===============================================================
BUG: KASAN: use-after-free in ax25_send_control+0x43/0x210
...
Call Trace:
...
ax25_send_control+0x43/0x210
ax25_release+0x2db/0x3b0
__sock_release+0x6d/0x120
sock_close+0xf/0x20
__fput+0x11f/0x420
...
Allocated by task 1283:
...
__kasan_kmalloc+0x81/0xa0
alloc_netdev_mqs+0x5a/0x680
mkiss_open+0x6c/0x380
tty_ldisc_open+0x55/0x90
...
Freed by task 1969:
...
kfree+0xa3/0x2c0
device_release+0x54/0xe0
kobject_put+0xa5/0x120
tty_ldisc_kill+0x3e/0x80
...
In order to fix these UAF bugs caused by rebinding operation,
this patch adds dev_hold_track() into ax25_bind() and
corresponding dev_put_track() into ax25_kill_by_device().
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Provide generic selftest support. Tested with LAN9500 and LAN9512.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
do_div() does a 64-by-32 division.
When the divisor is u64, do_div() truncates it to 32 bits, this means it
can test non-zero and be truncated to zero for division.
fix do_div.cocci warning:
do_div() does a 64-by-32 division, please consider using div64_u64 instead.
Signed-off-by: Wang Qing <wangqing@vivo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now we can use the enetc_cbd_alloc_data_mem() to replace complicated DMA
data alloc method and CBDR memory basic seting.
Signed-off-by: Po Liu <po.liu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Separate the CBDR data memory alloc standalone. It is convenient for
other part loading, for example the ENETC QOS part.
Reported-and-suggested-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Po Liu <po.liu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To replace the dma_map_single() stream DMA mapping with DMA coherent
method dma_alloc_coherent() which is more simple.
dma_map_single() found by Tim Gardner not proper. Suggested by Claudiu
Manoil and Jakub Kicinski to use dma_alloc_coherent(). Discussion at:
https://lore.kernel.org/netdev/AM9PR04MB8397F300DECD3C44D2EBD07796BD9@AM9PR04MB8397.eurprd04.prod.outlook.com/t/
Fixes: 888ae5a3952ba ("net: enetc: add tc flower psfp offload driver")
cc: Claudiu Manoil <claudiu.manoil@nxp.com>
Reported-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Po Liu <po.liu@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rafael reports that on a system with LX2160A and Marvell DSA switches,
if a reboot occurs while the DSA master (dpaa2-eth) is up, the following
panic can be seen:
systemd-shutdown[1]: Rebooting.
Unable to handle kernel paging request at virtual address 00a0000800000041
[00a0000800000041] address between user and kernel address ranges
Internal error: Oops: 96000004 [#1] PREEMPT SMP
CPU: 6 PID: 1 Comm: systemd-shutdow Not tainted 5.16.5-00042-g8f5585009b24 #32
pc : dsa_slave_netdevice_event+0x130/0x3e4
lr : raw_notifier_call_chain+0x50/0x6c
Call trace:
dsa_slave_netdevice_event+0x130/0x3e4
raw_notifier_call_chain+0x50/0x6c
call_netdevice_notifiers_info+0x54/0xa0
__dev_close_many+0x50/0x130
dev_close_many+0x84/0x120
unregister_netdevice_many+0x130/0x710
unregister_netdevice_queue+0x8c/0xd0
unregister_netdev+0x20/0x30
dpaa2_eth_remove+0x68/0x190
fsl_mc_driver_remove+0x20/0x5c
__device_release_driver+0x21c/0x220
device_release_driver_internal+0xac/0xb0
device_links_unbind_consumers+0xd4/0x100
__device_release_driver+0x94/0x220
device_release_driver+0x28/0x40
bus_remove_device+0x118/0x124
device_del+0x174/0x420
fsl_mc_device_remove+0x24/0x40
__fsl_mc_device_remove+0xc/0x20
device_for_each_child+0x58/0xa0
dprc_remove+0x90/0xb0
fsl_mc_driver_remove+0x20/0x5c
__device_release_driver+0x21c/0x220
device_release_driver+0x28/0x40
bus_remove_device+0x118/0x124
device_del+0x174/0x420
fsl_mc_bus_remove+0x80/0x100
fsl_mc_bus_shutdown+0xc/0x1c
platform_shutdown+0x20/0x30
device_shutdown+0x154/0x330
__do_sys_reboot+0x1cc/0x250
__arm64_sys_reboot+0x20/0x30
invoke_syscall.constprop.0+0x4c/0xe0
do_el0_svc+0x4c/0x150
el0_svc+0x24/0xb0
el0t_64_sync_handler+0xa8/0xb0
el0t_64_sync+0x178/0x17c
It can be seen from the stack trace that the problem is that the
deregistration of the master causes a dev_close(), which gets notified
as NETDEV_GOING_DOWN to dsa_slave_netdevice_event().
But dsa_switch_shutdown() has already run, and this has unregistered the
DSA slave interfaces, and yet, the NETDEV_GOING_DOWN handler attempts to
call dev_close_many() on those slave interfaces, leading to the problem.
The previous attempt to avoid the NETDEV_GOING_DOWN on the master after
dsa_switch_shutdown() was called seems improper. Unregistering the slave
interfaces is unnecessary and unhelpful. Instead, after the slaves have
stopped being uppers of the DSA master, we can now reset to NULL the
master->dsa_ptr pointer, which will make DSA start ignoring all future
notifier events on the master.
Fixes: 0650bf52b31f ("net: dsa: be compatible with masters which unregister on shutdown")
Reported-by: Rafael Richter <rafael.richter@gin.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ioana Ciornei says:
====================
dpaa2-eth: add support for software TSO
This series adds support for driver level TSO in the dpaa2-eth driver.
The first 5 patches lay the ground work for the actual feature:
rearrange some variable declaration, cleaning up the interraction with
the S/G Table buffer cache etc.
The 6th patch adds the actual driver level software TSO support by using
the usual tso_build_hdr()/tso_build_data() APIs and creates the S/G FDs.
With this patch set we can see the following improvement in a TCP flow
running on a single A72@2.2GHz of the LX2160A SoC:
before: 6.38Gbit/s
after: 8.48Gbit/s
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Once we added support in the dpaa2-eth for driver level software TSO we
observed the following situation: if the EQCR CI (consumer index) is
read from the cache-enabled area we sometimes end up with a computed
value of available enqueue entries bigger than the size of the ring.
This eventually will lead to the multiple enqueue of the same FD which
will determine the same FD to end up on the Tx confirmation path and the
same skb being freed twice.
Just read the consumer index from the cache inhibited area so that we
avoid this situation.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for driver level TSO in the enetc driver using
the TSO API.
There is not much to say about this specific implementation. We are
using the usual tso_build_hdr(), tso_build_data() to create each data
segment, we create an array of S/G FDs where the first S/G entry is
referencing the header data and the remaining ones the data portion.
For the S/G Table buffer we use the same cache of buffers used on the
other non-GSO cases - dpaa2_eth_sgt_get() and dpaa2_eth_sgt_recycle().
We cannot keep a DMA coherent buffer for all the TSO headers because the
DPAA2 architecture does not work in a ring based fashion so we just
allocate a buffer each time.
Even with these limitations we get the following improvement in TCP
termination on the LX2160A SoC, on a single A72 core running at 2.2GHz.
before: 6.38Gbit/s
after: 8.48Gbit/s
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Up until now, the __dpaa2_eth_tx function used a single FD on the stack
to construct the structure to be enqueued. Since we are now preparing
the ground work to add support for TSO done in software at the driver
level, the same function needs to work with an array of FDs and enqueue
as many as the build_*_fd functions create.
Make the necessary adjustments in order to do this. These include:
keeping an array of FDs in a percpu structure, cleaning up the necessary
FDs before populating it and then, retrying the enqueue process up till
all the generated FDs were enqueued or until we reach the maximum number
retries.
This patch does not change the fact that only a single FD will result
from a __dpaa2_eth_tx call but rather just creates the necessary changes
for the next patch.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of allocating memory for an S/G table each time a nonlinear skb
is processed, and then freeing it on the Tx confirmation path, use the
S/G table cache in order to reuse the memory.
For this to work we have to change the size of the cached buffers so
that it can hold the maximum number of scatterlist entries.
Other than that, each allocate/free call is replaced by a call to the
dpaa2_eth_sgt_get/dpaa2_eth_sgt_recycle functions, introduced in the
previous patch.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The dpaa2-eth driver uses in certain circumstances a buffer cache for
the S/G tables needed in case of a S/G FD. At the moment, the
interraction with the cache is open-coded and couldn't be reused easily.
Add two new functions - dpaa2_eth_sgt_get and dpaa2_eth_sgt_recycle -
which help with code reusability.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of allocating memory and then manually aligning it to the
desired value use napi_alloc_frag_align() directly to streamline the
process.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the next patches we'll be moving things arroung in the mentioned
function and also add some new variable declarations. Before all this,
cleanup the variable declaration order.
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hariprasad Kelam says:
====================
Priority flow control support for RVU netdev
In network congestion, instead of pausing all traffic on link
PFC allows user to selectively pause traffic according to its
class. This series of patches add support of PFC for RVU netdev
drivers.
Patch1 adds support to disable pause frames by default as
with PFC user can enable either PFC or 802.3 pause frames.
Patch2&3 adds resource management support for flow control
and configures necessary registers for PFC.
Patch4 adds dcb ops registration for netdev drivers.
V2 changes:
Fix compilation error by exporting required symbols 'otx2_config_pause_frm'
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Data centric bridging designed to eliminate packet loss due to
queue overflow by adding enhancements to ethernet network such as
proprity flow control etc. This patch adds support for management
of Priority flow control(PFC) on Octeontx2 and CN10K interfaces.
To enable PFC for all priorities
dcb pfc set dev eth0 prio-pfc all:on/off
To enable PFC on selected priorites
dcb pfc set dev eth0 prio-pfc 0:on/off 1:on/off ..7:on/off
With the ntuple commands user can map Priority to receive queues.
On queue overflow NIX will assert backpressure such that PFC pause frames
are genarated with mapped priority.
To map priority 7 to Queue 1
ethtool -U eth0 flow-type ether dst xx:xx:xx:xx:xx:xx vlan 0xe00a
m 0x1fff queue 1
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
CN10K MAC block (RPM) and Octeontx2 MAC block (CGX) both supports
PFC flow control and 802.3X flow control pause frames.
Each MAC block supports max 4 LMACS and AF driver assigns same
(MAC,LMAC) to PF and its VFs. As PF and its share same (MAC,LMAC)
pair we need resource management to address below scenarios
1. Maintain PFC and 8023X pause frames mutually exclusive.
2. Reject disable flow control request if other PF or Vfs
enabled it.
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Prirority based flow control (802.1Qbb) mechanism is similar to
ethernet pause frames (802.3x) instead pausing all traffic on a link,
PFC allows user to selectively pause traffic according to its class.
Oceteontx2 MAC block (CGX) and CN10K Mac block (RPM) both supports
PFC. As upper layer mbox handler is same for both the MACs, this
patch configures PFC by calling apporopritate callbacks.
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Current implementation is such that 802.3x pause frames are
enabled by default. As CGX and RPM blocks support PFC
(priority flow control) also, instead of driver enabling one
between them enable them upon request from PF or its VFs.
Also add support to disable pause frames in driver unbind.
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Originally we proposed a new hdmi-5v-supply regulator reference
for CI20 device tree but that was superseded by a better idea to use
the already defined "ddc-en-gpios" property of the "hdmi-connector".
Since "MIPS: DTS: CI20: Add DT nodes for HDMI setup" has already
been applied to v5.17-rc1, we add this on top.
Fixes: ae1b8d2c2de9 ("MIPS: DTS: CI20: Add DT nodes for HDMI setup")
Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com>
Reviewed-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Hardware interrupts are enabled during the pci probe, however,
they are not disabled during pci removal.
Disable all hardware interrupts during pci removal to avoid any
issues.
Fixes: e75377404726 ("amd-xgbe: Update PCI support to use new IRQ functions")
Suggested-by: Selwin Sebastian <Selwin.Sebastian@amd.com>
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It would be easy to craft a message containing an illegal binding table
update operation. This is handled correctly by the code, but the
corresponding warning printout is not rate limited as is should be.
We fix this now.
Fixes: b97bf3fd8f6a ("[TIPC] Initial merge")
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix loading of the driver when built as a module.
Fixes: f160e99462c6 ("net: phy: Add mdio-aspeed")
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEBsvAIBsPu6mG7thcrX5LkNig010FAmIDdE4THG1rbEBwZW5n
dXRyb25peC5kZQAKCRCtfkuQ2KDTXdbzB/9aYlSVCqS1tKOS57MmXBam1OrD5M6Y
R0PuMISj8zUvwKp/nLfTGG6MyPtlxLh5qK5Wu9Eq38jfKlHh64XhFbmZmVwc34ji
JzZO6znObncaWsDAUSIdv6uDB7OIuGQ94xgUXBvTkXhXnBn9P8kEWPKV6F1jSzGS
fpE3WVlRLq5DT322B51O/QXn4ET9bNacEIX9Y+teU8YkTqRowokkCwTgGt7PHo7M
M2f4ojHGUaABzkpyhK7eUOmGcmZYdQF84MrQhkQ+CF1NRzBnulH6NIJr5/PRCmy0
PWVDfnxBgOjNWSIMwQ9Mwg70GZBEoKlcByYLbU0tkA27CN/1Nwio8RkY
=21SZ
-----END PGP SIGNATURE-----
Merge tag 'linux-can-fixes-for-5.17-20220209' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2022-02-09
this is a pull request of 2 patches for net/master.
Oliver Hartkopp contributes 2 fixes for the CAN ISOTP protocol.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeremy Kerr says:
====================
MCTP tag control interface
This series implements a small interface for userspace-controlled
message tag allocation for the MCTP protocol. Rather than leaving the
kernel to allocate per-message tag values, userspace can explicitly
allocate (and release) message tags through two new ioctls:
SIOCMCTPALLOCTAG and SIOCMCTPDROPTAG.
In order to do this, we first introduce some minor changes to the tag
handling, including a couple of new tests for the route input paths.
As always, any comments/queries/etc are most welcome.
v2:
- make mctp_lookup_prealloc_tag static
- minor checkpatch formatting fixes
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This change adds a couple of new ioctls for mctp sockets:
SIOCMCTPALLOCTAG and SIOCMCTPDROPTAG. These ioctls provide facilities
for explicit allocation / release of tags, overriding the automatic
allocate-on-send/release-on-reply and timeout behaviours. This allows
userspace more control over messages that may not fit a simple
request/response model.
In order to indicate a pre-allocated tag to the sendmsg() syscall, we
introduce a new flag to the struct sockaddr_mctp.smctp_tag value:
MCTP_TAG_PREALLOC.
Additional changes from Jeremy Kerr <jk@codeconstruct.com.au>.
Contains a fix that was:
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Matt Johnston <matt@codeconstruct.com.au>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, we require an exact match on an incoming packet's dest
address, and the key's local_addr field.
In a future change, we may want to set up a key before packets are
routed, meaning we have no local address to match on.
This change allows key lookups to match on local_addr = MCTP_ADDR_ANY.
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, we have a couple of paths that check that an EID matches, or
the match value is MCTP_ADDR_ANY.
Rather than open coding this, add a little helper.
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>