1124527 Commits

Author SHA1 Message Date
Alex Elder
216b409d09 net: ipa: define more IPA endpoint register fields
Define the fields for the ENDP_INIT_MODE, ENDP_INIT_AGGR,
ENDP_INIT_HOL_BLOCK_EN, and ENDP_INIT_HOL_BLOCK_TIMER IPA
registers for all supported IPA versions.

Create enumerated types to identify fields for these IPA registers.
Use IPA_REG_STRIDE_FIELDS() to specify the field mask values defined
for these registers, for each supported version of IPA.

Change aggr_time_limit_encode() and hol_block_timer_encode() so they
take an ipa_reg pointer, and use those register's fields to compute
their encoded results.  Have aggr_time_limit_encode() take an IPA
pointer rather than version, to match hol_block_timer_encode().

Use ipa_reg_encode(), ipa_reg_bit(), and ipa_reg_field_max() to
manipulate values to be written to these registers, remove the
definitions of the various inline functions and *_FMASK symbols that
are now no longer used.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:51 -07:00
Alex Elder
4468a3448b net: ipa: define some IPA endpoint register fields
Define the fields for the ENDP_INIT_CTRL, ENDP_INIT_CFG, ENDP_INIT_NAT,
ENDP_INIT_HDR, and ENDP_INIT_HDR_EXT IPA registers for all supported
IPA versions.

Create enumerated types to identify fields for these IPA registers.
Use IPA_REG_STRIDE_FIELDS() to specify the field mask values defined
for these registers, for each supported version of IPA.

Move ipa_header_size_encoded() and ipa_metadata_offset_encoded() out
of "ipa_reg.h" and into "ipa_endpoint.c".  Change them so they take
an additional ipa_reg structure argument, and use ipa_reg_encode()
to encode the parts of the header size and offset prior to writing
to the register.  Change their names to be verbs rather than nouns.

Use ipa_reg_encode(), ipa_reg_bit, and ipa_reg_field_max() to
manipulate values to be written to these registers, remove the
definition of the no-longer-used *_FMASK symbols.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:51 -07:00
Alex Elder
1c418c4a92 net: ipa: define resource group/type IPA register fields
Define the fields for the {SRC,DST}_RSRC_GRP_{01,23,45,67}_RSRC_TYPE
IPA registers for all supported IPA versions.

Create enumerated types to identify fields for these IPA registers.
Use IPA_REG_STRIDE_FIELDS() to specify the field mask values defined
for these registers, for each supported version of IPA.

Use ipa_reg_encode() to build up the values to be written to these
registers.

Remove the definition of the no-longer-used *_FMASK symbols.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:51 -07:00
Alex Elder
9265a4f0f0 net: ipa: define even more IPA register fields
Define the fields for the FLAVOR_0, IDLE_INDICATION_CFG,
QTIME_TIMESTAMP_CFG, TIMERS_XO_CLK_DIV_CFG and TIMERS_PULSE_GRAN_CFG
IPA registers for all supported IPA versions.

Create enumerated types to identify fields for these IPA registers.
Use IPA_REG_FIELDS() to specify the field mask values defined for
these registers, for each supported version of IPA.

Use ipa_reg_bit() and ipa_reg_encode() to build up the values to be
written to these registers.  Use ipa_reg_decode() to extract field
values from the FLAVOR_0 register.

Remove the definition of the no-longer-used *_FMASK symbols.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:51 -07:00
Alex Elder
b5c35fa470 net: ipa: define more IPA register fields
Define the fields for the LOCAL_PKT_PROC_CNTXT, COUNTER_CFG, and
IPA_TX_CFG IPA registers for all supported IPA versions.

Create enumerated types to identify fields for these IPA registers.
Use IPA_REG_FIELDS() to specify the field mask values defined for
these registers, for each supported version of IPA.

Use ipa_reg_bit() and ipa_reg_encode() to build up the values to be
written to these registers.  Remove the definition of the *_FMASK
symbols as well as proc_cntxt_base_addr_encoded(), because they are
no longer needed.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:51 -07:00
Alex Elder
62b9c009a8 net: ipa: define some more IPA register fields
Define the fields for the SHARED_MEM_SIZE, QSB_MAX_WRITES,
QSB_MAX_READS, FILT_ROUT_HASH_EN, and FILT_ROUT_HASH_FLUSH IPA
registers for all supported IPA versions.

Create enumerated types to identify fields for these registers.  Use
IPA_REG_FIELDS() to specify the field mask values defined for these
registers, for each supported version of IPA.

Use ipa_reg_bit() and ipa_reg_encode() to build up the values to be
written to these registers rather than using the *_FMASK
preprocessor symbols.

Remove the definition of the now unused *_FMASK symbols.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:50 -07:00
Alex Elder
479deb3298 net: ipa: define CLKON_CFG and ROUTE IPA register fields
Create the ipa_reg_clkon_cfg_field_id enumerated type, which
identifies the fields for the CLKON_CFG IPA register.  Add "CLKON_"
to a few short names to try to avoid name conflicts.  Create the
ipa_reg_route_field_id enumerated type, which identifies the fields
for the ROUTE IPA register.

Use IPA_REG_FIELDS() to specify the field mask values defined for
these registers, for each supported version of IPA.

Use ipa_reg_bit() and ipa_reg_encode() to build up the values to be
written to these registers rather than using the *_FMASK
preprocessor symbols.

Remove the definition of the now unused *_FMASK symbols.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:50 -07:00
Alex Elder
12c7ea7dfd net: ipa: define COMP_CFG IPA register fields
Create the ipa_reg_comp_cfg_field_id enumerated type, which
identifies the fields for the COMP_CFG IPA register.

Use IPA_REG_FIELDS() to specify the field mask values defined for
this register, for each supported version of IPA.

Use ipa_reg_bit() to build up the value to be written to this
register rather than using the *_FMASK preprocessor symbols.

Remove the definition of the *_FMASK symbols, along with the inline
functions that were used to encode certain fields whose position
and/or width within the register was dependent on IPA version.

Take this opportunity to represent all one-bit fields using BIT(x)
rather than GENMASK(x, x).

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:50 -07:00
Alex Elder
a5ad8956f9 net: ipa: introduce ipa_reg field masks
Add register field descriptors to the ipa_reg structure.  A field in
a register is defined by a field mask, which is a 32-bit mask having
a single contiguous range of bits set.

For each register that has at least one field defined, an enumerated
type will identify the register's fields.  The ipa_reg structure for
that register will include an array fmask[] of field masks, indexed
by that enumerated type.  Each field mask defines the position and
bit width of a field.  An additional "fcount" records how many
fields (masks) are defined for a given register.

Introduce two macros to be used to define registers that have at
least one field.

Introduce a few new functions related to field masks.  The first
simply returns a field mask, given an IPA register pointer and field
mask ID.  A variant of that is meant to be used for the special case
of single-bit field masks.

Next, ipa_reg_encode(), identifies a field with an IPA register
pointer and a field ID, and takes a value to represent in that
field.  The result encodes the value in the appropriate place to be
stored in the register.  This is roughly modeled after the bitmask
operations (like u32_encode_bits()).

Another function (ipa_reg_decode()) similarly identifies a register
field, but the value supplied to it represents a full register
value.  The value encoded in the field is extracted from the value
and returned.  This is also roughly modeled after bitmask operations
(such as u32_get_bits()).

Finally, ipa_reg_field_max() returns the maximum value representable
by a field.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:50 -07:00
Alex Elder
6a244b75cf net: ipa: introduce ipa_reg()
Create a new function that returns a register descriptor given its
ID.  Change ipa_reg_offset() and ipa_reg_n_offset() so they take a
register descriptor argument rather than an IPA pointer and register
ID.  Have them accept null pointers (and return an invalid 0 offset),
to avoid the need for excessive error checking.  (A warning is issued
whenever ipa_reg() returns 0).

Call ipa_reg() or ipa_reg_n() to look up information about the
register before calls to ipa_reg_offset() and ipa_reg_n_offset().
Delay looking up offsets until they're needed to read or write
registers.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:50 -07:00
Alex Elder
82a0680745 net: ipa: use ipa_reg[] array for register offsets
Use the array of register descriptors assigned at initialization
time to determine the offset (and where used, stride) for IPA
registers.  Issue a warning if an offset is requested for a register
that's not valid for the current system.

Remove all IPE_REG_*_OFFSET macros, as well as inline static
functions that returned register offsets.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:50 -07:00
Alex Elder
07f120bcf7 net: ipa: add per-version IPA register definition files
Create a new subdirectory "reg", which contains a register
definition file for each supported version of IPA.  Each register
definition contains the register's offset, and for parameterized
registers, the stride (distance between consecutive instances of the
register).  Finally, it includes an all-caps printable register name.

In these files, each IPA version defines an array of IPA register
definition pointers, with unsupported registers defined with a null
pointer.  The array is indexed by the ipa_reg_id enumerated type.

At initialization time, the appropriate register definition array to
use is selected based on the IPA version, and assigned to a new
"regs" field in the IPA structure.

Extend ipa_reg_valid() so it fails if a valid register is not
defined.

This patch simply puts this infrastructure in place; the next will
use it.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:49 -07:00
Alex Elder
6bfb753850 net: ipa: use IPA register IDs to determine offsets
Expose two inline functions that return the offset for a register
whose ID is provided; one of them takes an additional argument
that's used for registers that are parameterized.  These both use
a common helper function __ipa_reg_offset(), which just uses the
offset symbols already defined.

Replace all references to the offset macros defined for IPA
registers with calls to ipa_reg_offset() or ipa_reg_n_offset().

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:49 -07:00
Alex Elder
98e2dd71a8 net: ipa: introduce IPA register IDs
Create a new ipa_reg_id enumerated type, which identifies each IPA
register with a symbolic identifier.  Use short names, but in some
cases (such as "BCR") add "IPA_" to the name to help avoid name
conflicts.

Create two functions that indicate register validity.  The first
concisely indicates whether a register is valid for a given version
of IPA, and if so, whether it is defined.  The second indicates
whether a register is valid for TX or RX endpoints.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:42:49 -07:00
Kees Cook
8f1e1658d3 s390/qeth: Split memcpy() of struct qeth_ipacmd_addr_change flexible array
To work around a misbehavior of the compiler's ability to see into
composite flexible array structs (as detailed in the coming memcpy()
hardening series[1]), split the memcpy() of the header and the payload
so no false positive run-time overflow warning will be generated.

[1] https://lore.kernel.org/linux-hardening/20220901065914.1417829-2-keescook@chromium.org/

Cc: Wenjia Zhang <wenjia@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Link: https://lore.kernel.org/r/20220927003953.1942442-1-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:13:44 -07:00
Donald Hunter
0d92efdee9 Add skb drop reasons to IPv6 UDP receive path
Enumerate the skb drop reasons in the receive path for IPv6 UDP packets.

Signed-off-by: Donald Hunter <donald.hunter@redhat.com>
Link: https://lore.kernel.org/r/20220926120350.14928-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 18:05:03 -07:00
Bo Liu
ab7ea1e735 ptp: Remove usage of the deprecated ida_simple_xxx API
Use ida_alloc_xxx()/ida_free() instead of
ida_simple_get()/ida_simple_remove().
The latter is deprecated and more verbose.

Signed-off-by: Bo Liu <liubo03@inspur.com>
Link: https://lore.kernel.org/r/20220926012744.3363-1-liubo03@inspur.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 17:33:11 -07:00
Taehee Yoo
62e56ef57c net: tls: Add ARIA-GCM algorithm
RFC 6209 describes ARIA for TLS 1.2.
ARIA-128-GCM and ARIA-256-GCM are defined in RFC 6209.

This patch would offer performance increment and an opportunity for
hardware offload.

Benchmark results:
iperf-ssl are used.
CPU: intel i3-12100.

  TLS(openssl-3.0-dev)
[  3]  0.0- 1.0 sec   185 MBytes  1.55 Gbits/sec
[  3]  1.0- 2.0 sec   186 MBytes  1.56 Gbits/sec
[  3]  2.0- 3.0 sec   186 MBytes  1.56 Gbits/sec
[  3]  3.0- 4.0 sec   186 MBytes  1.56 Gbits/sec
[  3]  4.0- 5.0 sec   186 MBytes  1.56 Gbits/sec
[  3]  0.0- 5.0 sec   927 MBytes  1.56 Gbits/sec
  kTLS(aria-generic)
[  3]  0.0- 1.0 sec   198 MBytes  1.66 Gbits/sec
[  3]  1.0- 2.0 sec   194 MBytes  1.62 Gbits/sec
[  3]  2.0- 3.0 sec   194 MBytes  1.63 Gbits/sec
[  3]  3.0- 4.0 sec   194 MBytes  1.63 Gbits/sec
[  3]  4.0- 5.0 sec   194 MBytes  1.62 Gbits/sec
[  3]  0.0- 5.0 sec   974 MBytes  1.63 Gbits/sec
  kTLS(aria-avx wirh GFNI)
[  3]  0.0- 1.0 sec   632 MBytes  5.30 Gbits/sec
[  3]  1.0- 2.0 sec   657 MBytes  5.51 Gbits/sec
[  3]  2.0- 3.0 sec   657 MBytes  5.51 Gbits/sec
[  3]  3.0- 4.0 sec   656 MBytes  5.50 Gbits/sec
[  3]  4.0- 5.0 sec   656 MBytes  5.50 Gbits/sec
[  3]  0.0- 5.0 sec  3.18 GBytes  5.47 Gbits/sec

Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Reviewed-by: Vadim Fedorenko <vfedorenko@novek.ru>
Link: https://lore.kernel.org/r/20220925150033.24615-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 17:29:09 -07:00
Bhupesh Sharma
c64655f32f net: stmmac: Minor spell fix related to 'stmmac_clk_csr_set()'
Minor spell fix related to 'stmmac_clk_csr_set()' inside a
comment used in the 'stmmac_probe_config_dt()' function.

Cc: Biao Huang <biao.huang@mediatek.com>
Signed-off-by: Bhupesh Sharma <bhupesh.sharma@linaro.org>
Link: https://lore.kernel.org/r/20220924104514.1666947-1-bhupesh.sharma@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 17:26:56 -07:00
Pavel Begunkov
7bcd9683e5 selftests/net: enable io_uring sendzc testing
d8b6171bd58a5 ("selftests/io_uring: test zerocopy send") added io_uring
zerocopy tests but forgot to enable it in make runs. Add missing
io_uring_zerocopy_tx.sh into TEST_PROGS.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Link: https://lore.kernel.org/r/28e743602cdd54ffc49f68bbcbcbafc59ba22dc2.1664142210.git.asml.silence@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 07:59:57 -07:00
Jakub Kicinski
9257f69273 Merge branch 'devlink-fix-order-of-port-and-netdev-register-in-drivers'
Jiri Pirko says:

====================
devlink: fix order of port and netdev register in drivers

Some of the drivers use wrong order in registering devlink port and
netdev, registering netdev first. That was not intended as the devlink
port is some sort of parent for the netdev. Fix the ordering.

Note that the follow-up patchset is going to make this ordering
mandatory.
====================

Link: https://lore.kernel.org/r/20220926110938.2800005-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 07:54:48 -07:00
Jiri Pirko
1fd7c08286 ionic: change order of devlink port register and netdev register
Make sure that devlink port is registered first and register netdev
after. Unregister netdev before devlnk port unregister.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 07:54:45 -07:00
Jiri Pirko
a286ba7387 ice: reorder PF/representor devlink port register/unregister flows
Make sure that netdevice is registered/unregistered while devlink port
is registered.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 07:54:45 -07:00
Jiri Pirko
dfe6094914 funeth: unregister devlink port after netdevice unregister
Fix the order of destroy_netdev() flow and unregister the devlink port
after calling unregister_netdev().

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 07:54:44 -07:00
Christophe JAILLET
73dfe93ea1 headers: Remove some left-over license text
Remove some left-over from commit e2be04c7f995 ("License cleanup: add SPDX
license identifier to uapi header files with a license")

When the SPDX-License-Identifier tag has been added, the corresponding
license text has not been removed.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Alexander Duyck <alexanderduyck@fb.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/88410cddd31197ea26840d7dd71612bece8c6acf.1663871981.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 07:48:01 -07:00
Kees Cook
de4feb4e3d NFC: hci: Split memcpy() of struct hcp_message flexible array
To work around a misbehavior of the compiler's ability to see into
composite flexible array structs (as detailed in the coming memcpy()
hardening series[1]), split the memcpy() of the header and the payload
so no false positive run-time overflow warning will be generated. This
split already existed for the "firstfrag" case, so just generalize the
logic further.

[1] https://lore.kernel.org/linux-hardening/20220901065914.1417829-2-keescook@chromium.org/

Cc: Eric Dumazet <edumazet@google.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Reported-by: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20220924040835.3364912-1-keescook@chromium.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 07:45:18 -07:00
Daniel Golle
454b20e193 net: ethernet: mtk_eth_soc: fix usage of foe_entry_size
As sizeof(hwe->data) can now longer be used as the actual size depends
on foe_entry_size, in commit 9d8cb4c096ab02
("net: ethernet: mtk_eth_soc: add foe_entry_size to mtk_eth_soc") the
use of sizeof(hwe->data) is hence replaced.

However, replacing it with ppe->eth->soc->foe_entry_size is wrong as
foe_entry_size represents the size of the whole descriptor and not just
the 'data' field.

Fix this by subtracing the size of the only other field in the struct
'ib1', so we actually end up with the correct size to be copied to the
data field.

Reported-by: Chen Minqiang <ptpt52@gmail.com>
Fixes: 9d8cb4c096ab02 ("net: ethernet: mtk_eth_soc: add foe_entry_size to mtk_eth_soc")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/YzBqPIgQR2gLrPoK@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 07:43:54 -07:00
Daniel Golle
fb7da771bc net: ethernet: mtk_eth_soc: fix wrong use of new helper function
In function mtk_foe_entry_set_vlan() the call to field accessor macro
FIELD_GET(MTK_FOE_IB1_BIND_VLAN_LAYER, entry->ib1)
has been wrongly replaced by
mtk_prep_ib1_vlan_layer(eth, entry->ib1)

Use correct helper function mtk_get_ib1_vlan_layer instead.

Reported-by: Chen Minqiang <ptpt52@gmail.com>
Fixes: 03a3180e5c09e1 ("net: ethernet: mtk_eth_soc: introduce flow offloading support for mt7986")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Acked-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/YzBp+Kk04CFDys4L@makrotopia.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-27 07:43:54 -07:00
Paolo Abeni
5591b021e0 Merge branch 'net-openvswitch-metering-and-conntrack-in-userns'
Michael Weiß says:

====================
net: openvswitch: metering and conntrack in userns

Currently using openvswitch in a non-initial user namespace, e.g., an
unprivileged container, is possible but without metering and conntrack
support. This is due to the restriction of the corresponding Netlink
interfaces to the global CAP_NET_ADMIN.

This simple patches switch from GENL_ADMIN_PERM to GENL_UNS_ADMIN_PERM
in several cases to allow this also for the unprivileged container
use case.

We tested this for unprivileged containers created by the container
manager of GyroidOS (gyroidos.github.io). However, for other container
managers such as LXC or systemd which provide unprivileged containers
this should be apply equally.
====================

Link: https://lore.kernel.org/r/20220923133820.993725-1-michael.weiss@aisec.fraunhofer.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-27 11:31:54 +02:00
Michael Weiß
59cd737766 net: openvswitch: allow conntrack in non-initial user namespace
Similar to the previous commit, the Netlink interface of the OVS
conntrack module was restricted to global CAP_NET_ADMIN by using
GENL_ADMIN_PERM. This is changed to GENL_UNS_ADMIN_PERM to support
unprivileged containers in non-initial user namespace.

Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-27 11:31:36 +02:00
Michael Weiß
8039371847 net: openvswitch: allow metering in non-initial user namespace
The Netlink interface for metering was restricted to global CAP_NET_ADMIN
by using GENL_ADMIN_PERM. To allow metring in a non-inital user namespace,
e.g., a container, this is changed to GENL_UNS_ADMIN_PERM.

Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-27 11:31:36 +02:00
Tony Lu
6627a2074d net/smc: Support SO_REUSEPORT
This enables SO_REUSEPORT [1] for clcsock when it is set on smc socket,
so that some applications which uses it can be transparently replaced
with SMC. Also, this helps improve load distribution.

Here is a simple test of NGINX + wrk with SMC. The CPU usage is collected
on NGINX (server) side as below.

Disable SO_REUSEPORT:

05:15:33 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
05:15:34 PM  all    7.02    0.00   11.86    0.00    2.04    8.93    0.00    0.00    0.00   70.15
05:15:34 PM    0    0.00    0.00    0.00    0.00   16.00   70.00    0.00    0.00    0.00   14.00
05:15:34 PM    1   11.58    0.00   22.11    0.00    0.00    0.00    0.00    0.00    0.00   66.32
05:15:34 PM    2    1.00    0.00    1.00    0.00    0.00    0.00    0.00    0.00    0.00   98.00
05:15:34 PM    3   16.84    0.00   30.53    0.00    0.00    0.00    0.00    0.00    0.00   52.63
05:15:34 PM    4   28.72    0.00   44.68    0.00    0.00    0.00    0.00    0.00    0.00   26.60
05:15:34 PM    5    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
05:15:34 PM    6    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00
05:15:34 PM    7    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00    0.00  100.00

Enable SO_REUSEPORT:

05:15:20 PM  CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
05:15:21 PM  all    8.56    0.00   14.40    0.00    2.20    9.86    0.00    0.00    0.00   64.98
05:15:21 PM    0    0.00    0.00    4.08    0.00   14.29   76.53    0.00    0.00    0.00    5.10
05:15:21 PM    1    9.09    0.00   16.16    0.00    1.01    0.00    0.00    0.00    0.00   73.74
05:15:21 PM    2    9.38    0.00   16.67    0.00    1.04    0.00    0.00    0.00    0.00   72.92
05:15:21 PM    3   10.42    0.00   17.71    0.00    1.04    0.00    0.00    0.00    0.00   70.83
05:15:21 PM    4    9.57    0.00   15.96    0.00    0.00    0.00    0.00    0.00    0.00   74.47
05:15:21 PM    5    9.18    0.00   15.31    0.00    0.00    1.02    0.00    0.00    0.00   74.49
05:15:21 PM    6    8.60    0.00   15.05    0.00    0.00    0.00    0.00    0.00    0.00   76.34
05:15:21 PM    7   12.37    0.00   14.43    0.00    0.00    0.00    0.00    0.00    0.00   73.20

Using SO_REUSEPORT helps the load distribution of NGINX be more
balanced.

[1] https://man7.org/linux/man-pages/man7/socket.7.html

Signed-off-by: Tony Lu <tonylu@linux.alibaba.com>
Acked-by: Wenjia Zhang <wenjia@linux.ibm.com>
Link: https://lore.kernel.org/r/20220922121906.72406-1-tonylu@linux.alibaba.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-27 10:26:17 +02:00
Jakub Kicinski
5dcf41a8e9 Merge branch 'net-sunhme-cleanups-and-logging-improvements'
Sean Anderson says:

====================
net: sunhme: Cleanups and logging improvements

This series is a continuation of [1] with a focus on logging improvements (in
the style of commit b11e5f6a3a5c ("net: sunhme: output link status with a single
print.")). I have included several of Rolf's patches in the series where
appropriate (with slight modifications). After this series is applied, many more
messages from this driver will come with driver/device information.
Additionally, most messages (especially debug messages) have been condensed onto
one line (as KERN_CONT messages get split).

[1] https://lore.kernel.org/netdev/4686583.GXAFRqVoOG@eto.sf-tec.de/
====================

Link: https://lore.kernel.org/r/20220924015339.1816744-1-seanga2@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 17:45:41 -07:00
Sean Anderson
77ceb3731e sunhme: Add myself as a maintainer
I have the hardware so at the very least I can test things.

Signed-off-by: Sean Anderson <seanga2@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 17:45:38 -07:00
Sean Anderson
26657c70b9 sunhme: Use vdbg for spam-y prints
The SXD, TXD, and RXD macros are used only once (or twice). Just use the
vdbg print, which seems to have been devised for these sorts of very
verbose messages.

Signed-off-by: Sean Anderson <seanga2@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 17:45:38 -07:00
Sean Anderson
24cddbc3ef sunhme: Combine continued messages
This driver seems to have been written under the assumption that messages
can be continued arbitrarily. I'm not when this changed (if ever), but such
ad-hoc continuations are liable to be rudely interrupted. Convert all such
instances to single prints. This loses a bit of timing information (such as
when a line was constructed piecemeal as the function executed), but it's
easy to add a few prints if necessary. This also adds newlines to the ends
of any prints without them.

Since (almost every) debug print included the name of the function, include
it automatically.

Signed-off-by: Sean Anderson <seanga2@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 17:45:38 -07:00
Sean Anderson
8acf878f29 sunhme: Use (net)dev_foo wherever possible
Wherever possible, use the associated netdev (or device) when printing
errors or other messages. This makes it immediately clear what device
caused the error, and provides more information than just the device name.

Signed-off-by: Sean Anderson <seanga2@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 17:45:38 -07:00
Sean Anderson
0bc1f45410 sunhme: Convert printk(KERN_FOO ...) to pr_foo(...)
This is a mostly-mechanical translation of the existing printks into
pr_foos. In several places, I have pasted messages which were broken over
several lines to allow for easier grepping.

Signed-off-by: Sean Anderson <seanga2@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 17:45:37 -07:00
Sean Anderson
30931367ba sunhme: Clean up debug infrastructure
Remove all the single-use debug conditionals, and just collect the debug
defines at the top of the file. HMD seems like it is used for general debug
info, so just redefine it as pr_debug. Additionally, instead of using the
default loglevel, use the debug loglevel for debugging.

Signed-off-by: Sean Anderson <seanga2@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 17:45:37 -07:00
Sean Anderson
03290907a5 sunhme: Convert FOO((...)) to FOO(...)
With the power of variadic macros, double parentheses are unnecessary.

Signed-off-by: Sean Anderson <seanga2@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 17:45:37 -07:00
Rolf Eike Beer
914d9b2711 sunhme: switch to devres
This not only removes a lot of code, it also fixes the memleak of the DMA
memory when register_netdev() fails.

Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
[ rebased onto net-next/master; fixed error reporting ]
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 17:45:37 -07:00
Sean Anderson
5b3dc6dda6 sunhme: Regularize probe errors
This fixes several error paths to ensure they return an appropriate error
(instead of ENODEV).

Signed-off-by: Sean Anderson <seanga2@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 17:45:37 -07:00
Sean Anderson
d6f1e89bdb sunhme: Return an ERR_PTR from quattro_pci_find
In order to differentiate between a missing bridge and an OOM condition,
return ERR_PTRs from quattro_pci_find. This also does some general linting
in the area.

Signed-off-by: Sean Anderson <seanga2@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 17:45:36 -07:00
Rolf Eike Beer
acb3f35f92 sunhme: forward the error code from pci_enable_device()
This already returns a proper error value, so pass it to the caller.

Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 17:45:36 -07:00
Sean Anderson
6478c6e994 sunhme: Remove version
Module versions are not very useful:

> The basic problem is, the version string does not identify the sources
> with enough accuracy. It says nothing about back ported fixes in
> stable kernels. It tells you nothing about vendor patches to the
> network core, etc.

https://lore.kernel.org/all/Yf6mtvA1zO7cdzr7@lunn.ch/

While we're at it, inline the author and use the driver name a bit more.

Signed-off-by: Sean Anderson <seanga2@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 17:45:36 -07:00
Rolf Eike Beer
8247ab50c2 sunhme: remove unused tx_dump_ring()
I can't find a reference to it in the entire git history.

Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 17:45:36 -07:00
Jakub Kicinski
ebb410a03e Merge branch 'net-dsa-remove-unnecessary-i2c_set_clientdata'
Yang Yingliang says:

====================
net: dsa: remove unnecessary i2c_set_clientdata()

This patchset https://lore.kernel.org/all/20220921140524.3831101-8-yangyingliang@huawei.com/T/
removed all set_drvdata(NULL) in driver remove function.

i2c_set_clientdata() is another wrapper of set drvdata function, to follow
the same convention, remove i2c_set_clientdata() called in driver remove
function in drivers/net/dsa/.
====================

Link: https://lore.kernel.org/r/20220923143742.87093-1-yangyingliang@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 17:44:31 -07:00
Yang Yingliang
6387bf7c39 net: dsa: xrs700x: remove unnecessary i2c_set_clientdata()
Remove unnecessary i2c_set_clientdata() in ->remove(), the driver_data
will be set to NULL in device_unbind_cleanup() after calling ->remove().

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 17:44:28 -07:00
Yang Yingliang
008971adb9 net: dsa: microchip: ksz9477: remove unnecessary i2c_set_clientdata()
Remove unnecessary i2c_set_clientdata() in ->remove(), the driver_data
will be set to NULL in device_unbind_cleanup() after calling ->remove().

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 17:44:28 -07:00
Yang Yingliang
db5d451c46 net: dsa: lan9303: remove unnecessary i2c_set_clientdata()
Remove unnecessary i2c_set_clientdata() in ->remove(), the driver_data
will be set to NULL in device_unbind_cleanup() after calling ->remove().

Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 17:44:28 -07:00