1155686 Commits

Author SHA1 Message Date
Alex Elder
55c6eae70f net: ipa: add IPA v5.0 packet status support
Update ipa_status_extract() to support IPA v5.0 and beyond.  Because
the format of the IPA packet status depends on the version, pass an
IPA pointer to the function.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27 11:16:29 +00:00
Alex Elder
ebd2a82ece net: ipa: introduce generalized status decoder
Stop assuming the IPA packet status has a fixed format (defined by
a C structure).  Instead, use a function to extract each field from
a block of data interpreted as an IPA packet status.  Define an
enumerated type that identifies the fields that can be extracted.
The current function extracts fields based on the existing
ipa_status structure format (which is no longer used).

Define IPA_STATUS_RULE_MISS, to replace the calls to field_max() to
represent that condition; those depended on the knowing the width of
a filter or router rule in the IPA packet status structure.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27 11:16:29 +00:00
Alex Elder
02c5077439 net: ipa: IPA status preparatory cleanups
The next patch reworks how the IPA packet status structure is
interpreted.  This patch does some preparatory work, to make it
easier to see the effect of that change:
  - Change a few functions that access fields in a IPA packet status
    structure to store field values in local variables with names
    related to the field.
  - Pass a void pointer rather than an (equivalent) status pointer
    to two functions called by ipa_endpoint_status_parse().
  - Use "rule" rather than "val" as the name of a variable that
    holds a routing rule ID.
  - Consistently use "IPA packet status" rather than "status
    element" when referring to this data structure.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27 11:16:29 +00:00
Alex Elder
ec4c24f6a5 net: ipa: define remaining IPA status field values
Define the remaining values for opcode and exception fields in the
IPA packet status structure.  Most of these values are powers-of-2,
suggesting they are meant to be used as bitmasks, but that is not
the case.  Add comments to be clear about this, and express the
values in decimal format.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27 11:16:29 +00:00
Alex Elder
cbea476117 net: ipa: rename the NAT enumerated type
Rename the ipa_nat_en enumerated type to be ipa_nat_type, and rename
its symbols accordingly.  Add a comment indicating those values are
also used in the IPA status nat_type field.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27 11:16:29 +00:00
Alex Elder
8e71708bb2 net: ipa: define all IPA status mask bits
There is a 16 bit status mask defined in the IPA packet status
structure, of which only one (TAG_VALID) is currently used.

Define all other IPA status mask values in an enumerated type whose
numeric values are bit mask values (in CPU byte order) in the status
mask.  Use the TAG_VALID value from that type rather than defining a
separate field mask.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27 11:16:29 +00:00
Alex Elder
b8dc7d0eea net: ipa: stop using sizeof(status)
The IPA packet status structure changes in IPA v5.0 in ways that are
difficult to represent cleanly.  As a small step toward redefining
it as a parsed block of data, use a constant to define its size,
rather than the size of the IPA status structure type.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27 11:16:29 +00:00
Alex Elder
63a560b528 net: ipa: refactor status buffer parsing
The packet length encoded in an IPA packet status buffer is computed
more than once in ipa_endpoint_status_parse().  It is also checked
again in ipa_endpoint_status_skip(), which that function calls.

Compute the length once, and use that computed value later rather
than recomputing it.  Check for it being zero in the parse function
rather than in ipa_endpoint_status_skip().

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27 11:16:28 +00:00
Vladimir Oltean
c8005511f3 net: dsa: ocelot: build felix.c into a dedicated kernel module
The build system currently complains:

scripts/Makefile.build:252: drivers/net/dsa/ocelot/Makefile:
felix.o is added to multiple modules: mscc_felix mscc_seville

Since felix.c holds the DSA glue layer, create a mscc_felix_dsa_lib.ko.
This is similar to how mscc_ocelot_switch_lib.ko holds a library for
configuring the hardware.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Colin Foster <colin.foster@in-advantage.com>
Link: https://lore.kernel.org/r/20230125145716.271355-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 23:37:46 -08:00
Jakub Kicinski
82fe335b78 Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
virtchnl: update and refactor

Jesse Brandeburg says:

The virtchnl.h file is used by i40e/ice physical function (PF) drivers
and irdma when talking to the iavf driver. This series cleans up the
header file by removing unused elements, adding/cleaning some comments,
fixing the data structures so they are explicitly defined, including
padding, and finally does a long overdue rename of the IWARP members in
the structures to RDMA, since the ice driver and it's associated Intel
Ethernet E800 series adapters support both RDMA and IWARP.

The whole series should result in no functional change, but hopefully
clearer code.

* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  virtchnl: i40e/iavf: rename iwarp to rdma
  virtchnl: do structure hardening
  virtchnl: update header and increase header clarity
  virtchnl: remove unused structure declaration
====================

Link: https://lore.kernel.org/r/20230125212441.4030014-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 23:29:12 -08:00
Stanislav Fomichev
a5f3a3f7c1 selftests/bpf: Properly enable hwtstamp in xdp_hw_metadata
The existing timestamping_enable() is a no-op because it applies
to the socket-related path that we are not verifying here
anymore. (but still leaving the code around hoping we can
have xdp->skb path verified here as well)

  poll: 1 (0)
  xsk_ring_cons__peek: 1
  0xf64788: rx_desc[0]->addr=100000000008000 addr=8100 comp_addr=8000
  rx_hash: 3697961069
  rx_timestamp:  1674657672142214773 (sec:1674657672.1422)
  XDP RX-time:   1674657709561774876 (sec:1674657709.5618) delta sec:37.4196
  AF_XDP time:   1674657709561871034 (sec:1674657709.5619) delta
sec:0.0001 (96.158 usec)
  0xf64788: complete idx=8 addr=8000

Also, maybe something to archive here, see [0] for Jesper's note
about NIC vs host clock delta.

0: https://lore.kernel.org/bpf/f3a116dc-1b14-3432-ad20-a36179ef0608@redhat.com/

v2:
- Restore original value (Martin)

Fixes: 297a3f124155 ("selftests/bpf: Simple program to dump XDP RX metadata")
Reported-by: Jesper Dangaard Brouer <jbrouer@redhat.com>
Tested-by: Jesper Dangaard Brouer <jbrouer@redhat.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
Link: https://lore.kernel.org/r/20230126225030.510629-1-sdf@google.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-01-26 22:10:31 -08:00
Jakub Kicinski
0313afe8b8 Merge branch 'tools-ynl-prevent-reorder-and-fix-flags'
Jakub Kicinski says:

====================
tools: ynl: prevent reorder and fix flags

Some codegen improvements for YAML specs.

First, Lorenzon discovered when switching the XDP feature family
to use flags instead of pure enum that the kdoc got garbled.
The support for enum and flags is therefore unified.

Second when regenerating all families we discussed so far I noticed
that some netlink policies jumped around. We need to ensure we don't
render code based on their ordering in a hash.
====================

Link: https://lore.kernel.org/r/20230126000235.1085551-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 16:32:43 -08:00
Jakub Kicinski
3a43ded081 tools: ynl: store ops in ordered dict to avoid random ordering
When rendering code we should walk the ops in the order in which
they are declared in the spec. This is both more intuitive and
prevents code from jumping around when hashing in the dict changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 16:32:41 -08:00
Jakub Kicinski
b49c34e217 tools: ynl: rename ops_list -> msg_list
ops_list contains all the operations, but the main iteration use
case is to walk only ops which define attrs. Rename ops_list to
msg_list, because now it looks like the contents are the same,
just the format is different. While at it convert from tuple
to just keys, none of the users care about the name of the op.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 16:32:41 -08:00
Jakub Kicinski
66fa34b9c2 tools: ynl: support kdocs for flags in code generation
Lorenzo reports that after switching from enum to flags netdev
family lost ability to render kdoc (and the enum contents got
generally garbled).

Combine the flags and enum handling in uAPI handling.

Reported-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 16:32:41 -08:00
Jakub Kicinski
868c82f34c Merge branch 'convert-drivers-to-return-xfrm-configuration-errors-through-extack'
Leon Romanovsky says:

====================
Convert drivers to return XFRM configuration errors through extack

This series continues effort started by Sabrina to return XFRM configuration
errors through extack. It allows for user space software stack easily present
driver failure reasons to users.

As a note, Intel drivers have a path where extack is equal to NULL, and error
prints won't be available in current patchset. If it is needed, it can be
changed by adding special to Intel macro to print to dmesg in case of
extack == NULL.
====================

Link: https://lore.kernel.org/r/cover.1674560845.git.leon@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 16:28:51 -08:00
Leon Romanovsky
8c284ea429 cxgb4: fill IPsec state validation failure reason
Rely on extack to return failure reason.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 16:28:48 -08:00
Leon Romanovsky
3fe5798627 bonding: fill IPsec state validation failure reason
Rely on extack to return failure reason.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 16:28:48 -08:00
Leon Romanovsky
505c500cfc ixgbe: fill IPsec state validation failure reason
Rely on extack to return failure reason.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 16:28:48 -08:00
Leon Romanovsky
c068ec5c96 ixgbevf: fill IPsec state validation failure reason
Rely on extack to return failure reason.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 16:28:48 -08:00
Leon Romanovsky
05ddf5f8cb nfp: fill IPsec state validation failure reason
Rely on extack to return failure reason.

Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 16:28:48 -08:00
Leon Romanovsky
6c48697955 netdevsim: Fill IPsec state validation failure reason
Rely on extack to return failure reason.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 16:28:48 -08:00
Leon Romanovsky
902812b816 net/mlx5e: Fill IPsec state validation failure reason
Rely on extack to return failure reason.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 16:28:48 -08:00
Leon Romanovsky
7681a4f58f xfrm: extend add state callback to set failure reason
Almost all validation logic is in the drivers, but they are
missing reliable way to convey failure reason to userspace
applications.

Let's use extack to return this information to users.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 16:28:48 -08:00
Leon Romanovsky
1bb70c5ab6 net/mlx5e: Fill IPsec policy validation failure reason
Rely on extack to return failure reason.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 16:28:48 -08:00
Leon Romanovsky
3089386db0 xfrm: extend add policy callback to set failure reason
Almost all validation logic is in the drivers, but they are
missing reliable way to convey failure reason to userspace
applications.

Let's use extack to return this information to users.

Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-26 16:28:48 -08:00
Linus Torvalds
28b4387f0e Networking fixes for 6.2-rc6, including fixes from netfilter.
Current release - regressions:
 
   - sched: sch_taprio: do not schedule in taprio_reset()
 
 Previous releases - regressions:
 
   - core: fix UaF in netns ops registration error path
 
   - ipv4: prevent potential spectre v1 gadgets
 
   - ipv6: fix reachability confirmation with proxy_ndp
 
   - netfilter: fix for the set rbtree
 
   - eth: fec: use page_pool_put_full_page when freeing rx buffers
 
   - eth: iavf: fix temporary deadlock and failure to set MAC address
 
 Previous releases - always broken:
 
  - netlink: prevent potential spectre v1 gadgets
 
  - netfilter: fixes for SCTP connection tracking
 
  - mctp: struct sock lifetime fixes
 
  - eth: ravb: fix possible hang if RIS2_QFF1 happen
 
  - eth: tg3: resolve deadlock in tg3_reset_task() during EEH
 
 Misc:
 
  - Mat stepped out as MPTCP co-maintainer
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmPSbsQSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOk1sAP/0uQCY1dZ3Q+PSPurc0+ZyWU+lW4bMjV
 ok98iYlQqvavfKVVcPDkI7dC/ag7vaiuaveYg1KjOC1sfgO7g/l90vHxXgLkP8qw
 Oy5ABmPGEAvZwAInl/ACzCvaXgLjOYiti7uRvFQ8ECQJXKoNUDIrt4fXbm/j2TLs
 +bgVwwr4dUdrsTMZS/P7t3bL6XefBzVp/v2bUnroBTFQgZQ/HEuWreYM55XMnYX0
 0GyOUXrkslm4ZZWUrvgLXJDyvonTl5jNI5BnS1XGNtcZZOe9sKkJdLndnEz9FZdT
 jIDmgtGhRYDqGdeVq2RpNNLxuRGB5JwcciP6k/zDZrckV3IxGzESs6G4E2Sd9CSk
 Xed2lAEAmdLn2X5N0k3PNT/csadA0BhdD6hI3B4nRZF1XSYPQUZtaA05m4TwEYWS
 G3LfEeKgEyLycFNsbAGWjg+2r1zSqj2Bu6f9VCeAJjL+APxNwvMqdC1vlrgyiDc4
 QLEYFsNX8fY9+tDJPySFamqboC7YrbAkMzZ/w9Hl/s3AmIcXudS7FlpI/uTixMLR
 MI5yRLB1mBXB4v8v9XN/fuR6PWu0umTFxpR5bbbnjJuksNh5tNhduKCWNOGGVGnm
 2WIBTNJO2GLmliL8+swLUWekIZUuVf+upE/vOK+9ENSEn65lXfW2UvMWqFPJyByl
 Ubl547BAwKBT
 =ay1Q
 -----END PGP SIGNATURE-----

Merge tag 'net-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from netfilter.

  Current release - regressions:

   - sched: sch_taprio: do not schedule in taprio_reset()

  Previous releases - regressions:

   - core: fix UaF in netns ops registration error path

   - ipv4: prevent potential spectre v1 gadgets

   - ipv6: fix reachability confirmation with proxy_ndp

   - netfilter: fix for the set rbtree

   - eth: fec: use page_pool_put_full_page when freeing rx buffers

   - eth: iavf: fix temporary deadlock and failure to set MAC address

  Previous releases - always broken:

   - netlink: prevent potential spectre v1 gadgets

   - netfilter: fixes for SCTP connection tracking

   - mctp: struct sock lifetime fixes

   - eth: ravb: fix possible hang if RIS2_QFF1 happen

   - eth: tg3: resolve deadlock in tg3_reset_task() during EEH

  Misc:

   - Mat stepped out as MPTCP co-maintainer"

* tag 'net-6.2-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (40 commits)
  net: mdio-mux-meson-g12a: force internal PHY off on mux switch
  docs: networking: Fix bridge documentation URL
  tsnep: Fix TX queue stop/wake for multiple queues
  net/tg3: resolve deadlock in tg3_reset_task() during EEH
  net: mctp: mark socks as dead on unhash, prevent re-add
  net: mctp: hold key reference when looking up a general key
  net: mctp: move expiry timer delete to unhash
  net: mctp: add an explicit reference from a mctp_sk_key to sock
  net: ravb: Fix possible hang if RIS2_QFF1 happen
  net: ravb: Fix lack of register setting after system resumed for Gen3
  net/x25: Fix to not accept on connected socket
  ice: move devlink port creation/deletion
  sctp: fail if no bound addresses can be used for a given scope
  net/sched: sch_taprio: do not schedule in taprio_reset()
  Revert "Merge branch 'ethtool-mac-merge'"
  netrom: Fix use-after-free of a listening socket.
  netfilter: conntrack: unify established states for SCTP paths
  Revert "netfilter: conntrack: add sctp DATA_SENT state"
  netfilter: conntrack: fix bug in for_each_sctp_chunk
  netfilter: conntrack: fix vtag checks for ABORT/SHUTDOWN_COMPLETE
  ...
2023-01-26 10:20:12 -08:00
Linus Torvalds
262b42e02d treewide: fix up files incorrectly marked executable
I'm not exactly clear on what strange workflow causes people to do it,
but clearly occasionally some files end up being committed as executable
even though they clearly aren't.

This is a reprise of commit 90fda63fa115 ("treewide: fix up files
incorrectly marked executable"), just with a different set of files (but
with the same trivial shell scripting).

So apparently we need to re-do this every five years or so, and Joe
needs to just keep reminding me to do so ;)

Reported-by: Joe Perches <joe@perches.com>
Fixes: 523375c943e5 ("drm/vmwgfx: Port vmwgfx to arm64")
Fixes: 5c439937775d ("ASoC: codecs: add support for ES8326")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-01-26 10:05:39 -08:00
Vladimir Oltean
9179f5fe41 net: ethtool: provide shims for stats aggregation helpers when CONFIG_ETHTOOL_NETLINK=n
ethtool_aggregate_*_stats() are implemented in net/ethtool/stats.c, a
file which is compiled out when CONFIG_ETHTOOL_NETLINK=n. In order to
avoid adding Kbuild dependencies from drivers (which call these helpers)
on CONFIG_ETHTOOL_NETLINK, let's add some shim definitions which simply
make the helpers dead code.

This means the function prototypes should have been located in
include/linux/ethtool_netlink.h rather than include/linux/ethtool.h.

Fixes: 449c5459641a ("net: ethtool: add helpers for aggregate statistics")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230125110214.4127759-1-vladimir.oltean@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 15:28:25 +01:00
Paolo Abeni
97f7d3dd76 Merge branch 'mptcp-add-mixed-v4-v6-support-for-the-in-kernel-pm'
Matthieu Baerts says:

====================
mptcp: add mixed v4/v6 support for the in-kernel PM

Before these patches, the in-kernel Path-Manager would not allow, for
the same MPTCP connection, having a mix of subflows in v4 and v6.

MPTCP's RFC 8684 doesn't forbid that and it is even recommended to do so
as the path in v4 and v6 are likely different. Some networks are also
v4 or v6 only, we cannot assume they all have both v4 and v6 support.

Patch 1 then removes this artificial constraint in the in-kernel PM
currently enforcing there are no mixed subflows in place, either in
address announcement or in subflow creation areas.

Patch 2 makes sure the sk_ipv6only attribute is also propagated to
subflows, just in case a new PM wouldn't respect it.

Some selftests have also been added for the in-kernel PM (patch 3).

Patches 4 to 8 are just some cleanups and small improvements in the
printed messages in the userspace PM. It is not linked to the rest but
identified when working on a related patch modifying this selftest,
already in -net:

  commit 4656d72c1efa ("selftests: mptcp: userspace: validate v4-v6 subflows mix")
---
====================

Link: https://lore.kernel.org/r/20230123-upstream-net-next-pm-v4-v6-v1-0-43fac502bfbf@tessares.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 13:33:33 +01:00
Matthieu Baerts
8dbdf24f4e selftests: mptcp: userspace: avoid read errors
During the cleanup phase, the server pids were killed with a SIGTERM
directly, not using a SIGUSR1 first to quit safely. As a result, this
test was often ending with two error messages:

  read: Connection reset by peer

While at it, use a for-loop to terminate all the PIDs the same way.

Also the different files are now removed after having killed the PIDs
using them. It makes more sense to do that in this order.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 13:33:30 +01:00
Matthieu Baerts
10d4273411 selftests: mptcp: userspace: print error details if any
Before, only '[FAIL]' was printed in case of error during the validation
phase.

Now, in case of failure, the variable name, its value and expected one
are displayed to help understand what was wrong.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 13:33:30 +01:00
Matthieu Baerts
1c0b0ee264 selftests: mptcp: userspace: refactor asserts
Instead of having a long list of conditions to check, it is possible to
give a list of variable names to compare with their 'e_XXX' version.

This will ease the introduction of the following commit which will print
which condition has failed (if any).

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 13:33:30 +01:00
Matthieu Baerts
f790ae03db selftests: mptcp: userspace: print titles
This script is running a few tests after having setup the environment.

Printing titles helps understand what is being tested.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 13:33:30 +01:00
Matthieu Baerts
40c71f763f mptcp: userspace pm: use a single point of exit
Like in all other functions in this file, a single point of exit is used
when extra operations are needed: unlock, decrement refcount, etc.

There is no functional change for the moment but it is better to do the
same here to make sure all cleanups are done in case of intermediate
errors.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 13:33:30 +01:00
Paolo Abeni
ad3493746e selftests: mptcp: add test-cases for mixed v4/v6 subflows
Note that we can't guess the listener family anymore based on the client
target address: always use IPv6.

The fullmesh flag with endpoints from different families is also
validated here.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 13:33:30 +01:00
Matthieu Baerts
7e9740e0e8 mptcp: propagate sk_ipv6only to subflows
Usually, attributes are propagated to subflows as well.

Here, if subflows are created by other ways than the MPTCP path-manager,
it is important to make sure they are in v6 if it is asked by the
userspace.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 13:33:30 +01:00
Paolo Abeni
b9d69db87f mptcp: let the in-kernel PM use mixed IPv4 and IPv6 addresses
Currently the in-kernel PM arbitrary enforces that created subflow's
family must match the main MPTCP socket while the RFC allows mixing
IPv4 and IPv6 subflows.

This patch changes the in-kernel PM logic to create subflows matching
the currently selected source (or destination) address. IPv4 sockets
can pick only IPv4 addresses (and v4 mapped in v6), while IPv6 sockets
not restricted to V6ONLY can pick either IPv4 and IPv6 addresses as
long as the source and destination matches.

A helper, previously introduced is used to ease family matching checks,
taking care of IPv4 vs IPv4-mapped-IPv6 vs IPv6 only addresses.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/269
Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 13:33:30 +01:00
Jamie Bainbridge
d0941130c9 icmp: Add counters for rate limits
There are multiple ICMP rate limiting mechanisms:

* Global limits: net.ipv4.icmp_msgs_burst/icmp_msgs_per_sec
* v4 per-host limits: net.ipv4.icmp_ratelimit/ratemask
* v6 per-host limits: net.ipv6.icmp_ratelimit/ratemask

However, when ICMP output is limited, there is no way to tell
which limit has been hit or even if the limits are responsible
for the lack of ICMP output.

Add counters for each of the cases above. As we are within
local_bh_disable(), use the __INC stats variant.

Example output:

 # nstat -sz "*RateLimit*"
 IcmpOutRateLimitGlobal          134                0.0
 IcmpOutRateLimitHost            770                0.0
 Icmp6OutRateLimitHost           84                 0.0

Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Suggested-by: Abhishek Rawal <rawal.abhishek92@gmail.com>
Link: https://lore.kernel.org/r/273b32241e6b7fdc5c609e6f5ebc68caf3994342.1674605770.git.jamie.bainbridge@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 10:52:18 +01:00
Paolo Abeni
9f92752788 Merge branch 'adding-sparx5-is0-vcap-support'
Steen Hegelund says:

====================
Adding Sparx5 IS0 VCAP support

This provides the Ingress Stage 0 (IS0) VCAP (Versatile Content-Aware
Processor) support for the Sparx5 platform.

The IS0 VCAP (also known in the datasheet as CLM) is a classifier VCAP that
mainly extracts frame information to metadata that follows the frame in the
Sparx5 processing flow all the way to the egress port.

The IS0 VCAP has 4 lookups and they are accessible with a TC chain id:

- chain 1000000: IS0 Lookup 0
- chain 1100000: IS0 Lookup 1
- chain 1200000: IS0 Lookup 2
- chain 1300000: IS0 Lookup 3
- chain 1400000: IS0 Lookup 4
- chain 1500000: IS0 Lookup 5

Each of these lookups have their own port keyset configuration that decides
which keys will be used for matching on which traffic type.

The IS0 VCAP has these traffic classifications:

- IPv4 frames
- IPv6 frames
- Unicast MPLS frames (ethertype = 0x8847)
- Multicast MPLS frames (ethertype = 0x8847)
- Other frame types than MPLS, IPv4 and IPv6

The IS0 VCAP has an action that allows setting the value of a PAG (Policy
Association Group) key field in the frame metadata, and this can be used
for matching in an IS2 VCAP rule.

This allow rules in the IS0 VCAP to be linked to rules in the IS2 VCAP.

The linking is exposed by using the TC "goto chain" action with an offset
from the IS2 chain ids.

As an example a "goto chain 8000001" will use a PAG value of 1 to chain to
a rule in IS2 Lookup 0.
====================

Link: https://lore.kernel.org/r/20230124104511.293938-1-steen.hegelund@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 10:08:05 +01:00
Steen Hegelund
52df82cc91 net: microchip: sparx5: Add support for IS0 VCAP CVLAN TC keys
This adds support for parsing and matching on the CVLAN tags in the Sparx5
IS0 VCAP.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 10:07:44 +01:00
Steen Hegelund
63e3564507 net: microchip: sparx5: Add support for IS0 VCAP ethernet protocol types
This allows the IS0 VCAP to have its own list of supported ethernet
protocol types matching what is supported by the VCAPs port lookup
classification.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 10:07:44 +01:00
Steen Hegelund
81e164c4ae net: microchip: sparx5: Add automatic selection of VCAP rule actionset
With more than one possible actionset in a VCAP instance, the VCAP API will
now use the actions in a VCAP rule to select the actionset that fits these
actions the best possible way.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 10:07:44 +01:00
Steen Hegelund
88bd9ea70b net: microchip: sparx5: Add TC filter chaining support for IS0 and IS2 VCAPs
This allows rules to be chained between VCAP instances, e.g. from IS0
Lookup 0 to IS0 Lookup 1, or from one of the IS0 Lookups to one of the IS2
Lookups.

Chaining from an IS2 Lookup to another IS2 Lookup is not supported in the
hardware.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 10:07:44 +01:00
Steen Hegelund
542e6e2c20 net: microchip: sparx5: Add TC support for IS0 VCAP
This enables the TC command to use the Sparx5 IS0 VCAP

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 10:07:44 +01:00
Steen Hegelund
7306fcd17c net: microchip: sparx5: Add actionset type id information to rule
This adds the actionset type id to the rule information.  This is needed as
we now have more than one actionset in a VCAP instance (IS0).

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 10:07:44 +01:00
Steen Hegelund
545609fd4e net: microchip: sparx5: Add IS0 VCAP keyset configuration for Sparx5
This adds the IS0 VCAP port keyset configuration for Sparx5 and also
updates the debugFS support to show the keyset configuration.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 10:07:44 +01:00
Steen Hegelund
f274a659fb net: microchip: sparx5: Add IS0 VCAP model and updated KUNIT VCAP model
This provides the IS0 (Ingress Stage 0) or CLM VCAP model for Sparx5.
This VCAP provides classification actions for Sparx5.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-01-26 10:07:44 +01:00
Jerome Brunet
7083df59ab net: mdio-mux-meson-g12a: force internal PHY off on mux switch
Force the internal PHY off then on when switching to the internal path.
This fixes problems where the PHY ID is not properly set.

Fixes: 7090425104db ("net: phy: add amlogic g12a mdio mux support")
Suggested-by: Qi Duan <qi.duan@amlogic.com>
Co-developed-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Link: https://lore.kernel.org/r/20230124101157.232234-1-jbrunet@baylibre.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-25 22:46:51 -08:00
Jakub Kicinski
3f17e16f38 Merge branch 'add-ip_local_port_range-socket-option'
Jakub Sitnicki says:

====================
Add IP_LOCAL_PORT_RANGE socket option

This patch set is a follow up to the "How to share IPv4 addresses by
partitioning the port space" talk given at LPC 2022 [1].

Please see patch #1 for the motivation & the use case description.
Patch #2 adds tests exercising the new option in various scenarios.

Documentation
-------------

Proposed update to the ip(7) man-page:

       IP_LOCAL_PORT_RANGE (since Linux X.Y)
              Set or get the per-socket default local  port  range.  This
              option  can  be  used  to  clamp down the global local port
              range, defined by the ip_local_port_range  /proc  interface
              described below, for a given socket.

              The  option  takes  an uint32_t value with the high 16 bits
              set to the upper range bound, and the low 16  bits  set  to
              the  lower  range  bound.  Range  bounds are inclusive. The
              16-bit values should be in host byte order.

              The lower bound has to be less than the  upper  bound  when
              both  bounds  are  not  zero. Otherwise, setting the option
              fails with EINVAL.

              If either bound is outside of the global local port  range,
              or is zero, then that bound has no effect.

              To  reset  the setting, pass zero as both the upper and the
              lower bound.

Interaction with SELinux bind() hook
------------------------------------

SELinux bind() hook - selinux_socket_bind() - performs a permission check
if the requested local port number lies outside of the netns ephemeral port
range.

The proposed socket option cannot be used change the ephemeral port range
to extend beyond the per-netns port range, as set by
net.ipv4.ip_local_port_range.

Hence, there is no interaction with SELinux, AFAICT.

RFC -> v1
RFC: https://lore.kernel.org/netdev/20220912225308.93659-1-jakub@cloudflare.com/

 * Allow either the high bound or the low bound, or both, to be zero
 * Add getsockopt support
 * Add selftests

Links:
------

[1]: https://lpc.events/event/16/contributions/1349/
====================

Link: https://lore.kernel.org/r/20221221-sockopt-port-range-v6-0-be255cc0e51f@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-25 22:45:02 -08:00