1138957 Commits

Author SHA1 Message Date
Julian Anastasov
f0be83d542 ipvs: add est_cpulist and est_nice sysctl vars
Allow the kthreads for stats to be configured for
specific cpulist (isolation) and niceness (scheduling
priority).

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Cc: yunhong-cgl jiang <xintian1976@gmail.com>
Cc: "dust.li" <dust.li@linux.alibaba.com>
Reviewed-by: Jiri Wiesner <jwiesner@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-12-10 22:44:43 +01:00
Julian Anastasov
705dd34440 ipvs: use kthreads for stats estimation
Estimating all entries in single list in timer context
by single CPU causes large latency with multiple IPVS rules
as reported in [1], [2], [3].

Spread the estimator structures in multiple chains and
use kthread(s) for the estimation. The chains are processed
in multiple (50) timer ticks to ensure the 2-second interval
between estimations with some accuracy. Every chain is
processed under RCU lock.

Every kthread works over its own data structure and all
such contexts are attached to array. The contexts can be
preserved while the kthread tasks are stopped or restarted.
When estimators are removed, unused kthread contexts are
released and the slots in array are left empty.

First kthread determines parameters to use, eg. maximum
number of estimators to process per kthread based on
chain's length (chain_max), allowing sub-100us cond_resched
rate and estimation taking up to 1/8 of the CPU capacity
to avoid any problems if chain_max is not correctly
calculated.

chain_max is calculated taking into account factors
such as CPU speed and memory/cache speed where the
cache_factor (4) is selected from real tests with
current generation of CPU/NUMA configurations to
correct the difference in CPU usage between
cached (during calc phase) and non-cached (working) state
of the estimated per-cpu data.

First kthread also plays the role of distributor of
added estimators to all kthreads, keeping low the
time to add estimators. The optimization is based on
the fact that newly added estimator should be estimated
after 2 seconds, so we have the time to offload the
adding to chain from controlling process to kthread 0.

The allocated kthread context may grow from 1 to 50
allocated structures for timer ticks which saves memory for
setups with small number of estimators.

We also add delayed work est_reload_work that will
make sure the kthread tasks are properly started/stopped.

ip_vs_start_estimator() is changed to report errors
which allows to safely store the estimators in
allocated structures.

Many thanks to Jiri Wiesner for his valuable comments
and for spending a lot of time reviewing and testing
the changes on different platforms with 48-256 CPUs and
1-8 NUMA nodes under different cpufreq governors.

[1] Report from Yunhong Jiang:
https://lore.kernel.org/netdev/D25792C1-1B89-45DE-9F10-EC350DC04ADC@gmail.com/
[2]
https://marc.info/?l=linux-virtual-server&m=159679809118027&w=2
[3] Report from Dust:
https://archive.linuxvirtualserver.org/html/lvs-devel/2020-12/msg00000.html

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Cc: yunhong-cgl jiang <xintian1976@gmail.com>
Cc: "dust.li" <dust.li@linux.alibaba.com>
Reviewed-by: Jiri Wiesner <jwiesner@suse.de>
Tested-by: Jiri Wiesner <jwiesner@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-12-10 22:44:43 +01:00
Julian Anastasov
1dbd8d9a82 ipvs: use u64_stats_t for the per-cpu counters
Use the provided u64_stats_t type to avoid
load/store tearing.

Fixes: 316580b69d0a ("u64_stats: provide u64_stats_t type")
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Cc: yunhong-cgl jiang <xintian1976@gmail.com>
Cc: "dust.li" <dust.li@linux.alibaba.com>
Reviewed-by: Jiri Wiesner <jwiesner@suse.de>
Tested-by: Jiri Wiesner <jwiesner@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-12-10 22:44:42 +01:00
Julian Anastasov
de39afb3d8 ipvs: use common functions for stats allocation
Move alloc_percpu/free_percpu logic in new functions

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Cc: yunhong-cgl jiang <xintian1976@gmail.com>
Cc: "dust.li" <dust.li@linux.alibaba.com>
Reviewed-by: Jiri Wiesner <jwiesner@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-12-10 22:44:42 +01:00
Julian Anastasov
5df7d714d8 ipvs: add rcu protection to stats
In preparation to using RCU locking for the list
with estimators, make sure the struct ip_vs_stats
are released after RCU grace period by using RCU
callbacks. This affects ipvs->tot_stats where we
can not use RCU callbacks for ipvs, so we use
allocated struct ip_vs_stats_rcu. For services
and dests we force RCU callbacks for all cases.

Signed-off-by: Julian Anastasov <ja@ssi.bg>
Cc: yunhong-cgl jiang <xintian1976@gmail.com>
Cc: "dust.li" <dust.li@linux.alibaba.com>
Reviewed-by: Jiri Wiesner <jwiesner@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-12-10 22:44:42 +01:00
Li Qiong
895fa59647 netfilter: flowtable: add a 'default' case to flowtable datapath
Add a 'default' case in case return a uninitialized value of ret, this
should not ever happen since the follow transmit path types:

- FLOW_OFFLOAD_XMIT_UNSPEC
- FLOW_OFFLOAD_XMIT_TC

are never observed from this path. Add this check for safety reasons.

Signed-off-by: Li Qiong <liqiong@nfschina.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-12-08 22:11:00 +01:00
Florian Westphal
7d7cfb48d8 netfilter: conntrack: set icmpv6 redirects as RELATED
icmp conntrack will set icmp redirects as RELATED, but icmpv6 will not
do this.

For icmpv6, only icmp errors (code <= 128) are examined for RELATED state.
ICMPV6 Redirects are part of neighbour discovery mechanism, those are
handled by marking a selected subset (e.g.  neighbour solicitations) as
UNTRACKED, but not REDIRECT -- they will thus be flagged as INVALID.

Add minimal support for REDIRECTs.  No parsing of neighbour options is
added for simplicity, so this will only check that we have the embeeded
original header (ND_OPT_REDIRECT_HDR), and then attempt to do a flow
lookup for this tuple.

Also extend the existing test case to cover redirects.

Fixes: 9fb9cbb1082d ("[NETFILTER]: Add nf_conntrack subsystem.")
Reported-by: Eric Garver <eric@garver.life>
Link: https://github.com/firewalld/firewalld/issues/1046
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Eric Garver <eric@garver.life>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-11-30 23:01:20 +01:00
Vishwanath Pai
e937452495 netfilter: ipset: Add support for new bitmask parameter
Add a new parameter to complement the existing 'netmask' option. The
main difference between netmask and bitmask is that bitmask takes any
arbitrary ip address as input, it does not have to be a valid netmask.

The name of the new parameter is 'bitmask'. This lets us mask out
arbitrary bits in the ip address, for example:
ipset create set1 hash:ip bitmask 255.128.255.0
ipset create set2 hash:ip,port family inet6 bitmask ffff::ff80

Signed-off-by: Vishwanath Pai <vpai@akamai.com>
Signed-off-by: Joshua Hunt <johunt@akamai.com>
Signed-off-by: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-11-30 18:55:36 +01:00
Florian Westphal
a70e483460 netfilter: conntrack: merge ipv4+ipv6 confirm functions
No need to have distinct functions.  After merge, ipv6 can avoid
protooff computation if the connection neither needs sequence adjustment
nor helper invocation -- this is the normal case.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-11-30 18:55:30 +01:00
Sriram Yagnaraman
bff3d05348 netfilter: conntrack: add sctp DATA_SENT state
SCTP conntrack currently assumes that the SCTP endpoints will
probe secondary paths using HEARTBEAT before sending traffic.

But, according to RFC 9260, SCTP endpoints can send any traffic
on any of the confirmed paths after SCTP association is up.
SCTP endpoints that sends INIT will confirm all peer addresses
that upper layer configures, and the SCTP endpoint that receives
COOKIE_ECHO will only confirm the address it sent the INIT_ACK to.

So, we can have a situation where the INIT sender can start to
use secondary paths without the need to send HEARTBEAT. This patch
allows DATA/SACK packets to create new connection tracking entry.

A new state has been added to indicate that a DATA/SACK chunk has
been seen in the original direction - SCTP_CONNTRACK_DATA_SENT.
State transitions mostly follows the HEARTBEAT_SENT, except on
receiving HEARTBEAT/HEARTBEAT_ACK/DATA/SACK in the reply direction.

State transitions in original direction:
- DATA_SENT behaves similar to HEARTBEAT_SENT for all chunks,
   except that it remains in DATA_SENT on receving HEARTBEAT,
   HEARTBEAT_ACK/DATA/SACK chunks
State transitions in reply direction:
- DATA_SENT behaves similar to HEARTBEAT_SENT for all chunks,
   except that it moves to HEARTBEAT_ACKED on receiving
   HEARTBEAT/HEARTBEAT_ACK/DATA/SACK chunks

Note: This patch still doesn't solve the problem when the SCTP
endpoint decides to use primary paths for association establishment
but uses a secondary path for association shutdown. We still have
to depend on timeout for connections to expire in such a case.

Signed-off-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-11-30 18:26:09 +01:00
Dan Carpenter
98cbc40e4f netfilter: nft_inner: fix IS_ERR() vs NULL check
The __nft_expr_type_get() function returns NULL on error.  It never
returns error pointers.

Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-11-22 22:29:54 +01:00
Paolo Abeni
339e79dfb0 Merge branch 'cleanup-ocelot_stats-exposure'
Colin Foster says:

====================
cleanup ocelot_stats exposure

The ocelot_stats structures became redundant across all users. Replace
this redundancy with a static const struct. After doing this, several
definitions inside include/soc/mscc/ocelot.h no longer needed to be
shared. Patch 2 removes them.

Checkpatch throws an error for a complicated macro not in parentheses. I
understand the reason for OCELOT_COMMON_STATS was to allow expansion, but
interestingly this patch set is essentially reverting the ability for
expansion. I'm keeping the macro in this set, but am open to remove it,
since it doesn't _actually_ provide any immediate benefits anymore.
====================

Link: https://lore.kernel.org/r/20221119231406.3167852-1-colin.foster@in-advantage.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-22 15:36:46 +01:00
Colin Foster
877e7b7c3b net: mscc: ocelot: issue a warning if stats are incorrectly ordered
Ocelot uses regmap_bulk_read() operations to efficiently read stats
registers. Currently the implementation relies on the stats layout to be
ordered to be most efficient.

Issue a warning if any future implementations happen to break this pattern.

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Co-developed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-22 15:36:44 +01:00
Colin Foster
a3bb8f521f net: mscc: ocelot: remove unnecessary exposure of stats structures
Since commit 4d1d157fb6a4 ("net: mscc: ocelot: share the common stat
definitions between all drivers") there is no longer a need to share the
stats structures to the world. Relocate these definitions to inside
ocelot_stats.c instead of a global include header.

Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-22 15:36:43 +01:00
Colin Foster
33d5eeb9a6 net: mscc: ocelot: remove redundant stats_layout pointers
Ever since commit 4d1d157fb6a4 ("net: mscc: ocelot: share the common stat
definitions between all drivers") the stats_layout entry in ocelot and
felix drivers have become redundant. Remove the unnecessary code.

Suggested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-22 15:36:43 +01:00
Björn Töpel
837a3d66d6 selftests: net: Add cross-compilation support for BPF programs
The selftests/net does not have proper cross-compilation support, and
does not properly state libbpf as a dependency. Mimic/copy the BPF
build from selftests/bpf, which has the nice side-effect that libbpf
is built as well.

Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Anders Roxell <anders.roxell@linaro.org>
Link: https://lore.kernel.org/r/20221119171841.2014936-1-bjorn@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-22 13:49:22 +01:00
Tiezhu Yang
6dcd6d0152 samples: pktgen: Use "grep -E" instead of "egrep"
The latest version of grep claims the egrep is now obsolete so the build
now contains warnings that look like:
	egrep: warning: egrep is obsolescent; using grep -E
fix this up by moving the related file to use "grep -E" instead.

  sed -i "s/egrep/grep -E/g" `grep egrep -rwl samples/pktgen`

Here are the steps to install the latest grep:

  wget http://ftp.gnu.org/gnu/grep/grep-3.8.tar.gz
  tar xf grep-3.8.tar.gz
  cd grep-3.8 && ./configure && make
  sudo make install
  export PATH=/usr/local/bin:$PATH

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://lore.kernel.org/r/1668826504-32162-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-22 13:23:48 +01:00
Suman Ghosh
674b3e1642 octeontx2-pf: Add additional checks while configuring ucast/bcast/mcast rules
1. If a profile does not support DMAC extraction then avoid installing NPC
flow rules for unicast. Similarly, if LXMB(L2 and L3) extraction is not
supported by the profile then avoid installing broadcast and multicast
rules.
2. Allow MCAM entry insertion for promiscuous mode.
3. For the profiles where DMAC is not extracted in MKEX key default
unicast entry installed by AF is not valid. Hence do not use action
from the AF installed default unicast entry for such cases.
4. Adjacent packet header fields in a packet like IP header source
and destination addresses or UDP/TCP header source port and destination
can be extracted together in MKEX profile. Therefore MKEX profile can be
configured to in two ways:
	a. Total of 4 bytes from start of UDP header(src port
	   + destination port)
	or
	b. Two bytes from start and two bytes from offset 2

Signed-off-by: Suman Ghosh <sumang@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Link: https://lore.kernel.org/r/20221118053329.2288486-1-sumang@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-22 09:39:25 +01:00
Florian Fainelli
696450c051 net: bcmgenet: Clear RGMII_LINK upon link down
Clear the RGMII_LINK bit upon detecting link down to be consistent with
setting the bit upon link up. We also move the clearing of the
out-of-band disable to the runtime initialization rather than for each
link up/down transition.

Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20221118213754.1383364-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-21 20:44:40 -08:00
Dan Carpenter
4e9a61394d net: microchip: sparx5: fix uninitialized variables
Smatch complains that "err" can be uninitialized on these paths.  Also
it's just nicer to "return 0;" instead of "return err;"

Fixes: 3a344f99bb55 ("net: microchip: sparx5: Add support for TC flower ARP dissector")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Link: https://lore.kernel.org/r/Y3eg9Ml/LmLR3L3C@kili
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-21 20:44:04 -08:00
Eric Dumazet
32634819ad net: fix __sock_gen_cookie()
I was mistaken how atomic64_try_cmpxchg(&sk_cookie, &res, new)
is working.

I was assuming @res would contain the final sk_cookie value,
regardless of the success of our cmpxchg()

We could do something like:

if (atomic64_try_cmpxchg(&sk_cookie, &res, new)
	res = new;

But we can avoid a conditional and read sk_cookie again.

atomic64_cmpxchg(&sk_cookie, res, new);
res = atomic64_read(&sk_cookie);

Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1527347 ("Error handling issues")
Fixes: 4ebf802cf1c6 ("net: __sock_gen_cookie() cleanup")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20221118043843.3703186-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-21 20:36:30 -08:00
David S. Miller
2c45455ea1 Merge branch 'mptcp-netlink'
Mat Martineau says:

====================
mptcp: More specific netlink command errors

This series makes the error reporting for the MPTCP_PM_CMD_ADD_ADDR netlink
command more specific, since there are multiple reasons the command could
fail.

Note that patch 2 adds a GENL_SET_ERR_MSG_FMT() macro to genetlink.h,
which is outside the MPTCP subsystem.

Patch 1 refactors in-kernel listening socket and endpoint creation to
simplify the second patch.

Patch 2 updates the error values returned by the in-kernel path manager
when it fails to create a local endpoint.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 13:09:08 +00:00
Paolo Abeni
a3400e8746 mptcp: more detailed error reporting on endpoint creation
Endpoint creation can fail for a number of reasons; in case of failure
append the error number to the extended ack message, using a newly
introduced generic helper.

Additionally let mptcp_pm_nl_append_new_local_addr() report different
error reasons.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 13:09:07 +00:00
Paolo Abeni
976d302fb6 mptcp: deduplicate error paths on endpoint creation
When endpoint creation fails, we need to free the newly allocated
entry and eventually destroy the paired mptcp listener socket.

Consolidate such action in a single point let all the errors path
reach it.

Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 13:09:07 +00:00
Kuniyuki Iwashima
7a7160edf1 net: Return errno in sk->sk_prot->get_port().
We assume the correct errno is -EADDRINUSE when sk->sk_prot->get_port()
fails, so some ->get_port() functions return just 1 on failure and the
callers return -EADDRINUSE instead.

However, mptcp_get_port() can return -EINVAL.  Let's not ignore the error.

Note the only exception is inet_autobind(), all of whose callers return
-EAGAIN instead.

Fixes: cec37a6e41aa ("mptcp: Handle MP_CAPABLE options for outgoing connections")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 13:05:39 +00:00
Yoshihiro Shimoda
1cb5072632 net: ethernet: renesas: rswitch: Fix MAC address info
Smatch detected the following warning.

    drivers/net/ethernet/renesas/rswitch.c:1717 rswitch_init() warn:
    '%pM' cannot be followed by 'n'

The 'n' should be '\n'.

Reported-by: Dan Carpenter <error27@gmail.com>
Suggested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Fixes: 3590918b5d07 ("net: ethernet: renesas: Add support for "Ethernet Switch"")
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Reviewed-by: Saeed Mahameed <saeed@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 12:57:18 +00:00
David S. Miller
4dca1319a7 Merge branch 'sarx5-VCAP-debugfs'
netdev.vger.kernel.org archive mirror
Steen Hegelund says:

====================
net: Add support for VCAP debugFS in Sparx5

This provides support for getting VCAP instance, VCAP rule and VCAP port
keyset configuration information via the debug file system.

It builds on top of the initial IS2 VCAP support found in these series:

https://lore.kernel.org/all/20221020130904.1215072-1-steen.hegelund@microchip.com/
https://lore.kernel.org/all/20221109114116.3612477-1-steen.hegelund@microchip.com/
https://lore.kernel.org/all/20221111130519.1459549-1-steen.hegelund@microchip.com/

Functionality:
==============

The VCAP API exposes a /sys/kernel/debug/sparx5/vcaps folder containing
the following entries:

- raw_<vcap>_<instance>
    This is a raw dump of the VCAP instance with a line for each available
    VCAP rule.  This information is limited to the VCAP rule address, the
    rule size and the rule keyset name as this requires very little
    information from the VCAP cache.

    This can be used to detect if a valid rule is stored at the correct
    address.

- <vcap>_<instance>
    This dumps the VCAP instance configuration: address ranges, chain id
    ranges, word size of keys and actions etc, and for each VCAP rule the
    details of keys (values and masks) and actions are shown.

    This is useful when discovering if the expected rule is present and in
    which order it will be matched.

- <interface>
    This shows the keyset configuration per lookup and traffic type and the
    set of sticky bits (common for all interfaces). This is cleared when
    shown, so it is possible to sample over a period of time.

    It also shows if this port/lookup is enabled for matching in the VCAP.

    This can be used to find out which keyset the traffic being sent to a
    port, will be matched against, and if such traffic has been seen by one
    of the ports.

Delivery:
=========

This is current plan for delivering the full VCAP feature set of Sparx5:

- TC protocol all support for IS2 VCAP
- Sparx5 IS0 VCAP support
- TC policer and drop action support (depends on the Sparx5 QoS support
  upstreamed separately)
- Sparx5 ES0 VCAP support
- TC flower template support
- TC matchall filter support for mirroring and policing ports
- TC flower filter mirror action support
- Sparx5 ES2 VCAP support

Version History:
================
v2      Removed a 'support' folder (used for integration testing) that had
        been added in patch 6/8 by a mistake.
        Wrapped long lines.

v1      Initial version
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 11:33:02 +00:00
Steen Hegelund
552b7d131a net: microchip: sparx5: Add VCAP debugfs KUNIT test
This tests the functionality of the debugFS support:

- finding valid keyset on an address
- raw VCAP output
- full rule VCAP output

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 11:33:02 +00:00
Steen Hegelund
71c9de9952 net: microchip: sparx5: Add VCAP locking to protect rules
This ensures that the VCAP cache and the lists maintained in the VCAP
instance is protected when accessed by different clients.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 11:33:02 +00:00
Steen Hegelund
72d84dd609 net: microchip: sparx5: Add VCAP debugFS key/action support for the VCAP API
This add support for displaying the keys and actions in a rule.
The keys and action display format will be determined by the size and the
type of the key or action. The longer keys will typically be displayed as a
hexadecimal byte array.

The actionset is not decoded in full as the Sparx5 IS2 only has one
supported action, so this will be added later with other VCAP types.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 11:33:02 +00:00
Steen Hegelund
3a7921560d net: microchip: sparx5: Add VCAP rule debugFS support for the VCAP API
This add support to show all rules in a VCAP instance. The information
shown is:

 - rule id
 - address range
 - size
 - chain id
 - keyset name, subword size, register span
 - actionset name, subword size, register span
 - counter value
 - sticky bit (one bit width counter)

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 11:33:02 +00:00
Steen Hegelund
d4134d41e3 net: microchip: sparx5: Add raw VCAP debugFS support for the VCAP API
This adds support for decoding VCAP rules with a minimum number of
attributes: address, rule size and keyset.

This allows for a quick inspection of a VCAP instance to determine if the
rule are present and in the correct order.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 11:33:02 +00:00
Steen Hegelund
e0305cc1d1 net: microchip: sparx5: Add VCAP debugFS support
Add a debugFS root folder for Sparx5 and add a vcap folder underneath with
the VCAP instances and the ports

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 11:33:02 +00:00
Steen Hegelund
277e9179ef net: microchip: sparx5: Ensure VCAP last_used_addr is set back to default
This ensures that the last_used_addr in a VCAP instance is returned to the
default value when all rules have been deleted.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 11:33:02 +00:00
Steen Hegelund
bcddc196d4 net: microchip: sparx5: Ensure L3 protocol has a default value
This ensures that the l3_proto always have a valid value and that any
dissector parsing errors causes the flower rule to be discarded.

Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 11:33:02 +00:00
David S. Miller
418e0721d4 Merge branch 'gve-alternate-missed-completions'
Jeroen de Borst says:

====================
gve: Handle alternate miss-completions

Some versions of the virtual NIC present miss-completions in
an alternative way. Let the diver handle these alternate completions
and announce this capability to the device.

The capability is announced uing a new AdminQ command that sends
driver information to the device. The device can refuse a driver
if it is lacking support for a capability, or it can adopt it's
behavior to work around OS specific issues.

Changed in v5:
- Removed comments in fucntion calls
- Switched ENOTSUPP back to EOPNOTSUPP and made sure it gets passed
Changed in v4:
- Clarified new AdminQ command in cover letter
- Changed EOPNOTSUPP to ENOTSUPP to match device's response
Changed in v3:
- Rewording cover letter
- Added 'Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>'
Changes in v2:
- Changed the subject to include 'gve:'
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 10:52:14 +00:00
Jeroen de Borst
a5affbd8a7 gve: Handle alternate miss completions
The virtual NIC has 2 ways of indicating a miss-path
completion. This handles the alternate.

Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 10:52:14 +00:00
Jeroen de Borst
c2a0c3ed5b gve: Adding a new AdminQ command to verify driver
Check whether the driver is compatible with the device
presented.

Signed-off-by: Jeroen de Borst <jeroendb@google.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 10:52:14 +00:00
Dmitry Vyukov
d9e8da5585 NFC: nci: Extend virtual NCI deinit test
Extend the test to check the scenario when NCI core tries to send data
to already closed device to ensure that nothing bad happens.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Cc: Bongsu Jeon <bongsu.jeon@samsung.com>
Cc: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 10:49:58 +00:00
David S. Miller
148b1da886 Merge branch 'axiennet-mdio-bus-freq'
Andy Chiu says:

====================
net: axienet: Use a DT property to configure frequency of the MDIO bus

Some FPGA platforms have to set frequency of the MDIO bus lower than 2.5
MHz. Thus, we use a DT property, which is "clock-frequency", to work
with it at boot time. The default 2.5 MHz would be set if the property
is not pressent. Also, factor out mdio enable/disable functions due to
the api change since 253761a0e61b7.

Changelog:
--- v5 ---
1. Make dt-binding patch prior to the implementation patch.
2. Disable mdio bus in error path.
3. Update description of some functions.
--- v4 ---
1. change MAX_MDIO_FREQ to DEFAULT_MDIO_FREQ as suggested by Andrew.
--- v3 RESEND ---
1. Repost the exact same patch again
--- v3 ---
1. Fix coding style, and make probing of the driver fail if MDC overflow
--- v2 ---
1. Use clock-frequency, as defined in mdio.yaml, to configure MDIO
   clock.
2. Only print out frequency if it is set to a non-standard value.
3. Reduce the scope of axienet_mdio_enable and remove
   axienet_mdio_disable because no one really uses it anymore.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 10:36:04 +00:00
Andy Chiu
2e1f2c1066 net: axienet: set mdio clock according to bus-frequency
Some FPGA platforms have 80KHz MDIO bus frequency constraint when
connecting Ethernet to its on-board external Marvell PHY. Thus, we may
have to set MDIO clock according to the DT. Otherwise, use the default
2.5 MHz, as specified by 802.3, if the entry is not present.

Also, change MAX_MDIO_FREQ to DEFAULT_MDIO_FREQ because we may actually
set MDIO bus frequency higher than 2.5MHz if undelying devices support
it. And properly disable the mdio bus clock in error path.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 10:36:03 +00:00
Andy Chiu
6830604ec0 dt-bindings: describe the support of "clock-frequency" in mdio
mdio bus frequency is going to be configurable at boottime by a property
in DT now, so add a description to it.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 10:36:03 +00:00
Andy Chiu
29f8eefba3 net: axienet: Unexport and remove unused mdio functions
Both axienet_mdio_{enable/disable} functions are no longer used in
xilinx_axienet_main.c due to 253761a0e61b7. And axienet_mdio_disable is
not even used in the mdio.c. So unexport and remove them.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Greentime Hu <greentime.hu@sifive.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 10:36:03 +00:00
Dan Carpenter
62a45b384a net: microchip: sparx5: prevent uninitialized variable
Smatch complains that:

    drivers/net/ethernet/microchip/sparx5/sparx5_dcb.c:112
    sparx5_dcb_apptrust_validate() error: uninitialized symbol 'match'.

This would only happen if the:

	if (sparx5_dcb_apptrust_policies[i].nselectors != nselectors)

condition is always true (they are not equal).  The "nselectors"
variable comes from dcbnl_ieee_set() and it is a number between 0-256.
This seems like a probably a real bug.

Fixes: 23f8382cd95d ("net: microchip: sparx5: add support for apptrust")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 10:06:47 +00:00
Lorenzo Bianconi
ef8c373bd9 net: ethernet: mtk_eth_soc: fix RSTCTRL_PPE{0,1} definitions
Fix RSTCTRL_PPE0 and RSTCTRL_PPE1 register mask definitions for
MTK_NETSYS_V2.
Remove duplicated definitions.

Fixes: 160d3a9b1929 ("net: ethernet: mtk_eth_soc: introduce MTK_NETSYS_V2 support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 09:48:12 +00:00
Horatiu Vultur
aa5ac4be8d net: microchip: sparx5: kunit test: Fix compile warnings.
When VCAP_KUNIT_TEST is enabled the following warnings are generated:

drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c:257:34: warning: Using plain integer as NULL pointer
drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c:258:41: warning: Using plain integer as NULL pointer
drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c:342:23: warning: Using plain integer as NULL pointer
drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c:359:23: warning: Using plain integer as NULL pointer
drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c:1327:34: warning: Using plain integer as NULL pointer
drivers/net/ethernet/microchip/vcap/vcap_api_kunit.c:1328:41: warning: Using plain integer as NULL pointer

Therefore fix this.

Fixes: dccc30cc4906 ("net: microchip: sparx5: Add KUNIT test of counters and sorted rules")
Fixes: c956b9b318d9 ("net: microchip: sparx5: Adding KUNIT tests of key/action values in VCAP API")
Fixes: 67d637516fa9 ("net: microchip: sparx5: Adding KUNIT test for the VCAP API")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 09:47:18 +00:00
David S. Miller
dca508cd88 Merge branch 'nfp-ipsec-offload'
Simon Horman says:

====================
nfp: IPsec offload support

Huanhuan Wang says:

this series adds support for IPsec offload to the NFP driver.

It covers three enhancements:

1. Patches 1/3:
   - Extend the capability word and control word to to support
     new features.

2. Patch 2/3:
   - Add framework to support IPsec offloading for NFP driver,
     but IPsec offload control plane interface xfrm callbacks which
     interact with upper layer are not implemented in this patch.

3. Patch 3/3:
   - IPsec control plane interface xfrm callbacks are implemented
     in this patch.

Changes since v3
* Remove structure fields that describe firmware but
  are not used for Kernel offload
* Add WARN_ON(!xa_empty()) before call to xa_destroy()
* Added helpers for hash methods

Changes since v2
* OFFLOAD_HANDLE_ERROR macro and the associated code removed
* Unnecessary logging removed
* Hook function xdo_dev_state_free in struct xfrmdev_ops removed
* Use Xarray to maintain SA entries

Changes since v1
* Explicitly return failure when XFRM_STATE_ESN is set
* Fix the issue that AEAD algorithm is not correctly offloaded
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 08:51:36 +00:00
Huanhuan Wang
859a497fe8 nfp: implement xfrm callbacks and expose ipsec offload feature to upper layer
Xfrm callbacks are implemented to offload SA info into firmware
by mailbox. It supports 16K SA info in total.

Expose ipsec offload feature to upper layer, this feature will
signal the availability of the offload.

Based on initial work of Norm Bagley <norman.bagley@netronome.com>.

Signed-off-by: Huanhuan Wang <huanhuan.wang@corigine.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 08:51:36 +00:00
Huanhuan Wang
57f273adbc nfp: add framework to support ipsec offloading
A new metadata type and config structure are introduced to
interact with firmware to support ipsec offloading. This
feature relies on specific firmware that supports ipsec
encrypt/decrypt by advertising related capability bit.

The xfrm callbacks which interact with upper layer are
implemented in the following patch.

Based on initial work of Norm Bagley <norman.bagley@netronome.com>.

Signed-off-by: Huanhuan Wang <huanhuan.wang@corigine.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 08:51:36 +00:00
Yinjun Zhang
484963ce9f nfp: extend capability and control words
Currently the 32-bit capability word is almost exhausted, now
allocate some more words to support new features, and control
word is also extended accordingly. Packet-type offloading is
implemented in NIC application firmware, but it's not used in
kernel driver, so reserve this bit here in case it's redefined
for other use.

Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-21 08:51:36 +00:00