IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
To avoid duplicated code in different MPTCP selftests, we can add and use
helpers defined in mptcp_lib.sh.
The helper verify_listener_events() is defined both in mptcp_join.sh and
userspace_pm.sh, export it into mptcp_lib.sh and rename it with mptcp_lib_
prefix. Use this new helper in both scripts.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240308-upstream-net-next-20240308-selftests-mptcp-unification-v1-13-4f42c347b653@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
verify_listener_events() helper will be exported into mptcp_lib.sh as a
public function, but print_test() is invoked in it, which is a private
function in userspace_pm.sh only. So this patch moves print_test() out of
verify_listener_events().
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240308-upstream-net-next-20240308-selftests-mptcp-unification-v1-12-4f42c347b653@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Extract the main part of check_expected() in userspace_pm.sh to a new
function mptcp_lib_check_expected() in mptcp_lib.sh. It will be used
in both mptcp_john.sh and userspace_pm.sh. check_expected_one() is
moved into mptcp_lib.sh too as mptcp_lib_check_expected_one().
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240308-upstream-net-next-20240308-selftests-mptcp-unification-v1-11-4f42c347b653@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch modifies test_fail() to call mptcp_lib_pr_fail() only if there
are arguments (if [ ${#} -gt 0 ]) in userspace_pm.sh, add arguments
"unexpected type: ${type}" when calling test_fail() from test_remove().
Then mptcp_lib_pr_fail() can be used in check_expected_one() instead of
test_fail().
The same in mptcp_join.sh, calling fail_test() without argument, and adapt
this helper not to call print_fail() in this case.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240308-upstream-net-next-20240308-selftests-mptcp-unification-v1-10-4f42c347b653@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To unify the output formats of all test scripts, this patch adds
four more helpers:
mptcp_lib_pr_ok()
mptcp_lib_pr_skip()
mptcp_lib_pr_fail()
mptcp_lib_pr_info()
to print out [ OK ], [SKIP], [FAIL] and 'INFO: ' with colors. Use them
in all scripts to print the "ok/skip/fail/info' using the same 'format'.
Having colors helps to quickly identify issues when looking at a long
list of output logs and results.
Note that now all print the same keywords, which was not the case
before, but it is good to uniform that.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240308-upstream-net-next-20240308-selftests-mptcp-unification-v1-9-4f42c347b653@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch uses addition assignment operator (+=) to append strings
instead of duplicating the variable name in mptcp_connect.sh and
mptcp_join.sh.
This can make the statements shorter.
Note: in mptcp_connect.sh, add a local variable extra in do_transfer to
save the various extra warning logs, using += to append it. And add a
new variable tc_info to save various tc info, also using += to append it.
This can make the code more readable and prepare for the next commit.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240308-upstream-net-next-20240308-selftests-mptcp-unification-v1-8-4f42c347b653@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch adds a new helper mptcp_lib_print_title(), a wrapper of
mptcp_lib_inc_test_counter() and mptcp_lib_pr_title_counter(), to
print out test counter in each test result and increase the counter.
Use this helper to print out test counters for every tests in diag.sh,
mptcp_connect.sh, mptcp_sockopt.sh, pm_netlink.sh, simult_flows.sh,
and userspace_pm.sh.
diag.sh:
01 no msk on netns creation [ ok ]
02 listen match for dport 10000 [ ok ]
03 listen match for sport 10000 [ ok ]
04 listen match for saddr and sport [ ok ]
05 all listen sockets [ ok ]
mptcp_connect.sh:
01 New MPTCP socket can be blocked via sysctl [ OK ]
02 Validating network environment with pings [ OK ]
INFO: Using loss of 0.85% delay 31 ms reorder .. with delay 7ms on ns3eth4
03 ns1 MPTCP -> ns1 (10.0.1.1:10000 ) MPTCP (duration 69ms) [ OK ]
04 ns1 MPTCP -> ns1 (10.0.1.1:10001 ) TCP (duration 20ms) [ OK ]
05 ns1 TCP -> ns1 (10.0.1.1:10002 ) MPTCP (duration 16ms) [ OK ]
mptcp_sockopt.sh:
01 Transfer v4 [ OK ]
02 Mark v4 [ OK ]
03 Transfer v6 [ OK ]
04 Mark v6 [ OK ]
05 SOL_MPTCP sockopt v4 [ OK ]
pm_netlink.sh:
01 defaults addr list [ OK ]
02 simple add/get addr [ OK ]
03 dump addrs [ OK ]
04 simple del addr [ OK ]
05 dump addrs after del [ OK ]
simult_flows.sh:
01 balanced bwidth 7391 max 8456 [ OK ]
02 balanced bwidth - reverse direction 7403 max 8456 [ OK ]
03 balanced bwidth with unbalanced delay 7429 max 8456 [ OK ]
04 balanced bwidth with unbalanced delay - reverse ... 7485 max 8456 [ OK ]
05 unbalanced bwidth 7549 max 8456 [ OK ]
userspace_pm.sh:
01 Created network namespaces ns1, ns2 [ OK ]
INFO: Make connections
02 Established IPv4 MPTCP Connection ns2 => ns1 [ OK ]
03 Established IPv6 MPTCP Connection ns2 => ns1 [ OK ]
INFO: Announce tests
04 ADD_ADDR 10.0.2.2 (ns2) => ns1, invalid token [ OK ]
05 ADD_ADDR id:67 10.0.2.2 (ns2) => ns1, reuse port [ OK ]
Having test counters helps to quickly identify issues when looking at a
long list of output logs and results.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240308-upstream-net-next-20240308-selftests-mptcp-unification-v1-7-4f42c347b653@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch adds a new variable MPTCP_LIB_TEST_FORMAT as the test title
printing format. Also add a helper mptcp_lib_print_title() to use this
format to print the test title with test counters. They are used in
mptcp_join.sh first.
Each MPTCP selftest is having subtests, and it helps to give them a
number to quickly identify them. This can be managed by mptcp_lib.sh,
reusing what has been done here. The following commit will use these
new helpers in the other tests.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240308-upstream-net-next-20240308-selftests-mptcp-unification-v1-6-4f42c347b653@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Variable TEST_COUNT are used in mptcp_connect.sh and mptcp_join.sh as
test counters, which are initialized to 0, while variable test_cnt are used
in diag.sh and simult_flows.sh, which are initialized to 1. To maintain
consistency, this patch renames them all as MPTCP_LIB_TEST_COUNTER,
initializes it to 1, and exports it into mptcp_lib.sh.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240308-upstream-net-next-20240308-selftests-mptcp-unification-v1-5-4f42c347b653@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Only total test results are printed out in mptcp_sockopt.sh:
PASS: all packets had packet mark set
PASS: SOL_MPTCP getsockopt has expected information
PASS: TCP_INQ cmsg/ioctl -t tcp
PASS: TCP_INQ cmsg/ioctl -6 -t tcp
PASS: TCP_INQ cmsg/ioctl -r tcp
PASS: TCP_INQ cmsg/ioctl -6 -r tcp
PASS: TCP_INQ cmsg/ioctl -r tcp -t tcp
They mismatch with the test results:
ok 1 - mptcp_sockopt: mark ipv4
ok 2 - mptcp_sockopt: transfer ipv4
ok 3 - mptcp_sockopt: mark ipv6
ok 4 - mptcp_sockopt: transfer ipv6
ok 5 - mptcp_sockopt: sockopt v4
ok 6 - mptcp_sockopt: sockopt v6
ok 7 - mptcp_sockopt: TCP_INQ: -t tcp
ok 8 - mptcp_sockopt: TCP_INQ: -6 -t tcp
ok 9 - mptcp_sockopt: TCP_INQ: -r tcp
ok 10 - mptcp_sockopt: TCP_INQ: -6 -r tcp
ok 11 - mptcp_sockopt: TCP_INQ: -r tcp -t tcp
'mptcp_sockopt.sh' now display more detailed results + why (what you had
in a former patch from v6, merged here). It no longer displays 'PASS:',
because it is duplicated info now that the detailed are displayed:
Transfer v4 [ OK ]
Mark v4 [ OK ]
Transfer v6 [ OK ]
Mark v6 [ OK ]
SOL_MPTCP sockopt v4 [ OK ]
SOL_MPTCP sockopt v6 [ OK ]
TCP_INQ cmsg/ioctl -t tcp [ OK ]
TCP_INQ cmsg/ioctl -6 -t tcp [ OK ]
TCP_INQ cmsg/ioctl -r tcp [ OK ]
TCP_INQ cmsg/ioctl -6 -r tcp [ OK ]
TCP_INQ cmsg/ioctl -r tcp -t tcp [ OK ]
Also fix the TAP output:
ok 1 - mptcp_sockopt: transfer ipv4
ok 2 - mptcp_sockopt: mark ipv4
ok 3 - mptcp_sockopt: transfer ipv6
ok 4 - mptcp_sockopt: mark ipv6
ok 5 - mptcp_sockopt: sockopt v4
ok 6 - mptcp_sockopt: sockopt v6
ok 7 - mptcp_sockopt: TCP_INQ: -t tcp
ok 8 - mptcp_sockopt: TCP_INQ: -6 -t tcp
ok 9 - mptcp_sockopt: TCP_INQ: -r tcp
ok 10 - mptcp_sockopt: TCP_INQ: -6 -r tcp
ok 11 - mptcp_sockopt: TCP_INQ: -r tcp -t tcp
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240308-upstream-net-next-20240308-selftests-mptcp-unification-v1-4-4f42c347b653@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The first [ OK ] in the output of mptcp_connect.sh misaligns with the
others:
New MPTCP socket can be blocked via sysctl [ OK ]
INFO: validating network environment with pings
INFO: Using loss of 0.85% delay 16 ms reorder 95% 70% with delay 4ms on
ns1 MPTCP -> ns1 (10.0.1.1:10000 ) MPTCP (duration 184ms) [ OK ]
ns1 MPTCP -> ns1 (10.0.1.1:10001 ) TCP (duration 50ms) [ OK ]
ns1 TCP -> ns1 (10.0.1.1:10002 ) MPTCP (duration 55ms) [ OK ]
This patch aligns them by using 69 chars to display the first two lines,
and 50 chars for the other. Since 19 chars are used to display duration
time. Also print out a [ OK ] at the end of the 2nd line for consistency.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240308-upstream-net-next-20240308-selftests-mptcp-unification-v1-3-4f42c347b653@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch adds a new dedicated counter 'port' instead of TEST_COUNT
to increase port numbers in mptcp_connect.sh.
This can avoid outputting discontinuous test counters.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240308-upstream-net-next-20240308-selftests-mptcp-unification-v1-2-4f42c347b653@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Some error messages are printed to stderr while the others are printed
to 'stdout'. As part of the unification, this patch drop "1>&2" to let
all errors messages are printed to 'stdout'.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240308-upstream-net-next-20240308-selftests-mptcp-unification-v1-1-4f42c347b653@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.
To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().
Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Herve Codina <herve.codina@bootlin.com>
Link: https://lore.kernel.org/r/9684419fd714cc489a3ef36d838d3717bb6aec6d.1709886922.git.u.kleine-koenig@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Petr Machata says:
====================
mlxsw: Support for nexthop group statistics
ECMP is a fundamental component in L3 designs. However, it's fragile. Many
factors influence whether an ECMP group will operate as intended: hash
policy (i.e. the set of fields that contribute to ECMP hash calculation),
neighbor validity, hash seed (which might lead to polarization) or the type
of ECMP group used (hash-threshold or resilient).
At the same time, collection of statistics that would help an operator
determine that the group performs as desired, is difficult.
Support for nexthop group statistics and their HW collection has been
introduced recently. In this patch set, add HW stats collection support
to mlxsw.
This patchset progresses as follows:
- Patches #1 and #2 add nexthop IDs to notifiers.
- Patches #3 and #4 are code-shaping.
- Patches #5, #6 and #7 adjust the flow counter code.
- Patches #8 and #9 add HW nexthop counters.
- Patch #10 adjusts the HW counter code to allow sharing the same counter
for several resilient group buckets with the same NH ID.
- Patch #11 adds a selftest.
====================
Link: https://lore.kernel.org/r/cover.1709901020.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add to lib.sh support for fetching NH stats, and a new library,
router_mpath_nh_lib.sh, with the common code for testing NH stats.
Use the latter from router_mpath_nh.sh and router_mpath_nh_res.sh.
The test works by sending traffic through a NH group, and checking that the
reported values correspond to what the link that ultimately receives the
traffic reports having seen.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Link: https://lore.kernel.org/r/2a424c54062a5f1efd13b9ec5b2b0e29c6af2574.1709901020.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For resilient groups, we can reuse the same counter for all the buckets
that share the same nexthop. Keep a reference count per counter, and keep
all these counters in a per-next hop group xarray, which serves as a
NHID->counter cache. If a counter is already present for a given NHID, just
take a reference and use the same counter.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/cdd00084533fc83ac5917562f54642f008205bf3.1709901020.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When hw_stats is set on a group, install nexthop counters on members of a
group.
Counter allocation request is moved from nexthop object initialization to
the update code. The previous placement made sense: when the counters are
enabled by dpipe, the counters are installed to all existing nexthops and
all nexthops created from then on get them. For the finer-grained nexthop
group statistics, this is unsuitable. The existing placement was kept for
the IPv4 and IPv6 nexthops.
Resilient group replacement emits a pre_replace notification, and then any
bucket_replace notifications if there were any replacements at all. If the
group is balanced and the nexthop composition of the replaced group didn't
change, there will be no such notifiers. Therefore hook to the pre_replace
notifier and mark all buckets for update, to un/install the counters.
When reporting deltas for resilient groups, use the nexthop ID that we
stored in a previous patch to look up to which nexthop a bucket
contributes.
Co-developed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Link: https://lore.kernel.org/r/87495a72f187df2e5d491d02729c550d235fcc85.1709901020.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The core interfaces for collecting per-NH statistics are built around
nexthops even for resilient groups. Because mlxsw models each bucket as a
nexthop, the core next hop that a given bucket contributes to needs to be
looked up. In order to be able to match the two up, we need to track
nexthop ID for members of group nexthop objects. For simplicity, do it for
all nexthop objects, not just group members.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/184ceb6b154e08f5bcf116a705b0fcb01c31895c.1709901020.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The next patch will add the ability to share nexthop counters among
mlxsw nexthops backed by the same core nexthop. To have a place to store
reference count, the counter should be kept in a dedicated structure. In
this patch, introduce the structure together with the related helpers, sans
the refcount, which comes in the next patch.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/61f23fa4f8c5d7879f68dacd793d8ab7425f33c0.1709901020.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
mlxsw_sp_nexthop_counter_disable() decays to a nop when called on a
disabled counter, but mlxsw_sp_nexthop_counter_enable() can't similarly
be called on an enabled counter. This would be useful in the following
patches. Add the missing condition.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/0cc9050e196366c1387ab5ee47f1cee8ecde9c86.1709901020.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
For the report_delta-like interface like a previous patch has added for
collection of NH group statistics, it's easiest to read the counter and
have the HW clear it right away. Thus, change mlxsw_sp_flow_counter_get()
to take a bool indicating whether this should be done.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/6a096ede8ee92d5041e3832242c3bbc137198aba.1709901020.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The function mlxsw_sp_nexthop_counter_alloc() doesn't directly allocate
anything, and mlxsw_sp_nexthop_counter_free() doesn't directly free. For
the following patches, we will need names for functions that actually do
those things. Therefore rename to mlxsw_sp_nexthop_counter_enable() and
mlxsw_sp_nexthop_counter_disable() to free up the namespace.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/f59272958697a718f090f59f892d32beabcd8972.1709901020.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
When sending the notifications to collect NH statistics for resilient
groups, the driver will need to know the nexthop IDs in individual buckets
to look up the right counter. To that end, move the nexthop ID from struct
nh_notifier_grp_entry_info to nh_notifier_single_info.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/8f964cd50b1a56d3606ce7ab4c50354ae019c43b.1709901020.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The NEXTHOP_EVENT_RES_TABLE_PRE_REPLACE notifier currently keeps the group
ID unset. That makes it impossible to look up the group for which the
notifier is intended. This is not an issue at the moment, because the only
client is netdevsim, and that just so that it veto replacements, which is a
static property not tied to a particular group. But for any practical use,
the ID is necessary. Set it.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/025fef095dcfb408042568bb5439da014d47239e.1709901020.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move gro_find_receive_by_type() and gro_find_complete_by_type()
to include/net/gro.h where they belong.
Also use _NET_GRO_H instead of _NET_IPV6_GRO_H to protect
include/net/gro.h from multiple inclusions.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240308102230.296224-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add a space (' ') prefix to every userdata line to match docs for
dev-kmsg. To account for this extra character in each userdata entry,
reduce userdata entry names (directory name) from 54 characters to 53.
According to the dev-kmsg docs, a space is used for subsequent lines to
mark them as continuation lines.
> A line starting with ' ', is a continuation line, adding
> key/value pairs to the log message, which provide the machine
> readable context of the message, for reliable processing in
> userspace.
Testing for this patch::
cd /sys/kernel/config/netconsole && mkdir cmdline0
cd cmdline0
mkdir userdata/test && echo "hello" > userdata/test/value
mkdir userdata/test2 && echo "hello2" > userdata/test2/value
echo "message" > /dev/kmsg
Outputs::
6.8.0-rc5-virtme,12,493,231373579,-;message
test=hello
test2=hello2
And I confirmed all testing works as expected from the original patchset
Fixes: df03f830d0 ("net: netconsole: cache userdata formatted string in netconsole_target")
Signed-off-by: Matthew Wood <thepacketgeek@gmail.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240308002525.248672-1-thepacketgeek@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Switch to new function phy_support_eee. This allows to simplify
the code because data->tx_lpi_enabled is now populated by
phy_ethtool_get_eee().
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/92462328-5c9b-4d82-9ce4-ea974cda4900@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Handling case err == 0 in the other branch allows to simplify the
code. In addition I assume in "err & phydev->eee_cfg.tx_lpi_enabled"
it should have been a logical and operator. It works as expected also
with the bitwise and, but using a bitwise and with a bool value looks
ugly to me.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://lore.kernel.org/r/de37bf30-61dd-49f9-b645-2d8ea11ddb5d@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Disable LEDs just before resetting the MT7530 to avoid
situations where the ESW_P4_LED_0 and ESW_P3_LED_0 pin
states may cause an unintended external crystal frequency
to be selected.
The HT_XTAL_FSEL (External Crystal Frequency Selection)
field of HWTRAP (the Hardware Trap register) stores a
2-bit value that represents the state of the ESW_P4_LED_0
and ESW_P4_LED_0 pins (seemingly) sampled just after the
MT7530 has been reset, as:
ESW_P4_LED_0 ESW_P3_LED_0 Frequency
-----------------------------------------
0 1 20MHz
1 0 40MHz
1 1 25MHz
The value of HT_XTAL_FSEL is bootstrapped by pulling
ESW_P4_LED_0 and ESW_P3_LED_0 up or down accordingly,
but:
if a 40MHz crystal has been selected and
the ESW_P3_LED_0 pin is high during reset,
or a 20MHz crystal has been selected and
the ESW_P4_LED_0 pin is high during reset,
then the value of HT_XTAL_FSEL will indicate
that a 25MHz crystal is present.
By default, the state of the LED pins is PHY controlled
to reflect the link state.
To illustrate, if a board has:
5 ports with active low LED control,
and HT_XTAL_FSEL bootstrapped for 40MHz.
When the MT7530 is powered up without any external
connection, only the LED associated with Port 3 is
illuminated as ESW_P3_LED_0 is low.
In this state, directly after mt7530_setup()'s reset
is performed, the HWTRAP register (0x7800) reflects
the intended HT_XTAL_FSEL (HWTRAP bits 10:9) of 40MHz:
mt7530-mdio mdio-bus:1f: mt7530_read: 00007800 == 00007dcf
>>> bin(0x7dcf >> 9 & 0b11)
'0b10'
But if a cable is connected to Port 3 and the link
is active before mt7530_setup()'s reset takes place,
then HT_XTAL_FSEL seems to be set for 25MHz:
mt7530-mdio mdio-bus:1f: mt7530_read: 00007800 == 00007fcf
>>> bin(0x7fcf >> 9 & 0b11)
'0b11'
Once HT_XTAL_FSEL reflects 25MHz, none of the ports
are functional until the MT7621 (or MT7530 itself)
is reset.
By disabling the LED pins just before reset, the chance
of an unintended HT_XTAL_FSEL value is reduced.
Signed-off-by: Justin Swartz <justin.swartz@risingedge.co.za>
Link: https://lore.kernel.org/r/20240305043952.21590-1-justin.swartz@risingedge.co.za
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Since commit 43a7206b09 ("driver core: class: make class_register() take
a const *"), the driver core allows for struct class to be in read-only
memory, so move the ptp_class structure to be declared at build time
placing it into read-only memory, instead of having to be dynamically
allocated at boot time.
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240305-ptp-v1-1-ed253eb33c20@marliere.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
ynl-gen-c.py supports check unterminated-ok, but the yaml schemas don't
have this key. Add this to the yaml files.
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240308081239.3281710-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Support using pre-defined values in checks so we don't need to use hard
code number for the string, binary length. e.g. we have a definition like
#define TEAM_STRING_MAX_LEN 32
Which defined in yaml like:
definitions:
-
name: string-max-len
type: const
value: 32
It can be used in the attribute-sets like
attribute-sets:
-
name: attr-option
name-prefix: team-attr-option-
attributes:
-
name: name
type: string
checks:
len: string-max-len
With this patch it will be converted to
[TEAM_ATTR_OPTION_NAME] = { .type = NLA_STRING, .len = TEAM_STRING_MAX_LEN, }
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240311140727.109562-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The check is duplicated in 2 places, factor it out into a common helper.
Signed-off-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Link: https://lore.kernel.org/r/20240308204500.1112858-1-almasrymina@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jason Xing says:
====================
annotate data-races around sysctl_tcp_wmem[0]
Adding simple READ_ONCE() can avoid reading the sysctl knob meanwhile
someone is trying to change it.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When reading wmem[0], it could be changed concurrently without
READ_ONCE() protection. So add one annotation here.
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's possible that writer and the reader can manipulate the same
sysctl knob concurrently. Using READ_ONCE() to prevent reading
an old value.
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Running the page-pool sample on production machines under moderate
networking load shows recycling rate higher than 100%:
$ page-pool
eth0[2] page pools: 14 (zombies: 0)
refs: 89088 bytes: 364904448 (refs: 0 bytes: 0)
recycling: 100.3% (alloc: 1392:2290247724 recycle: 469289484:1828235386)
Note that outstanding refs (89088) == slow alloc * cache size (1392 * 64)
which means this machine is recycling page pool pages perfectly, not
a single page has been released.
The extra 0.3% is because sample ignores allocations from the ptr_ring.
Treat those the same as alloc_fast, the ring vs cache alloc is
already captured accurately enough by recycling stats.
With the fix:
$ page-pool
eth0[2] page pools: 14 (zombies: 0)
refs: 89088 bytes: 364904448 (refs: 0 bytes: 0)
recycling: 100.0% (alloc: 1392:2331141604 recycle: 473625579:1857460661)
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commits ca065d0cf8 ("udp: no longer use SLAB_DESTROY_BY_RCU")
and 7ae215d23c ("bpf: Don't refcount LISTEN sockets in sk_assign()")
UDP early demux no longer need to grab a refcount on the UDP socket.
This save two atomic operations per incoming packet for connected
sockets.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Joe Stringer <joe@wand.net.nz>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Gavrilov Ilia says:
====================
fix incorrect parameter validation in the *_get_sockopt() functions
This v2 series fix incorrent parameter validation in *_get_sockopt()
functions in several places.
version 2 changes:
- reword the patch description
- add two patches for net/kcm and net/x25
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The 'len' variable can't be negative when assigned the result of
'min_t' because all 'min_t' parameters are cast to unsigned int,
and then the minimum one is chosen.
To fix the logic, check 'len' as read from 'optlen',
where the types of relevant variables are (signed) int.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Gavrilov Ilia <Ilia.Gavrilov@infotecs.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 'len' variable can't be negative when assigned the result of
'min_t' because all 'min_t' parameters are cast to unsigned int,
and then the minimum one is chosen.
To fix the logic, check 'len' as read from 'optlen',
where the types of relevant variables are (signed) int.
Fixes: ab7ac4eb98 ("kcm: Kernel Connection Multiplexor module")
Signed-off-by: Gavrilov Ilia <Ilia.Gavrilov@infotecs.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 'len' variable can't be negative when assigned the result of
'min_t' because all 'min_t' parameters are cast to unsigned int,
and then the minimum one is chosen.
To fix the logic, check 'len' as read from 'optlen',
where the types of relevant variables are (signed) int.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Gavrilov Ilia <Ilia.Gavrilov@infotecs.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 'len' variable can't be negative when assigned the result of
'min_t' because all 'min_t' parameters are cast to unsigned int,
and then the minimum one is chosen.
To fix the logic, check 'len' as read from 'optlen',
where the types of relevant variables are (signed) int.
Fixes: 3557baabf2 ("[L2TP]: PPP over L2TP driver core")
Reviewed-by: Tom Parkin <tparkin@katalix.com>
Signed-off-by: Gavrilov Ilia <Ilia.Gavrilov@infotecs.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 'olr' variable can't be negative when assigned the result of
'min_t' because all 'min_t' parameters are cast to unsigned int,
and then the minimum one is chosen.
To fix the logic, check 'olr' as read from 'optlen',
where the types of relevant variables are (signed) int.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Gavrilov Ilia <Ilia.Gavrilov@infotecs.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
The 'len' variable can't be negative when assigned the result of
'min_t' because all 'min_t' parameters are cast to unsigned int,
and then the minimum one is chosen.
To fix the logic, check 'len' as read from 'optlen',
where the types of relevant variables are (signed) int.
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Gavrilov Ilia <Ilia.Gavrilov@infotecs.ru>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herve Codina says:
====================
Add support for QMC HDLC
This series introduces the QMC HDLC support.
Patches were previously sent as part of a full feature series and were
previously reviewed in that context:
"Add support for QMC HDLC, framer infrastructure and PEF2256 framer" [1]
In order to ease the merge, the full feature series has been split and
needed parts were merged in v6.8-rc1:
- "Prepare the PowerQUICC QMC and TSA for the HDLC QMC driver" [2]
- "Add support for framer infrastructure and PEF2256 framer" [3]
This series contains patches related to the QMC HDLC part (QMC HDLC
driver):
- Introduce the QMC HDLC driver (patches 1 and 2)
- Add timeslots change support in QMC HDLC (patch 3)
- Add framer support as a framer consumer in QMC HDLC (patch 4)
Compare to the original full feature series, a modification was done on
patch 3 in order to use a coherent prefix in the commit title.
I kept the patches unsquashed as they were previously sent and reviewed.
Of course, I can squash them if needed.
Compared to the previous iteration:
https://lore.kernel.org/linux-kernel/20240306080726.167338-1-herve.codina@bootlin.com/
this v7 series mainly:
- Rename a variable.
- Fix reverse xmas tree declarations.
- Add 'Acked-by' tag.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>