IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Menglong Dong says:
====================
net: icmp: add skb drop reasons to icmp
In the commit c504e5c2f964 ("net: skb: introduce kfree_skb_reason()"),
we added the support of reporting the reasons of skb drops to kfree_skb
tracepoint. And in this series patches, reasons for skb drops are added
to ICMP protocol.
In order to report the reasons of skb drops in 'sock_queue_rcv_skb()',
the function 'sock_queue_rcv_skb_reason()' is introduced in the 1th
patch, which is used in the 3th patch.
As David Ahern suggested, the reasons for skb drops should be more
general and not be code based. Therefore, in the 2th patch,
SKB_DROP_REASON_PTYPE_ABSENT is renamed to
SKB_DROP_REASON_UNHANDLED_PROTO, which is used for the cases of no
L3 protocol handler, no L4 protocol handler, version extensions, etc.
In the 3th patch, we introduce the new function __ping_queue_rcv_skb()
to report drop reasons by its return value and keep the return value of
ping_queue_rcv_skb() still.
In the 4th patch, we make ICMP message handler functions return drop
reasons, which means we change the return type of 'handler()' in
'struct icmp_control' from 'bool' to 'enum skb_drop_reason'. This
changed its original intention, as 'false' means failure, but
'SKB_NOT_DROPPED_YET', which is 0, means success now. Therefore, we
have to change all usages of these handler. Following "handler"
functions are involved:
icmp_unreach()
icmp_redirect()
icmp_echo()
icmp_timestamp()
icmp_discard()
And following drop reasons are added(what they mean can be see
in the document for them):
SKB_DROP_REASON_ICMP_CSUM
SKB_DROP_REASON_INVALID_PROTO
The reason 'INVALID_PROTO' is introduced for the case that the packet
doesn't follow rfc 1122 and is dropped. I think this reason is different
from the 'UNHANDLED_PROTO', as the 'UNHANDLED_PROTO' means the packet is
fine, and it is just not supported. This is not a common case, and I
believe we can locate the problem from the data in the packet. For now,
this 'INVALID_PROTO' is used for the icmp broadcasts with wrong types.
Maybe there should be a document file for these reasons. For example,
list all the case that causes the 'INVALID_PROTO' drop reason. Therefore,
users can locate their problems according to the document.
Changes since v4:
- rename SKB_DROP_REASON_RFC_1122 to SKB_DROP_REASON_INVALID_PROTO
Changes since v3:
- rename SKB_DROP_REASON_PTYPE_ABSENT to SKB_DROP_REASON_UNHANDLED_PROTO
in the 2th patch
- fix the return value problem of ping_queue_rcv_skb() in the 3th patch
- remove SKB_DROP_REASON_ICMP_TYPE and SKB_DROP_REASON_ICMP_BROADCAST
and introduce the SKB_DROP_REASON_RFC_1122 in the 4th patch
Changes since v2:
- fix aliegnment problem in the 2th patch
Changes since v1:
- introduce __ping_queue_rcv_skb() instead of change the return value
of ping_queue_rcv_skb() in the 2th patch, as Paolo suggested
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Replace kfree_skb() used in icmp_rcv() and icmpv6_rcv() with
kfree_skb_reason().
In order to get the reasons of the skb drops after icmp message handle,
we change the return type of 'handler()' in 'struct icmp_control' from
'bool' to 'enum skb_drop_reason'. This may change its original
intention, as 'false' means failure, but 'SKB_NOT_DROPPED_YET' means
success now. Therefore, all 'handler' and the call of them need to be
handled. Following 'handler' functions are involved:
icmp_unreach()
icmp_redirect()
icmp_echo()
icmp_timestamp()
icmp_discard()
And following new drop reasons are added:
SKB_DROP_REASON_ICMP_CSUM
SKB_DROP_REASON_INVALID_PROTO
The reason 'INVALID_PROTO' is introduced for the case that the packet
doesn't follow rfc 1122 and is dropped. This is not a common case, and
I believe we can locate the problem from the data in the packet. For now,
this 'INVALID_PROTO' is used for the icmp broadcasts with wrong types.
Maybe there should be a document file for these reasons. For example,
list all the case that causes the 'UNHANDLED_PROTO' and 'INVALID_PROTO'
drop reason. Therefore, users can locate their problems according to the
document.
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to avoid to change the return value of ping_queue_rcv_skb(),
introduce the function __ping_queue_rcv_skb(), which is able to report
the reasons of skb drop as its return value, as Paolo suggested.
Meanwhile, make ping_queue_rcv_skb() a simple call to
__ping_queue_rcv_skb().
The kfree_skb() and sock_queue_rcv_skb() used in ping_queue_rcv_skb()
are replaced with kfree_skb_reason() and sock_queue_rcv_skb_reason()
now.
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
As David Ahern suggested, the reasons for skb drops should be more
general and not be code based.
Therefore, rename SKB_DROP_REASON_PTYPE_ABSENT to
SKB_DROP_REASON_UNHANDLED_PROTO, which is used for the cases of no
L3 protocol handler, no L4 protocol handler, version extensions, etc.
From previous discussion, now we have the aim to make these reasons
more abstract and users based, avoiding code based.
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to report the reasons of skb drops in 'sock_queue_rcv_skb()',
introduce the function 'sock_queue_rcv_skb_reason()'.
As the return value of 'sock_queue_rcv_skb()' is used as the error code,
we can't make it as drop reason and have to pass extra output argument.
'sock_queue_rcv_skb()' is used in many places, so we can't change it
directly.
Introduce the new function 'sock_queue_rcv_skb_reason()' and make
'sock_queue_rcv_skb()' an inline call to it.
Reviewed-by: Hao Peng <flyingpeng@tencent.com>
Reviewed-by: Jiang Biao <benbjiang@tencent.com>
Signed-off-by: Menglong Dong <imagedong@tencent.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski says:
====================
tls: rx: random refactoring part 2
TLS Rx refactoring. Part 2 of 3. This one focusing on the main loop.
A couple of features to follow.
====================
The current invese logic is harder to follow (and adds extra
tests to the fast path). We have to enumerate all cases which
need to keep the skb before consuming it. It's simpler to
jump out of the full record flow as we detect those cases.
This makes it clear that partial consumption and peek can
only reach end of the function thru the !zc case so move
the code up there.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Whatever we do in the loop the skb should not remain on as
ctx->recv_pkt afterwards. We can clear that pointer and
restart strparser earlier.
This adds overhead of extra linking and unlinking to rx_list
but that's not large (upcoming change will switch to unlocked
skb list operations).
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
tls_sw_advance_skb() always consumes the skb at the end of the loop.
To fall here the following must be true:
!async && !is_peek && !retain_skb
retain_skb => !zc && rxm->full_len > len
# but non-full record implies !zc, so above can be simplified as
retain_skb => rxm->full_len > len
!async && !is_peek && !(rxm->full_len > len)
!async && !is_peek && rxm->full_len <= len
tls_sw_advance_skb() returns false if len < rxm->full_len
which can't be true given conditions above.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Most of the conditions deciding if zero-copy can be used
do not change throughout the iterations, so pre-calculate
them.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We track both if the last record was handled by async crypto
and how many records were async. This is not necessary. We
implicitly assume once crypto goes async it will stay that
way, otherwise we'd reorder records. So just track if we're
in async mode, the exact number of records is not necessary.
This change also forces us into "async" mode more consistently
in case crypto ever decided to interleave async and sync.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
tls_sw_advance_skb() caters to the async case when skb argument
is NULL. In that case it simply unpauses the strparser.
These are surprising semantics to a person reading the code,
and result in higher LoC, so inline the __strp_unpause and
only call tls_sw_advance_skb() when we actually move past
an skb.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
cmsg can be filled in during rx_list processing or normal
receive. Consolidate the code.
We don't need to keep the boolean to track if the cmsg was
created. 0 is an invalid content type.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since we are protected from async completions by decrypt_compl_lock
we can drop the async_notify and reinit the completion before we
start waiting.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We pass zc as a pointer to bool a few functions down as an in/out
argument. This is error prone since C will happily evalue a pointer
as a boolean (IOW forgetting *zc and writing zc leads to loss of
developer time..). Wrap the arguments into a structure.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We plumb pointer to chunk all the way to the decryption method.
It's set to the length of the text when decrypt_skb_update()
returns.
I think the code is written this way because original TLS
implementation passed &chunk to zerocopy_from_iter() and this
was carried forward as the code gotten more complex, without
any refactoring.
The fix for peek() introduced a new variable - to_decrypt
which for all practical purposes is what chunk is going to
get set to. Spare ourselves the pointer passing, use to_decrypt.
Use this opportunity to clean things up a little further.
Note that chunk / to_decrypt was mostly needed for the async
path, since the sync path would access rxm->full_len (decryption
transforms full_len from record size to text size). Use the
right source of truth more explicitly.
We have three cases:
- async - it's TLS 1.2 only, so chunk == to_decrypt, but we
need the min() because to_decrypt is a whole record
and we don't want to underflow len. Note that we can't
handle partial record by falling back to sync as it
would introduce reordering against records in flight.
- zc - again, TLS 1.2 only for now, so chunk == to_decrypt,
we don't do zc if len < to_decrypt, no need to check again.
- normal - it already handles chunk > len, we can factor out the
assignment to rxm->full_len and share it with zc.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
sk is unused, remove it to make it clear the function
doesn't poke at the socket.
size_used is always 0 on input and @length on success.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Create a local device *dev in order to not dereference the platform_device
several times throughout the probe function.
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Daniel Borkmann says:
====================
pull-request: bpf-next 2022-04-09
We've added 63 non-merge commits during the last 9 day(s) which contain
a total of 68 files changed, 4852 insertions(+), 619 deletions(-).
The main changes are:
1) Add libbpf support for USDT (User Statically-Defined Tracing) probes.
USDTs are an abstraction built on top of uprobes, critical for tracing
and BPF, and widely used in production applications, from Andrii Nakryiko.
2) While Andrii was adding support for x86{-64}-specific logic of parsing
USDT argument specification, Ilya followed-up with USDT support for s390
architecture, from Ilya Leoshkevich.
3) Support name-based attaching for uprobe BPF programs in libbpf. The format
supported is `u[ret]probe/binary_path:[raw_offset|function[+offset]]`, e.g.
attaching to libc malloc can be done in BPF via SEC("uprobe/libc.so.6:malloc")
now, from Alan Maguire.
4) Various load/store optimizations for the arm64 JIT to shrink the image
size by using arm64 str/ldr immediate instructions. Also enable pointer
authentication to verify return address for JITed code, from Xu Kuohai.
5) BPF verifier fixes for write access checks to helper functions, e.g.
rd-only memory from bpf_*_cpu_ptr() must not be passed to helpers that
write into passed buffers, from Kumar Kartikeya Dwivedi.
6) Fix overly excessive stack map allocation for its base map structure and
buckets which slipped-in from cleanups during the rlimit accounting removal
back then, from Yuntao Wang.
7) Extend the unstable CT lookup helpers for XDP and tc/BPF to report netfilter
connection tracking tuple direction, from Lorenzo Bianconi.
8) Improve bpftool dump to show BPF program/link type names, Milan Landaverde.
9) Minor cleanups all over the place from various others.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (63 commits)
bpf: Fix excessive memory allocation in stack_map_alloc()
selftests/bpf: Fix return value checks in perf_event_stackmap test
selftests/bpf: Add CO-RE relos into linked_funcs selftests
libbpf: Use weak hidden modifier for USDT BPF-side API functions
libbpf: Don't error out on CO-RE relos for overriden weak subprogs
samples, bpf: Move routes monitor in xdp_router_ipv4 in a dedicated thread
libbpf: Allow WEAK and GLOBAL bindings during BTF fixup
libbpf: Use strlcpy() in path resolution fallback logic
libbpf: Add s390-specific USDT arg spec parsing logic
libbpf: Make BPF-side of USDT support work on big-endian machines
libbpf: Minor style improvements in USDT code
libbpf: Fix use #ifdef instead of #if to avoid compiler warning
libbpf: Potential NULL dereference in usdt_manager_attach_usdt()
selftests/bpf: Uprobe tests should verify param/return values
libbpf: Improve string parsing for uprobe auto-attach
libbpf: Improve library identification for uprobe binary path resolution
selftests/bpf: Test for writes to map key from BPF helpers
selftests/bpf: Test passing rdonly mem to global func
bpf: Reject writes for PTR_TO_MAP_KEY in check_helper_mem_access
bpf: Check PTR_TO_MEM | MEM_RDONLY in check_helper_mem_access
...
====================
Link: https://lore.kernel.org/r/20220408231741.19116-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The 'n_buckets * (value_size + sizeof(struct stack_map_bucket))' part of the
allocated memory for 'smap' is never used after the memlock accounting was
removed, thus get rid of it.
[ Note, Daniel:
Commit b936ca643ade ("bpf: rework memlock-based memory accounting for maps")
moved `cost += n_buckets * (value_size + sizeof(struct stack_map_bucket))`
up and therefore before the bpf_map_area_alloc() allocation, sigh. In a later
step commit c85d69135a91 ("bpf: move memory size checks to bpf_map_charge_init()"),
and the overflow checks of `cost >= U32_MAX - PAGE_SIZE` moved into
bpf_map_charge_init(). And then 370868107bf6 ("bpf: Eliminate rlimit-based
memory accounting for stackmap maps") finally removed the bpf_map_charge_init().
Anyway, the original code did the allocation same way as /after/ this fix. ]
Fixes: b936ca643ade ("bpf: rework memlock-based memory accounting for maps")
Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220407130423.798386-1-ytcoode@gmail.com
The 8000 series and newer NICs all get hardware timestamps from the MAC
and can provide timestamps on a normal TX queue, rather than via a slow
path through the MC. As such we can use this path for any packet where a
hardware timestamp is requested.
This also enables support for PTP over transports other than IPv4+UDP.
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: Edward Cree <ecree@xilinx.com>
Link: https://lore.kernel.org/r/510652dc-54b4-0e11-657e-e37ee3ca26a9@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add cable test support for Micrel KSZ9x31 PHYs.
Tested on i.MX8M Mini with KSZ9131RNX in 100/Full mode with pairs shuffled
before magnetics:
(note: Cable test started/completed messages are omitted)
mx8mm-ksz9131-a-d-connected$ ethtool --cable-test eth0
Pair A code OK
Pair B code Short within Pair
Pair B, fault length: 0.80m
Pair C code Short within Pair
Pair C, fault length: 0.80m
Pair D code OK
mx8mm-ksz9131-a-b-connected$ ethtool --cable-test eth0
Pair A code OK
Pair B code OK
Pair C code Short within Pair
Pair C, fault length: 0.00m
Pair D code Short within Pair
Pair D, fault length: 0.00m
Tested on R8A77951 Salvator-XS with KSZ9031RNX and all four pairs connected:
(note: Cable test started/completed messages are omitted)
r8a7795-ksz9031-all-connected$ ethtool --cable-test eth0
Pair A code OK
Pair B code OK
Pair C code OK
Pair D code OK
The CTRL1000 CTL1000_ENABLE_MASTER and CTL1000_AS_MASTER bits are not
restored by calling phy_init_hw(), they must be manually cached in
ksz9x31_cable_test_start() and restored at the end of
ksz9x31_cable_test_get_status().
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Oleksij Rempel <linux@rempel-privat.de>
Cc: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Claudiu Beznea <claudiu.beznea@microchip.com>
Link: https://lore.kernel.org/r/20220407105534.85833-1-marex@denx.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The bpf_get_stackid() function may also return 0 on success as per UAPI BPF
helper documentation. Therefore, correct checks from 'val > 0' to 'val >= 0'
to ensure that they cover all possible success return values.
Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220408041452.933944-1-ytcoode@gmail.com
Add CO-RE relocations into __weak subprogs for multi-file linked_funcs
selftest to make sure libbpf handles such combination well.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220408181425.2287230-4-andrii@kernel.org
Use __weak __hidden for bpf_usdt_xxx() APIs instead of much more
confusing `static inline __noinline`. This was previously impossible due
to libbpf erroring out on CO-RE relocations pointing to eliminated weak
subprogs. Now that previous patch fixed this issue, switch back to
__weak __hidden as it's a more direct way of specifying the desired
behavior.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220408181425.2287230-3-andrii@kernel.org
During BPF static linking, all the ELF relocations and .BTF.ext
information (including CO-RE relocations) are preserved for __weak
subprograms that were logically overriden by either previous weak
subprogram instance or by corresponding "strong" (non-weak) subprogram.
This is just how native user-space linkers work, nothing new.
But libbpf is over-zealous when processing CO-RE relocation to error out
when CO-RE relocation belonging to such eliminated weak subprogram is
encountered. Instead of erroring out on this expected situation, log
debug-level message and skip the relocation.
Fixes: db2b8b06423c ("libbpf: Support CO-RE relocations for multi-prog sections")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220408181425.2287230-2-andrii@kernel.org
During BTF fix up for global variables, global variable can be global
weak and will have STB_WEAK binding in ELF. Support such global
variables in addition to non-weak ones.
This is not the problem when using BPF static linking, as BPF static
linker "fixes up" BTF during generation so that libbpf doesn't have to
do it anymore during bpf_object__open(), which led to this not being
noticed for a while, along with a pretty rare (currently) use of __weak
variables and maps.
Reported-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220407230446.3980075-2-andrii@kernel.org
Coverity static analyzer complains that strcpy() can cause buffer
overflow. Use libbpf_strlcpy() instead to be 100% sure this doesn't
happen.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220407230446.3980075-1-andrii@kernel.org
Ilya Leoshkevich says:
====================
This series adds USDT support for s390, making the "usdt" test pass
there. Patch 1 is a collection of minor cleanups, patch 2 adds
BPF-side support, patch 3 adds userspace-side support.
====================
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
The logic is superficially similar to that of x86, but the small
differences (no need for register table and dynamic allocation of
register names, no $ sign before constants) make maintaining a common
implementation too burdensome. Therefore simply add a s390x-specific
version of parse_usdt_arg().
Note that while bcc supports index registers, this patch does not. This
should not be a problem in most cases, since s390 uses a default value
"nor" for STAP_SDT_ARG_CONSTRAINT.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220407214411.257260-4-iii@linux.ibm.com
Ido Schimmel says:
====================
net/sched: Better error reporting for offload failures
This patchset improves error reporting to user space when offload fails
during the flow action setup phase. That is, when failures occur in the
actions themselves, even before calling device drivers. Requested /
reported in [1].
This is done by passing extack to the offload_act_setup() callback and
making use of it in the various actions.
Patches #1-#2 change matchall and flower to log error messages to user
space in accordance with the verbose flag.
Patch #3 passes extack to the offload_act_setup() callback from the
various call sites, including matchall and flower.
Patches #4-#11 make use of extack in the various actions to report
offload failures.
Patch #12 adds an error message when the action does not support offload
at all.
Patches #13-#14 change matchall and flower to stop overwriting more
specific error messages.
[1] https://lore.kernel.org/netdev/20220317185249.5mff5u2x624pjewv@skbuf/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The various error paths of tc_setup_offload_action() now report specific
error messages. Remove the generic messages to avoid overwriting the
more specific ones.
Before:
# tc filter add dev dummy0 ingress pref 1 proto ip flower skip_sw dst_ip 198.51.100.1 action police rate 100Mbit burst 10000
Error: cls_flower: Failed to setup flow action.
We have an error talking to the kernel
After:
# tc filter add dev dummy0 ingress pref 1 proto ip flower skip_sw dst_ip 198.51.100.1 action police rate 100Mbit burst 10000
Error: act_police: Offload not supported when conform/exceed action is "reclassify".
We have an error talking to the kernel
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The various error paths of tc_setup_offload_action() now report specific
error messages. Remove the generic messages to avoid overwriting the
more specific ones.
Before:
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action police rate 100Mbit burst 10000
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
After:
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action police rate 100Mbit burst 10000
Error: act_police: Offload not supported when conform/exceed action is "reclassify".
We have an error talking to the kernel
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For better error reporting to user space, add an extack message when the
requested action does not support offload.
Example:
# echo 1 > /sys/kernel/tracing/events/netlink/netlink_extack/enable
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action nat ingress 192.0.2.1 198.51.100.1
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
# cat /sys/kernel/tracing/trace_pipe
tc-181 [000] b..1. 88.406093: netlink_extack: msg=Action does not support offload
tc-181 [000] ..... 88.406108: netlink_extack: msg=cls_matchall: Failed to setup flow action
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For better error reporting to user space, add an extack message when
vlan action offload fails.
Currently, the failure cannot be triggered, but add a message in case
the action is extended in the future to support more than the current
set of modes.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For better error reporting to user space, add an extack message when
tunnel_key action offload fails.
Currently, the failure cannot be triggered, but add a message in case
the action is extended in the future to support more than set/release
modes.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For better error reporting to user space, add extack messages when
skbedit action offload fails.
Example:
# echo 1 > /sys/kernel/tracing/events/netlink/netlink_extack/enable
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action skbedit queue_mapping 1234
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
# cat /sys/kernel/tracing/trace_pipe
tc-185 [002] b..1. 31.802414: netlink_extack: msg=act_skbedit: Offload not supported when "queue_mapping" option is used
tc-185 [002] ..... 31.802418: netlink_extack: msg=cls_matchall: Failed to setup flow action
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action skbedit inheritdsfield
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
# cat /sys/kernel/tracing/trace_pipe
tc-187 [002] b..1. 45.985145: netlink_extack: msg=act_skbedit: Offload not supported when "inheritdsfield" option is used
tc-187 [002] ..... 45.985160: netlink_extack: msg=cls_matchall: Failed to setup flow action
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For better error reporting to user space, add extack messages when
police action offload fails.
Example:
# echo 1 > /sys/kernel/tracing/events/netlink/netlink_extack/enable
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action police rate 100Mbit burst 10000
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
# cat /sys/kernel/tracing/trace_pipe
tc-182 [000] b..1. 21.592969: netlink_extack: msg=act_police: Offload not supported when conform/exceed action is "reclassify"
tc-182 [000] ..... 21.592982: netlink_extack: msg=cls_matchall: Failed to setup flow action
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action police rate 100Mbit burst 10000 conform-exceed drop/continue
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
# cat /sys/kernel/tracing/trace_pipe
tc-184 [000] b..1. 38.882579: netlink_extack: msg=act_police: Offload not supported when conform/exceed action is "continue"
tc-184 [000] ..... 38.882593: netlink_extack: msg=cls_matchall: Failed to setup flow action
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For better error reporting to user space, add an extack message when
pedit action offload fails.
Currently, the failure cannot be triggered, but add a message in case
the action is extended in the future to support more than set/add
commands.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For better error reporting to user space, add extack messages when mpls
action offload fails.
Example:
# echo 1 > /sys/kernel/tracing/events/netlink/netlink_extack/enable
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action mpls dec_ttl
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
# cat /sys/kernel/tracing/trace_pipe
tc-182 [000] b..1. 18.693915: netlink_extack: msg=act_mpls: Offload not supported when "dec_ttl" option is used
tc-182 [000] ..... 18.693921: netlink_extack: msg=cls_matchall: Failed to setup flow action
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For better error reporting to user space, add an extack message when
mirred action offload fails.
Currently, the failure cannot be triggered, but add a message in case
the action is extended in the future to support more than ingress/egress
mirror/redirect.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
For better error reporting to user space, add extack messages when gact
action offload fails.
Example:
# echo 1 > /sys/kernel/tracing/events/netlink/netlink_extack/enable
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action continue
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
# cat /sys/kernel/tracing/trace_pipe
tc-181 [002] b..1. 105.493450: netlink_extack: msg=act_gact: Offload of "continue" action is not supported
tc-181 [002] ..... 105.493466: netlink_extack: msg=cls_matchall: Failed to setup flow action
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action reclassify
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
# cat /sys/kernel/tracing/trace_pipe
tc-183 [002] b..1. 124.126477: netlink_extack: msg=act_gact: Offload of "reclassify" action is not supported
tc-183 [002] ..... 124.126489: netlink_extack: msg=cls_matchall: Failed to setup flow action
# tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action pipe action drop
Error: cls_matchall: Failed to setup flow action.
We have an error talking to the kernel
# cat /sys/kernel/tracing/trace_pipe
tc-185 [002] b..1. 137.097791: netlink_extack: msg=act_gact: Offload of "pipe" action is not supported
tc-185 [002] ..... 137.097804: netlink_extack: msg=cls_matchall: Failed to setup flow action
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The callback is used by various actions to populate the flow action
structure prior to offload. Pass extack to this callback so that the
various actions will be able to report accurate error messages to user
space.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The verbose flag was added in commit 81c7288b170a ("sched: cls: enable
verbose logging") to avoid suppressing logging of error messages that
occur "when the rule is not to be exclusively executed by the hardware".
However, such error messages are currently suppressed when setup of flow
action fails. Take the verbose flag into account to avoid suppressing
error messages. This is done by using the extack pointer initialized by
tc_cls_common_offload_init(), which performs the necessary checks.
Before:
# tc filter add dev dummy0 ingress pref 1 proto ip flower dst_ip 198.51.100.1 action police rate 100Mbit burst 10000
# tc filter add dev dummy0 ingress pref 2 proto ip flower verbose dst_ip 198.51.100.1 action police rate 100Mbit burst 10000
After:
# tc filter add dev dummy0 ingress pref 1 proto ip flower dst_ip 198.51.100.1 action police rate 100Mbit burst 10000
# tc filter add dev dummy0 ingress pref 2 proto ip flower verbose dst_ip 198.51.100.1 action police rate 100Mbit burst 10000
Warning: cls_flower: Failed to setup flow action.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The verbose flag was added in commit 81c7288b170a ("sched: cls: enable
verbose logging") to avoid suppressing logging of error messages that
occur "when the rule is not to be exclusively executed by the hardware".
However, such error messages are currently suppressed when setup of flow
action fails. Take the verbose flag into account to avoid suppressing
error messages. This is done by using the extack pointer initialized by
tc_cls_common_offload_init(), which performs the necessary checks.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
t-queue
Tony Nguyen says:
====================
100GbE Intel Wired LAN Driver Updates 2022-04-07
Alexander Lobakin says:
This hunts down several places around packet templates/dummies for
switch rules which are either repetitive, fragile or just not
really readable code.
It's a common need to add new packet templates and to review such
changes as well, try to simplify both with the help of a pair
macros and aliases.
ice_find_dummy_packet() became very complex at this point with tons
of nested if-elses. It clearly showed this approach does not scale,
so convert its logics to the simple mask-match + static const array.
bloat-o-meter is happy about that (built w/ LLVM 13):
add/remove: 0/1 grow/shrink: 1/1 up/down: 2/-1058 (-1056)
Function old new delta
ice_fill_adv_dummy_packet 289 291 +2
ice_adv_add_update_vsi_list 201 - -201
ice_add_adv_rule 2950 2093 -857
Total: Before=414512, After=413456, chg -0.25%
add/remove: 53/52 grow/shrink: 0/0 up/down: 4660/-3988 (672)
RO Data old new delta
ice_dummy_pkt_profiles - 672 +672
Total: Before=37895, After=38567, chg +1.77%
Diffstat also looks nice, and adding new packet templates now takes
less lines.
We'll probably come out with dynamic template crafting in a while,
but for now let's improve what we have currently.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Potin Lai says:
====================
mdio: aspeed: Add Clause 45 support for Aspeed MDIO
This patch series add Clause 45 support for Aspeed MDIO driver, and
separate c22 and c45 implementation into different functions.
LINK: [v1] https://lore.kernel.org/all/20220329161949.19762-1-potin.lai@quantatw.com/
LINK: [v2] https://lore.kernel.org/all/20220406170055.28516-1-potin.lai@quantatw.com/
Changes v2 --> v3:
- sort local variable sequence in reverse Christmas tree format.
Changes v1 --> v2:
- add C45 to probe_capabilities
- break one patch into 3 small patches
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add Clause 45 support for Aspeed mdio driver.
Signed-off-by: Potin Lai <potin.lai@quantatw.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add following additional functions to move out the implementation from
aspeed_mdio_read() and aspeed_mdio_write().
c22:
- aspeed_mdio_read_c22()
- aspeed_mdio_write_c22()
c45:
- aspeed_mdio_read_c45()
- aspeed_mdio_write_c45()
Signed-off-by: Potin Lai <potin.lai@quantatw.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>