IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
We have switched to memcg-based memory accouting and thus the rlimit is
not needed any more. LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK was introduced in
libbpf for backward compatibility, so we can use it instead now.
This patch also removes the useless header sys/resource.h from many files
in samples/bpf.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220409125958.92629-2-laoar.shao@gmail.com
When running xdpsock for a fix duration of time before terminating
using --duration=<n>, there is a race condition that may cause xdpsock
to terminate immediately.
When running for a fixed duration of time the check to determine when to
terminate execution is in is_benchmark_done() and is being executed in
the context of the poller thread,
if (opt_duration > 0) {
unsigned long dt = (get_nsecs() - start_time);
if (dt >= opt_duration)
benchmark_done = true;
}
However start_time is only set after the poller thread have been
created. This leaves a small window when the poller thread is starting
and calls is_benchmark_done() for the first time that start_time is not
yet set. In that case start_time have its initial value of 0 and the
duration check fails as it do not correlate correctly for the
applications start time and immediately sets benchmark_done which in
turn terminates the xdpsock application.
Fix this by setting start_time before creating the poller thread.
Fixes: d3f11b018f ("samples/bpf: xdpsock: Add duration option to specify how long to run")
Signed-off-by: Niklas Söderlund <niklas.soderlund@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220315102948.466436-1-niklas.soderlund@corigine.com
It may be useful to add timestamp for Tx packets for continuous or cyclic
transmit operation. The timestamp and sequence ID of a Tx packet are
stored according to pktgen header format. To enable per-packet timestamp,
use -y|--tstamp option. If timestamp is off, pktgen header is not
included in the UDP payload. This means receiving side can use the magic
number for pktgen for differentiation.
The implementation supports both VLAN tagged and untagged option. By
default, the minimum packet size is set at 64B. However, if VLAN tagged
is on (-V), the minimum packet size is increased to 66B just so to fit
the pktgen_hdr size.
Added hex_dump() into the code path just for future cross-checking.
As before, simply change to "#define DEBUG_HEXDUMP 1" to inspect the
accuracy of TX packet.
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211230035447.523177-8-boon.leong.ong@intel.com
When user sets tx-pkt-count and in case where there are invalid Tx frame,
the complete_tx_only_all() process polls indefinitely. So, this patch
adds a time-out mechanism into the process so that the application
can terminate automatically after it retries 3*polling interval duration.
v1->v2:
Thanks to Jesper's and Song Liu's suggestion.
- clean-up git message to remove polling log
- make the Tx time-out retries configurable with 1s granularity
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211230035447.523177-7-boon.leong.ong@intel.com
By default, TX schedule policy is SCHED_OTHER (round-robin time-sharing).
To improve TX cyclic scheduling, we add SCHED_FIFO policy and its priority
by using -W FIFO or --policy=FIFO and -U <PRIO> or --schpri=<PRIO>.
A) From xdpsock --app-stats, for SCHED_OTHER policy:
$ xdpsock -i eth0 -t -N -z -T 1000 -b 16 -C 100000 -a
period min ave max cycle
Cyclic TX 1000000 53507 75334 712642 6250
B) For SCHED_FIFO policy and schpri=50:
$ xdpsock -i eth0 -t -N -z -T 1000 -b 16 -C 100000 -a -W FIFO -U 50
period min ave max cycle
Cyclic TX 1000000 3699 24859 54397 6250
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211230035447.523177-6-boon.leong.ong@intel.com
Tx cycle time is in micro-seconds unit. By combining the batch size (-b M)
and Tx cycle time (-T|--tx-cycle N), xdpsock now can transmit batch-size of
packets every N-us periodically. Cyclic TX operation is not applicable if
--poll mode is used.
To transmit 16 packets every 1ms cycle time for total of 100000 packets
silently:
$ xdpsock -i eth0 -T -N -z -T 1000 -b 16 -C 100000
To print cyclic TX schedule variance stats, use --app-stats|-a:
$ xdpsock -i eth0 -T -N -z -T 1000 -b 16 -C 100000 -a
sock0@eth0:0 txonly xdp-drv
pps pkts 0.00
rx 0 0
tx 0 100000
calls/s count
rx empty polls 0 0
fill fail polls 0 0
copy tx sendtos 0 0
tx wakeup sendtos 0 6254
opt polls 0 0
period min ave max cycle
Cyclic TX 1000000 53507 75334 712642 6250
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211230035447.523177-5-boon.leong.ong@intel.com
To set Dest MAC address (-G|--tx-dmac) only:
$ xdpsock -i eth0 -t -N -z -G aa:bb:cc:dd:ee:ff
To set Source MAC address (-H|--tx-smac) only:
$ xdpsock -i eth0 -t -N -z -H 11:22:33:44:55:66
To set both Dest and Source MAC address:
$ xdpsock -i eth0 -t -N -z -G aa:bb:cc:dd:ee:ff \
-H 11:22:33:44:55:66
The default Dest and Source MAC address remain the same as before.
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/20211230035447.523177-3-boon.leong.ong@intel.com
In multi-queue environment testing, the support for VLAN-tag based
steering is useful. So, this patch adds the capability to add
VLAN tag (VLAN ID and Priority) to the generated Tx frame.
To set the VLAN ID=10 and Priority=2 for Tx only through TxQ=3:
$ xdpsock -i eth0 -t -N -z -q 3 -V -J 10 -K 2
If VLAN ID (-J) and Priority (-K) is set, it default to
VLAN ID = 1
VLAN Priority = 0.
For example, VLAN-tagged Tx only, xdp copy mode through TxQ=1:
$ xdpsock -i eth0 -t -N -c -q 1 -V
Signed-off-by: Ong Boon Leong <boon.leong.ong@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211230035447.523177-2-boon.leong.ong@intel.com
The xdpsock sample application is a useful base for experiment's around
AF_XDP sockets. Compiling the sample outside of the kernel tree is made
harder then it has to be as the sample includes two headers and that are
not installed by 'make install_header' nor are usually part of
distributions kernel headers.
The first header asm/barrier.h is not used and can just be dropped.
The second linux/compiler.h are only needed for the decorator __force
and are only used in ip_fast_csum(), csum_fold() and
csum_tcpudp_nofold(). These functions are copied verbatim from
include/asm-generic/checksum.h and lib/checksum.c. While it's fine to
copy and use these functions in the sample application the decorator
brings no value and can be dropped together with the include.
With this change it's trivial to compile the xdpsock sample outside the
kernel tree from xdpsock_user.c and xdpsock.h.
$ gcc -o xdpsock xdpsock_user.c -lbpf -lpthread
Signed-off-by: Niklas Söderlund <niklas.soderlund@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Link: https://lore.kernel.org/bpf/20210806122855.26115-2-simon.horman@corigine.com
Execute the following command and exit, then execute it again, the following
error will be reported:
$ sudo ./samples/bpf/xdpsock -i ens4f2 -M
^C
$ sudo ./samples/bpf/xdpsock -i ens4f2 -M
libbpf: elf: skipping unrecognized data section(16) .eh_frame
libbpf: elf: skipping relo section(17) .rel.eh_frame for section(16) .eh_frame
libbpf: Kernel error message: XDP program already attached
ERROR: link set xdp fd failed
Commit c9d27c9e8d ("samples: bpf: Do not unload prog within xdpsock") removed
the unloading prog code because of the presence of bpf_link. This is fine if
XDP_SHARED_UMEM is disabled, but if it is enabled, unloading the prog is still
needed.
Fixes: c9d27c9e8d ("samples: bpf: Do not unload prog within xdpsock")
Signed-off-by: Wang Hai <wanghai38@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Cc: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20210628091815.2373487-1-wanghai38@huawei.com
With the introduction of bpf_link in xsk's libbpf part, there's no
further need for explicit unload of prog on xdpsock's termination. When
process dies, the bpf_link's refcount will be decremented and resources
will be unloaded/freed under the hood in case when there are no more
active users.
While at it, don't dump stats on error path.
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20210329224316.17793-8-maciej.fijalkowski@intel.com
Fix a possible hang in xdpsock that can occur when using multiple
threads. In this case, one or more of the threads might get stuck in
the while-loop in tx_only after the user has signaled the main thread
to stop execution. In this case, no more Tx packets will be sent, so a
thread might get stuck in the aforementioned while-loop. Fix this by
introducing a test inside the while-loop to check if the benchmark has
been terminated. If so, return from the function.
Fixes: cd9e72b6f2 ("samples/bpf: xdpsock: Add option to specify batch size")
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20201210163407.22066-1-magnus.karlsson@gmail.com
Introduce a sample program to demonstrate the control and data
plane split. For the control plane part a new program called
xdpsock_ctrl_proc is introduced. For the data plane part, some code
was added to xdpsock_user.c to act as the data plane entity.
Application xdpsock_ctrl_proc works as control entity with sudo
privileges (CAP_SYS_ADMIN and CAP_NET_ADMIN are sufficient) and the
extended xdpsock as data plane entity with CAP_NET_RAW capability
only.
Usage example:
sudo ./samples/bpf/xdpsock_ctrl_proc -i <interface>
sudo ./samples/bpf/xdpsock -i <interface> -q <queue_id>
-n <interval> -N -l -R
Signed-off-by: Mariusz Dudek <mariuszx.dudek@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20201203090546.11976-3-mariuszx.dudek@intel.com
Increment the statistics over how many Tx packets have been sent at
the time of sending instead of at the time of completion. This as a
completion event means that the buffer has been sent AND returned to
user space. The packet always gets sent shortly after sendto() is
called. The kernel might, for performance reasons, decide to not
return every single buffer to user space immediately after sending,
for example, only after a batch of packets have been
transmitted. Incrementing the number of packets sent at completion,
will in that case be confusing as if you send a single packet, the
counter might show zero for a while even though the packet has been
transmitted.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/1605525167-14450-2-git-send-email-magnus.karlsson@gmail.com
Add an option to count the number of interrupts generated per second and
total number of interrupts during the lifetime of the application for a
given interface. This information is extracted from /proc/interrupts. Since
there is no naming convention across drivers, the user must provide the
string which is specific to their interface in the /proc/interrupts file on
the command line.
Usage:
./xdpsock ... -I <irq_str>
eg. for queue 0 of i40e device eth0:
./xdpsock ... -I i40e-eth0-TxRx-0
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20201002133612.31536-3-ciara.loftus@intel.com
Categorise and record syscalls issued in the xdpsock sample app. The
categories recorded are:
rx_empty_polls: polls when the rx ring is empty
fill_fail_polls: polls when failed to get addr from fill ring
copy_tx_sendtos: sendtos issued for tx when copy mode enabled
tx_wakeup_sendtos: sendtos issued when tx ring needs waking up
opt_polls: polls issued since the '-p' flag is set
Print the stats using '-a' on the xdpsock command line.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20201002133612.31536-2-ciara.loftus@intel.com
Fix a possible deadlock in the l2fwd application in xdpsock that can
occur when there is no space in the Tx ring. There are two ways to get
the kernel to consume entries in the Tx ring: calling sendto() to make
it send packets and freeing entries from the completion ring, as the
kernel will not send a packet if there is no space for it to add a
completion entry in the completion ring. The Tx loop in l2fwd only
used to call sendto(). This patches adds cleaning the completion ring
in that loop.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1599726666-8431-3-git-send-email-magnus.karlsson@gmail.com
Fix the sending of a single packet (or small burst) in xdpsock when
executing in copy mode. Currently, the l2fwd application in xdpsock
only transmits the packets after a batch of them has been received,
which might be confusing if you only send one packet and expect that
it is returned pronto. Fix this by calling sendto() more often and add
a comment in the code that states that this can be optimized if
needed.
Reported-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1599726666-8431-2-git-send-email-magnus.karlsson@gmail.com
Optimize the throughput performance of the l2fwd sub-app in the
xdpsock sample application by removing a duplicate syscall and
increasing the size of the fill ring.
The latter needs some further explanation. We recommend that you set
the fill ring size >= HW RX ring size + AF_XDP RX ring size. Make sure
you fill up the fill ring with buffers at regular intervals, and you
will with this setting avoid allocation failures in the driver. These
are usually quite expensive since drivers have not been written to
assume that allocation failures are common. For regular sockets,
kernel allocated memory is used that only runs out in OOM situations
that should be rare.
These two performance optimizations together lead to a 6% percent
improvement for the l2fwd app on my machine.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/1598619065-1944-1-git-send-email-magnus.karlsson@intel.com
Fix all files in samples/bpf to include libbpf header files with the bpf/
prefix, to be consistent with external users of the library. Also ensure
that all includes of exported libbpf header files (those that are exported
on 'make install' of the library) use bracketed includes instead of quoted.
To make sure no new files are introduced that doesn't include the bpf/
prefix in its include, remove tools/lib/bpf from the include path entirely,
and use tools/lib instead.
Fixes: 6910d7d386 ("selftests/bpf: Ensure bpf_helper_defs.h are taken from selftests dir")
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157952560911.1683545.8795966751309534150.stgit@toke.dk
New option '-s' or '--tx-pkt-size' has been added to specify the transmit
packet size. The packet size ranges from 47 to 4096 bytes. When this
option is not provided, it defaults to 64 byte packet.
The code uses struct ethhdr, struct iphdr and struct udphdr to form the
transmit packet. The MAC address, IP address and UDP ports are set to default
values.
The code calculates IP and UDP checksums before sending the packet.
Checksum calculation code in Linux kernel is used for this purpose.
The Ethernet FCS is not filled by the code.
Signed-off-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191220085530.4980-6-jay.jayatheerthan@intel.com
When attaching XDP programs, userspace can set flags to request the attach
mode (generic/SKB mode, driver mode or hw offloaded mode). If no such flags
are requested, the kernel will attempt to attach in driver mode, and then
silently fall back to SKB mode if this fails.
The silent fallback is a major source of user confusion, as users will try
to load a program on a device without XDP support, and instead of an error
they will get the silent fallback behaviour, not notice, and then wonder
why performance is not what they were expecting.
In an attempt to combat this, let's switch all the samples to default to
explicitly requesting driver-mode attach. As part of this, ensure that all
the userspace utilities have a switch to enable SKB mode. For those that
have a switch to request driver mode, keep it but turn it into a no-op.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: David Ahern <dsahern@gmail.com>
Link: https://lore.kernel.org/bpf/20191216110742.364456-1-toke@redhat.com