Jesper Dangaard Brouer eff94154cc samples/bpf: xdp_redirect_cpu_user: Cpumap qsize set larger default
Experience from production shows queue size of 192 is too small, as
this caused packet drops during cpumap-enqueue on RX-CPU.  This can be
diagnosed with xdp_monitor sample program.

This bpftrace program was used to diagnose the problem in more detail:

 bpftrace -e '
  tracepoint:xdp:xdp_cpumap_kthread { @deq_bulk = lhist(args->processed,0,10,1); @drop_net = lhist(args->drops,0,10,1) }
  tracepoint:xdp:xdp_cpumap_enqueue { @enq_bulk = lhist(args->processed,0,10,1); @enq_drops = lhist(args->drops,0,10,1); }'

Watch out for the @enq_drops counter. The @drop_net counter can happen
when netstack gets invalid packets, so don't despair it can be
natural, and that counter will likely disappear in newer kernels as it
was a source of confusion (look at netstat info for reason of the
netstack @drop_net counters).

The production system was configured with CPU power-saving C6 state.
Learn more in this blogpost[1].

And wakeup latency in usec for the states are:

 # grep -H . /sys/devices/system/cpu/cpu0/cpuidle/*/latency
 /sys/devices/system/cpu/cpu0/cpuidle/state0/latency:0
 /sys/devices/system/cpu/cpu0/cpuidle/state1/latency:2
 /sys/devices/system/cpu/cpu0/cpuidle/state2/latency:10
 /sys/devices/system/cpu/cpu0/cpuidle/state3/latency:133

Deepest state take 133 usec to wakeup from (133/10^6). The link speed
is 25Gbit/s ((25*10^9/8) in bytes/sec). How many bytes can arrive with
in 133 usec at this speed: (25*10^9/8)*(133/10^6) = 415625 bytes. With
MTU size packets this is 275 packets, and with minimum Ethernet (incl
intergap overhead) 84 bytes it is 4948 packets. Clearly default queue
size is too small.

Setting default cpumap queue to 2048 as worst-case (small packet) at
10Gbit/s is 1979 packets with 133 usec wakeup time, +64 packet before
kthread wakeup call (due to xdp_do_flush) worst-case 2043 packets.

Thus, if a packet burst on RX-CPU will enqueue packets to a remote
cpumap CPU that is in deep-sleep state it can overrun the cpumap queue.

The production system was also configured to avoid deep-sleep via:
 tuned-adm profile network-latency

[1] https://jeremyeder.com/2013/08/30/oh-did-you-expect-the-cpu/

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/162523477604.786243.13372630844944530891.stgit@firesoul
2021-07-07 20:11:48 -07:00
..
2019-05-31 16:41:29 -07:00
2019-05-31 16:41:29 -07:00
2021-03-25 22:03:46 -07:00
2021-03-25 22:03:46 -07:00

eBPF sample programs
====================

This directory contains a test stubs, verifier test-suite and examples
for using eBPF. The examples use libbpf from tools/lib/bpf.

Build dependencies
==================

Compiling requires having installed:
 * clang >= version 3.4.0
 * llvm >= version 3.7.1

Note that LLVM's tool 'llc' must support target 'bpf', list version
and supported targets with command: ``llc --version``

Clean and configuration
-----------------------

It can be needed to clean tools, samples or kernel before trying new arch or
after some changes (on demand)::

 make -C tools clean
 make -C samples/bpf clean
 make clean

Configure kernel, defconfig for instance::

 make defconfig

Kernel headers
--------------

There are usually dependencies to header files of the current kernel.
To avoid installing devel kernel headers system wide, as a normal
user, simply call::

 make headers_install

This will creates a local "usr/include" directory in the git/build top
level directory, that the make system automatically pickup first.

Compiling
=========

For building the BPF samples, issue the below command from the kernel
top level directory::

 make M=samples/bpf

It is also possible to call make from this directory.  This will just
hide the invocation of make as above.

Manually compiling LLVM with 'bpf' support
------------------------------------------

Since version 3.7.0, LLVM adds a proper LLVM backend target for the
BPF bytecode architecture.

By default llvm will build all non-experimental backends including bpf.
To generate a smaller llc binary one can use::

 -DLLVM_TARGETS_TO_BUILD="BPF"

We recommend that developers who want the fastest incremental builds
use the Ninja build system, you can find it in your system's package
manager, usually the package is ninja or ninja-build.

Quick sniplet for manually compiling LLVM and clang
(build dependencies are ninja, cmake and gcc-c++)::

 $ git clone https://github.com/llvm/llvm-project.git
 $ mkdir -p llvm-project/llvm/build
 $ cd llvm-project/llvm/build
 $ cmake .. -G "Ninja" -DLLVM_TARGETS_TO_BUILD="BPF;X86" \
            -DLLVM_ENABLE_PROJECTS="clang"    \
            -DCMAKE_BUILD_TYPE=Release        \
            -DLLVM_BUILD_RUNTIME=OFF
 $ ninja

It is also possible to point make to the newly compiled 'llc' or
'clang' command via redefining LLC or CLANG on the make command line::

 make M=samples/bpf LLC=~/git/llvm-project/llvm/build/bin/llc CLANG=~/git/llvm-project/llvm/build/bin/clang

Cross compiling samples
-----------------------
In order to cross-compile, say for arm64 targets, export CROSS_COMPILE and ARCH
environment variables before calling make. But do this before clean,
cofiguration and header install steps described above. This will direct make to
build samples for the cross target::

 export ARCH=arm64
 export CROSS_COMPILE="aarch64-linux-gnu-"

Headers can be also installed on RFS of target board if need to keep them in
sync (not necessarily and it creates a local "usr/include" directory also)::

 make INSTALL_HDR_PATH=~/some_sysroot/usr headers_install

Pointing LLC and CLANG is not necessarily if it's installed on HOST and have
in its targets appropriate arm64 arch (usually it has several arches).
Build samples::

 make M=samples/bpf

Or build samples with SYSROOT if some header or library is absent in toolchain,
say libelf, providing address to file system containing headers and libs,
can be RFS of target board::

 make M=samples/bpf SYSROOT=~/some_sysroot