linux/tools/testing/selftests/bpf/progs/trigger_bench.c
Andrii Nakryiko d41bc48bfa selftests/bpf: Add uprobe triggering overhead benchmarks
Add benchmark to measure overhead of uprobes and uretprobes. Also have
a baseline (no uprobe attached) benchmark.

On my dev machine, baseline benchmark can trigger 130M user_target()
invocations. When uprobe is attached, this falls to just 700K. With
uretprobe, we get down to 520K:

  $ sudo ./bench trig-uprobe-base -a
  Summary: hits  131.289 ± 2.872M/s

  # UPROBE
  $ sudo ./bench -a trig-uprobe-without-nop
  Summary: hits    0.729 ± 0.007M/s

  $ sudo ./bench -a trig-uprobe-with-nop
  Summary: hits    1.798 ± 0.017M/s

  # URETPROBE
  $ sudo ./bench -a trig-uretprobe-without-nop
  Summary: hits    0.508 ± 0.012M/s

  $ sudo ./bench -a trig-uretprobe-with-nop
  Summary: hits    0.883 ± 0.008M/s

So there is almost 2.5x performance difference between probing nop vs
non-nop instruction for entry uprobe. And 1.7x difference for uretprobe.

This means that non-nop uprobe overhead is around 1.4 microseconds for uprobe
and 2 microseconds for non-nop uretprobe.

For nop variants, uprobe and uretprobe overhead is down to 0.556 and
1.13 microseconds, respectively.

For comparison, just doing a very low-overhead syscall (with no BPF
programs attached anywhere) gives:

  $ sudo ./bench trig-base -a
  Summary: hits    4.830 ± 0.036M/s

So uprobes are about 2.67x slower than pure context switch.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211116013041.4072571-1-andrii@kernel.org
2021-11-16 14:46:49 +01:00

62 lines
1.1 KiB
C

// SPDX-License-Identifier: GPL-2.0
// Copyright (c) 2020 Facebook
#include <linux/bpf.h>
#include <asm/unistd.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
char _license[] SEC("license") = "GPL";
long hits = 0;
SEC("tp/syscalls/sys_enter_getpgid")
int bench_trigger_tp(void *ctx)
{
__sync_add_and_fetch(&hits, 1);
return 0;
}
SEC("raw_tp/sys_enter")
int BPF_PROG(bench_trigger_raw_tp, struct pt_regs *regs, long id)
{
if (id == __NR_getpgid)
__sync_add_and_fetch(&hits, 1);
return 0;
}
SEC("kprobe/__x64_sys_getpgid")
int bench_trigger_kprobe(void *ctx)
{
__sync_add_and_fetch(&hits, 1);
return 0;
}
SEC("fentry/__x64_sys_getpgid")
int bench_trigger_fentry(void *ctx)
{
__sync_add_and_fetch(&hits, 1);
return 0;
}
SEC("fentry.s/__x64_sys_getpgid")
int bench_trigger_fentry_sleep(void *ctx)
{
__sync_add_and_fetch(&hits, 1);
return 0;
}
SEC("fmod_ret/__x64_sys_getpgid")
int bench_trigger_fmodret(void *ctx)
{
__sync_add_and_fetch(&hits, 1);
return -22;
}
SEC("uprobe/self/uprobe_target")
int bench_trigger_uprobe(void *ctx)
{
__sync_add_and_fetch(&hits, 1);
return 0;
}