Merge branch 'bpf: Support calling kernel function'

Martin KaFai says:

====================

This series adds support to allow bpf program calling kernel function.

The use case included in this set is to allow bpf-tcp-cc to directly
call some tcp-cc helper functions (e.g. "tcp_cong_avoid_ai()").  Those
functions have already been used by some kernel tcp-cc implementations.

This set will also allow the bpf-tcp-cc program to directly call the
kernel tcp-cc implementation,  For example, a bpf_dctcp may only want to
implement its own dctcp_cwnd_event() and reuse other dctcp_*() directly
from the kernel tcp_dctcp.c instead of reimplementing (or
copy-and-pasting) them.

The tcp-cc kernel functions mentioned above will be white listed
for the struct_ops bpf-tcp-cc programs to use in a later patch.
The white listed functions are not bounded to a fixed ABI contract.
Those functions have already been used by the existing kernel tcp-cc.
If any of them has changed, both in-tree and out-of-tree kernel tcp-cc
implementations have to be changed.  The same goes for the struct_ops
bpf-tcp-cc programs which have to be adjusted accordingly.

Please see individual patch for details.

v2:
- Patch 2 in v1 is removed.  No need to support extern func in kernel.
  Changed libbpf to adjust the .ksyms datasec for extern func
  in patch 11. (Andrii)
- Name change: btf_check_func_arg_match() and btf_check_subprog_arg_match()
  in patch 2. (Andrii)
- Always set unreliable on any error in patch 2 since it does not
  matter. (Andrii)
- s/kern_func/kfunc/ and s/descriptor/desc/ in this set. (Andrii)
- Remove some unnecessary changes in disasm.h and disasm.c
  in patch 3.  In particular, no need to change the function
  signature in bpf_insn_revmap_call_t.  Also, removed the changes
  in print_bpf_insn().
- Fixed an issue in check_kfunc_call() when the calling kernel function
  returns a pointer in patch 3.  Added a selftest.
- Adjusted the verifier selftests due to the changes in the verifier log
  in patch 3.
- Fixed a comparison issue in kfunc_desc_cmp_by_imm() in patch 3. (Andrii)
- Name change: is_ldimm64_insn(),
  new helper: is_call_insn() in patch 10 (Andrii)
- Move btf_func_linkage() from btf.h to libbpf.c in patch 11. (Andrii)
- Fixed the linker error when CONFIG_BPF_SYSCALL is not defined.
  Moved the check_kfunc_call from filter.c to test_run.c in patch 14.
  (kernel test robot)
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
This commit is contained in:
Alexei Starovoitov 2021-03-26 20:29:06 -07:00
commit fddbf4b6dc
25 changed files with 1352 additions and 337 deletions

View File

@ -2346,3 +2346,8 @@ out:
tmp : orig_prog);
return prog;
}
bool bpf_jit_supports_kfunc_call(void)
{
return true;
}

View File

@ -1390,6 +1390,19 @@ static inline void emit_push_r64(const u8 src[], u8 **pprog)
*pprog = prog;
}
static void emit_push_r32(const u8 src[], u8 **pprog)
{
u8 *prog = *pprog;
int cnt = 0;
/* mov ecx,dword ptr [ebp+off] */
EMIT3(0x8B, add_2reg(0x40, IA32_EBP, IA32_ECX), STACK_VAR(src_lo));
/* push ecx */
EMIT1(0x51);
*pprog = prog;
}
static u8 get_cond_jmp_opcode(const u8 op, bool is_cmp_lo)
{
u8 jmp_cond;
@ -1459,6 +1472,174 @@ static u8 get_cond_jmp_opcode(const u8 op, bool is_cmp_lo)
return jmp_cond;
}
/* i386 kernel compiles with "-mregparm=3". From gcc document:
*
* ==== snippet ====
* regparm (number)
* On x86-32 targets, the regparm attribute causes the compiler
* to pass arguments number one to (number) if they are of integral
* type in registers EAX, EDX, and ECX instead of on the stack.
* Functions that take a variable number of arguments continue
* to be passed all of their arguments on the stack.
* ==== snippet ====
*
* The first three args of a function will be considered for
* putting into the 32bit register EAX, EDX, and ECX.
*
* Two 32bit registers are used to pass a 64bit arg.
*
* For example,
* void foo(u32 a, u32 b, u32 c, u32 d):
* u32 a: EAX
* u32 b: EDX
* u32 c: ECX
* u32 d: stack
*
* void foo(u64 a, u32 b, u32 c):
* u64 a: EAX (lo32) EDX (hi32)
* u32 b: ECX
* u32 c: stack
*
* void foo(u32 a, u64 b, u32 c):
* u32 a: EAX
* u64 b: EDX (lo32) ECX (hi32)
* u32 c: stack
*
* void foo(u32 a, u32 b, u64 c):
* u32 a: EAX
* u32 b: EDX
* u64 c: stack
*
* The return value will be stored in the EAX (and EDX for 64bit value).
*
* For example,
* u32 foo(u32 a, u32 b, u32 c):
* return value: EAX
*
* u64 foo(u32 a, u32 b, u32 c):
* return value: EAX (lo32) EDX (hi32)
*
* Notes:
* The verifier only accepts function having integer and pointers
* as its args and return value, so it does not have
* struct-by-value.
*
* emit_kfunc_call() finds out the btf_func_model by calling
* bpf_jit_find_kfunc_model(). A btf_func_model
* has the details about the number of args, size of each arg,
* and the size of the return value.
*
* It first decides how many args can be passed by EAX, EDX, and ECX.
* That will decide what args should be pushed to the stack:
* [first_stack_regno, last_stack_regno] are the bpf regnos
* that should be pushed to the stack.
*
* It will first push all args to the stack because the push
* will need to use ECX. Then, it moves
* [BPF_REG_1, first_stack_regno) to EAX, EDX, and ECX.
*
* When emitting a call (0xE8), it needs to figure out
* the jmp_offset relative to the jit-insn address immediately
* following the call (0xE8) instruction. At this point, it knows
* the end of the jit-insn address after completely translated the
* current (BPF_JMP | BPF_CALL) bpf-insn. It is passed as "end_addr"
* to the emit_kfunc_call(). Thus, it can learn the "immediate-follow-call"
* address by figuring out how many jit-insn is generated between
* the call (0xE8) and the end_addr:
* - 0-1 jit-insn (3 bytes each) to restore the esp pointer if there
* is arg pushed to the stack.
* - 0-2 jit-insns (3 bytes each) to handle the return value.
*/
static int emit_kfunc_call(const struct bpf_prog *bpf_prog, u8 *end_addr,
const struct bpf_insn *insn, u8 **pprog)
{
const u8 arg_regs[] = { IA32_EAX, IA32_EDX, IA32_ECX };
int i, cnt = 0, first_stack_regno, last_stack_regno;
int free_arg_regs = ARRAY_SIZE(arg_regs);
const struct btf_func_model *fm;
int bytes_in_stack = 0;
const u8 *cur_arg_reg;
u8 *prog = *pprog;
s64 jmp_offset;
fm = bpf_jit_find_kfunc_model(bpf_prog, insn);
if (!fm)
return -EINVAL;
first_stack_regno = BPF_REG_1;
for (i = 0; i < fm->nr_args; i++) {
int regs_needed = fm->arg_size[i] > sizeof(u32) ? 2 : 1;
if (regs_needed > free_arg_regs)
break;
free_arg_regs -= regs_needed;
first_stack_regno++;
}
/* Push the args to the stack */
last_stack_regno = BPF_REG_0 + fm->nr_args;
for (i = last_stack_regno; i >= first_stack_regno; i--) {
if (fm->arg_size[i - 1] > sizeof(u32)) {
emit_push_r64(bpf2ia32[i], &prog);
bytes_in_stack += 8;
} else {
emit_push_r32(bpf2ia32[i], &prog);
bytes_in_stack += 4;
}
}
cur_arg_reg = &arg_regs[0];
for (i = BPF_REG_1; i < first_stack_regno; i++) {
/* mov e[adc]x,dword ptr [ebp+off] */
EMIT3(0x8B, add_2reg(0x40, IA32_EBP, *cur_arg_reg++),
STACK_VAR(bpf2ia32[i][0]));
if (fm->arg_size[i - 1] > sizeof(u32))
/* mov e[adc]x,dword ptr [ebp+off] */
EMIT3(0x8B, add_2reg(0x40, IA32_EBP, *cur_arg_reg++),
STACK_VAR(bpf2ia32[i][1]));
}
if (bytes_in_stack)
/* add esp,"bytes_in_stack" */
end_addr -= 3;
/* mov dword ptr [ebp+off],edx */
if (fm->ret_size > sizeof(u32))
end_addr -= 3;
/* mov dword ptr [ebp+off],eax */
if (fm->ret_size)
end_addr -= 3;
jmp_offset = (u8 *)__bpf_call_base + insn->imm - end_addr;
if (!is_simm32(jmp_offset)) {
pr_err("unsupported BPF kernel function jmp_offset:%lld\n",
jmp_offset);
return -EINVAL;
}
EMIT1_off32(0xE8, jmp_offset);
if (fm->ret_size)
/* mov dword ptr [ebp+off],eax */
EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EAX),
STACK_VAR(bpf2ia32[BPF_REG_0][0]));
if (fm->ret_size > sizeof(u32))
/* mov dword ptr [ebp+off],edx */
EMIT3(0x89, add_2reg(0x40, IA32_EBP, IA32_EDX),
STACK_VAR(bpf2ia32[BPF_REG_0][1]));
if (bytes_in_stack)
/* add esp,"bytes_in_stack" */
EMIT3(0x83, add_1reg(0xC0, IA32_ESP), bytes_in_stack);
*pprog = prog;
return 0;
}
static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
int oldproglen, struct jit_context *ctx)
{
@ -1888,6 +2069,18 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image,
if (insn->src_reg == BPF_PSEUDO_CALL)
goto notyet;
if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) {
int err;
err = emit_kfunc_call(bpf_prog,
image + addrs[i],
insn, &prog);
if (err)
return err;
break;
}
func = (u8 *) __bpf_call_base + imm32;
jmp_offset = func - (image + addrs[i]);
@ -2393,3 +2586,8 @@ out:
tmp : orig_prog);
return prog;
}
bool bpf_jit_supports_kfunc_call(void)
{
return true;
}

View File

@ -427,6 +427,7 @@ enum bpf_reg_type {
PTR_TO_PERCPU_BTF_ID, /* reg points to a percpu kernel variable */
PTR_TO_FUNC, /* reg points to a bpf program function */
PTR_TO_MAP_KEY, /* reg points to a map element key */
__BPF_REG_TYPE_MAX,
};
/* The information passed from prog-specific *_is_valid_access
@ -480,6 +481,7 @@ struct bpf_verifier_ops {
const struct btf_type *t, int off, int size,
enum bpf_access_type atype,
u32 *next_btf_id);
bool (*check_kfunc_call)(u32 kfunc_btf_id);
};
struct bpf_prog_offload_ops {
@ -796,6 +798,8 @@ struct btf_mod_pair {
struct module *module;
};
struct bpf_kfunc_desc_tab;
struct bpf_prog_aux {
atomic64_t refcnt;
u32 used_map_cnt;
@ -832,6 +836,7 @@ struct bpf_prog_aux {
struct bpf_prog **func;
void *jit_data; /* JIT specific data. arch dependent */
struct bpf_jit_poke_descriptor *poke_tab;
struct bpf_kfunc_desc_tab *kfunc_tab;
u32 size_poke_tab;
struct bpf_ksym ksym;
const struct bpf_prog_ops *ops;
@ -1527,6 +1532,7 @@ int bpf_prog_test_run_raw_tp(struct bpf_prog *prog,
int bpf_prog_test_run_sk_lookup(struct bpf_prog *prog,
const union bpf_attr *kattr,
union bpf_attr __user *uattr);
bool bpf_prog_test_check_kfunc_call(u32 kfunc_id);
bool btf_ctx_access(int off, int size, enum bpf_access_type type,
const struct bpf_prog *prog,
struct bpf_insn_access_aux *info);
@ -1545,8 +1551,11 @@ int btf_distill_func_proto(struct bpf_verifier_log *log,
struct btf_func_model *m);
struct bpf_reg_state;
int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog,
struct bpf_reg_state *regs);
int btf_check_subprog_arg_match(struct bpf_verifier_env *env, int subprog,
struct bpf_reg_state *regs);
int btf_check_kfunc_arg_match(struct bpf_verifier_env *env,
const struct btf *btf, u32 func_id,
struct bpf_reg_state *regs);
int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog,
struct bpf_reg_state *reg);
int btf_check_type_match(struct bpf_verifier_log *log, const struct bpf_prog *prog,
@ -1557,6 +1566,10 @@ struct bpf_link *bpf_link_by_id(u32 id);
const struct bpf_func_proto *bpf_base_func_proto(enum bpf_func_id func_id);
void bpf_task_storage_free(struct task_struct *task);
bool bpf_prog_has_kfunc_call(const struct bpf_prog *prog);
const struct btf_func_model *
bpf_jit_find_kfunc_model(const struct bpf_prog *prog,
const struct bpf_insn *insn);
#else /* !CONFIG_BPF_SYSCALL */
static inline struct bpf_prog *bpf_prog_get(u32 ufd)
{
@ -1719,6 +1732,11 @@ static inline int bpf_prog_test_run_sk_lookup(struct bpf_prog *prog,
return -ENOTSUPP;
}
static inline bool bpf_prog_test_check_kfunc_call(u32 kfunc_id)
{
return false;
}
static inline void bpf_map_put(struct bpf_map *map)
{
}
@ -1737,6 +1755,18 @@ bpf_base_func_proto(enum bpf_func_id func_id)
static inline void bpf_task_storage_free(struct task_struct *task)
{
}
static inline bool bpf_prog_has_kfunc_call(const struct bpf_prog *prog)
{
return false;
}
static inline const struct btf_func_model *
bpf_jit_find_kfunc_model(const struct bpf_prog *prog,
const struct bpf_insn *insn)
{
return NULL;
}
#endif /* CONFIG_BPF_SYSCALL */
void __bpf_free_used_btfs(struct bpf_prog_aux *aux,

View File

@ -110,6 +110,7 @@ const struct btf_type *btf_type_resolve_func_ptr(const struct btf *btf,
const struct btf_type *
btf_resolve_size(const struct btf *btf, const struct btf_type *type,
u32 *type_size);
const char *btf_type_str(const struct btf_type *t);
#define for_each_member(i, struct_type, member) \
for (i = 0, member = btf_type_member(struct_type); \
@ -141,6 +142,11 @@ static inline bool btf_type_is_enum(const struct btf_type *t)
return BTF_INFO_KIND(t->info) == BTF_KIND_ENUM;
}
static inline bool btf_type_is_scalar(const struct btf_type *t)
{
return btf_type_is_int(t) || btf_type_is_enum(t);
}
static inline bool btf_type_is_typedef(const struct btf_type *t)
{
return BTF_INFO_KIND(t->info) == BTF_KIND_TYPEDEF;

View File

@ -877,8 +877,7 @@ void bpf_prog_free_linfo(struct bpf_prog *prog);
void bpf_prog_fill_jited_linfo(struct bpf_prog *prog,
const u32 *insn_to_jit_off);
int bpf_prog_alloc_jited_linfo(struct bpf_prog *prog);
void bpf_prog_free_jited_linfo(struct bpf_prog *prog);
void bpf_prog_free_unused_jited_linfo(struct bpf_prog *prog);
void bpf_prog_jit_attempt_done(struct bpf_prog *prog);
struct bpf_prog *bpf_prog_alloc(unsigned int size, gfp_t gfp_extra_flags);
struct bpf_prog *bpf_prog_alloc_no_stats(unsigned int size, gfp_t gfp_extra_flags);
@ -919,6 +918,7 @@ u64 __bpf_call_base(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog);
void bpf_jit_compile(struct bpf_prog *prog);
bool bpf_jit_needs_zext(void);
bool bpf_jit_supports_kfunc_call(void);
bool bpf_helper_changes_pkt_data(void *func);
static inline bool bpf_dump_raw_ok(const struct cred *cred)

View File

@ -1117,6 +1117,10 @@ enum bpf_link_type {
* offset to another bpf function
*/
#define BPF_PSEUDO_CALL 1
/* when bpf_call->src_reg == BPF_PSEUDO_KFUNC_CALL,
* bpf_call->imm == btf_id of a BTF_KIND_FUNC in the running kernel
*/
#define BPF_PSEUDO_KFUNC_CALL 2
/* flags for BPF_MAP_UPDATE_ELEM command */
enum {

View File

@ -283,7 +283,7 @@ static const char * const btf_kind_str[NR_BTF_KINDS] = {
[BTF_KIND_FLOAT] = "FLOAT",
};
static const char *btf_type_str(const struct btf_type *t)
const char *btf_type_str(const struct btf_type *t)
{
return btf_kind_str[BTF_INFO_KIND(t->info)];
}
@ -4377,7 +4377,7 @@ static u8 bpf_ctx_convert_map[] = {
#undef BPF_LINK_TYPE
static const struct btf_member *
btf_get_prog_ctx_type(struct bpf_verifier_log *log, struct btf *btf,
btf_get_prog_ctx_type(struct bpf_verifier_log *log, const struct btf *btf,
const struct btf_type *t, enum bpf_prog_type prog_type,
int arg)
{
@ -5362,6 +5362,147 @@ int btf_check_type_match(struct bpf_verifier_log *log, const struct bpf_prog *pr
return btf_check_func_type_match(log, btf1, t1, btf2, t2);
}
static u32 *reg2btf_ids[__BPF_REG_TYPE_MAX] = {
#ifdef CONFIG_NET
[PTR_TO_SOCKET] = &btf_sock_ids[BTF_SOCK_TYPE_SOCK],
[PTR_TO_SOCK_COMMON] = &btf_sock_ids[BTF_SOCK_TYPE_SOCK_COMMON],
[PTR_TO_TCP_SOCK] = &btf_sock_ids[BTF_SOCK_TYPE_TCP],
#endif
};
static int btf_check_func_arg_match(struct bpf_verifier_env *env,
const struct btf *btf, u32 func_id,
struct bpf_reg_state *regs,
bool ptr_to_mem_ok)
{
struct bpf_verifier_log *log = &env->log;
const char *func_name, *ref_tname;
const struct btf_type *t, *ref_t;
const struct btf_param *args;
u32 i, nargs, ref_id;
t = btf_type_by_id(btf, func_id);
if (!t || !btf_type_is_func(t)) {
/* These checks were already done by the verifier while loading
* struct bpf_func_info or in add_kfunc_call().
*/
bpf_log(log, "BTF of func_id %u doesn't point to KIND_FUNC\n",
func_id);
return -EFAULT;
}
func_name = btf_name_by_offset(btf, t->name_off);
t = btf_type_by_id(btf, t->type);
if (!t || !btf_type_is_func_proto(t)) {
bpf_log(log, "Invalid BTF of func %s\n", func_name);
return -EFAULT;
}
args = (const struct btf_param *)(t + 1);
nargs = btf_type_vlen(t);
if (nargs > MAX_BPF_FUNC_REG_ARGS) {
bpf_log(log, "Function %s has %d > %d args\n", func_name, nargs,
MAX_BPF_FUNC_REG_ARGS);
return -EINVAL;
}
/* check that BTF function arguments match actual types that the
* verifier sees.
*/
for (i = 0; i < nargs; i++) {
u32 regno = i + 1;
struct bpf_reg_state *reg = &regs[regno];
t = btf_type_skip_modifiers(btf, args[i].type, NULL);
if (btf_type_is_scalar(t)) {
if (reg->type == SCALAR_VALUE)
continue;
bpf_log(log, "R%d is not a scalar\n", regno);
return -EINVAL;
}
if (!btf_type_is_ptr(t)) {
bpf_log(log, "Unrecognized arg#%d type %s\n",
i, btf_type_str(t));
return -EINVAL;
}
ref_t = btf_type_skip_modifiers(btf, t->type, &ref_id);
ref_tname = btf_name_by_offset(btf, ref_t->name_off);
if (btf_is_kernel(btf)) {
const struct btf_type *reg_ref_t;
const struct btf *reg_btf;
const char *reg_ref_tname;
u32 reg_ref_id;
if (!btf_type_is_struct(ref_t)) {
bpf_log(log, "kernel function %s args#%d pointer type %s %s is not supported\n",
func_name, i, btf_type_str(ref_t),
ref_tname);
return -EINVAL;
}
if (reg->type == PTR_TO_BTF_ID) {
reg_btf = reg->btf;
reg_ref_id = reg->btf_id;
} else if (reg2btf_ids[reg->type]) {
reg_btf = btf_vmlinux;
reg_ref_id = *reg2btf_ids[reg->type];
} else {
bpf_log(log, "kernel function %s args#%d expected pointer to %s %s but R%d is not a pointer to btf_id\n",
func_name, i,
btf_type_str(ref_t), ref_tname, regno);
return -EINVAL;
}
reg_ref_t = btf_type_skip_modifiers(reg_btf, reg_ref_id,
&reg_ref_id);
reg_ref_tname = btf_name_by_offset(reg_btf,
reg_ref_t->name_off);
if (!btf_struct_ids_match(log, reg_btf, reg_ref_id,
reg->off, btf, ref_id)) {
bpf_log(log, "kernel function %s args#%d expected pointer to %s %s but R%d has a pointer to %s %s\n",
func_name, i,
btf_type_str(ref_t), ref_tname,
regno, btf_type_str(reg_ref_t),
reg_ref_tname);
return -EINVAL;
}
} else if (btf_get_prog_ctx_type(log, btf, t,
env->prog->type, i)) {
/* If function expects ctx type in BTF check that caller
* is passing PTR_TO_CTX.
*/
if (reg->type != PTR_TO_CTX) {
bpf_log(log,
"arg#%d expected pointer to ctx, but got %s\n",
i, btf_type_str(t));
return -EINVAL;
}
if (check_ctx_reg(env, reg, regno))
return -EINVAL;
} else if (ptr_to_mem_ok) {
const struct btf_type *resolve_ret;
u32 type_size;
resolve_ret = btf_resolve_size(btf, ref_t, &type_size);
if (IS_ERR(resolve_ret)) {
bpf_log(log,
"arg#%d reference type('%s %s') size cannot be determined: %ld\n",
i, btf_type_str(ref_t), ref_tname,
PTR_ERR(resolve_ret));
return -EINVAL;
}
if (check_mem_reg(env, reg, regno, type_size))
return -EINVAL;
} else {
return -EINVAL;
}
}
return 0;
}
/* Compare BTF of a function with given bpf_reg_state.
* Returns:
* EFAULT - there is a verifier bug. Abort verification.
@ -5369,17 +5510,14 @@ int btf_check_type_match(struct bpf_verifier_log *log, const struct bpf_prog *pr
* 0 - BTF matches with what bpf_reg_state expects.
* Only PTR_TO_CTX and SCALAR_VALUE states are recognized.
*/
int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog,
struct bpf_reg_state *regs)
int btf_check_subprog_arg_match(struct bpf_verifier_env *env, int subprog,
struct bpf_reg_state *regs)
{
struct bpf_verifier_log *log = &env->log;
struct bpf_prog *prog = env->prog;
struct btf *btf = prog->aux->btf;
const struct btf_param *args;
const struct btf_type *t, *ref_t;
u32 i, nargs, btf_id, type_size;
const char *tname;
bool is_global;
u32 btf_id;
int err;
if (!prog->aux->func_info)
return -EINVAL;
@ -5391,93 +5529,23 @@ int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog,
if (prog->aux->func_info_aux[subprog].unreliable)
return -EINVAL;
t = btf_type_by_id(btf, btf_id);
if (!t || !btf_type_is_func(t)) {
/* These checks were already done by the verifier while loading
* struct bpf_func_info
*/
bpf_log(log, "BTF of func#%d doesn't point to KIND_FUNC\n",
subprog);
return -EFAULT;
}
tname = btf_name_by_offset(btf, t->name_off);
t = btf_type_by_id(btf, t->type);
if (!t || !btf_type_is_func_proto(t)) {
bpf_log(log, "Invalid BTF of func %s\n", tname);
return -EFAULT;
}
args = (const struct btf_param *)(t + 1);
nargs = btf_type_vlen(t);
if (nargs > MAX_BPF_FUNC_REG_ARGS) {
bpf_log(log, "Function %s has %d > %d args\n", tname, nargs,
MAX_BPF_FUNC_REG_ARGS);
goto out;
}
is_global = prog->aux->func_info_aux[subprog].linkage == BTF_FUNC_GLOBAL;
/* check that BTF function arguments match actual types that the
* verifier sees.
*/
for (i = 0; i < nargs; i++) {
struct bpf_reg_state *reg = &regs[i + 1];
err = btf_check_func_arg_match(env, btf, btf_id, regs, is_global);
t = btf_type_by_id(btf, args[i].type);
while (btf_type_is_modifier(t))
t = btf_type_by_id(btf, t->type);
if (btf_type_is_int(t) || btf_type_is_enum(t)) {
if (reg->type == SCALAR_VALUE)
continue;
bpf_log(log, "R%d is not a scalar\n", i + 1);
goto out;
}
if (btf_type_is_ptr(t)) {
/* If function expects ctx type in BTF check that caller
* is passing PTR_TO_CTX.
*/
if (btf_get_prog_ctx_type(log, btf, t, prog->type, i)) {
if (reg->type != PTR_TO_CTX) {
bpf_log(log,
"arg#%d expected pointer to ctx, but got %s\n",
i, btf_kind_str[BTF_INFO_KIND(t->info)]);
goto out;
}
if (check_ctx_reg(env, reg, i + 1))
goto out;
continue;
}
if (!is_global)
goto out;
t = btf_type_skip_modifiers(btf, t->type, NULL);
ref_t = btf_resolve_size(btf, t, &type_size);
if (IS_ERR(ref_t)) {
bpf_log(log,
"arg#%d reference type('%s %s') size cannot be determined: %ld\n",
i, btf_type_str(t), btf_name_by_offset(btf, t->name_off),
PTR_ERR(ref_t));
goto out;
}
if (check_mem_reg(env, reg, i + 1, type_size))
goto out;
continue;
}
bpf_log(log, "Unrecognized arg#%d type %s\n",
i, btf_kind_str[BTF_INFO_KIND(t->info)]);
goto out;
}
return 0;
out:
/* Compiler optimizations can remove arguments from static functions
* or mismatched type can be passed into a global function.
* In such cases mark the function as unreliable from BTF point of view.
*/
prog->aux->func_info_aux[subprog].unreliable = true;
return -EINVAL;
if (err)
prog->aux->func_info_aux[subprog].unreliable = true;
return err;
}
int btf_check_kfunc_arg_match(struct bpf_verifier_env *env,
const struct btf *btf, u32 func_id,
struct bpf_reg_state *regs)
{
return btf_check_func_arg_match(env, btf, func_id, regs, false);
}
/* Convert BTF of a function into bpf_reg_state if possible

View File

@ -143,25 +143,25 @@ int bpf_prog_alloc_jited_linfo(struct bpf_prog *prog)
if (!prog->aux->nr_linfo || !prog->jit_requested)
return 0;
prog->aux->jited_linfo = kcalloc(prog->aux->nr_linfo,
sizeof(*prog->aux->jited_linfo),
GFP_KERNEL_ACCOUNT | __GFP_NOWARN);
prog->aux->jited_linfo = kvcalloc(prog->aux->nr_linfo,
sizeof(*prog->aux->jited_linfo),
GFP_KERNEL_ACCOUNT | __GFP_NOWARN);
if (!prog->aux->jited_linfo)
return -ENOMEM;
return 0;
}
void bpf_prog_free_jited_linfo(struct bpf_prog *prog)
void bpf_prog_jit_attempt_done(struct bpf_prog *prog)
{
kfree(prog->aux->jited_linfo);
prog->aux->jited_linfo = NULL;
}
if (prog->aux->jited_linfo &&
(!prog->jited || !prog->aux->jited_linfo[0])) {
kvfree(prog->aux->jited_linfo);
prog->aux->jited_linfo = NULL;
}
void bpf_prog_free_unused_jited_linfo(struct bpf_prog *prog)
{
if (prog->aux->jited_linfo && !prog->aux->jited_linfo[0])
bpf_prog_free_jited_linfo(prog);
kfree(prog->aux->kfunc_tab);
prog->aux->kfunc_tab = NULL;
}
/* The jit engine is responsible to provide an array
@ -217,12 +217,6 @@ void bpf_prog_fill_jited_linfo(struct bpf_prog *prog,
insn_to_jit_off[linfo[i].insn_off - insn_start - 1];
}
void bpf_prog_free_linfo(struct bpf_prog *prog)
{
bpf_prog_free_jited_linfo(prog);
kvfree(prog->aux->linfo);
}
struct bpf_prog *bpf_prog_realloc(struct bpf_prog *fp_old, unsigned int size,
gfp_t gfp_extra_flags)
{
@ -1849,9 +1843,15 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
/* In case of BPF to BPF calls, verifier did all the prep
* work with regards to JITing, etc.
*/
bool jit_needed = false;
if (fp->bpf_func)
goto finalize;
if (IS_ENABLED(CONFIG_BPF_JIT_ALWAYS_ON) ||
bpf_prog_has_kfunc_call(fp))
jit_needed = true;
bpf_prog_select_func(fp);
/* eBPF JITs can rewrite the program in case constant
@ -1866,14 +1866,10 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
return fp;
fp = bpf_int_jit_compile(fp);
if (!fp->jited) {
bpf_prog_free_jited_linfo(fp);
#ifdef CONFIG_BPF_JIT_ALWAYS_ON
bpf_prog_jit_attempt_done(fp);
if (!fp->jited && jit_needed) {
*err = -ENOTSUPP;
return fp;
#endif
} else {
bpf_prog_free_unused_jited_linfo(fp);
}
} else {
*err = bpf_prog_offload_compile(fp);
@ -2354,6 +2350,11 @@ bool __weak bpf_jit_needs_zext(void)
return false;
}
bool __weak bpf_jit_supports_kfunc_call(void)
{
return false;
}
/* To execute LD_ABS/LD_IND instructions __bpf_prog_run() may call
* skb_copy_bits(), so provide a weak definition of it for NET-less config.
*/

View File

@ -19,16 +19,23 @@ static const char *__func_get_name(const struct bpf_insn_cbs *cbs,
{
BUILD_BUG_ON(ARRAY_SIZE(func_id_str) != __BPF_FUNC_MAX_ID);
if (insn->src_reg != BPF_PSEUDO_CALL &&
if (!insn->src_reg &&
insn->imm >= 0 && insn->imm < __BPF_FUNC_MAX_ID &&
func_id_str[insn->imm])
return func_id_str[insn->imm];
if (cbs && cbs->cb_call)
return cbs->cb_call(cbs->private_data, insn);
if (cbs && cbs->cb_call) {
const char *res;
res = cbs->cb_call(cbs->private_data, insn);
if (res)
return res;
}
if (insn->src_reg == BPF_PSEUDO_CALL)
snprintf(buff, len, "%+d", insn->imm);
else if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL)
snprintf(buff, len, "kernel-function");
return buff;
}

View File

@ -1694,7 +1694,9 @@ static void __bpf_prog_put_noref(struct bpf_prog *prog, bool deferred)
{
bpf_prog_kallsyms_del_all(prog);
btf_put(prog->aux->btf);
bpf_prog_free_linfo(prog);
kvfree(prog->aux->jited_linfo);
kvfree(prog->aux->linfo);
kfree(prog->aux->kfunc_tab);
if (prog->aux->attach_btf)
btf_put(prog->aux->attach_btf);

View File

@ -234,6 +234,12 @@ static bool bpf_pseudo_call(const struct bpf_insn *insn)
insn->src_reg == BPF_PSEUDO_CALL;
}
static bool bpf_pseudo_kfunc_call(const struct bpf_insn *insn)
{
return insn->code == (BPF_JMP | BPF_CALL) &&
insn->src_reg == BPF_PSEUDO_KFUNC_CALL;
}
static bool bpf_pseudo_func(const struct bpf_insn *insn)
{
return insn->code == (BPF_LD | BPF_IMM | BPF_DW) &&
@ -1554,47 +1560,205 @@ static int add_subprog(struct bpf_verifier_env *env, int off)
verbose(env, "too many subprograms\n");
return -E2BIG;
}
/* determine subprog starts. The end is one before the next starts */
env->subprog_info[env->subprog_cnt++].start = off;
sort(env->subprog_info, env->subprog_cnt,
sizeof(env->subprog_info[0]), cmp_subprogs, NULL);
return env->subprog_cnt - 1;
}
static int check_subprogs(struct bpf_verifier_env *env)
struct bpf_kfunc_desc {
struct btf_func_model func_model;
u32 func_id;
s32 imm;
};
#define MAX_KFUNC_DESCS 256
struct bpf_kfunc_desc_tab {
struct bpf_kfunc_desc descs[MAX_KFUNC_DESCS];
u32 nr_descs;
};
static int kfunc_desc_cmp_by_id(const void *a, const void *b)
{
const struct bpf_kfunc_desc *d0 = a;
const struct bpf_kfunc_desc *d1 = b;
/* func_id is not greater than BTF_MAX_TYPE */
return d0->func_id - d1->func_id;
}
static const struct bpf_kfunc_desc *
find_kfunc_desc(const struct bpf_prog *prog, u32 func_id)
{
struct bpf_kfunc_desc desc = {
.func_id = func_id,
};
struct bpf_kfunc_desc_tab *tab;
tab = prog->aux->kfunc_tab;
return bsearch(&desc, tab->descs, tab->nr_descs,
sizeof(tab->descs[0]), kfunc_desc_cmp_by_id);
}
static int add_kfunc_call(struct bpf_verifier_env *env, u32 func_id)
{
const struct btf_type *func, *func_proto;
struct bpf_kfunc_desc_tab *tab;
struct bpf_prog_aux *prog_aux;
struct bpf_kfunc_desc *desc;
const char *func_name;
unsigned long addr;
int err;
prog_aux = env->prog->aux;
tab = prog_aux->kfunc_tab;
if (!tab) {
if (!btf_vmlinux) {
verbose(env, "calling kernel function is not supported without CONFIG_DEBUG_INFO_BTF\n");
return -ENOTSUPP;
}
if (!env->prog->jit_requested) {
verbose(env, "JIT is required for calling kernel function\n");
return -ENOTSUPP;
}
if (!bpf_jit_supports_kfunc_call()) {
verbose(env, "JIT does not support calling kernel function\n");
return -ENOTSUPP;
}
if (!env->prog->gpl_compatible) {
verbose(env, "cannot call kernel function from non-GPL compatible program\n");
return -EINVAL;
}
tab = kzalloc(sizeof(*tab), GFP_KERNEL);
if (!tab)
return -ENOMEM;
prog_aux->kfunc_tab = tab;
}
if (find_kfunc_desc(env->prog, func_id))
return 0;
if (tab->nr_descs == MAX_KFUNC_DESCS) {
verbose(env, "too many different kernel function calls\n");
return -E2BIG;
}
func = btf_type_by_id(btf_vmlinux, func_id);
if (!func || !btf_type_is_func(func)) {
verbose(env, "kernel btf_id %u is not a function\n",
func_id);
return -EINVAL;
}
func_proto = btf_type_by_id(btf_vmlinux, func->type);
if (!func_proto || !btf_type_is_func_proto(func_proto)) {
verbose(env, "kernel function btf_id %u does not have a valid func_proto\n",
func_id);
return -EINVAL;
}
func_name = btf_name_by_offset(btf_vmlinux, func->name_off);
addr = kallsyms_lookup_name(func_name);
if (!addr) {
verbose(env, "cannot find address for kernel function %s\n",
func_name);
return -EINVAL;
}
desc = &tab->descs[tab->nr_descs++];
desc->func_id = func_id;
desc->imm = BPF_CAST_CALL(addr) - __bpf_call_base;
err = btf_distill_func_proto(&env->log, btf_vmlinux,
func_proto, func_name,
&desc->func_model);
if (!err)
sort(tab->descs, tab->nr_descs, sizeof(tab->descs[0]),
kfunc_desc_cmp_by_id, NULL);
return err;
}
static int kfunc_desc_cmp_by_imm(const void *a, const void *b)
{
const struct bpf_kfunc_desc *d0 = a;
const struct bpf_kfunc_desc *d1 = b;
if (d0->imm > d1->imm)
return 1;
else if (d0->imm < d1->imm)
return -1;
return 0;
}
static void sort_kfunc_descs_by_imm(struct bpf_prog *prog)
{
struct bpf_kfunc_desc_tab *tab;
tab = prog->aux->kfunc_tab;
if (!tab)
return;
sort(tab->descs, tab->nr_descs, sizeof(tab->descs[0]),
kfunc_desc_cmp_by_imm, NULL);
}
bool bpf_prog_has_kfunc_call(const struct bpf_prog *prog)
{
return !!prog->aux->kfunc_tab;
}
const struct btf_func_model *
bpf_jit_find_kfunc_model(const struct bpf_prog *prog,
const struct bpf_insn *insn)
{
const struct bpf_kfunc_desc desc = {
.imm = insn->imm,
};
const struct bpf_kfunc_desc *res;
struct bpf_kfunc_desc_tab *tab;
tab = prog->aux->kfunc_tab;
res = bsearch(&desc, tab->descs, tab->nr_descs,
sizeof(tab->descs[0]), kfunc_desc_cmp_by_imm);
return res ? &res->func_model : NULL;
}
static int add_subprog_and_kfunc(struct bpf_verifier_env *env)
{
int i, ret, subprog_start, subprog_end, off, cur_subprog = 0;
struct bpf_subprog_info *subprog = env->subprog_info;
struct bpf_insn *insn = env->prog->insnsi;
int insn_cnt = env->prog->len;
int i, ret, insn_cnt = env->prog->len;
/* Add entry function. */
ret = add_subprog(env, 0);
if (ret < 0)
if (ret)
return ret;
/* determine subprog starts. The end is one before the next starts */
for (i = 0; i < insn_cnt; i++) {
if (bpf_pseudo_func(insn + i)) {
if (!env->bpf_capable) {
verbose(env,
"function pointers are allowed for CAP_BPF and CAP_SYS_ADMIN\n");
return -EPERM;
}
ret = add_subprog(env, i + insn[i].imm + 1);
if (ret < 0)
return ret;
/* remember subprog */
insn[i + 1].imm = ret;
continue;
}
if (!bpf_pseudo_call(insn + i))
for (i = 0; i < insn_cnt; i++, insn++) {
if (!bpf_pseudo_func(insn) && !bpf_pseudo_call(insn) &&
!bpf_pseudo_kfunc_call(insn))
continue;
if (!env->bpf_capable) {
verbose(env,
"function calls to other bpf functions are allowed for CAP_BPF and CAP_SYS_ADMIN\n");
verbose(env, "loading/calling other bpf or kernel functions are allowed for CAP_BPF and CAP_SYS_ADMIN\n");
return -EPERM;
}
ret = add_subprog(env, i + insn[i].imm + 1);
if (bpf_pseudo_func(insn)) {
ret = add_subprog(env, i + insn->imm + 1);
if (ret >= 0)
/* remember subprog */
insn[1].imm = ret;
} else if (bpf_pseudo_call(insn)) {
ret = add_subprog(env, i + insn->imm + 1);
} else {
ret = add_kfunc_call(env, insn->imm);
}
if (ret < 0)
return ret;
}
@ -1608,6 +1772,16 @@ static int check_subprogs(struct bpf_verifier_env *env)
for (i = 0; i < env->subprog_cnt; i++)
verbose(env, "func#%d @%d\n", i, subprog[i].start);
return 0;
}
static int check_subprogs(struct bpf_verifier_env *env)
{
int i, subprog_start, subprog_end, off, cur_subprog = 0;
struct bpf_subprog_info *subprog = env->subprog_info;
struct bpf_insn *insn = env->prog->insnsi;
int insn_cnt = env->prog->len;
/* now check that all jumps are within the same subprog */
subprog_start = subprog[cur_subprog].start;
subprog_end = subprog[cur_subprog + 1].start;
@ -1916,6 +2090,17 @@ static int get_prev_insn_idx(struct bpf_verifier_state *st, int i,
return i;
}
static const char *disasm_kfunc_name(void *data, const struct bpf_insn *insn)
{
const struct btf_type *func;
if (insn->src_reg != BPF_PSEUDO_KFUNC_CALL)
return NULL;
func = btf_type_by_id(btf_vmlinux, insn->imm);
return btf_name_by_offset(btf_vmlinux, func->name_off);
}
/* For given verifier state backtrack_insn() is called from the last insn to
* the first insn. Its purpose is to compute a bitmask of registers and
* stack slots that needs precision in the parent verifier state.
@ -1924,6 +2109,7 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx,
u32 *reg_mask, u64 *stack_mask)
{
const struct bpf_insn_cbs cbs = {
.cb_call = disasm_kfunc_name,
.cb_print = verbose,
.private_data = env,
};
@ -5365,7 +5551,7 @@ static int __check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn
func_info_aux = env->prog->aux->func_info_aux;
if (func_info_aux)
is_global = func_info_aux[subprog].linkage == BTF_FUNC_GLOBAL;
err = btf_check_func_arg_match(env, subprog, caller->regs);
err = btf_check_subprog_arg_match(env, subprog, caller->regs);
if (err == -EFAULT)
return err;
if (is_global) {
@ -5960,6 +6146,98 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
return 0;
}
/* mark_btf_func_reg_size() is used when the reg size is determined by
* the BTF func_proto's return value size and argument.
*/
static void mark_btf_func_reg_size(struct bpf_verifier_env *env, u32 regno,
size_t reg_size)
{
struct bpf_reg_state *reg = &cur_regs(env)[regno];
if (regno == BPF_REG_0) {
/* Function return value */
reg->live |= REG_LIVE_WRITTEN;
reg->subreg_def = reg_size == sizeof(u64) ?
DEF_NOT_SUBREG : env->insn_idx + 1;
} else {
/* Function argument */
if (reg_size == sizeof(u64)) {
mark_insn_zext(env, reg);
mark_reg_read(env, reg, reg->parent, REG_LIVE_READ64);
} else {
mark_reg_read(env, reg, reg->parent, REG_LIVE_READ32);
}
}
}
static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn)
{
const struct btf_type *t, *func, *func_proto, *ptr_type;
struct bpf_reg_state *regs = cur_regs(env);
const char *func_name, *ptr_type_name;
u32 i, nargs, func_id, ptr_type_id;
const struct btf_param *args;
int err;
func_id = insn->imm;
func = btf_type_by_id(btf_vmlinux, func_id);
func_name = btf_name_by_offset(btf_vmlinux, func->name_off);
func_proto = btf_type_by_id(btf_vmlinux, func->type);
if (!env->ops->check_kfunc_call ||
!env->ops->check_kfunc_call(func_id)) {
verbose(env, "calling kernel function %s is not allowed\n",
func_name);
return -EACCES;
}
/* Check the arguments */
err = btf_check_kfunc_arg_match(env, btf_vmlinux, func_id, regs);
if (err)
return err;
for (i = 0; i < CALLER_SAVED_REGS; i++)
mark_reg_not_init(env, regs, caller_saved[i]);
/* Check return type */
t = btf_type_skip_modifiers(btf_vmlinux, func_proto->type, NULL);
if (btf_type_is_scalar(t)) {
mark_reg_unknown(env, regs, BPF_REG_0);
mark_btf_func_reg_size(env, BPF_REG_0, t->size);
} else if (btf_type_is_ptr(t)) {
ptr_type = btf_type_skip_modifiers(btf_vmlinux, t->type,
&ptr_type_id);
if (!btf_type_is_struct(ptr_type)) {
ptr_type_name = btf_name_by_offset(btf_vmlinux,
ptr_type->name_off);
verbose(env, "kernel function %s returns pointer type %s %s is not supported\n",
func_name, btf_type_str(ptr_type),
ptr_type_name);
return -EINVAL;
}
mark_reg_known_zero(env, regs, BPF_REG_0);
regs[BPF_REG_0].btf = btf_vmlinux;
regs[BPF_REG_0].type = PTR_TO_BTF_ID;
regs[BPF_REG_0].btf_id = ptr_type_id;
mark_btf_func_reg_size(env, BPF_REG_0, sizeof(void *));
} /* else { add_kfunc_call() ensures it is btf_type_is_void(t) } */
nargs = btf_type_vlen(func_proto);
args = (const struct btf_param *)(func_proto + 1);
for (i = 0; i < nargs; i++) {
u32 regno = i + 1;
t = btf_type_skip_modifiers(btf_vmlinux, args[i].type, NULL);
if (btf_type_is_ptr(t))
mark_btf_func_reg_size(env, regno, sizeof(void *));
else
/* scalar. ensured by btf_check_kfunc_arg_match() */
mark_btf_func_reg_size(env, regno, t->size);
}
return 0;
}
static bool signed_add_overflows(s64 a, s64 b)
{
/* Do the add in u64, where overflow is well-defined */
@ -10162,6 +10440,7 @@ static int do_check(struct bpf_verifier_env *env)
if (env->log.level & BPF_LOG_LEVEL) {
const struct bpf_insn_cbs cbs = {
.cb_call = disasm_kfunc_name,
.cb_print = verbose,
.private_data = env,
};
@ -10309,7 +10588,8 @@ static int do_check(struct bpf_verifier_env *env)
if (BPF_SRC(insn->code) != BPF_K ||
insn->off != 0 ||
(insn->src_reg != BPF_REG_0 &&
insn->src_reg != BPF_PSEUDO_CALL) ||
insn->src_reg != BPF_PSEUDO_CALL &&
insn->src_reg != BPF_PSEUDO_KFUNC_CALL) ||
insn->dst_reg != BPF_REG_0 ||
class == BPF_JMP32) {
verbose(env, "BPF_CALL uses reserved fields\n");
@ -10324,6 +10604,8 @@ static int do_check(struct bpf_verifier_env *env)
}
if (insn->src_reg == BPF_PSEUDO_CALL)
err = check_func_call(env, insn, &env->insn_idx);
else if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL)
err = check_kfunc_call(env, insn);
else
err = check_helper_call(env, insn, &env->insn_idx);
if (err)
@ -11634,6 +11916,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
func[i]->aux->name[0] = 'F';
func[i]->aux->stack_depth = env->subprog_info[i].stack_depth;
func[i]->jit_requested = 1;
func[i]->aux->kfunc_tab = prog->aux->kfunc_tab;
func[i]->aux->linfo = prog->aux->linfo;
func[i]->aux->nr_linfo = prog->aux->nr_linfo;
func[i]->aux->jited_linfo = prog->aux->jited_linfo;
@ -11741,7 +12024,7 @@ static int jit_subprogs(struct bpf_verifier_env *env)
prog->bpf_func = func[0]->bpf_func;
prog->aux->func = func;
prog->aux->func_cnt = env->subprog_cnt;
bpf_prog_free_unused_jited_linfo(prog);
bpf_prog_jit_attempt_done(prog);
return 0;
out_free:
for (i = 0; i < env->subprog_cnt; i++) {
@ -11764,7 +12047,7 @@ out_undo_insn:
insn->off = 0;
insn->imm = env->insn_aux_data[i].call_imm;
}
bpf_prog_free_jited_linfo(prog);
bpf_prog_jit_attempt_done(prog);
return err;
}
@ -11773,6 +12056,7 @@ static int fixup_call_args(struct bpf_verifier_env *env)
#ifndef CONFIG_BPF_JIT_ALWAYS_ON
struct bpf_prog *prog = env->prog;
struct bpf_insn *insn = prog->insnsi;
bool has_kfunc_call = bpf_prog_has_kfunc_call(prog);
int i, depth;
#endif
int err = 0;
@ -11786,6 +12070,10 @@ static int fixup_call_args(struct bpf_verifier_env *env)
return err;
}
#ifndef CONFIG_BPF_JIT_ALWAYS_ON
if (has_kfunc_call) {
verbose(env, "calling kernel functions are not allowed in non-JITed programs\n");
return -EINVAL;
}
if (env->subprog_cnt > 1 && env->prog->aux->tail_call_reachable) {
/* When JIT fails the progs with bpf2bpf calls and tail_calls
* have to be rejected, since interpreter doesn't support them yet.
@ -11814,6 +12102,26 @@ static int fixup_call_args(struct bpf_verifier_env *env)
return err;
}
static int fixup_kfunc_call(struct bpf_verifier_env *env,
struct bpf_insn *insn)
{
const struct bpf_kfunc_desc *desc;
/* insn->imm has the btf func_id. Replace it with
* an address (relative to __bpf_base_call).
*/
desc = find_kfunc_desc(env->prog, insn->imm);
if (!desc) {
verbose(env, "verifier internal error: kernel function descriptor not found for func_id %u\n",
insn->imm);
return -EFAULT;
}
insn->imm = desc->imm;
return 0;
}
/* Do various post-verification rewrites in a single program pass.
* These rewrites simplify JIT and interpreter implementations.
*/
@ -11949,6 +12257,12 @@ static int do_misc_fixups(struct bpf_verifier_env *env)
continue;
if (insn->src_reg == BPF_PSEUDO_CALL)
continue;
if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) {
ret = fixup_kfunc_call(env, insn);
if (ret)
return ret;
continue;
}
if (insn->imm == BPF_FUNC_get_route_realm)
prog->dst_needed = 1;
@ -12178,6 +12492,8 @@ patch_call_imm:
}
}
sort_kfunc_descs_by_imm(env->prog);
return 0;
}
@ -12288,7 +12604,7 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog)
/* 1st arg to a function */
regs[BPF_REG_1].type = PTR_TO_CTX;
mark_reg_known_zero(env, regs, BPF_REG_1);
ret = btf_check_func_arg_match(env, subprog, regs);
ret = btf_check_subprog_arg_match(env, subprog, regs);
if (ret == -EFAULT)
/* unlikely verifier bug. abort.
* ret == 0 and ret < 0 are sadly acceptable for
@ -12883,6 +13199,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
if (!env->explored_states)
goto skip_full_check;
ret = add_subprog_and_kfunc(env);
if (ret < 0)
goto skip_full_check;
ret = check_subprogs(env);
if (ret < 0)
goto skip_full_check;

View File

@ -2,6 +2,7 @@
/* Copyright (c) 2017 Facebook
*/
#include <linux/bpf.h>
#include <linux/btf_ids.h>
#include <linux/slab.h>
#include <linux/vmalloc.h>
#include <linux/etherdevice.h>
@ -213,10 +214,37 @@ int noinline bpf_modify_return_test(int a, int *b)
*b += 1;
return a + *b;
}
u64 noinline bpf_kfunc_call_test1(struct sock *sk, u32 a, u64 b, u32 c, u64 d)
{
return a + b + c + d;
}
int noinline bpf_kfunc_call_test2(struct sock *sk, u32 a, u32 b)
{
return a + b;
}
struct sock * noinline bpf_kfunc_call_test3(struct sock *sk)
{
return sk;
}
__diag_pop();
ALLOW_ERROR_INJECTION(bpf_modify_return_test, ERRNO);
BTF_SET_START(test_sk_kfunc_ids)
BTF_ID(func, bpf_kfunc_call_test1)
BTF_ID(func, bpf_kfunc_call_test2)
BTF_ID(func, bpf_kfunc_call_test3)
BTF_SET_END(test_sk_kfunc_ids)
bool bpf_prog_test_check_kfunc_call(u32 kfunc_id)
{
return btf_id_set_contains(&test_sk_kfunc_ids, kfunc_id);
}
static void *bpf_test_init(const union bpf_attr *kattr, u32 size,
u32 headroom, u32 tailroom)
{

View File

@ -9813,6 +9813,7 @@ const struct bpf_verifier_ops tc_cls_act_verifier_ops = {
.convert_ctx_access = tc_cls_act_convert_ctx_access,
.gen_prologue = tc_cls_act_prologue,
.gen_ld_abs = bpf_gen_ld_abs,
.check_kfunc_call = bpf_prog_test_check_kfunc_call,
};
const struct bpf_prog_ops tc_cls_act_prog_ops = {

View File

@ -5,6 +5,7 @@
#include <linux/bpf_verifier.h>
#include <linux/bpf.h>
#include <linux/btf.h>
#include <linux/btf_ids.h>
#include <linux/filter.h>
#include <net/tcp.h>
#include <net/bpf_sk_storage.h>
@ -178,10 +179,50 @@ bpf_tcp_ca_get_func_proto(enum bpf_func_id func_id,
}
}
BTF_SET_START(bpf_tcp_ca_kfunc_ids)
BTF_ID(func, tcp_reno_ssthresh)
BTF_ID(func, tcp_reno_cong_avoid)
BTF_ID(func, tcp_reno_undo_cwnd)
BTF_ID(func, tcp_slow_start)
BTF_ID(func, tcp_cong_avoid_ai)
#if IS_BUILTIN(CONFIG_TCP_CONG_CUBIC)
BTF_ID(func, cubictcp_init)
BTF_ID(func, cubictcp_recalc_ssthresh)
BTF_ID(func, cubictcp_cong_avoid)
BTF_ID(func, cubictcp_state)
BTF_ID(func, cubictcp_cwnd_event)
BTF_ID(func, cubictcp_acked)
#endif
#if IS_BUILTIN(CONFIG_TCP_CONG_DCTCP)
BTF_ID(func, dctcp_init)
BTF_ID(func, dctcp_update_alpha)
BTF_ID(func, dctcp_cwnd_event)
BTF_ID(func, dctcp_ssthresh)
BTF_ID(func, dctcp_cwnd_undo)
BTF_ID(func, dctcp_state)
#endif
#if IS_BUILTIN(CONFIG_TCP_CONG_BBR)
BTF_ID(func, bbr_init)
BTF_ID(func, bbr_main)
BTF_ID(func, bbr_sndbuf_expand)
BTF_ID(func, bbr_undo_cwnd)
BTF_ID(func, bbr_cwnd_even),
BTF_ID(func, bbr_ssthresh)
BTF_ID(func, bbr_min_tso_segs)
BTF_ID(func, bbr_set_state)
#endif
BTF_SET_END(bpf_tcp_ca_kfunc_ids)
static bool bpf_tcp_ca_check_kfunc_call(u32 kfunc_btf_id)
{
return btf_id_set_contains(&bpf_tcp_ca_kfunc_ids, kfunc_btf_id);
}
static const struct bpf_verifier_ops bpf_tcp_ca_verifier_ops = {
.get_func_proto = bpf_tcp_ca_get_func_proto,
.is_valid_access = bpf_tcp_ca_is_valid_access,
.btf_struct_access = bpf_tcp_ca_btf_struct_access,
.check_kfunc_call = bpf_tcp_ca_check_kfunc_call,
};
static int bpf_tcp_ca_init_member(const struct btf_type *t,

View File

@ -124,7 +124,7 @@ static inline void bictcp_hystart_reset(struct sock *sk)
ca->sample_cnt = 0;
}
static void bictcp_init(struct sock *sk)
static void cubictcp_init(struct sock *sk)
{
struct bictcp *ca = inet_csk_ca(sk);
@ -137,7 +137,7 @@ static void bictcp_init(struct sock *sk)
tcp_sk(sk)->snd_ssthresh = initial_ssthresh;
}
static void bictcp_cwnd_event(struct sock *sk, enum tcp_ca_event event)
static void cubictcp_cwnd_event(struct sock *sk, enum tcp_ca_event event)
{
if (event == CA_EVENT_TX_START) {
struct bictcp *ca = inet_csk_ca(sk);
@ -319,7 +319,7 @@ tcp_friendliness:
ca->cnt = max(ca->cnt, 2U);
}
static void bictcp_cong_avoid(struct sock *sk, u32 ack, u32 acked)
static void cubictcp_cong_avoid(struct sock *sk, u32 ack, u32 acked)
{
struct tcp_sock *tp = tcp_sk(sk);
struct bictcp *ca = inet_csk_ca(sk);
@ -338,7 +338,7 @@ static void bictcp_cong_avoid(struct sock *sk, u32 ack, u32 acked)
tcp_cong_avoid_ai(tp, ca->cnt, acked);
}
static u32 bictcp_recalc_ssthresh(struct sock *sk)
static u32 cubictcp_recalc_ssthresh(struct sock *sk)
{
const struct tcp_sock *tp = tcp_sk(sk);
struct bictcp *ca = inet_csk_ca(sk);
@ -355,7 +355,7 @@ static u32 bictcp_recalc_ssthresh(struct sock *sk)
return max((tp->snd_cwnd * beta) / BICTCP_BETA_SCALE, 2U);
}
static void bictcp_state(struct sock *sk, u8 new_state)
static void cubictcp_state(struct sock *sk, u8 new_state)
{
if (new_state == TCP_CA_Loss) {
bictcp_reset(inet_csk_ca(sk));
@ -442,7 +442,7 @@ static void hystart_update(struct sock *sk, u32 delay)
}
}
static void bictcp_acked(struct sock *sk, const struct ack_sample *sample)
static void cubictcp_acked(struct sock *sk, const struct ack_sample *sample)
{
const struct tcp_sock *tp = tcp_sk(sk);
struct bictcp *ca = inet_csk_ca(sk);
@ -471,13 +471,13 @@ static void bictcp_acked(struct sock *sk, const struct ack_sample *sample)
}
static struct tcp_congestion_ops cubictcp __read_mostly = {
.init = bictcp_init,
.ssthresh = bictcp_recalc_ssthresh,
.cong_avoid = bictcp_cong_avoid,
.set_state = bictcp_state,
.init = cubictcp_init,
.ssthresh = cubictcp_recalc_ssthresh,
.cong_avoid = cubictcp_cong_avoid,
.set_state = cubictcp_state,
.undo_cwnd = tcp_reno_undo_cwnd,
.cwnd_event = bictcp_cwnd_event,
.pkts_acked = bictcp_acked,
.cwnd_event = cubictcp_cwnd_event,
.pkts_acked = cubictcp_acked,
.owner = THIS_MODULE,
.name = "cubic",
};

View File

@ -1117,6 +1117,10 @@ enum bpf_link_type {
* offset to another bpf function
*/
#define BPF_PSEUDO_CALL 1
/* when bpf_call->src_reg == BPF_PSEUDO_KFUNC_CALL,
* bpf_call->imm == btf_id of a BTF_KIND_FUNC in the running kernel
*/
#define BPF_PSEUDO_KFUNC_CALL 2
/* flags for BPF_MAP_UPDATE_ELEM command */
enum {

View File

@ -185,7 +185,8 @@ enum reloc_type {
RELO_LD64,
RELO_CALL,
RELO_DATA,
RELO_EXTERN,
RELO_EXTERN_VAR,
RELO_EXTERN_FUNC,
RELO_SUBPROG_ADDR,
};
@ -573,14 +574,19 @@ static bool insn_is_subprog_call(const struct bpf_insn *insn)
insn->off == 0;
}
static bool is_ldimm64(struct bpf_insn *insn)
static bool is_ldimm64_insn(struct bpf_insn *insn)
{
return insn->code == (BPF_LD | BPF_IMM | BPF_DW);
}
static bool is_call_insn(const struct bpf_insn *insn)
{
return insn->code == (BPF_JMP | BPF_CALL);
}
static bool insn_is_pseudo_func(struct bpf_insn *insn)
{
return is_ldimm64(insn) && insn->src_reg == BPF_PSEUDO_FUNC;
return is_ldimm64_insn(insn) && insn->src_reg == BPF_PSEUDO_FUNC;
}
static int
@ -1921,9 +1927,9 @@ resolve_func_ptr(const struct btf *btf, __u32 id, __u32 *res_id)
return btf_is_func_proto(t) ? t : NULL;
}
static const char *btf_kind_str(const struct btf_type *t)
static const char *__btf_kind_str(__u16 kind)
{
switch (btf_kind(t)) {
switch (kind) {
case BTF_KIND_UNKN: return "void";
case BTF_KIND_INT: return "int";
case BTF_KIND_PTR: return "ptr";
@ -1945,6 +1951,16 @@ static const char *btf_kind_str(const struct btf_type *t)
}
}
static const char *btf_kind_str(const struct btf_type *t)
{
return __btf_kind_str(btf_kind(t));
}
static enum btf_func_linkage btf_func_linkage(const struct btf_type *t)
{
return (enum btf_func_linkage)BTF_INFO_VLEN(t->info);
}
/*
* Fetch integer attribute of BTF map definition. Such attributes are
* represented using a pointer to an array, in which dimensionality of array
@ -3009,7 +3025,7 @@ static bool sym_is_subprog(const GElf_Sym *sym, int text_shndx)
static int find_extern_btf_id(const struct btf *btf, const char *ext_name)
{
const struct btf_type *t;
const char *var_name;
const char *tname;
int i, n;
if (!btf)
@ -3019,14 +3035,18 @@ static int find_extern_btf_id(const struct btf *btf, const char *ext_name)
for (i = 1; i <= n; i++) {
t = btf__type_by_id(btf, i);
if (!btf_is_var(t))
if (!btf_is_var(t) && !btf_is_func(t))
continue;
var_name = btf__name_by_offset(btf, t->name_off);
if (strcmp(var_name, ext_name))
tname = btf__name_by_offset(btf, t->name_off);
if (strcmp(tname, ext_name))
continue;
if (btf_var(t)->linkage != BTF_VAR_GLOBAL_EXTERN)
if (btf_is_var(t) &&
btf_var(t)->linkage != BTF_VAR_GLOBAL_EXTERN)
return -EINVAL;
if (btf_is_func(t) && btf_func_linkage(t) != BTF_FUNC_EXTERN)
return -EINVAL;
return i;
@ -3139,12 +3159,48 @@ static int find_int_btf_id(const struct btf *btf)
return 0;
}
static int add_dummy_ksym_var(struct btf *btf)
{
int i, int_btf_id, sec_btf_id, dummy_var_btf_id;
const struct btf_var_secinfo *vs;
const struct btf_type *sec;
sec_btf_id = btf__find_by_name_kind(btf, KSYMS_SEC,
BTF_KIND_DATASEC);
if (sec_btf_id < 0)
return 0;
sec = btf__type_by_id(btf, sec_btf_id);
vs = btf_var_secinfos(sec);
for (i = 0; i < btf_vlen(sec); i++, vs++) {
const struct btf_type *vt;
vt = btf__type_by_id(btf, vs->type);
if (btf_is_func(vt))
break;
}
/* No func in ksyms sec. No need to add dummy var. */
if (i == btf_vlen(sec))
return 0;
int_btf_id = find_int_btf_id(btf);
dummy_var_btf_id = btf__add_var(btf,
"dummy_ksym",
BTF_VAR_GLOBAL_ALLOCATED,
int_btf_id);
if (dummy_var_btf_id < 0)
pr_warn("cannot create a dummy_ksym var\n");
return dummy_var_btf_id;
}
static int bpf_object__collect_externs(struct bpf_object *obj)
{
struct btf_type *sec, *kcfg_sec = NULL, *ksym_sec = NULL;
const struct btf_type *t;
struct extern_desc *ext;
int i, n, off;
int i, n, off, dummy_var_btf_id;
const char *ext_name, *sec_name;
Elf_Scn *scn;
GElf_Shdr sh;
@ -3156,6 +3212,10 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
if (elf_sec_hdr(obj, scn, &sh))
return -LIBBPF_ERRNO__FORMAT;
dummy_var_btf_id = add_dummy_ksym_var(obj->btf);
if (dummy_var_btf_id < 0)
return dummy_var_btf_id;
n = sh.sh_size / sh.sh_entsize;
pr_debug("looking for externs among %d symbols...\n", n);
@ -3200,6 +3260,11 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
sec_name = btf__name_by_offset(obj->btf, sec->name_off);
if (strcmp(sec_name, KCONFIG_SEC) == 0) {
if (btf_is_func(t)) {
pr_warn("extern function %s is unsupported under %s section\n",
ext->name, KCONFIG_SEC);
return -ENOTSUP;
}
kcfg_sec = sec;
ext->type = EXT_KCFG;
ext->kcfg.sz = btf__resolve_size(obj->btf, t->type);
@ -3221,6 +3286,11 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
return -ENOTSUP;
}
} else if (strcmp(sec_name, KSYMS_SEC) == 0) {
if (btf_is_func(t) && ext->is_weak) {
pr_warn("extern weak function %s is unsupported\n",
ext->name);
return -ENOTSUP;
}
ksym_sec = sec;
ext->type = EXT_KSYM;
skip_mods_and_typedefs(obj->btf, t->type,
@ -3247,7 +3317,14 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
* extern variables in DATASEC
*/
int int_btf_id = find_int_btf_id(obj->btf);
/* For extern function, a dummy_var added earlier
* will be used to replace the vs->type and
* its name string will be used to refill
* the missing param's name.
*/
const struct btf_type *dummy_var;
dummy_var = btf__type_by_id(obj->btf, dummy_var_btf_id);
for (i = 0; i < obj->nr_extern; i++) {
ext = &obj->externs[i];
if (ext->type != EXT_KSYM)
@ -3266,12 +3343,32 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
ext_name = btf__name_by_offset(obj->btf, vt->name_off);
ext = find_extern_by_name(obj, ext_name);
if (!ext) {
pr_warn("failed to find extern definition for BTF var '%s'\n",
ext_name);
pr_warn("failed to find extern definition for BTF %s '%s'\n",
btf_kind_str(vt), ext_name);
return -ESRCH;
}
btf_var(vt)->linkage = BTF_VAR_GLOBAL_ALLOCATED;
vt->type = int_btf_id;
if (btf_is_func(vt)) {
const struct btf_type *func_proto;
struct btf_param *param;
int j;
func_proto = btf__type_by_id(obj->btf,
vt->type);
param = btf_params(func_proto);
/* Reuse the dummy_var string if the
* func proto does not have param name.
*/
for (j = 0; j < btf_vlen(func_proto); j++)
if (param[j].type && !param[j].name_off)
param[j].name_off =
dummy_var->name_off;
vs->type = dummy_var_btf_id;
vt->info &= ~0xffff;
vt->info |= BTF_FUNC_GLOBAL;
} else {
btf_var(vt)->linkage = BTF_VAR_GLOBAL_ALLOCATED;
vt->type = int_btf_id;
}
vs->offset = off;
vs->size = sizeof(int);
}
@ -3403,31 +3500,7 @@ static int bpf_program__record_reloc(struct bpf_program *prog,
reloc_desc->processed = false;
/* sub-program call relocation */
if (insn->code == (BPF_JMP | BPF_CALL)) {
if (insn->src_reg != BPF_PSEUDO_CALL) {
pr_warn("prog '%s': incorrect bpf_call opcode\n", prog->name);
return -LIBBPF_ERRNO__RELOC;
}
/* text_shndx can be 0, if no default "main" program exists */
if (!shdr_idx || shdr_idx != obj->efile.text_shndx) {
sym_sec_name = elf_sec_name(obj, elf_sec_by_idx(obj, shdr_idx));
pr_warn("prog '%s': bad call relo against '%s' in section '%s'\n",
prog->name, sym_name, sym_sec_name);
return -LIBBPF_ERRNO__RELOC;
}
if (sym->st_value % BPF_INSN_SZ) {
pr_warn("prog '%s': bad call relo against '%s' at offset %zu\n",
prog->name, sym_name, (size_t)sym->st_value);
return -LIBBPF_ERRNO__RELOC;
}
reloc_desc->type = RELO_CALL;
reloc_desc->insn_idx = insn_idx;
reloc_desc->sym_off = sym->st_value;
return 0;
}
if (!is_ldimm64(insn)) {
if (!is_call_insn(insn) && !is_ldimm64_insn(insn)) {
pr_warn("prog '%s': invalid relo against '%s' for insns[%d].code 0x%x\n",
prog->name, sym_name, insn_idx, insn->code);
return -LIBBPF_ERRNO__RELOC;
@ -3450,12 +3523,39 @@ static int bpf_program__record_reloc(struct bpf_program *prog,
}
pr_debug("prog '%s': found extern #%d '%s' (sym %d) for insn #%u\n",
prog->name, i, ext->name, ext->sym_idx, insn_idx);
reloc_desc->type = RELO_EXTERN;
if (insn->code == (BPF_JMP | BPF_CALL))
reloc_desc->type = RELO_EXTERN_FUNC;
else
reloc_desc->type = RELO_EXTERN_VAR;
reloc_desc->insn_idx = insn_idx;
reloc_desc->sym_off = i; /* sym_off stores extern index */
return 0;
}
/* sub-program call relocation */
if (is_call_insn(insn)) {
if (insn->src_reg != BPF_PSEUDO_CALL) {
pr_warn("prog '%s': incorrect bpf_call opcode\n", prog->name);
return -LIBBPF_ERRNO__RELOC;
}
/* text_shndx can be 0, if no default "main" program exists */
if (!shdr_idx || shdr_idx != obj->efile.text_shndx) {
sym_sec_name = elf_sec_name(obj, elf_sec_by_idx(obj, shdr_idx));
pr_warn("prog '%s': bad call relo against '%s' in section '%s'\n",
prog->name, sym_name, sym_sec_name);
return -LIBBPF_ERRNO__RELOC;
}
if (sym->st_value % BPF_INSN_SZ) {
pr_warn("prog '%s': bad call relo against '%s' at offset %zu\n",
prog->name, sym_name, (size_t)sym->st_value);
return -LIBBPF_ERRNO__RELOC;
}
reloc_desc->type = RELO_CALL;
reloc_desc->insn_idx = insn_idx;
reloc_desc->sym_off = sym->st_value;
return 0;
}
if (!shdr_idx || shdr_idx >= SHN_LORESERVE) {
pr_warn("prog '%s': invalid relo against '%s' in special section 0x%x; forgot to initialize global var?..\n",
prog->name, sym_name, shdr_idx);
@ -5695,7 +5795,7 @@ poison:
/* poison second part of ldimm64 to avoid confusing error from
* verifier about "unknown opcode 00"
*/
if (is_ldimm64(insn))
if (is_ldimm64_insn(insn))
bpf_core_poison_insn(prog, relo_idx, insn_idx + 1, insn + 1);
bpf_core_poison_insn(prog, relo_idx, insn_idx, insn);
return 0;
@ -5771,7 +5871,7 @@ poison:
case BPF_LD: {
__u64 imm;
if (!is_ldimm64(insn) ||
if (!is_ldimm64_insn(insn) ||
insn[0].src_reg != 0 || insn[0].off != 0 ||
insn_idx + 1 >= prog->insns_cnt ||
insn[1].code != 0 || insn[1].dst_reg != 0 ||
@ -6213,7 +6313,7 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)
insn[0].imm = obj->maps[relo->map_idx].fd;
relo->processed = true;
break;
case RELO_EXTERN:
case RELO_EXTERN_VAR:
ext = &obj->externs[relo->sym_off];
if (ext->type == EXT_KCFG) {
insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
@ -6231,6 +6331,12 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)
}
relo->processed = true;
break;
case RELO_EXTERN_FUNC:
ext = &obj->externs[relo->sym_off];
insn[0].src_reg = BPF_PSEUDO_KFUNC_CALL;
insn[0].imm = ext->ksym.kernel_btf_id;
relo->processed = true;
break;
case RELO_SUBPROG_ADDR:
insn[0].src_reg = BPF_PSEUDO_FUNC;
/* will be handled as a follow up pass */
@ -7351,6 +7457,7 @@ static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
{
char sym_type, sym_name[500];
unsigned long long sym_addr;
const struct btf_type *t;
struct extern_desc *ext;
int ret, err = 0;
FILE *f;
@ -7377,6 +7484,10 @@ static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
if (!ext || ext->type != EXT_KSYM)
continue;
t = btf__type_by_id(obj->btf, ext->btf_id);
if (!btf_is_var(t))
continue;
if (ext->is_set && ext->ksym.addr != sym_addr) {
pr_warn("extern (ksym) '%s' resolution is ambiguous: 0x%llx or 0x%llx\n",
sym_name, ext->ksym.addr, sym_addr);
@ -7395,75 +7506,151 @@ out:
return err;
}
static int find_ksym_btf_id(struct bpf_object *obj, const char *ksym_name,
__u16 kind, struct btf **res_btf,
int *res_btf_fd)
{
int i, id, btf_fd, err;
struct btf *btf;
btf = obj->btf_vmlinux;
btf_fd = 0;
id = btf__find_by_name_kind(btf, ksym_name, kind);
if (id == -ENOENT) {
err = load_module_btfs(obj);
if (err)
return err;
for (i = 0; i < obj->btf_module_cnt; i++) {
btf = obj->btf_modules[i].btf;
/* we assume module BTF FD is always >0 */
btf_fd = obj->btf_modules[i].fd;
id = btf__find_by_name_kind(btf, ksym_name, kind);
if (id != -ENOENT)
break;
}
}
if (id <= 0) {
pr_warn("extern (%s ksym) '%s': failed to find BTF ID in kernel BTF(s).\n",
__btf_kind_str(kind), ksym_name);
return -ESRCH;
}
*res_btf = btf;
*res_btf_fd = btf_fd;
return id;
}
static int bpf_object__resolve_ksym_var_btf_id(struct bpf_object *obj,
struct extern_desc *ext)
{
const struct btf_type *targ_var, *targ_type;
__u32 targ_type_id, local_type_id;
const char *targ_var_name;
int id, btf_fd = 0, err;
struct btf *btf = NULL;
id = find_ksym_btf_id(obj, ext->name, BTF_KIND_VAR, &btf, &btf_fd);
if (id < 0)
return id;
/* find local type_id */
local_type_id = ext->ksym.type_id;
/* find target type_id */
targ_var = btf__type_by_id(btf, id);
targ_var_name = btf__name_by_offset(btf, targ_var->name_off);
targ_type = skip_mods_and_typedefs(btf, targ_var->type, &targ_type_id);
err = bpf_core_types_are_compat(obj->btf, local_type_id,
btf, targ_type_id);
if (err <= 0) {
const struct btf_type *local_type;
const char *targ_name, *local_name;
local_type = btf__type_by_id(obj->btf, local_type_id);
local_name = btf__name_by_offset(obj->btf, local_type->name_off);
targ_name = btf__name_by_offset(btf, targ_type->name_off);
pr_warn("extern (var ksym) '%s': incompatible types, expected [%d] %s %s, but kernel has [%d] %s %s\n",
ext->name, local_type_id,
btf_kind_str(local_type), local_name, targ_type_id,
btf_kind_str(targ_type), targ_name);
return -EINVAL;
}
ext->is_set = true;
ext->ksym.kernel_btf_obj_fd = btf_fd;
ext->ksym.kernel_btf_id = id;
pr_debug("extern (var ksym) '%s': resolved to [%d] %s %s\n",
ext->name, id, btf_kind_str(targ_var), targ_var_name);
return 0;
}
static int bpf_object__resolve_ksym_func_btf_id(struct bpf_object *obj,
struct extern_desc *ext)
{
int local_func_proto_id, kfunc_proto_id, kfunc_id;
const struct btf_type *kern_func;
struct btf *kern_btf = NULL;
int ret, kern_btf_fd = 0;
local_func_proto_id = ext->ksym.type_id;
kfunc_id = find_ksym_btf_id(obj, ext->name, BTF_KIND_FUNC,
&kern_btf, &kern_btf_fd);
if (kfunc_id < 0) {
pr_warn("extern (func ksym) '%s': not found in kernel BTF\n",
ext->name);
return kfunc_id;
}
if (kern_btf != obj->btf_vmlinux) {
pr_warn("extern (func ksym) '%s': function in kernel module is not supported\n",
ext->name);
return -ENOTSUP;
}
kern_func = btf__type_by_id(kern_btf, kfunc_id);
kfunc_proto_id = kern_func->type;
ret = bpf_core_types_are_compat(obj->btf, local_func_proto_id,
kern_btf, kfunc_proto_id);
if (ret <= 0) {
pr_warn("extern (func ksym) '%s': func_proto [%d] incompatible with kernel [%d]\n",
ext->name, local_func_proto_id, kfunc_proto_id);
return -EINVAL;
}
ext->is_set = true;
ext->ksym.kernel_btf_obj_fd = kern_btf_fd;
ext->ksym.kernel_btf_id = kfunc_id;
pr_debug("extern (func ksym) '%s': resolved to kernel [%d]\n",
ext->name, kfunc_id);
return 0;
}
static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj)
{
const struct btf_type *t;
struct extern_desc *ext;
struct btf *btf;
int i, j, id, btf_fd, err;
int i, err;
for (i = 0; i < obj->nr_extern; i++) {
const struct btf_type *targ_var, *targ_type;
__u32 targ_type_id, local_type_id;
const char *targ_var_name;
int ret;
ext = &obj->externs[i];
if (ext->type != EXT_KSYM || !ext->ksym.type_id)
continue;
btf = obj->btf_vmlinux;
btf_fd = 0;
id = btf__find_by_name_kind(btf, ext->name, BTF_KIND_VAR);
if (id == -ENOENT) {
err = load_module_btfs(obj);
if (err)
return err;
for (j = 0; j < obj->btf_module_cnt; j++) {
btf = obj->btf_modules[j].btf;
/* we assume module BTF FD is always >0 */
btf_fd = obj->btf_modules[j].fd;
id = btf__find_by_name_kind(btf, ext->name, BTF_KIND_VAR);
if (id != -ENOENT)
break;
}
}
if (id <= 0) {
pr_warn("extern (ksym) '%s': failed to find BTF ID in kernel BTF(s).\n",
ext->name);
return -ESRCH;
}
/* find local type_id */
local_type_id = ext->ksym.type_id;
/* find target type_id */
targ_var = btf__type_by_id(btf, id);
targ_var_name = btf__name_by_offset(btf, targ_var->name_off);
targ_type = skip_mods_and_typedefs(btf, targ_var->type, &targ_type_id);
ret = bpf_core_types_are_compat(obj->btf, local_type_id,
btf, targ_type_id);
if (ret <= 0) {
const struct btf_type *local_type;
const char *targ_name, *local_name;
local_type = btf__type_by_id(obj->btf, local_type_id);
local_name = btf__name_by_offset(obj->btf, local_type->name_off);
targ_name = btf__name_by_offset(btf, targ_type->name_off);
pr_warn("extern (ksym) '%s': incompatible types, expected [%d] %s %s, but kernel has [%d] %s %s\n",
ext->name, local_type_id,
btf_kind_str(local_type), local_name, targ_type_id,
btf_kind_str(targ_type), targ_name);
return -EINVAL;
}
ext->is_set = true;
ext->ksym.kernel_btf_obj_fd = btf_fd;
ext->ksym.kernel_btf_id = id;
pr_debug("extern (ksym) '%s': resolved to [%d] %s %s\n",
ext->name, id, btf_kind_str(targ_var), targ_var_name);
t = btf__type_by_id(obj->btf, ext->btf_id);
if (btf_is_var(t))
err = bpf_object__resolve_ksym_var_btf_id(obj, ext);
else
err = bpf_object__resolve_ksym_func_btf_id(obj, ext);
if (err)
return err;
}
return 0;
}

View File

@ -187,16 +187,6 @@ struct tcp_congestion_ops {
typeof(y) __y = (y); \
__x == 0 ? __y : ((__y == 0) ? __x : min(__x, __y)); })
static __always_inline __u32 tcp_slow_start(struct tcp_sock *tp, __u32 acked)
{
__u32 cwnd = min(tp->snd_cwnd + acked, tp->snd_ssthresh);
acked -= cwnd - tp->snd_cwnd;
tp->snd_cwnd = min(cwnd, tp->snd_cwnd_clamp);
return acked;
}
static __always_inline bool tcp_in_slow_start(const struct tcp_sock *tp)
{
return tp->snd_cwnd < tp->snd_ssthresh;
@ -213,22 +203,7 @@ static __always_inline bool tcp_is_cwnd_limited(const struct sock *sk)
return !!BPF_CORE_READ_BITFIELD(tp, is_cwnd_limited);
}
static __always_inline void tcp_cong_avoid_ai(struct tcp_sock *tp, __u32 w, __u32 acked)
{
/* If credits accumulated at a higher w, apply them gently now. */
if (tp->snd_cwnd_cnt >= w) {
tp->snd_cwnd_cnt = 0;
tp->snd_cwnd++;
}
tp->snd_cwnd_cnt += acked;
if (tp->snd_cwnd_cnt >= w) {
__u32 delta = tp->snd_cwnd_cnt / w;
tp->snd_cwnd_cnt -= delta * w;
tp->snd_cwnd += delta;
}
tp->snd_cwnd = min(tp->snd_cwnd, tp->snd_cwnd_clamp);
}
extern __u32 tcp_slow_start(struct tcp_sock *tp, __u32 acked) __ksym;
extern void tcp_cong_avoid_ai(struct tcp_sock *tp, __u32 w, __u32 acked) __ksym;
#endif

View File

@ -0,0 +1,59 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2021 Facebook */
#include <test_progs.h>
#include <network_helpers.h>
#include "kfunc_call_test.skel.h"
#include "kfunc_call_test_subprog.skel.h"
static void test_main(void)
{
struct kfunc_call_test *skel;
int prog_fd, retval, err;
skel = kfunc_call_test__open_and_load();
if (!ASSERT_OK_PTR(skel, "skel"))
return;
prog_fd = bpf_program__fd(skel->progs.kfunc_call_test1);
err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4),
NULL, NULL, (__u32 *)&retval, NULL);
ASSERT_OK(err, "bpf_prog_test_run(test1)");
ASSERT_EQ(retval, 12, "test1-retval");
prog_fd = bpf_program__fd(skel->progs.kfunc_call_test2);
err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4),
NULL, NULL, (__u32 *)&retval, NULL);
ASSERT_OK(err, "bpf_prog_test_run(test2)");
ASSERT_EQ(retval, 3, "test2-retval");
kfunc_call_test__destroy(skel);
}
static void test_subprog(void)
{
struct kfunc_call_test_subprog *skel;
int prog_fd, retval, err;
skel = kfunc_call_test_subprog__open_and_load();
if (!ASSERT_OK_PTR(skel, "skel"))
return;
prog_fd = bpf_program__fd(skel->progs.kfunc_call_test1);
err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4),
NULL, NULL, (__u32 *)&retval, NULL);
ASSERT_OK(err, "bpf_prog_test_run(test1)");
ASSERT_EQ(retval, 10, "test1-retval");
ASSERT_NEQ(skel->data->active_res, -1, "active_res");
ASSERT_EQ(skel->data->sk_state, BPF_TCP_CLOSE, "sk_state");
kfunc_call_test_subprog__destroy(skel);
}
void test_kfunc_call(void)
{
if (test__start_subtest("main"))
test_main();
if (test__start_subtest("subprog"))
test_subprog();
}

View File

@ -174,8 +174,8 @@ static __always_inline void bictcp_hystart_reset(struct sock *sk)
* as long as it is used in one of the func ptr
* under SEC(".struct_ops").
*/
SEC("struct_ops/bictcp_init")
void BPF_PROG(bictcp_init, struct sock *sk)
SEC("struct_ops/bpf_cubic_init")
void BPF_PROG(bpf_cubic_init, struct sock *sk)
{
struct bictcp *ca = inet_csk_ca(sk);
@ -192,7 +192,7 @@ void BPF_PROG(bictcp_init, struct sock *sk)
* The remaining tcp-cubic functions have an easier way.
*/
SEC("no-sec-prefix-bictcp_cwnd_event")
void BPF_PROG(bictcp_cwnd_event, struct sock *sk, enum tcp_ca_event event)
void BPF_PROG(bpf_cubic_cwnd_event, struct sock *sk, enum tcp_ca_event event)
{
if (event == CA_EVENT_TX_START) {
struct bictcp *ca = inet_csk_ca(sk);
@ -384,7 +384,7 @@ tcp_friendliness:
}
/* Or simply use the BPF_STRUCT_OPS to avoid the SEC boiler plate. */
void BPF_STRUCT_OPS(bictcp_cong_avoid, struct sock *sk, __u32 ack, __u32 acked)
void BPF_STRUCT_OPS(bpf_cubic_cong_avoid, struct sock *sk, __u32 ack, __u32 acked)
{
struct tcp_sock *tp = tcp_sk(sk);
struct bictcp *ca = inet_csk_ca(sk);
@ -403,7 +403,7 @@ void BPF_STRUCT_OPS(bictcp_cong_avoid, struct sock *sk, __u32 ack, __u32 acked)
tcp_cong_avoid_ai(tp, ca->cnt, acked);
}
__u32 BPF_STRUCT_OPS(bictcp_recalc_ssthresh, struct sock *sk)
__u32 BPF_STRUCT_OPS(bpf_cubic_recalc_ssthresh, struct sock *sk)
{
const struct tcp_sock *tp = tcp_sk(sk);
struct bictcp *ca = inet_csk_ca(sk);
@ -420,7 +420,7 @@ __u32 BPF_STRUCT_OPS(bictcp_recalc_ssthresh, struct sock *sk)
return max((tp->snd_cwnd * beta) / BICTCP_BETA_SCALE, 2U);
}
void BPF_STRUCT_OPS(bictcp_state, struct sock *sk, __u8 new_state)
void BPF_STRUCT_OPS(bpf_cubic_state, struct sock *sk, __u8 new_state)
{
if (new_state == TCP_CA_Loss) {
bictcp_reset(inet_csk_ca(sk));
@ -496,7 +496,7 @@ static __always_inline void hystart_update(struct sock *sk, __u32 delay)
}
}
void BPF_STRUCT_OPS(bictcp_acked, struct sock *sk,
void BPF_STRUCT_OPS(bpf_cubic_acked, struct sock *sk,
const struct ack_sample *sample)
{
const struct tcp_sock *tp = tcp_sk(sk);
@ -525,21 +525,21 @@ void BPF_STRUCT_OPS(bictcp_acked, struct sock *sk,
hystart_update(sk, delay);
}
__u32 BPF_STRUCT_OPS(tcp_reno_undo_cwnd, struct sock *sk)
{
const struct tcp_sock *tp = tcp_sk(sk);
extern __u32 tcp_reno_undo_cwnd(struct sock *sk) __ksym;
return max(tp->snd_cwnd, tp->prior_cwnd);
__u32 BPF_STRUCT_OPS(bpf_cubic_undo_cwnd, struct sock *sk)
{
return tcp_reno_undo_cwnd(sk);
}
SEC(".struct_ops")
struct tcp_congestion_ops cubic = {
.init = (void *)bictcp_init,
.ssthresh = (void *)bictcp_recalc_ssthresh,
.cong_avoid = (void *)bictcp_cong_avoid,
.set_state = (void *)bictcp_state,
.undo_cwnd = (void *)tcp_reno_undo_cwnd,
.cwnd_event = (void *)bictcp_cwnd_event,
.pkts_acked = (void *)bictcp_acked,
.init = (void *)bpf_cubic_init,
.ssthresh = (void *)bpf_cubic_recalc_ssthresh,
.cong_avoid = (void *)bpf_cubic_cong_avoid,
.set_state = (void *)bpf_cubic_state,
.undo_cwnd = (void *)bpf_cubic_undo_cwnd,
.cwnd_event = (void *)bpf_cubic_cwnd_event,
.pkts_acked = (void *)bpf_cubic_acked,
.name = "bpf_cubic",
};

View File

@ -194,22 +194,12 @@ __u32 BPF_PROG(dctcp_cwnd_undo, struct sock *sk)
return max(tcp_sk(sk)->snd_cwnd, ca->loss_cwnd);
}
SEC("struct_ops/tcp_reno_cong_avoid")
void BPF_PROG(tcp_reno_cong_avoid, struct sock *sk, __u32 ack, __u32 acked)
extern void tcp_reno_cong_avoid(struct sock *sk, __u32 ack, __u32 acked) __ksym;
SEC("struct_ops/dctcp_reno_cong_avoid")
void BPF_PROG(dctcp_cong_avoid, struct sock *sk, __u32 ack, __u32 acked)
{
struct tcp_sock *tp = tcp_sk(sk);
if (!tcp_is_cwnd_limited(sk))
return;
/* In "safe" area, increase. */
if (tcp_in_slow_start(tp)) {
acked = tcp_slow_start(tp, acked);
if (!acked)
return;
}
/* In dangerous area, increase slowly. */
tcp_cong_avoid_ai(tp, tp->snd_cwnd, acked);
tcp_reno_cong_avoid(sk, ack, acked);
}
SEC(".struct_ops")
@ -226,7 +216,7 @@ struct tcp_congestion_ops dctcp = {
.in_ack_event = (void *)dctcp_update_alpha,
.cwnd_event = (void *)dctcp_cwnd_event,
.ssthresh = (void *)dctcp_ssthresh,
.cong_avoid = (void *)tcp_reno_cong_avoid,
.cong_avoid = (void *)dctcp_cong_avoid,
.undo_cwnd = (void *)dctcp_cwnd_undo,
.set_state = (void *)dctcp_state,
.flags = TCP_CONG_NEEDS_ECN,

View File

@ -0,0 +1,47 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2021 Facebook */
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include "bpf_tcp_helpers.h"
extern int bpf_kfunc_call_test2(struct sock *sk, __u32 a, __u32 b) __ksym;
extern __u64 bpf_kfunc_call_test1(struct sock *sk, __u32 a, __u64 b,
__u32 c, __u64 d) __ksym;
SEC("classifier")
int kfunc_call_test2(struct __sk_buff *skb)
{
struct bpf_sock *sk = skb->sk;
if (!sk)
return -1;
sk = bpf_sk_fullsock(sk);
if (!sk)
return -1;
return bpf_kfunc_call_test2((struct sock *)sk, 1, 2);
}
SEC("classifier")
int kfunc_call_test1(struct __sk_buff *skb)
{
struct bpf_sock *sk = skb->sk;
__u64 a = 1ULL << 32;
__u32 ret;
if (!sk)
return -1;
sk = bpf_sk_fullsock(sk);
if (!sk)
return -1;
a = bpf_kfunc_call_test1((struct sock *)sk, 1, a | 2, 3, a | 4);
ret = a >> 32; /* ret should be 2 */
ret += (__u32)a; /* ret should be 12 */
return ret;
}
char _license[] SEC("license") = "GPL";

View File

@ -0,0 +1,42 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2021 Facebook */
#include <linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include "bpf_tcp_helpers.h"
extern const int bpf_prog_active __ksym;
extern __u64 bpf_kfunc_call_test1(struct sock *sk, __u32 a, __u64 b,
__u32 c, __u64 d) __ksym;
extern struct sock *bpf_kfunc_call_test3(struct sock *sk) __ksym;
int active_res = -1;
int sk_state = -1;
int __noinline f1(struct __sk_buff *skb)
{
struct bpf_sock *sk = skb->sk;
int *active;
if (!sk)
return -1;
sk = bpf_sk_fullsock(sk);
if (!sk)
return -1;
active = (int *)bpf_per_cpu_ptr(&bpf_prog_active,
bpf_get_smp_processor_id());
if (active)
active_res = *active;
sk_state = bpf_kfunc_call_test3((struct sock *)sk)->__sk_common.skc_state;
return (__u32)bpf_kfunc_call_test1((struct sock *)sk, 1, 2, 3, 4);
}
SEC("classifier")
int kfunc_call_test1(struct __sk_buff *skb)
{
return f1(skb);
}
char _license[] SEC("license") = "GPL";

View File

@ -19,7 +19,7 @@
BPF_MOV64_IMM(BPF_REG_0, 2),
BPF_EXIT_INSN(),
},
.errstr_unpriv = "function calls to other bpf functions are allowed for",
.errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for",
.result_unpriv = REJECT,
.result = ACCEPT,
.retval = 1,
@ -136,7 +136,7 @@
{
"calls: wrong src reg",
.insns = {
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 2, 0, 0),
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 3, 0, 0),
BPF_MOV64_IMM(BPF_REG_0, 1),
BPF_EXIT_INSN(),
},
@ -397,7 +397,7 @@
BPF_MOV64_IMM(BPF_REG_0, 1),
BPF_EXIT_INSN(),
},
.errstr_unpriv = "function calls to other bpf functions are allowed for",
.errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for",
.fixup_map_hash_48b = { 3 },
.result_unpriv = REJECT,
.result = ACCEPT,
@ -1977,7 +1977,7 @@
BPF_EXIT_INSN(),
},
.prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
.errstr_unpriv = "function calls to other bpf functions are allowed for",
.errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for",
.result_unpriv = REJECT,
.result = ACCEPT,
},
@ -2003,7 +2003,7 @@
BPF_EXIT_INSN(),
},
.prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
.errstr_unpriv = "function calls to other bpf functions are allowed for",
.errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for",
.errstr = "!read_ok",
.result = REJECT,
},
@ -2028,7 +2028,7 @@
BPF_EXIT_INSN(),
},
.prog_type = BPF_PROG_TYPE_SOCKET_FILTER,
.errstr_unpriv = "function calls to other bpf functions are allowed for",
.errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for",
.errstr = "!read_ok",
.result = REJECT,
},

View File

@ -85,7 +85,7 @@
BPF_MOV64_IMM(BPF_REG_0, 12),
BPF_EXIT_INSN(),
},
.errstr_unpriv = "function calls to other bpf functions are allowed for",
.errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for",
.result_unpriv = REJECT,
.result = ACCEPT,
.retval = 7,
@ -103,7 +103,7 @@
BPF_MOV64_IMM(BPF_REG_0, 12),
BPF_EXIT_INSN(),
},
.errstr_unpriv = "function calls to other bpf functions are allowed for",
.errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for",
.result_unpriv = REJECT,
.result = ACCEPT,
.retval = 7,
@ -121,7 +121,7 @@
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 1, 0, -5),
BPF_EXIT_INSN(),
},
.errstr_unpriv = "function calls to other bpf functions are allowed for",
.errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for",
.result_unpriv = REJECT,
.result = ACCEPT,
.retval = 7,
@ -137,7 +137,7 @@
BPF_MOV64_REG(BPF_REG_0, BPF_REG_1),
BPF_EXIT_INSN(),
},
.errstr_unpriv = "function calls to other bpf functions are allowed for",
.errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for",
.result_unpriv = REJECT,
.result = ACCEPT,
.retval = 2,
@ -152,7 +152,7 @@
BPF_MOV64_REG(BPF_REG_0, BPF_REG_1),
BPF_EXIT_INSN(),
},
.errstr_unpriv = "function calls to other bpf functions are allowed for",
.errstr_unpriv = "loading/calling other bpf or kernel functions are allowed for",
.result_unpriv = REJECT,
.result = ACCEPT,
.retval = 2,