linux/kernel
YiFei Zhu 7d9c342789 bpf: Make cgroup storages shared between programs on the same cgroup
This change comes in several parts:

One, the restriction that the CGROUP_STORAGE map can only be used
by one program is removed. This results in the removal of the field
'aux' in struct bpf_cgroup_storage_map, and removal of relevant
code associated with the field, and removal of now-noop functions
bpf_free_cgroup_storage and bpf_cgroup_storage_release.

Second, we permit a key of type u64 as the key to the map.
Providing such a key type indicates that the map should ignore
attach type when comparing map keys. However, for simplicity newly
linked storage will still have the attach type at link time in
its key struct. cgroup_storage_check_btf is adapted to accept
u64 as the type of the key.

Third, because the storages are now shared, the storages cannot
be unconditionally freed on program detach. There could be two
ways to solve this issue:
* A. Reference count the usage of the storages, and free when the
     last program is detached.
* B. Free only when the storage is impossible to be referred to
     again, i.e. when either the cgroup_bpf it is attached to, or
     the map itself, is freed.
Option A has the side effect that, when the user detach and
reattach a program, whether the program gets a fresh storage
depends on whether there is another program attached using that
storage. This could trigger races if the user is multi-threaded,
and since nondeterminism in data races is evil, go with option B.

The both the map and the cgroup_bpf now tracks their associated
storages, and the storage unlink and free are removed from
cgroup_bpf_detach and added to cgroup_bpf_release and
cgroup_storage_map_free. The latter also new holds the cgroup_mutex
to prevent any races with the former.

Fourth, on attach, we reuse the old storage if the key already
exists in the map, via cgroup_storage_lookup. If the storage
does not exist yet, we create a new one, and publish it at the
last step in the attach process. This does not create a race
condition because for the whole attach the cgroup_mutex is held.
We keep track of an array of new storages that was allocated
and if the process fails only the new storages would get freed.

Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/d5401c6106728a00890401190db40020a1f84ff1.1595565795.git.zhuyifei@google.com
2020-07-25 20:16:35 -07:00
..
bpf bpf: Make cgroup storages shared between programs on the same cgroup 2020-07-25 20:16:35 -07:00
cgroup cgroup: fix cgroup_sk_alloc() for sk_clone_lock() 2020-07-07 13:34:11 -07:00
configs compiler: remove CONFIG_OPTIMIZE_INLINING entirely 2020-04-07 10:43:42 -07:00
debug kgdb: enable arch to support XML packet. 2020-07-09 20:09:28 -07:00
dma dma-pool: do not allocate pool memory from CMA 2020-07-14 15:46:32 +02:00
events bpf: Fail PERF_EVENT_IOC_SET_BPF when bpf_get_[stack|stackid] cannot work 2020-07-25 20:16:34 -07:00
gcov treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
irq genirq/affinity: Handle affinity setting on inactive interrupts correctly 2020-07-17 23:30:43 +02:00
kcsan kcsan: Support distinguishing volatile accesses 2020-06-11 20:04:01 +02:00
livepatch livepatch: Make klp_apply_object_relocs static 2020-05-11 00:31:38 +02:00
locking The X86 entry, exception and interrupt code rework 2020-06-13 10:05:47 -07:00
power treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
printk Revert "kernel/printk: add kmsg SEEK_CUR handling" 2020-06-21 20:47:20 -07:00
rcu A single fix for a printk format warning in RCU. 2020-07-05 12:21:28 -07:00
sched sched: Warn if garbage is passed to default_wake_function() 2020-07-24 14:40:25 +02:00
time timer: Fix wheel index calculation on last level 2020-07-17 21:44:05 +02:00
trace bpf: Separate bpf_get_[stack|stackid] for perf events BPF 2020-07-25 20:16:34 -07:00
.gitignore .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
acct.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
async.c
audit_fsnotify.c fsnotify: use helpers to access data by data_type 2020-03-23 18:19:06 +01:00
audit_tree.c
audit_watch.c \n 2020-04-06 08:58:42 -07:00
audit.c audit/stable-5.8 PR 20200601 2020-06-02 17:13:37 -07:00
audit.h audit: fix a net reference leak in audit_list_rules_send() 2020-04-22 15:23:10 -04:00
auditfilter.c audit: fix a net reference leak in audit_list_rules_send() 2020-04-22 15:23:10 -04:00
auditsc.c audit: add subj creds to NETFILTER_CFG record to 2020-05-20 18:09:19 -04:00
backtracetest.c
bounds.c
capability.c
compat.c uaccess: Selectively open read or write user access 2020-05-01 12:35:21 +10:00
configs.c proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
context_tracking.c context_tracking: Ensure that the critical path cannot be instrumented 2020-06-11 15:14:36 +02:00
cpu_pm.c kernel/cpu_pm: Fix uninitted local in cpu_pm 2020-05-15 11:44:34 -07:00
cpu.c The changes in this cycle are: 2020-06-03 13:06:42 -07:00
crash_core.c
crash_dump.c crash_dump: Remove no longer used saved_max_pfn 2020-04-15 11:21:54 +02:00
cred.c exec: Teach prepare_exec_creds how exec treats uids & gids 2020-05-20 14:44:21 -05:00
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c umd: Remove exit_umh 2020-07-07 11:58:59 -05:00
extable.c kernel/extable.c: use address-of operator on section symbols 2020-04-07 10:43:42 -07:00
fail_function.c
fork.c Merge branch 'usermode-driver-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace into bpf-next 2020-07-14 12:18:01 -07:00
freezer.c
futex.c mmap locking API: use coccinelle to convert mmap_sem rwsem call sites 2020-06-09 09:39:14 -07:00
gen_kheaders.sh kbuild: add variables for compression tools 2020-06-06 23:42:01 +09:00
groups.c mm: remove the pgprot argument to __vmalloc 2020-06-02 10:59:11 -07:00
hung_task.c kernel/hung_task.c: introduce sysctl to print all traces when a hung task is detected 2020-06-08 11:05:56 -07:00
iomem.c
irq_work.c irq_work, smp: Allow irq_work on call_single_queue 2020-05-28 10:54:15 +02:00
jump_label.c
kallsyms.c kallsyms: Refactor kallsyms_show_value() to take cred 2020-07-08 15:59:57 -07:00
kcmp.c kernel/kcmp.c: Use new infrastructure to fix deadlocks in execve 2020-03-25 10:04:01 -05:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks sched/rt, locking: Use CONFIG_PREEMPTION 2019-12-08 14:37:36 +01:00
Kconfig.preempt sched/Kconfig: Fix spelling mistake in user-visible help text 2019-11-12 11:35:32 +01:00
kcov.c kcov: check kcov_softirq in kcov_remote_stop() 2020-06-10 19:14:17 -07:00
kexec_core.c kexec: add machine_kexec_post_load() 2020-01-08 16:32:55 +00:00
kexec_elf.c
kexec_file.c kexec: do not verify the signature without the lockdown or mandatory signature 2020-06-26 00:27:36 -07:00
kexec_internal.h kexec: add machine_kexec_post_load() 2020-01-08 16:32:55 +00:00
kexec.c kexec: add machine_kexec_post_load() 2020-01-08 16:32:55 +00:00
kheaders.c
kmod.c kmod: make request_module() return an error when autoloading is disabled 2020-04-10 15:36:22 -07:00
kprobes.c kprobes: Do not expose probe addresses to non-CAP_SYSLOG 2020-07-08 16:00:22 -07:00
ksysfs.c
kthread.c maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault 2020-06-17 10:57:41 -07:00
latencytop.c sysctl: pass kernel pointers to ->proc_handler 2020-04-27 02:07:40 -04:00
Makefile umh: Separate the user mode driver and the user mode helper support 2020-07-04 09:34:32 -05:00
module_signature.c
module_signing.c
module-internal.h
module.c Refactor kallsyms_show_value() users for correct cred 2020-07-09 13:09:30 -07:00
notifier.c mm: remove vmalloc_sync_(un)mappings() 2020-06-02 10:59:12 -07:00
nsproxy.c nsproxy: restore EINVAL for non-namespace file descriptor 2020-06-17 00:33:12 +02:00
padata.c padata: upgrade smp_mb__after_atomic to smp_mb in padata_do_serial 2020-06-18 17:09:54 +10:00
panic.c bug: Annotate WARN/BUG/stackfail as noinstr safe 2020-06-11 15:14:36 +02:00
params.c
pid_namespace.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-06-03 16:27:18 -07:00
pid.c remove the no longer needed pid_alive() check in __task_pid_nr_ns() 2020-04-30 06:40:14 -05:00
profile.c proc: convert everything to "struct proc_ops" 2020-02-04 03:05:26 +00:00
ptrace.c ptrace: reintroduce usage of subjective credentials in ptrace_has_cap() 2020-01-18 13:51:39 +01:00
range.c
reboot.c printk: Collapse shutdown types into a single dump reason 2020-05-30 10:34:03 -07:00
relay.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
resource.c /dev/mem: Revoke mappings when a driver claims the region 2020-05-27 11:10:05 +02:00
rseq.c rseq: Reject unknown flags on rseq unregister 2019-12-25 10:41:20 +01:00
scs.c scs: Report SCS usage in bytes rather than number of entries 2020-06-04 16:14:56 +01:00
seccomp.c sysctl: pass kernel pointers to ->proc_handler 2020-04-27 02:07:40 -04:00
signal.c task_work: teach task_work_add() to do signal_wake_up() 2020-06-30 12:18:08 -06:00
smp.c smp, irq_work: Continue smp_call_function*() and irq_work*() integration 2020-06-28 17:01:20 +02:00
smpboot.c
smpboot.h
softirq.c x86/entry: Clarify irq_{enter,exit}_rcu() 2020-06-11 15:15:24 +02:00
stackleak.c
stacktrace.c stacktrace: Get rid of unneeded '!!' pattern 2019-11-11 10:30:59 +01:00
stop_machine.c stop_machine: Make stop_cpus() static 2020-01-17 10:19:21 +01:00
sys_ni.c y2038: allow disabling time32 system calls 2019-11-15 14:38:30 +01:00
sys.c Add additional LSM hooks for SafeSetID 2020-06-14 11:39:31 -07:00
sysctl_binary.c sysctl: Remove the sysctl system call 2019-11-26 13:03:56 -06:00
sysctl-test.c kunit: allow kunit tests to be loaded as a module 2020-01-09 16:42:29 -07:00
sysctl.c kernel/sysctl.c: ignore out-of-range taint bits introduced via kernel.tainted 2020-06-08 11:05:56 -07:00
task_work.c task_work: teach task_work_add() to do signal_wake_up() 2020-06-30 12:18:08 -06:00
taskstats.c taskstats: fix data-race 2019-12-04 15:18:39 +01:00
test_kprobes.c
torture.c CPU (hotplug) updates: 2020-03-30 18:06:39 -07:00
tracepoint.c
tsacct.c tsacct: add 64-bit btime field 2019-12-18 18:07:31 +01:00
ucount.c ucount: Make sure ucounts in /proc/sys/user don't regress again 2020-04-07 21:51:27 +02:00
uid16.c
uid16.h
umh.c umh: Stop calling do_execve_file 2020-07-04 09:35:36 -05:00
up.c smp/up: Make smp_call_function_single() match SMP semantics 2020-02-07 15:34:12 +01:00
user_namespace.c nsproxy: add struct nsset 2020-05-09 13:57:12 +02:00
user-return-notifier.c
user.c user.c: make uidhash_table static 2020-06-04 19:06:24 -07:00
usermode_driver.c umd: Stop using split_argv 2020-07-07 11:58:59 -05:00
utsname_sysctl.c sysctl: pass kernel pointers to ->proc_handler 2020-04-27 02:07:40 -04:00
utsname.c nsproxy: add struct nsset 2020-05-09 13:57:12 +02:00
watch_queue.c Notifications over pipes + Keyring notifications 2020-06-13 09:56:21 -07:00
watchdog_hld.c
watchdog.c kernel/watchdog.c: convert {soft/hard}lockup boot parameters to sysctl aliases 2020-06-08 11:05:56 -07:00
workqueue_internal.h
workqueue.c maccess: rename probe_kernel_{read,write} to copy_{from,to}_kernel_nofault 2020-06-17 10:57:41 -07:00