linux/kernel
Daniel Borkmann 50b045a8c0 bpf, lru: avoid messing with eviction heuristics upon syscall lookup
One of the biggest issues we face right now with picking LRU map over
regular hash table is that a map walk out of user space, for example,
to just dump the existing entries or to remove certain ones, will
completely mess up LRU eviction heuristics and wrong entries such
as just created ones will get evicted instead. The reason for this
is that we mark an entry as "in use" via bpf_lru_node_set_ref() from
system call lookup side as well. Thus upon walk, all entries are
being marked, so information of actual least recently used ones
are "lost".

In case of Cilium where it can be used (besides others) as a BPF
based connection tracker, this current behavior causes disruption
upon control plane changes that need to walk the map from user space
to evict certain entries. Discussion result from bpfconf [0] was that
we should simply just remove marking from system call side as no
good use case could be found where it's actually needed there.
Therefore this patch removes marking for regular LRU and per-CPU
flavor. If there ever should be a need in future, the behavior could
be selected via map creation flag, but due to mentioned reason we
avoid this here.

  [0] http://vger.kernel.org/bpfconf.html

Fixes: 29ba732acb ("bpf: Add BPF_MAP_TYPE_LRU_HASH")
Fixes: 8f8449384e ("bpf: Add BPF_MAP_TYPE_LRU_PERCPU_HASH")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2019-05-14 10:47:29 -07:00
..
bpf bpf, lru: avoid messing with eviction heuristics upon syscall lookup 2019-05-14 10:47:29 -07:00
cgroup Merge branch 'for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2019-05-09 13:52:12 -07:00
configs
debug kdb: use bool for binary state indicators 2018-12-30 08:31:52 +00:00
dma DMA mapping updates for 5.2 2019-05-09 08:40:55 -07:00
events Printk changes for 5.2 2019-05-07 09:18:12 -07:00
gcov kernel/gcov/gcc_3_4.c: use struct_size() in kzalloc() 2019-03-07 18:32:02 -08:00
irq Driver core/kobject patches for 5.2-rc1 2019-05-07 13:01:40 -07:00
livepatch Driver core/kobject patches for 5.2-rc1 2019-05-07 13:01:40 -07:00
locking Remove Mysterious Macro Intended to Obscure Weird Behaviours (mmiowb()) 2019-05-06 16:57:52 -07:00
power Power management updates for 5.2-rc1 2019-05-06 19:40:31 -07:00
printk fbdev changes for v5.1: 2019-03-15 14:22:59 -07:00
rcu Printk changes for 5.2 2019-05-07 09:18:12 -07:00
sched Driver core/kobject patches for 5.2-rc1 2019-05-07 13:01:40 -07:00
time Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2019-05-07 22:03:58 -07:00
trace bpf: mark bpf_event_notify and bpf_event_init as static 2019-05-14 01:27:18 +02:00
.gitignore Provide in-kernel headers to make extending kernel easier 2019-04-29 16:48:03 +02:00
acct.c acct_on(): don't mess with freeze protection 2019-04-04 21:04:13 -04:00
async.c treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively 2019-04-09 14:19:06 +02:00
audit_fsnotify.c audit_compare_dname_path(): switch to const struct qstr * 2019-04-28 20:33:43 -04:00
audit_tree.c fsnotify: switch send_to_group() and ->handle_event to const struct qstr * 2019-04-26 13:51:03 -04:00
audit_watch.c audit_compare_dname_path(): switch to const struct qstr * 2019-04-28 20:33:43 -04:00
audit.c audit: connect LOGIN record to its syscall record 2019-03-20 20:57:48 -04:00
audit.h audit_compare_dname_path(): switch to const struct qstr * 2019-04-28 20:33:43 -04:00
auditfilter.c Merge branch 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-05-07 20:03:32 -07:00
auditsc.c Merge branch 'work.dcache' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-05-07 20:03:32 -07:00
backtracetest.c backtrace-test: Simplify stack trace handling 2019-04-29 12:37:47 +02:00
bounds.c
capability.c LSM: add SafeSetID module that gates setid calls 2019-01-25 11:22:43 -08:00
compat.c time: make adjtime compat handling available for 32 bit 2019-02-07 00:13:27 +01:00
configs.c kernel/configs: use .incbin directive to embed config_data.gz 2019-03-07 18:32:02 -08:00
context_tracking.c
cpu_pm.c
cpu.c Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-05-06 14:50:46 -07:00
crash_core.c kexec: export PG_offline to VMCOREINFO 2019-03-05 21:07:14 -08:00
crash_dump.c
cred.c SELinux: Remove cred security blob poisoning 2019-01-08 13:18:44 -08:00
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c Merge branch 'for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2019-03-07 10:11:41 -08:00
extable.c
fail_function.c treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively 2019-04-09 14:19:06 +02:00
fork.c Merge branch 'for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2019-05-09 13:52:12 -07:00
freezer.c
futex.c locking/futex: Allow low-level atomic operations to return -EAGAIN 2019-04-26 13:57:31 +01:00
gen_ikh_data.sh Provide in-kernel headers to make extending kernel easier 2019-04-29 16:48:03 +02:00
groups.c
hung_task.c kernel/hung_task.c: Use continuously blocked time when reporting. 2019-03-07 18:31:59 -08:00
iomem.c mm/resource: Use resource_overlaps() to simplify region_intersects() 2019-04-19 12:59:36 +02:00
irq_work.c irq_work: Do not raise an IPI when queueing work on the local CPU 2019-04-18 14:07:52 +02:00
jump_label.c locking/static_key: Don't take sleeping locks in __static_key_slow_dec_deferred() 2019-04-29 08:29:21 +02:00
kallsyms.c bpf: Add module name [bpf] to ksymbols for bpf programs 2019-01-21 17:38:56 -03:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks Remove Mysterious Macro Intended to Obscure Weird Behaviours (mmiowb()) 2019-05-06 16:57:52 -07:00
Kconfig.preempt
kcov.c kcov: convert kcov.refcount to refcount_t 2019-03-07 18:32:02 -08:00
kexec_core.c power/suspend: Add function to disable secondaries for suspend 2019-05-03 19:42:41 +02:00
kexec_file.c crypto: shash - remove shash_desc::flags 2019-04-25 15:38:12 +08:00
kexec_internal.h
kexec.c
kheaders.c Provide in-kernel headers to make extending kernel easier 2019-04-29 16:48:03 +02:00
kmod.c
kprobes.c kprobes: Fix error check when reusing optimized probes 2019-04-16 09:38:16 +02:00
ksysfs.c
kthread.c Merge branch 'akpm' (patches from Andrew) 2019-03-06 10:31:36 -08:00
latencytop.c latency_top: Simplify stack trace handling 2019-04-29 12:37:48 +02:00
Makefile Driver core/kobject patches for 5.2-rc1 2019-05-07 13:01:40 -07:00
memremap.c mm/hmm: fix memremap.h, move dev_page_fault_t callback to hmm 2018-12-28 12:11:52 -08:00
module_signing.c
module-internal.h
module.c Merge branch 'next-integrity' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2019-05-09 12:54:40 -07:00
notifier.c
nsproxy.c
padata.c padata: Replace padata_attr_type default_attrs field with groups 2019-04-25 22:06:11 +02:00
panic.c s390: simplify disabled_wait 2019-05-02 13:54:11 +02:00
params.c
pid_namespace.c
pid.c Fix failure path in alloc_pid() 2018-12-28 12:42:30 -08:00
profile.c
ptrace.c ptrace: take into account saved_sigmask in PTRACE{GET,SET}SIGMASK 2019-03-29 10:01:37 -07:00
range.c
reboot.c
relay.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-03-12 13:27:20 -07:00
resource.c mm/resource: Use resource_overlaps() to simplify region_intersects() 2019-04-19 12:59:36 +02:00
rseq.c rseq: Remove superfluous rseq_len from task_struct 2019-04-19 12:39:32 +02:00
seccomp.c audit/stable-5.2 PR 20190507 2019-05-07 19:06:04 -07:00
signal.c Merge branch 'for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2019-05-09 13:52:12 -07:00
smp.c cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM 2019-01-30 19:27:00 +01:00
smpboot.c
smpboot.h
softirq.c softirq: Remove tasklet_hrtimer 2019-03-22 14:36:02 +01:00
stackleak.c
stacktrace.c stacktrace: Provide common infrastructure 2019-04-29 12:37:57 +02:00
stop_machine.c treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively 2019-04-09 14:19:06 +02:00
sys_ni.c signal: support CLONE_PIDFD with pidfd_send_signal 2019-05-07 14:31:03 +02:00
sys.c Merge branch 'akpm' (patches from Andrew) 2019-03-07 19:25:37 -08:00
sysctl_binary.c kernel/sysctl: add panic_print into sysctl 2019-01-04 13:13:47 -08:00
sysctl.c ipv6: Add rate limit mask for ICMPv6 messages 2019-04-18 16:58:37 -07:00
task_work.c
taskstats.c genetlink: optionally validate strictly/dumps 2019-04-27 17:07:22 -04:00
test_kprobes.c
torture.c torture: Don't try to offline the last CPU 2019-03-26 14:42:53 -07:00
tracepoint.c
tsacct.c
ucount.c
uid16.c
uid16.h
umh.c umh: add exit routine for UMH process 2019-01-11 18:05:40 -08:00
up.c
user_namespace.c
user-return-notifier.c
user.c
utsname_sysctl.c
utsname.c
watchdog_hld.c kernel/watchdog_hld.c: hard lockup message should end with a newline 2019-04-19 09:46:05 -07:00
watchdog.c watchdog: Fix typo in comment 2019-04-18 14:05:51 +02:00
workqueue_internal.h sched/core, workqueues: Distangle worker accounting from rq lock 2019-04-16 16:55:15 +02:00
workqueue.c Merge branch 'for-5.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2019-05-09 13:48:52 -07:00