linux

iv/linux

History

Huaixin Chang 26a8b12747 sched/fair: Fix race between runtime distribution and assignment Currently, there is a potential race between distribute_cfs_runtime() and assign_cfs_rq_runtime(). Race happens when cfs_b->runtime is read, distributes without holding lock and finds out there is not enough runtime to charge against after distribution. Because assign_cfs_rq_runtime() might be called during distribution, and use cfs_b->runtime at the same time. Fibtest is the tool to test this race. Assume all gcfs_rq is throttled and cfs period timer runs, slow threads might run and sleep, returning unused cfs_rq runtime and keeping min_cfs_rq_runtime in their local pool. If all this happens sufficiently quickly, cfs_b->runtime will drop a lot. If runtime distributed is large too, over-use of runtime happens. A runtime over-using by about 70 percent of quota is seen when we test fibtest on a 96-core machine. We run fibtest with 1 fast thread and 95 slow threads in test group, configure 10ms quota for this group and see the CPU usage of fibtest is 17.0%, which is far more than the expected 10%. On a smaller machine with 32 cores, we also run fibtest with 96 threads. CPU usage is more than 12%, which is also more than expected 10%. This shows that on similar workloads, this race do affect CPU bandwidth control. Solve this by holding lock inside distribute_cfs_runtime(). Fixes: `c06f04c704` ("sched: Fix potential near-infinite distribute_cfs_runtime() loop") Reviewed-by: Ben Segall <bsegall@google.com> Signed-off-by: Huaixin Chang <changhuaixin@linux.alibaba.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/lkml/20200325092602.22471-1-changhuaixin@linux.alibaba.com/		2020-04-08 11:35:19 +02:00
..
autogroup.c	sched/autogroup: Make autogroup_path() always available	2019-06-24 19:23:40 +02:00
autogroup.h	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
clock.c	sched/clock: Use static_branch_likely() with sched_clock_running	2019-11-29 08:10:54 +01:00
completion.c	completion: Use lockdep_assert_RT_in_threaded_ctx() in complete_all()	2020-03-23 18:40:25 +01:00
core.c	sched/fair: Align rq->avg_idle and rq->avg_scan_cost	2020-04-08 11:35:18 +02:00
cpuacct.c	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
cpudeadline.c	Linux 5.2-rc5	2019-06-17 12:12:27 +02:00
cpudeadline.h	sched/headers: Simplify and clean up header usage in the scheduler	2018-03-04 12:39:29 +01:00
cpufreq_schedutil.c	sched/uclamp: Rename uclamp_util_with() into uclamp_rq_util_with()	2019-12-25 10:42:08 +01:00
cpufreq.c	cpufreq: Avoid leaving stale IRQ work items during CPU offline	2019-12-12 17:59:43 +01:00
cpupri.c	sched/rt: cpupri_find: Trigger a full search as fallback	2020-03-20 13:06:20 +01:00
cpupri.h	sched/rt: Optimize cpupri_find() on non-heterogenous systems	2020-03-06 12:57:27 +01:00
cputime.c	sched/vtime: Prevent unstable evaluation of WARN(vtime->state)	2020-03-06 12:57:16 +01:00
deadline.c	sched/deadline: Make two functions static	2020-03-06 12:57:24 +01:00
debug.c	sched/pelt: Add a new runnable average signal	2020-02-24 11:36:36 +01:00
fair.c	sched/fair: Fix race between runtime distribution and assignment	2020-04-08 11:35:19 +02:00
features.h	sched/fair/util_est: Implement faster ramp-up EWMA on utilization increases	2019-10-29 10:01:07 +01:00
idle.c	idle: fix spelling mistake "iterrupts" -> "interrupts"	2020-01-17 10:19:22 +01:00
isolation.c	genirq, sched/isolation: Isolate from handling managed interrupts	2020-01-22 16:29:49 +01:00
loadavg.c	timers/nohz: Update NOHZ load in remote tick	2020-01-28 21:36:44 +01:00
Makefile	psi: pressure stall information for CPU, memory, and IO	2018-10-26 16:26:32 -07:00
membarrier.c	membarrier: Fix RCU locking bug caused by faulty merge	2019-10-01 21:27:50 +02:00
pelt.c	sched/pelt: Add support to track thermal pressure	2020-03-06 12:57:17 +01:00
pelt.h	sched/pelt: Add support to track thermal pressure	2020-03-06 12:57:17 +01:00
psi.c	psi: Move PF_MEMSTALL out of task->flags	2020-03-20 13:06:19 +01:00
rt.c	sched/rt: Remove unnecessary push for unfit tasks	2020-03-06 12:57:29 +01:00
sched-pelt.h	sched/fair: Fix "runnable_avg_yN_inv" not used warnings	2019-06-17 12:15:58 +02:00
sched.h	sched/fair: Align rq->avg_idle and rq->avg_scan_cost	2020-04-08 11:35:18 +02:00
stats.c	proc: introduce proc_create_seq{,_data}	2018-05-16 07:23:35 +02:00
stats.h	psi: Move PF_MEMSTALL out of task->flags	2020-03-20 13:06:19 +01:00
stop_task.c	sched/core: Further clarify sched_class::set_next_task()	2019-11-11 08:35:21 +01:00
swait.c	sched/swait: Prepare usage in completions	2020-03-21 16:00:23 +01:00
topology.c	sched/topology: Don't enable EAS on SMT systems	2020-03-06 12:57:23 +01:00
wait_bit.c	sched/wait: fix ___wait_var_event(exclusive)	2019-12-17 13:32:50 +01:00
wait.c	Add wake_up_interruptible_sync_poll_locked()	2019-10-31 15:12:23 +00:00