linux/kernel/sched
Mel Gorman 5d4cf996cf sched: Assign correct scheduling domain to 'sd_llc'
Commit 42eb088e (sched: Avoid NULL dereference on sd_busy) corrected a NULL
dereference on sd_busy but the fix also altered what scheduling domain it
used for the 'sd_llc' percpu variable.

One impact of this is that a task selecting a runqueue may consider
idle CPUs that are not cache siblings as candidates for running.
Tasks are then running on CPUs that are not cache hot.

This was found through bisection where ebizzy threads were not seeing equal
performance and it looked like a scheduling fairness issue. This patch
mitigates but does not completely fix the problem on all machines tested
implying there may be an additional bug or a common root cause. Here are
the average range of performance seen by individual ebizzy threads. It
was tested on top of candidate patches related to x86 TLB range flushing.

	4-core machine
			    3.13.0-rc3            3.13.0-rc3
			       vanilla            fixsd-v3r3
	Mean   1        0.00 (  0.00%)        0.00 (  0.00%)
	Mean   2        0.34 (  0.00%)        0.10 ( 70.59%)
	Mean   3        1.29 (  0.00%)        0.93 ( 27.91%)
	Mean   4        7.08 (  0.00%)        0.77 ( 89.12%)
	Mean   5      193.54 (  0.00%)        2.14 ( 98.89%)
	Mean   6      151.12 (  0.00%)        2.06 ( 98.64%)
	Mean   7      115.38 (  0.00%)        2.04 ( 98.23%)
	Mean   8      108.65 (  0.00%)        1.92 ( 98.23%)

	8-core machine
	Mean   1         0.00 (  0.00%)        0.00 (  0.00%)
	Mean   2         0.40 (  0.00%)        0.21 ( 47.50%)
	Mean   3        23.73 (  0.00%)        0.89 ( 96.25%)
	Mean   4        12.79 (  0.00%)        1.04 ( 91.87%)
	Mean   5        13.08 (  0.00%)        2.42 ( 81.50%)
	Mean   6        23.21 (  0.00%)       69.46 (-199.27%)
	Mean   7        15.85 (  0.00%)      101.72 (-541.77%)
	Mean   8       109.37 (  0.00%)       19.13 ( 82.51%)
	Mean   12      124.84 (  0.00%)       28.62 ( 77.07%)
	Mean   16      113.50 (  0.00%)       24.16 ( 78.71%)

It's eliminated for one machine and reduced for another.

Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alex Shi <alex.shi@linaro.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: H Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/20131217092124.GV11295@suse.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-12-17 15:08:43 +01:00
..
auto_group.c sched/autogroup: Fix race with task_groups list 2013-05-28 09:40:22 +02:00
auto_group.h Revert "sched/autogroup: Fix crash on reboot when autogroup is disabled" 2012-12-11 10:23:45 +01:00
clock.c sched_clock: Prevent 64bit inatomicity on 32bit systems 2013-04-08 11:50:44 +02:00
completion.c sched: Move completion code from core.c to completion.c 2013-11-06 07:49:19 +01:00
core.c sched: Assign correct scheduling domain to 'sd_llc' 2013-12-17 15:08:43 +01:00
cpuacct.c cgroup: pass around cgroup_subsys_state instead of cgroup in file methods 2013-08-08 20:11:24 -04:00
cpuacct.h sched/cpuacct: Initialize root cpuacct earlier 2013-04-10 13:54:20 +02:00
cpupri.c sched: Fix some kernel-doc warnings 2013-07-18 09:58:21 +02:00
cpupri.h
cputime.c Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2013-09-05 12:36:46 -07:00
debug.c sched: Avoid throttle_cfs_rq() racing with period_timer stopping 2013-10-29 12:02:32 +01:00
fair.c sched/fair: Rework sched_fair time accounting 2013-12-11 15:52:35 +01:00
features.h sched/numa: Resist moving tasks towards nodes with fewer hinting faults 2013-10-09 12:40:27 +02:00
idle_task.c sched/numa: Introduce migrate_swap() 2013-10-09 12:40:46 +02:00
Makefile sched: Move completion code from core.c to completion.c 2013-11-06 07:49:19 +01:00
proc.c sched: Change get_rq_runnable_load() to static and inline 2013-06-27 10:07:44 +02:00
rt.c sched/rt: Fix task_tick_rt() comment 2013-10-26 12:25:21 +02:00
sched.h sched: Remove unnecessary iteration over sched domains to update nr_busy_cpus 2013-11-06 12:37:55 +01:00
stats.c fix a leak in /proc/schedstats 2013-04-29 15:41:45 -04:00
stats.h sched: Micro-optimize by dropping unnecessary task_rq() calls 2013-09-25 13:51:06 +02:00
stop_task.c sched/numa: Introduce migrate_swap() 2013-10-09 12:40:46 +02:00
wait.c sched: Move wait code from core.c to wait.c 2013-11-06 07:49:18 +01:00