linux/kernel
Gilad Ben-Yossef 3fc498f165 smp: introduce a generic on_each_cpu_mask() function
We have lots of infrastructure in place to partition multi-core systems
such that we have a group of CPUs that are dedicated to specific task:
cgroups, scheduler and interrupt affinity, and cpuisol= boot parameter.
Still, kernel code will at times interrupt all CPUs in the system via IPIs
for various needs.  These IPIs are useful and cannot be avoided
altogether, but in certain cases it is possible to interrupt only specific
CPUs that have useful work to do and not the entire system.

This patch set, inspired by discussions with Peter Zijlstra and Frederic
Weisbecker when testing the nohz task patch set, is a first stab at trying
to explore doing this by locating the places where such global IPI calls
are being made and turning the global IPI into an IPI for a specific group
of CPUs.  The purpose of the patch set is to get feedback if this is the
right way to go for dealing with this issue and indeed, if the issue is
even worth dealing with at all.  Based on the feedback from this patch set
I plan to offer further patches that address similar issue in other code
paths.

This patch creates an on_each_cpu_mask() and on_each_cpu_cond()
infrastructure API (the former derived from existing arch specific
versions in Tile and Arm) and uses them to turn several global IPI
invocation to per CPU group invocations.

Core kernel:

on_each_cpu_mask() calls a function on processors specified by cpumask,
which may or may not include the local processor.

You must not call this function with disabled interrupts or from a
hardware interrupt handler or from a bottom half handler.

arch/arm:

Note that the generic version is a little different then the Arm one:

1. It has the mask as first parameter
2. It calls the function on the calling CPU with interrupts disabled,
   but this should be OK since the function is called on the other CPUs
   with interrupts disabled anyway.

arch/tile:

The API is the same as the tile private one, but the generic version
also calls the function on the with interrupts disabled in UP case

This is OK since the function is called on the other CPUs
with interrupts disabled.

Signed-off-by: Gilad Ben-Yossef <gilad@benyossef.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Matt Mackall <mpm@selenic.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Sasha Levin <levinsasha928@gmail.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Avi Kivity <avi@redhat.com>
Acked-by: Michal Nazarewicz <mina86@mina86.org>
Cc: Kosaki Motohiro <kosaki.motohiro@gmail.com>
Cc: Milton Miller <miltonm@bga.com>
Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-28 17:14:35 -07:00
..
debug Fixes: 2012-03-23 09:29:44 -07:00
events Merge branch 'for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2012-03-20 18:11:21 -07:00
gcov gcov: disable CONSTRUCTORS for UML 2011-07-26 16:49:45 -07:00
irq Generialize powerpc's irq_host as irq_domain 2012-03-21 10:27:19 -07:00
power Power management updates for 3.4 2012-03-21 10:15:51 -07:00
sched Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security 2012-03-21 13:25:04 -07:00
time Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2012-03-22 20:20:18 -07:00
trace Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-03-21 13:36:41 -07:00
.gitignore
acct.c Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2012-01-08 12:19:57 -08:00
async.c kernel/async: remove redundant declaration. 2012-01-13 09:32:18 +10:30
audit_tree.c audit_tree,rcu: Convert call_rcu(__put_tree) to kfree_rcu() 2011-07-20 14:10:11 -07:00
audit_watch.c
audit.c constify path argument of audit_log_d_path() 2012-03-20 21:29:40 -04:00
audit.h audit: remove AUDIT_SETUP_CONTEXT as it isn't used 2012-01-17 16:16:57 -05:00
auditfilter.c audit: allow interfield comparison in audit rules 2012-01-17 16:17:01 -05:00
auditsc.c kernel-doc: fix new warnings in auditsc.c 2012-01-23 08:44:53 -08:00
backtracetest.c
bounds.c
capability.c Revert "capabitlies: ns_capable can use the cap helpers rather than lsm call" 2012-01-17 10:19:41 -08:00
cgroup_freezer.c cgroup: remove cgroup_subsys argument from callbacks 2012-02-02 09:20:22 -08:00
cgroup.c Merge branch 'akpm' (Andrew's patch-bomb) 2012-03-22 09:04:48 -07:00
compat.c kernel: Fix files explicitly needing EXPORT_SYMBOL infrastructure 2011-10-31 19:30:05 -04:00
configs.c kernel/configs.c: include MODULE_*() when CONFIG_IKCONFIG_PROC=n 2011-07-25 20:57:15 -07:00
cpu_pm.c cpu_pm: call notifiers during suspend 2011-09-23 12:05:29 +05:30
cpu.c Merge branch 'pm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm 2012-01-08 13:10:57 -08:00
cpuset.c cpuset: mm: reduce large amounts of memory barrier related damage v3 2012-03-21 17:54:59 -07:00
crash_dump.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
cred.c security: trim security.h 2012-02-14 10:45:42 +11:00
delayacct.c KVM: Steal time implementation 2011-07-14 12:59:14 +03:00
dma.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
elfcore.c
exec_domain.c
exit.c kernel/exit.c: if init dies, log a signal which killed it, if any 2012-03-23 16:58:32 -07:00
extable.c extable, core_kernel_data(): Make sure all archs define _sdata 2011-05-20 08:56:56 +02:00
fork.c prctl: add PR_{SET,GET}_CHILD_SUBREAPER to allow simple process supervision 2012-03-23 16:58:32 -07:00
freezer.c PM / Freezer: Remove references to TIF_FREEZE in comments 2012-03-04 23:08:54 +01:00
futex_compat.c
futex.c Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-19 17:11:15 -07:00
groups.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
hrtimer.c Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2011-11-28 08:43:52 -08:00
hung_task.c hung_task: fix the broken rcu_lock_break() logic 2012-03-05 15:49:42 -08:00
irq_work.c kernel: fix two implicit header assumptions in irq_work.c 2011-10-31 09:20:12 -04:00
itimer.c [S390] cputime: add sparse checking and cleanup 2011-12-15 14:56:19 +01:00
jump_label.c static keys: Inline the static_key_enabled() function 2012-02-28 20:01:08 +01:00
kallsyms.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt sched: Isolate preempt counting in its own config option 2011-06-10 15:15:40 +02:00
kexec.c PM / Sleep: Introduce "late suspend" and "early resume" of devices 2012-01-29 20:38:29 +01:00
kfifo.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
kmod.c kmod: make __request_module() killable 2012-03-23 16:58:42 -07:00
kprobes.c kprobes: return proper error code from register_kprobe() 2012-03-05 15:49:42 -08:00
ksysfs.c kernel: ksysfs.c is implicitly using stat.h 2011-10-31 09:20:13 -04:00
kthread.c freezer: kill unused set_freezable_with_signal() 2011-11-23 09:28:17 -08:00
latencytop.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
lockdep_internals.h
lockdep_proc.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
lockdep_states.h
lockdep.c lockdep: Add CPU-idle/offline warning to lockdep-RCU splat 2012-02-21 09:06:06 -08:00
Makefile sysctl: Move the implementation into fs/proc/proc_sysctl.c 2012-01-24 16:37:54 -08:00
module.c error: implicit declaration of function 'module_flags_taint' 2012-01-15 16:21:07 -08:00
mutex-debug.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
mutex-debug.h
mutex.c sched/rt: Use schedule_preempt_disabled() 2012-03-01 10:28:03 +01:00
mutex.h
notifier.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
nsproxy.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
padata.c padata: Fix race on sequence number wrap 2012-03-14 17:25:56 +08:00
panic.c panic: don't print redundant backtraces on oops 2012-01-12 20:13:11 -08:00
params.c includecheck: delete any duplicate instances of module.h 2012-02-28 19:31:56 -05:00
pid_namespace.c signal: zap_pid_ns_processes: s/SEND_SIG_NOINFO/SEND_SIG_FORCED/ 2012-03-23 16:58:41 -07:00
pid.c vfs: fix panic in __d_lookup() with high dentry hashtable counts 2012-02-13 20:45:38 -05:00
posix-cpu-timers.c [S390] cputime: add sparse checking and cleanup 2011-12-15 14:56:19 +01:00
posix-timers.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
printk.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-20 10:31:44 -07:00
profile.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
ptrace.c ptrace: remove PTRACE_SEIZE_DEVEL bit 2012-03-23 16:58:41 -07:00
range.c range: fix bogus misuse of module.h to get printk() 2011-10-31 09:20:11 -04:00
rcu.h rcu: Allow nesting of rcu_idle_enter() and rcu_idle_exit() 2012-02-21 09:06:12 -08:00
rcupdate.c rcu: Check for illegal use of RCU from offlined CPUs 2012-02-21 09:06:03 -08:00
rcutiny_plugin.h rcu: Simplify unboosting checks 2012-02-21 09:03:43 -08:00
rcutiny.c rcu: Add RCU_NONIDLE() for idle-loop RCU read-side critical sections 2012-02-21 09:06:13 -08:00
rcutorture.c PTR_ERR should be called before its argument is cleared. 2012-02-21 09:06:10 -08:00
rcutree_plugin.h rcu: Hold off RCU_FAST_NO_HZ after timer posted 2012-02-21 09:42:30 -08:00
rcutree_trace.c rcu: Rework detection of use of RCU by offline CPUs 2012-02-21 09:06:07 -08:00
rcutree.c rcu: Stop spurious warnings from synchronize_sched_expedited 2012-02-21 15:33:34 -08:00
rcutree.h rcu: Rework detection of use of RCU by offline CPUs 2012-02-21 09:06:07 -08:00
relay.c relay: prevent integer overflow in relay_open() 2012-02-10 09:04:49 +01:00
res_counter.c net: introduce res_counter_charge_nofail() for socket allocations 2012-01-22 15:08:46 -05:00
resource.c kernel/resource.c: move EXPORT_SYMBOL right after definition 2012-02-03 23:37:07 +01:00
rtmutex_common.h
rtmutex-debug.c lockdep, rtmutex, bug: Show taint flags on error 2011-12-06 08:16:49 +01:00
rtmutex-debug.h
rtmutex-tester.c rtmutex-tester: convert sysdev_class to a regular subsystem 2011-12-14 14:54:22 -08:00
rtmutex.c Revert "rcu: Permit rt_mutex_unlock() with irqs disabled" 2011-12-11 10:33:18 -08:00
rtmutex.h
rwsem.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
seccomp.c seccomp: audit abnormal end to a process due to seccomp 2012-01-17 16:16:55 -05:00
semaphore.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
signal.c signal: cosmetic, s/from_ancestor_ns/force/ in prepare_signal() paths 2012-03-23 16:58:41 -07:00
smp.c smp: introduce a generic on_each_cpu_mask() function 2012-03-28 17:14:35 -07:00
softirq.c Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-03-20 10:32:09 -07:00
spinlock.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
srcu.c rcu: Call out dangers of expedited RCU primitives 2012-02-21 09:06:08 -08:00
stacktrace.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
stop_machine.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
sys_ni.c Cross Memory Attach 2011-10-31 17:30:44 -07:00
sys.c prctl: add PR_{SET,GET}_CHILD_SUBREAPER to allow simple process supervision 2012-03-23 16:58:32 -07:00
sysctl_binary.c binary_sysctl(): fix memory leak 2011-12-20 10:25:04 -08:00
sysctl.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl 2012-03-23 18:08:58 -07:00
taskstats.c Make TASKSTATS require root access 2011-09-19 17:04:37 -07:00
test_kprobes.c
time.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
timeconst.pl
timer.c Merge branch 'core-debugobjects-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-01-06 07:53:34 -08:00
tracepoint.c static keys: Introduce 'struct static_key', static_key_true()/false() and static_key_slow_[inc|dec]() 2012-02-24 10:05:59 +01:00
tsacct.c [S390] cputime: add sparse checking and cleanup 2011-12-15 14:56:19 +01:00
uid16.c
up.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
user_namespace.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
user-return-notifier.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
user.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
utsname_sysctl.c Merge branch 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux 2011-11-06 19:44:47 -08:00
utsname.c kernel: Map most files to use export.h instead of module.h 2011-10-31 09:20:12 -04:00
wait.c lockdep/waitqueues: Add better annotation 2011-12-21 10:07:39 +01:00
watchdog.c kernel/watchdog.c: add comment to watchdog() exit path 2012-03-23 16:58:32 -07:00
workqueue_sched.h
workqueue.c Merge branch 'for-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2012-03-20 18:13:22 -07:00