linux

iv/linux

Author	SHA1	Message	Date
Mike Galbraith	b5d9d734a5	sched: Ensure that a child can't gain time over it's parent after fork() A fork/exec load is usually "pass the baton", so the child should never be placed behind the parent. With START_DEBIT we make room for the new task, but with child_runs_first, that room comes out of the _parent's_ hide. There's nothing to say that the parent wasn't ahead of min_vruntime at fork() time, which means that the "baton carrier", who is essentially the parent in drag, can gain time and increase scheduling latencies for waiters. With NEW_FAIR_SLEEPERS + START_DEBIT + child_runs_first enabled, we essentially pass the sleeper fairness off to the child, which is fine, but if we don't base placement on the parent's updated vruntime, we can end up compounding latency woes if the child itself then does fork/exec. The debit incurred at fork doesn't hurt the parent who is then going to sleep and maybe exit, but the child who acquires the error harms all comers. This improves latencies of make -j<n> kernel build workloads. Reported-by: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: Mike Galbraith <efault@gmx.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-08 13:15:34 +02:00
Peter Zijlstra	71a29aa7b6	sched: Deal with low-load in wake_affine() wake_affine() would always fail under low-load situations where both prev and this were idle, because adding a single task will always be a significant imbalance, even if there's nothing around that could balance it. Deal with this by allowing imbalance when there's nothing you can do about it. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-07 20:39:06 +02:00
Peter Zijlstra	cdd2ab3de4	sched: Remove short cut from select_task_rq_fair() select_task_rq_fair() incorrectly skips the wake_affine() logic, remove this. When prev_cpu == this_cpu, the code jumps straight to the wake_idle() logic, this doesn't give the wake_affine() logic the chance to pin the task to this cpu. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-07 20:39:05 +02:00
Ingo Molnar	d7ea17a769	sched: Fix dynamic power-balancing crash This crash: [ 1774.088275] divide error: 0000 [#1] SMP [ 1774.100355] CPU 13 [ 1774.102498] Modules linked in: [ 1774.105631] Pid: 30881, comm: hackbench Not tainted 2.6.31-rc8-tip-01308-g484d664-dirty #1629 X8DTN [ 1774.114807] RIP: 0010:[<ffffffff81041c38>] [<ffffffff81041c38>] sched_balance_self+0x19b/0x2d4 Triggers because update_group_power() modifies the sd tree and does temporary calculations there - not considering that other CPUs could observe intermediate values, such as the zero initial value. Calculate it in a temporary variable instead. (we need no memory barrier as these are all statistical values anyway) Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <20090904092742.GA11014@elte.hu> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 11:52:52 +02:00
Peter Zijlstra	18a3885fc1	sched: Remove reciprocal for cpu_power Its a source of fail, also, now that cpu_power is dynamical, its a waste of time. before: <idle>-0 [000] 132.877936: find_busiest_group: avg_load: 0 group_load: 8241 power: 1 after: bash-1689 [001] 137.862151: find_busiest_group: avg_load: 10636288 group_load: 10387 power: 1 [ v2: build fix from From: Andreas Herrmann ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083826.425896304@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 10:09:56 +02:00
Gautham R Shenoy	d899a789c2	sched: Try to deal with low capacity, fix update_sd_power_savings_stats() sgs.group_capacity can now be 0, if for some reason group->__cpu_power happens to be less than SCHED_LOAD_SCALE/2. In that case, we need the following fix to make it work for update_sd_power_savings_stats(). That's because both sum_nr_running and group_capacity are unsigned longs. Cc: Gautham R Shenoy <ego@in.ibm.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 10:09:56 +02:00
Peter Zijlstra	bdb94aa5db	sched: Try to deal with low capacity When the capacity drops low, we want to migrate load away. Allow the load-balancer to remove all tasks when we hit rock bottom. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083826.342231003@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 10:09:55 +02:00
Peter Zijlstra	e9e9250bc7	sched: Scale down cpu_power due to RT tasks Keep an average on the amount of time spend on RT tasks and use that fraction to scale down the cpu_power for regular tasks. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083826.287778431@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 10:09:55 +02:00
Peter Zijlstra	ab29230e67	sched: Implement dynamic cpu_power Recompute the cpu_power for each cpu during load-balance. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083826.162033479@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 10:09:54 +02:00
Peter Zijlstra	a52bfd7358	sched: Add smt_gain The idea is that multi-threading a core yields more work capacity than a single thread, provide a way to express a static gain for threads. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083826.073345955@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 10:09:54 +02:00
Peter Zijlstra	cc9fba7d76	sched: Update the cpu_power sum during load-balance In order to prepare for a more dynamic cpu_power, update the group sum while walking the sched domains during load-balance. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083825.985050292@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 10:09:53 +02:00
Peter Zijlstra	b5d978e0c7	sched: Add SD_PREFER_SIBLING Do the placement thing using SD flags. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083825.897028974@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 10:09:53 +02:00
Peter Zijlstra	f93e65c186	sched: Restore __cpu_power to a straight sum of power cpu_power is supposed to be a representation of the process capacity of the cpu, not a value to randomly tweak in order to affect placement. Remove the placement hacks. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Andreas Herrmann <andreas.herrmann3@amd.com> Acked-by: Gautham R Shenoy <ego@in.ibm.com> Cc: Balbir Singh <balbir@in.ibm.com> LKML-Reference: <20090901083825.810860576@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 10:09:53 +02:00
Ingo Molnar	9aa55fbd01	Merge branches 'sched/domains' and 'sched/clock' into sched/core Merge reason: both topics are ready now, and we want to merge dependent changes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-04 10:08:47 +02:00
Peter Zijlstra	768d0c2722	sched: Add wait, sleep and iowait accounting tracepoints Add 3 schedstat tracepoints to help account for wait-time, sleep-time and iowait-time. They can also be used as a perf-counter source to profile tasks on these clocks. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Arjan van de Ven <arjan@linux.intel.com> LKML-Reference: <new-submission> [ build fix for the !CONFIG_SCHEDSTATS case ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-02 09:12:18 +02:00
Arjan van de Ven	8f0dfc34e9	sched: Provide iowait counters For counting how long an application has been waiting for (disk) IO, there currently is only the HZ sample driven information available, while for all other counters in this class, a high resolution version is available via CONFIG_SCHEDSTATS. In order to make an improved bootchart tool possible, we also need a higher resolution version of the iowait time. This patch below adds this scheduler statistic to the kernel. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4A64B813.1080506@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-02 08:44:08 +02:00
Ingo Molnar	f14eff1cc2	Merge commit 'v2.6.31-rc8' into sched/core Merge reason: bump from rc5 to rc8, but also pick up TP_perf_assign() API, a patch will be queued that depends on it. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-09-02 08:20:35 +02:00
Anirban Sinha	84e9dabf6e	sched: Rename init_cfs_rq => init_tg_cfs_rq ... so that it does not share a common name with a function within the same scope. Signed-off-by: Anirban Sinha <asinha@zeugmasystems.com> LKML-Reference: <DDFD17CC94A9BD49A82147DDF7D545C501EA98A6@exchange.ZeugmaSystems.local> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-29 15:51:16 +02:00
Peter Zijlstra	34d76c4155	sched: Fix division by zero - really When re-computing the shares for each task group's cpu representation we need the ratio of weight on each cpu vs the total weight of the sched domain. Since load-balancing is loosely (read not) synchronized, the weight of individual cpus can change between doing the sum and calculating the ratio. The previous patch dealt with only one of the race scenarios, this patch side steps them all by saving a snapshot of all the individual cpu weights, thereby always working on a consistent set. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: torvalds@linux-foundation.org Cc: jes@sgi.com Cc: jens.axboe@oracle.com Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Arjan van de Ven <arjan@infradead.org> Cc: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <1251371336.18584.77.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-28 08:26:49 +02:00
James Bottomley	1b364bf438	module: workaround duplicate section names The root cause is a duplicate section name (.text); is this legal? [ Amerigo Wang: "AFAIK, yes." ] However, there's a problem with commit 6d76013381ed28979cd122eb4b249a88b5e384fa in that if you fail to allocate a mod->sect_attrs (in this case it's null because of the duplication), it still gets used without checking in add_notes_attrs() This should fix it [ This patch leaves other problems, particularly the sections directory, but recent parisc toolchains seem to produce these modules and this prevents a crash and is a minimal change -- RR ] Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Tested-by: Helge Deller <deller@gmx.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-27 12:33:19 -07:00
Rusty Russell	7d1d16e416	module: fix BUG_ON() for powerpc (and other function descriptor archs) The rarely-used symbol_put_addr() needs to use dereference_function_descriptor on powerpc. Reported-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-27 12:33:19 -07:00
Oleg Nesterov	4ab6c08336	clone(): fix race between copy_process() and de_thread() Spotted by Hiroshi Shimamoto who also provided the test-case below. copy_process() uses signal->count as a reference counter, but it is not. This test case #include <sys/types.h> #include <sys/wait.h> #include <unistd.h> #include <stdio.h> #include <errno.h> #include <pthread.h> void null_thread(void p) { for (;;) sleep(1); return NULL; } void exec_thread(void p) { execl("/bin/true", "/bin/true", NULL); return null_thread(p); } int main(int argc, char **argv) { for (;;) { pid_t pid; int ret, status; pid = fork(); if (pid < 0) break; if (!pid) { pthread_t tid; pthread_create(&tid, NULL, exec_thread, NULL); for (;;) pthread_create(&tid, NULL, null_thread, NULL); } do { ret = waitpid(pid, &status, 0); } while (ret == -1 && errno == EINTR); } return 0; } quickly creates an unkillable task. If copy_process(CLONE_THREAD) races with de_thread() copy_signal()->atomic(signal->count) breaks the signal->notify_count logic, and the execing thread can hang forever in kernel space. Change copy_process() to increment count/live only when we know for sure we can't fail. In this case the forked thread will take care of its reference to signal correctly. If copy_process() fails, check CLONE_THREAD flag. If it it set - do nothing, the counters were not changed and current belongs to the same thread group. If it is not set, ->signal must be released in any case (and ->count must be == 1), the forked child is the only thread in the thread group. We need more cleanups here, in particular signal->count should not be used by de_thread/__exit_signal at all. This patch only fixes the bug. Reported-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Tested-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Roland McGrath <roland@redhat.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-26 20:06:52 -07:00
Linus Torvalds	9c93768866	Merge branch 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf_counter: Fix typo in read() output generation perf tools: Check perf.data owner	2009-08-25 11:24:37 -07:00
Linus Torvalds	44afa9a4b8	Merge branch 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: clockevent: Prevent dead lock on clockevents_lock timers: Drop write permission on /proc/timer_list	2009-08-25 11:24:04 -07:00
Linus Torvalds	7d63e6359a	Merge branch 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'tracing-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: Fix too large stack usage in do_one_initcall() tracing: handle broken names in ftrace filter ftrace: Unify effect of writing to trace_options and option/*	2009-08-25 11:23:43 -07:00
Peter Zijlstra	4464fcaa9c	perf_counter: Fix typo in read() output generation When you iterate a list, using the iterator is useful. Before: ID: 5 ID: 5 ID: 5 ID: 5 EVNT: 0x40088b scale: nan ID: 5 CNT: 1006252 ID: 6 CNT: 1011090 ID: 7 CNT: 1011196 ID: 8 CNT: 1011095 EVNT: 0x40088c scale: 1.000000 ID: 5 CNT: 2003065 ID: 6 CNT: 2011671 ID: 7 CNT: 2012620 ID: 8 CNT: 2013479 EVNT: 0x40088c scale: 1.000000 ID: 5 CNT: 3002390 ID: 6 CNT: 3015996 ID: 7 CNT: 3018019 ID: 8 CNT: 3020006 EVNT: 0x40088b scale: 1.000000 ID: 5 CNT: 4002406 ID: 6 CNT: 4021120 ID: 7 CNT: 4024241 ID: 8 CNT: 4027059 After: ID: 1 ID: 2 ID: 3 ID: 4 EVNT: 0x400889 scale: nan ID: 1 CNT: 1005270 ID: 2 CNT: 1009833 ID: 3 CNT: 1010065 ID: 4 CNT: 1010088 EVNT: 0x400898 scale: nan ID: 1 CNT: 2001531 ID: 2 CNT: 2022309 ID: 3 CNT: 2022470 ID: 4 CNT: 2022627 EVNT: 0x400888 scale: 0.489467 ID: 1 CNT: 3001261 ID: 2 CNT: 3027088 ID: 3 CNT: 3027941 ID: 4 CNT: 3028762 Reported-by: stephane eranian <eranian@googlemail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Mackerras <paulus@samba.org> Cc: Corey J Ashford <cjashfor@us.ibm.com> Cc: perfmon2-devel <perfmon2-devel@lists.sourceforge.net> LKML-Reference: <1250867976.7538.73.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-21 18:00:35 +02:00
Peter Zijlstra	a8af7246c1	sched: Avoid division by zero Patch a5004278f0525dcb9aa43703ef77bf371ea837cd (sched: Fix cgroup smp fairness) introduced the possibility of a divide-by-zero because load-balancing is not synchronized between sched_domains. This can cause the state of cpus to change between the first and second loop over the sched domain in tg_shares_up(). Reported-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Jes Sorensen <jes@sgi.com> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <1250855934.7538.30.camel@twins> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-21 14:15:10 +02:00
Hiroshi Shimamoto	cde7e5ca4e	sched: Use for_each_class macro in move_one_task() Replace for loop with the macro for_each_class to cleanup. Signed-off-by: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> LKML-Reference: <4A8A277D.4090304@ct.jp.nec.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-20 13:00:30 +02:00
Linus Torvalds	d46c7d9ab8	Merge branch 'perfcounters-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perfcounters-fixes-for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf tools: Make 'make html' work perf annotate: Fix segmentation fault perf_counter: Fix the PARISC build perf_counter: Check task on counter read IPI perf: Rename perf-examples.txt to examples.txt perf record: Fix typo in pid_synthesize_comm_event	2009-08-19 09:43:19 -07:00
Suresh Siddha	f833bab87f	clockevent: Prevent dead lock on clockevents_lock Currently clockevents_notify() is called with interrupts enabled at some places and interrupts disabled at some other places. This results in a deadlock in this scenario. cpu A holds clockevents_lock in clockevents_notify() with irqs enabled cpu B waits for clockevents_lock in clockevents_notify() with irqs disabled cpu C doing set_mtrr() which will try to rendezvous of all the cpus. This will result in C and A come to the rendezvous point and waiting for B. B is stuck forever waiting for the spinlock and thus not reaching the rendezvous point. Fix the clockevents code so that clockevents_lock is taken with interrupts disabled and thus avoid the above deadlock. Also call lapic_timer_propagate_broadcast() on the destination cpu so that we avoid calling smp_call_function() in the clockevents notifier chain. This issue left us wondering if we need to change the MTRR rendezvous logic to use stop machine logic (instead of smp_call_function) or add a check in spinlock debug code to see if there are other spinlocks which gets taken under both interrupts enabled/disabled conditions. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com> Cc: "Pallipadi Venkatesh" <venkatesh.pallipadi@intel.com> Cc: "Brown Len" <len.brown@intel.com> LKML-Reference: <1250544899.2709.210.camel@sbs-t61.sc.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2009-08-19 18:15:10 +02:00
Jiri Olsa	eda1e32855	tracing: handle broken names in ftrace filter If one filter item (for set_ftrace_filter and set_ftrace_notrace) is being setup by more than 1 consecutive writes (FTRACE_ITER_CONT flag), it won't be handled corretly. I used following program to test/verify: [snip] #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <string.h> int main(int argc, char *argv) { int fd, i; char file = argv[1]; if (-1 == (fd = open(file, O_WRONLY))) { perror("open failed"); return -1; } for(i = 0; i < (argc - 2); i++) { int len = strlen(argv[2+i]); int cnt, off = 0; while(len) { cnt = write(fd, argv[2+i] + off, len); len -= cnt; off += cnt; } } close(fd); return 0; } [snip] before change: sh-4.0# echo > ./set_ftrace_filter sh-4.0# /test ./set_ftrace_filter "sys" "_open " sh-4.0# cat ./set_ftrace_filter #### all functions enabled #### sh-4.0# after change: sh-4.0# echo > ./set_ftrace_notrace sh-4.0# test ./set_ftrace_notrace "sys" "_open " sh-4.0# cat ./set_ftrace_notrace sys_open sh-4.0# Signed-off-by: Jiri Olsa <jolsa@redhat.com> LKML-Reference: <20090811152904.GA26065@jolsa.lab.eng.brq.redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>	2009-08-18 20:39:48 -04:00
KOSAKI Motohiro	0753ba01e1	mm: revert "oom: move oom_adj value" The commit 2ff05b2b (oom: move oom_adj value) moveed the oom_adj value to the mm_struct. It was a very good first step for sanitize OOM. However Paul Menage reported the commit makes regression to his job scheduler. Current OOM logic can kill OOM_DISABLED process. Why? His program has the code of similar to the following. ... set_oom_adj(OOM_DISABLE); /* The job scheduler never killed by oom / ... if (vfork() == 0) { set_oom_adj(0); / Invoked child can be killed */ execve("foo-bar-cmd"); } .... vfork() parent and child are shared the same mm_struct. then above set_oom_adj(0) doesn't only change oom_adj for vfork() child, it's also change oom_adj for vfork() parent. Then, vfork() parent (job scheduler) lost OOM immune and it was killed. Actually, fork-setting-exec idiom is very frequently used in userland program. We must not break this assumption. Then, this patch revert commit 2ff05b2b and related commit. Reverted commit list --------------------- - commit 2ff05b2b4e (oom: move oom_adj value from task_struct to mm_struct) - commit 4d8b9135c3 (oom: avoid unnecessary mm locking and scanning for OOM_DISABLE) - commit 8123681022 (oom: only oom kill exiting tasks with attached memory) - commit 933b787b57 (mm: copy over oom_adj value at fork time) Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Paul Menage <menage@google.com> Cc: David Rientjes <rientjes@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Mel Gorman <mel@csn.ul.ie> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-08-18 16:31:13 -07:00
Linus Torvalds	dcd94dbdaf	Merge branch 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: genirq: Wake up irq thread after action has been installed	2009-08-18 13:57:38 -07:00
Andreas Herrmann	294b0c9619	sched: Consolidate definition of variable sd in __build_sched_domains Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818110229.GM29515@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-18 18:35:45 +02:00
Andreas Herrmann	0601a88d8f	sched: Separate out build of NUMA sched groups from __build_sched_domains ... to further strip down __build_sched_domains(). Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818110111.GL29515@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-18 18:35:44 +02:00
Andreas Herrmann	de616e36c7	sched: Separate out build of ALLNODES sched groups from __build_sched_domains For the sake of completeness. Now all calls to init_sched_build_groups() are contained in build_sched_groups(). Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818110013.GK29515@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-18 18:35:44 +02:00
Andreas Herrmann	86548096f2	sched: Separate out build of CPU sched groups from __build_sched_domains ... to further strip down __build_sched_domains(). Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818105928.GJ29515@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-18 18:35:43 +02:00
Andreas Herrmann	a2af04cdbb	sched: Separate out build of MC sched groups from __build_sched_domains ... to further strip down __build_sched_domains(). Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818105838.GI29515@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-18 18:35:43 +02:00
Andreas Herrmann	0e8e85c941	sched: Separate out build of SMT sched groups from __build_sched_domains ... to further strip down __build_sched_domains(). Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818105751.GH29515@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-18 18:35:42 +02:00
Andreas Herrmann	d817353555	sched: Separate out build of SMT sched domain from __build_sched_domains ... to further strip down __build_sched_domains(). Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818105703.GG29515@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-18 18:35:42 +02:00
Andreas Herrmann	410c408108	sched: Separate out build of MC sched domain from __build_sched_domains ... to further strip down __build_sched_domains(). Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818105614.GF29515@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-18 18:35:41 +02:00
Andreas Herrmann	87cce6622c	sched: Separate out build of CPU sched domain from __build_sched_domains ... to further strip down __build_sched_domains(). Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818105455.GE29515@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-18 18:35:40 +02:00
Andreas Herrmann	7f4588f3aa	sched: Separate out build of NUMA sched domain from __build_sched_domains ... to further strip down __build_sched_domains(). Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818105406.GD29515@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-18 18:35:40 +02:00
Andreas Herrmann	2109b99ee1	sched: Separate out allocation/free/goto-hell from __build_sched_domains Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818105300.GC29515@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-18 18:35:39 +02:00
Andreas Herrmann	49a02c514d	sched: Use structure to store local data in __build_sched_domains Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20090818105152.GB29515@alberich.amd.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-18 18:35:39 +02:00
Thomas Gleixner	69ab849439	genirq: Wake up irq thread after action has been installed The wake_up_process() of the new irq thread in __setup_irq() is too early as the irqaction is not yet fully initialized especially action->irq is not yet set. The interrupt thread might dereference the wrong irq descriptor. Move the wakeup after the action is installed and action->irq has been set. Reported-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Michael Buesch <mb@bu3sch.de>	2009-08-18 17:22:43 +02:00
Ingo Molnar	f738eb1b63	perf_counter: Fix the PARISC build PARISC does not build: /home/mingo/tip/kernel/perf_counter.c: In function 'perf_counter_index': /home/mingo/tip/kernel/perf_counter.c:2016: error: 'PERF_COUNTER_INDEX_OFFSET' undeclared (first use in this function) /home/mingo/tip/kernel/perf_counter.c:2016: error: (Each undeclared identifier is reported only once /home/mingo/tip/kernel/perf_counter.c:2016: error: for each function it appears in.) As PERF_COUNTER_INDEX_OFFSET is not defined. Now, we could define it in the architecture - but lets also provide a core default of 0 (which happens to be what all but one architecture uses at the moment). Architectures that need a different index offset should set this value in their asm/perf_counter.h files. Cc: Kyle McMartin <kyle@mcmartin.ca> Cc: Helge Deller <deller@gmx.de> Cc: linux-parisc@vger.kernel.org Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-18 11:34:13 +02:00
Zhaolei	f2d84b65b9	ftrace: Unify effect of writing to trace_options and option/* "echo noglobal-clock > trace_options" can be used to change trace clock but "echo 0 > options/global-clock" can't. The flag toggling will be silently accepted without actually changing the clock callback. We can fix it by using set_tracer_flags() in trace_options_core_write(). Changelog: v1->v2: Simplified switch() after Li Zefan <lizf@cn.fujitsu.com>'s suggestion Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2009-08-18 02:07:04 +02:00
Amerigo Wang	de809347ae	timers: Drop write permission on /proc/timer_list /proc/timer_list and /proc/slabinfo are not supposed to be written, so there should be no write permissions on it. Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Vegard Nossum <vegard.nossum@gmail.com> Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: linux-mm@kvack.org Cc: Christoph Lameter <cl@linux-foundation.org> Cc: David Rientjes <rientjes@google.com> Cc: Amerigo Wang <amwang@redhat.com> Cc: Matt Mackall <mpm@selenic.com> Cc: Arjan van de Ven <arjan@linux.intel.com> LKML-Reference: <20090817094525.6355.88682.sendpatchset@localhost.localdomain> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-17 11:47:31 +02:00
Paul Mackerras	e1ac3614ff	perf_counter: Check task on counter read IPI In general, code in perf_counter.c that is called through an IPI checks, for per-task counters, that the counter's task is still the current task. This is to handle the race condition where the cpu switches from the task we want to another task in the interval between sending the IPI and the IPI arriving and being handled on the target CPU. For some reason, __perf_counter_read is missing this check, yet there is no reason why the race condition can't occur. This adds a check that the current task is the one we want. If it isn't, we just return. In that case the counter->count value should be up to date, since it will have been updated when the counter was scheduled out, which must have happened since the IPI was sent. I don't have an example of an actual failure due to this race, but it seems obvious that it could occur and we need to guard against it. Signed-off-by: Paul Mackerras <paulus@samba.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <19076.63614.277861.368125@drongo.ozlabs.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-08-17 11:38:13 +02:00

1 2 3 4 5 ...

7783 Commits