linux

iv/linux

Author	SHA1	Message	Date
Anton Blanchard	db5ae97dd4	kernel/smp.c: fix smp_call_function_many() SMP race commit 6dc19899958e420a931274b94019e267e2396d3e upstream. I noticed a failure where we hit the following WARN_ON in generic_smp_call_function_interrupt: if (!cpumask_test_and_clear_cpu(cpu, data->cpumask)) continue; data->csd.func(data->csd.info); refs = atomic_dec_return(&data->refs); WARN_ON(refs < 0); <------------------------- We atomically tested and cleared our bit in the cpumask, and yet the number of cpus left (ie refs) was 0. How can this be? It turns out commit 54fdade1c3332391948ec43530c02c4794a38172 ("generic-ipi: make struct call_function_data lockless") is at fault. It removes locking from smp_call_function_many and in doing so creates a rather complicated race. The problem comes about because: - The smp_call_function_many interrupt handler walks call_function.queue without any locking. - We reuse a percpu data structure in smp_call_function_many. - We do not wait for any RCU grace period before starting the next smp_call_function_many. Imagine a scenario where CPU A does two smp_call_functions back to back, and CPU B does an smp_call_function in between. We concentrate on how CPU C handles the calls: CPU A CPU B CPU C CPU D smp_call_function smp_call_function_interrupt walks call_function.queue sees data from CPU A on list smp_call_function smp_call_function_interrupt walks call_function.queue sees (stale) CPU A on list smp_call_function int clears last ref on A list_del_rcu, unlock smp_call_function reuses percpu data A data->cpumask sees and clears bit in cpumask might be using old or new fn! decrements refs below 0 set data->refs (too late!) The important thing to note is since the interrupt handler walks a potentially stale call_function.queue without any locking, then another cpu can view the percpu data structure at any time, even when the owner is in the process of initialising it. The following test case hits the WARN_ON 100% of the time on my PowerPC box (having 128 threads does help :) #include <linux/module.h> #include <linux/init.h> #define ITERATIONS 100 static void do_nothing_ipi(void dummy) { } static void do_ipis(struct work_struct dummy) { int i; for (i = 0; i < ITERATIONS; i++) smp_call_function(do_nothing_ipi, NULL, 1); printk(KERN_DEBUG "cpu %d finished\n", smp_processor_id()); } static struct work_struct work[NR_CPUS]; static int __init testcase_init(void) { int cpu; for_each_online_cpu(cpu) { INIT_WORK(&work[cpu], do_ipis); schedule_work_on(cpu, &work[cpu]); } return 0; } static void __exit testcase_exit(void) { } module_init(testcase_init) module_exit(testcase_exit) MODULE_LICENSE("GPL"); MODULE_AUTHOR("Anton Blanchard"); I tried to fix it by ordering the read and the write of ->cpumask and ->refs. In doing so I missed a critical case but Paul McKenney was able to spot my bug thankfully :) To ensure we arent viewing previous iterations the interrupt handler needs to read ->refs then ->cpumask then ->refs _again_. Thanks to Milton Miller and Paul McKenney for helping to debug this issue. [miltonm@bga.com: add WARN_ON and BUG_ON, remove extra read of refs before initial read of mask that doesn't help (also noted by Peter Zijlstra), adjust comments, hopefully clarify scenario ] [miltonm@bga.com: remove excess tests] Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Milton Miller <miltonm@bga.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:51 -08:00
David Dillow	044b93a01a	fs/direct-io.c: don't try to allocate more than BIO_MAX_PAGES in a bio commit 20d9600cb407b0b55fef6ee814b60345c6f58264 upstream. When using devices that support max_segments > BIO_MAX_PAGES (256), direct IO tries to allocate a bio with more pages than allowed, which leads to an oops in dio_bio_alloc(). Clamp the request to the supported maximum, and change dio_bio_alloc() to reflect that bio_alloc() will always return a bio when called with __GFP_WAIT and a valid number of vectors. [akpm@linux-foundation.org: remove redundant BUG_ON()] Signed-off-by: David Dillow <dillowda@ornl.gov> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:51 -08:00
Randy Dunlap	8e1fbc87e5	backlight: fix 88pm860x_bl macro collision commit 2550326ac7a062fdfc204f9a3b98bdb9179638fc upstream. Fix collision with kernel-supplied #define: drivers/video/backlight/88pm860x_bl.c:24:1: warning: "CURRENT_MASK" redefined arch/x86/include/asm/page_64_types.h:6:1: warning: this is the location of the previous definition Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Haojian Zhuang <haojian.zhuang@marvell.com> Cc: Richard Purdie <rpurdie@rpsys.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:50 -08:00
KAMEZAWA Hiroyuki	0712c4990d	memory hotplug: one more lock on memory hotplug commit 925268a06dc2b1ff7bfcc37419a6827a0e739639 upstream. Now, memory_hotplug_(un)lock() is used for add/remove/offline pages for avoiding races with hibernation. But this should be held in online_pages(), too. It seems asymmetric. There are cases where one has to avoid a race with memory hotplug notifier and his own local code, and hotplug v.s. hotplug. This will add a generic solution for avoiding races. In other view, having lock here has no big impacts. online pages is tend to be done by udev script at el against each memory section one by one. Then, it's better to have lock here, too. Reviewed-by: Christoph Lameter <cl@linux.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:50 -08:00
Soren Hansen	4c9a57b6e0	nbd: remove module-level ioctl mutex commit de1f016f882e52facc3c8609599f827bcdd14af9 upstream. Commit 2a48fc0ab242417 ("block: autoconvert trivial BKL users to private mutex") replaced uses of the BKL in the nbd driver with mutex operations. Since then, I've been been seeing these lock ups: INFO: task qemu-nbd:16115 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. qemu-nbd D 0000000000000001 0 16115 16114 0x00000004 ffff88007d775d98 0000000000000082 ffff88007d775fd8 ffff88007d774000 0000000000013a80 ffff8800020347e0 ffff88007d775fd8 0000000000013a80 ffff880133730000 ffff880002034440 ffffea0004333db8 ffffffffa071c020 Call Trace: [<ffffffff815b9997>] __mutex_lock_slowpath+0xf7/0x180 [<ffffffff815b93eb>] mutex_lock+0x2b/0x50 [<ffffffffa071a21c>] nbd_ioctl+0x6c/0x1c0 [nbd] [<ffffffff812cb970>] blkdev_ioctl+0x230/0x730 [<ffffffff811967a1>] block_ioctl+0x41/0x50 [<ffffffff81175c03>] do_vfs_ioctl+0x93/0x370 [<ffffffff81175f61>] sys_ioctl+0x81/0xa0 [<ffffffff8100c0c2>] system_call_fastpath+0x16/0x1b Instrumenting the nbd module's ioctl handler with some extra logging clearly shows the NBD_DO_IT ioctl being invoked which is a long-lived ioctl in the sense that it doesn't return until another ioctl asks the driver to disconnect. However, that other ioctl blocks, waiting for the module-level mutex that replaced the BKL, and then we're stuck. This patch removes the module-level mutex altogether. It's clearly wrong, and as far as I can see, it's entirely unnecessary, since the nbd driver maintains per-device mutexes, and I don't see anything that would require a module-level (or kernel-level, for that matter) mutex. Signed-off-by: Soren Hansen <soren@linux2go.dk> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Paul Clements <paul.clements@steeleye.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:50 -08:00
Michael Büsch	1abdb48041	ssb-pcmcia: Fix parsing of invariants tuples commit dd3cb633078fb12e06ce6cebbdfbf55a7562e929 upstream. This fixes parsing of the device invariants (MAC address) for PCMCIA SSB devices. ssb_pcmcia_do_get_invariants expects an iv pointer as data argument. Tested-by: dylan cristiani <d.cristiani@idem-tech.it> Signed-off-by: Michael Buesch <mb@bu3sch.de> Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:50 -08:00
Corey Minyard	a7c9851a10	char/ipmi: fix OOPS caused by pnp_unregister_driver on unregistered driver commit d2478521afc20227658a10a8c5c2bf1a2aa615b3 upstream. This patch fixes an OOPS triggered when calling modprobe ipmi_si a second time after the first modprobe returned without finding any ipmi devices. This can happen if you reload the module after having the first module load fail. The driver was not deregistering from PNP in that case. Peter Huewe originally reported this patch and supplied a fix, I have a different patch based on Linus' suggestion that cleans things up a bit more. Cc: openipmi-developer@lists.sourceforge.net Reviewed-by: Peter Huewe <peterhuewe@gmx.de> Cc: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:49 -08:00
Oleg Nesterov	c2e884b07e	perf: Validate cpu early in perf_event_alloc() commit 66832eb4baaaa9abe4c993ddf9113a79e39b9915 upstream. Starting from perf_event_alloc()->perf_init_event(), the kernel assumes that event->cpu is either -1 or the valid CPU number. Change perf_event_alloc() to validate this argument early. This also means we can remove the similar check in find_get_context(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Roland McGrath <roland@redhat.com> LKML-Reference: <20110118161032.GC693@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:48 -08:00
Oleg Nesterov	ccb6b70738	perf: Find_get_context: fix the per-cpu-counter check commit 22a4ec729017ba613337a28f306f94ba5023fe2e upstream. If task == NULL, find_get_context() should always check that cpu is correct. Afaics, the bug was introduced by 38a81da2 "perf events: Clean up pid passing", but even before that commit "&& cpu != -1" was not exactly right, -ESRCH from find_task_by_vpid() is not accurate. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Prasad <prasad@linux.vnet.ibm.com> Cc: Roland McGrath <roland@redhat.com> LKML-Reference: <20110118161008.GB693@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:47 -08:00
Eric Dumazet	785fb57c98	perf: Fix alloc_callchain_buffers() commit 88d4f0db7fa8785859c1d637f9aac210932b6216 upstream. Commit 927c7a9e92c4 ("perf: Fix race in callchains") introduced a mismatch in the sizing of struct callchain_cpus_entries. nr_cpu_ids must be used instead of num_possible_cpus(), or we might get out of bound memory accesses on some machines. Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Miller <davem@davemloft.net> Cc: Stephane Eranian <eranian@google.com> LKML-Reference: <1295980851.3588.351.camel@edumazet-laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:46 -08:00
Michal Hocko	e7e78456b4	memsw: handle swapaccount kernel parameter correctly commit fceda1bf498677501befc7da72fd2e4de7f18466 upstream. __setup based kernel command line parameters handlers which are handled in obsolete_checksetup are provided with the parameter value including = (more precisely everything right after the parameter name). This means that the current implementation of swapaccount[=1\|0] doesn't work at all because if there is a value for the parameter then we are testing for "0" resp. "1" but we are getting "=0" resp. "=1" and if there is no parameter value we are getting an empty string rather than NULL. The original noswapccount parameter, which doesn't care about the value, works correctly. Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:46 -08:00
Marcin Slusarz	1ac0a5efc2	watchdog: Don't change watchdog state on read of sysctl commit 9ffdc6c37df131f89d52001e0ef03091b158826f upstream. Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> [ add {}'s to fix a warning ] Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1296230433-6261-3-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:45 -08:00
Marcin Slusarz	a716ed74c7	watchdog: Fix sysctl consistency commit 397357666de6b5b6adb5fa99f9758ec8cf30ac34 upstream. If it was not possible to enable watchdog for any cpu, switch watchdog_enabled back to 0, because it's visible via kernel.watchdog sysctl. Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com> Signed-off-by: Don Zickus <dzickus@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Frederic Weisbecker <fweisbec@gmail.com> LKML-Reference: <1296230433-6261-2-git-send-email-dzickus@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:45 -08:00
Guy Martin	49ae676572	parisc : Remove broken line wrapping handling pdc_iodc_print() commit fbea668498e93bb38ac9226c7af9120a25957375 upstream. Remove the broken line wrapping handling in pdc_iodc_print(). It is broken in 3 ways : - It doesn't keep track of the current screen position, it just assumes that the new buffer will be printed at the begining of the screen. - It doesn't take in account that non printable characters won't increase the current position on the screen. - And last but not least, it triggers a kernel panic if a backspace is the first char in the provided buffer : Backtrace: [<0000000040128ec4>] pdc_console_write+0x44/0x78 [<0000000040128f18>] pdc_console_tty_write+0x20/0x38 [<000000004032f1ac>] n_tty_write+0x2a4/0x550 [<000000004032b158>] tty_write+0x1e0/0x2d8 [<00000000401bb420>] vfs_write+0xb8/0x188 [<00000000401bb630>] sys_write+0x68/0xb8 [<0000000040104eb8>] syscall_exit+0x0/0x14 Most terminals handle the line wrapping just fine. I've confirmed that it works correctly on a C8000 with both vga and serial output. Signed-off-by: Guy Martin <gmsoft@tuxicoman.be> Signed-off-by: James Bottomley <James.Bottomley@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:45 -08:00
Shirish Pargaonkar	97a3d0d988	cifs: Fix regression during share-level security mounts (Repost) commit 540b2e377797d8715469d408b887baa9310c5f3e upstream. NTLM response length was changed to 16 bytes instead of 24 bytes that are sent in Tree Connection Request during share-level security share mounts. Revert it back to 24 bytes. Reported-and-Tested-by: Grzegorz Ozanski <grzegorz.ozanski@intel.com> Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Shirish Pargaonkar <shirishpargaonkar@gmail.com> Acked-by: Suresh Jayaraman <sjayaraman@suse.de> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:44 -08:00
Pavel Shilovsky	c4eb47bcb0	CIFS: Fix oplock break handling (try #2 ) commit 12fed00de963433128b5366a21a55808fab2f756 upstream. When we get oplock break notification we should set the appropriate value of OplockLevel field in oplock break acknowledge according to the oplock level held by the client in this time. As we only can have level II oplock or no oplock in the case of oplock break, we should be aware only about clientCanCacheRead field in cifsInodeInfo structure. Also fix bug connected with wrong interpretation of OplockLevel field during oplock break notification processing. Signed-off-by: Pavel Shilovsky <piastryyy@gmail.com> Signed-off-by: Steve French <sfrench@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:44 -08:00
Russell King	b770e599b2	ARM: smp_on_up: allow non-ARM SMP processors commit e98ff0f55a0232b578c9aa7f1c245868277ac7bc upstream. Allow non-ARM SMP processors to use the SMP_ON_UP feature. CPUs supporting SMP must have the new CPU ID format, so check for this first. Then check for ARM11MPCore, which fails the MPIDR check. Lastly check the MPIDR reports multiprocessing extensions and that the CPU is part of a multiprocessing system. Reported-and-Tested-by: Stephen Boyd <sboyd@codeaurora.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:44 -08:00
Russell King	3eb02dfd31	ARM: SMP: use more sane register allocation for __fixup_smp_on_up commit 0eb0511d176534674600a1986c3c766756288908 upstream. Use r0,r3-r6 rather than r0,r3,r4,r6,r7, which makes it easier to understand which registers can be modified. Also document which registers hold values which must be preserved. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:43 -08:00
Tejun Heo	fbd4bd2f9e	workqueue: relax lockdep annotation on flush_work() commit e159489baa717dbae70f9903770a6a4990865887 upstream. Currently, the lockdep annotation in flush_work() requires exclusive access on the workqueue the target work is queued on and triggers warning if a work is trying to flush another work on the same workqueue; however, this is no longer true as workqueues can now execute multiple works concurrently. This patch adds lock_map_acquire_read() and make process_one_work() hold read access to the workqueue while executing a work and start_flush_work() check for write access if concurrnecy level is one or the workqueue has a rescuer (as only one execution resource - the rescuer - is guaranteed to be available under memory pressure), and read access if higher. This better represents what's going on and removes spurious lockdep warnings which are triggered by fake dependency chain created through flush_work(). * Peter pointed out that flushing another work from a WQ_MEM_RECLAIM wq breaks forward progress guarantee under memory pressure. Condition check accordingly updated. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: "Rafael J. Wysocki" <rjw@sisk.pl> Tested-by: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:43 -08:00
Stefan Richter	fd920389c2	firewire: core: fix unstable I/O with Canon camcorder commit 6044565af458e7fa6e748bff437ecc49dea88d79 upstream. Regression since commit 10389536742c, "firewire: core: check for 1394a compliant IRM, fix inaccessibility of Sony camcorder": The camcorder Canon MV5i generates lots of bus resets when asynchronous requests are sent to it (e.g. Config ROM read requests or FCP Command write requests) if the camcorder is not root node. This causes drop- outs in videos or makes the camcorder entirely inaccessible. https://bugzilla.redhat.com/show_bug.cgi?id=633260 Fix this by allowing any Canon device, even if it is a pre-1394a IRM like MV5i are, to remain root node (if it is at least Cycle Master capable). With the FireWire controller cards that I tested, MV5i always becomes root node when plugged in and left to its own devices. Reported-by: Ralf Lange Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:42 -08:00
Ken Mills	1902395cb9	n_gsm: copy mtu over when configuring via ioctl interface commit 91f78f36694b8748fda855b1f9e3614b027a744f upstream. This field is settable but did not get copied. Signed-off-by: Ken Mills <ken.k.mills@intel.com> Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:42 -08:00
Benjamin Herrenschmidt	90e1fed6f5	powerpc: Fix some 6xx/7xxx CPU setup functions commit 1f1936ff3febf38d582177ea319eaa278f32c91f upstream. Some of those functions try to adjust the CPU features, for example to remove NAP support on some revisions. However, they seem to use r5 as an index into the CPU table entry, which might have been right a long time ago but no longer is. r4 is the right register to use. This probably caused some off behaviours on some PowerMac variants using 750cx or 7455 processor revisions. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:42 -08:00
Anton Blanchard	5c41035f08	powerpc/numa: Fix bug in unmap_cpu_from_node commit 429f4d8d20b91e4a6c239f951c06a56a6ac22957 upstream. When converting to the new cpumask code I screwed up: - if (cpu_isset(cpu, numa_cpumask_lookup_table[node])) { - cpu_clear(cpu, numa_cpumask_lookup_table[node]); + if (cpumask_test_cpu(cpu, node_to_cpumask_map[node])) { + cpumask_set_cpu(cpu, node_to_cpumask_map[node]); This was introduced in commit 25863de07af9 (powerpc/cpumask: Convert NUMA code to new cpumask API) Fix it. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:41 -08:00
Anton Blanchard	e702bb3517	powerpc: Fix hcall tracepoint recursion commit 57cdfdf829a850a317425ed93c6a576c9ee6329c upstream. Spinlocks on shared processor partitions use H_YIELD to notify the hypervisor we are waiting on another virtual CPU. Unfortunately this means the hcall tracepoints can recurse. The patch below adds a percpu depth and checks it on both the entry and exit hcall tracepoints. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:41 -08:00
Timur Tabi	ce8f917731	powerpc/85xx: fix compatible properties of the P1022DS DMA nodes used for audio commit b2e0861e51f2961954330dcafe6d148ee3ab5cff upstream. In order to prevent the fsl_dma driver from claiming the DMA channels that the P1022DS audio driver needs, the compatible properties for those nodes must say "fsl,ssi-dma-channel" instead of "fsl,eloplus-dma-channel". Signed-off-by: Timur Tabi <timur@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:41 -08:00
Justin TerAvest	d8077df787	cfq-iosched: Don't wait if queue already has requests. commit 02a8f01b5a9f396d0327977af4c232d0f94c45fd upstream. Commit 7667aa0630407bc07dc38dcc79d29cc0a65553c1 added logic to wait for the last queue of the group to become busy (have at least one request), so that the group does not lose out for not being continuously backlogged. The commit did not check for the condition that the last queue already has some requests. As a result, if the queue already has requests, wait_busy is set. Later on, cfq_select_queue() checks the flag, and decides that since the queue has a request now and wait_busy is set, the queue is expired. This results in early expiration of the queue. This patch fixes the problem by adding a check to see if queue already has requests. If it does, wait_busy is not set. As a result, time slices do not expire early. The queues with more than one request are usually buffered writers. Testing shows improvement in isolation between buffered writers. Signed-off-by: Justin TerAvest <teravest@google.com> Reviewed-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:40 -08:00
Peter Zijlstra	1cdc65e140	sched: Fix update_curr_rt() commit 06c3bc655697b19521901f9254eb0bbb2c67e7e8 upstream. cpu_stopper_thread() migration_cpu_stop() __migrate_task() deactivate_task() dequeue_task() dequeue_task_rq() update_curr_rt() Will call update_curr_rt() on rq->curr, which at that time is rq->stop. The problem is that rq->stop.prio matches an RT prio and thus falsely assumes its a rt_sched_class task. Reported-Debuged-Tested-Acked-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:40 -08:00
Peter Zijlstra	44e6ae1221	sched, cgroup: Use exit hook to avoid use-after-free crash commit 068c5cc5ac7414a8e9eb7856b4bf3cc4d4744267 upstream. By not notifying the controller of the on-exit move back to init_css_set, we fail to move the task out of the previous cgroup's cfs_rq. This leads to an opportunity for a cgroup-destroy to come in and free the cgroup (there are no active tasks left in it after all) to which the not-quite dead task is still enqueued. Reported-by: Miklos Vajna <vmiklos@frugalware.org> Fixed-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> LKML-Reference: <1293206353.29444.205.camel@laptop> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:40 -08:00
NeilBrown	fd607feccf	sched: Change wait_for_completion__timeout() to return a signed long commit 6bf4123760a5aece6e4829ce90b70b6ffd751d65 upstream. wait_for_completion__timeout() can return: 0: if the wait timed out -ve: if the wait was interrupted +ve: if the completion was completed. As they currently return an 'unsigned long', the last two cases are not easily distinguished which can easily result in buggy code, as is the case for the recently added wait_for_completion_interruptible_timeout() call in net/sunrpc/cache.c So change them both to return 'long'. As MAX_SCHEDULE_TIMEOUT is LONG_MAX, a large +ve return value should never overflow. Signed-off-by: NeilBrown <neilb@suse.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: J. Bruce Fields <bfields@fieldses.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <20110105125016.64ccab0e@notabene.brown> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:39 -08:00
Eric Dumazet	168adf9338	epoll: epoll_wait() should not use timespec_add_ns() commit 0781b909b5586f4db720b5d1838b78f9d8e42f14 upstream. commit 95aac7b1cd224f ("epoll: make epoll_wait() use the hrtimer range feature") added a performance regression because it uses timespec_add_ns() with potential very large 'ns' values. [akpm@linux-foundation.org: s/epoll_set_mstimeout/ep_set_mstimeout/, per Davide] Reported-by: Simon Kirby <sim@hostway.ca> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Cc: Shawn Bohrer <shawn.bohrer@gmail.com> Acked-by: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:39 -08:00
David Miller	cea229ad1a	klist: Fix object alignment on 64-bit. commit 795abaf1e4e188c4171e3cd3dbb11a9fcacaf505 upstream. Commit c0e69a5bbc6f ("klist.c: bit 0 in pointer can't be used as flag") intended to make sure that all klist objects were at least pointer size aligned, but used the constant "4" which only works on 32-bit. Use "sizeof(void *)" which is correct in all cases. Signed-off-by: David S. Miller <davem@davemloft.net> Acked-by: Jesper Nilsson <jesper.nilsson@axis.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:39 -08:00
Mel Gorman	39a06a2ac3	mm: page allocator: adjust the per-cpu counter threshold when memory is low commit 88f5acf88ae6a9778f6d25d0d5d7ec2d57764a97 upstream. Commit aa45484 ("calculate a better estimate of NR_FREE_PAGES when memory is low") noted that watermarks were based on the vmstat NR_FREE_PAGES. To avoid synchronization overhead, these counters are maintained on a per-cpu basis and drained both periodically and when a threshold is above a threshold. On large CPU systems, the difference between the estimate and real value of NR_FREE_PAGES can be very high. The system can get into a case where pages are allocated far below the min watermark potentially causing livelock issues. The commit solved the problem by taking a better reading of NR_FREE_PAGES when memory was low. Unfortately, as reported by Shaohua Li this accurate reading can consume a large amount of CPU time on systems with many sockets due to cache line bouncing. This patch takes a different approach. For large machines where counter drift might be unsafe and while kswapd is awake, the per-cpu thresholds for the target pgdat are reduced to limit the level of drift to what should be a safe level. This incurs a performance penalty in heavy memory pressure by a factor that depends on the workload and the machine but the machine should function correctly without accidentally exhausting all memory on a node. There is an additional cost when kswapd wakes and sleeps but the event is not expected to be frequent - in Shaohua's test case, there was one recorded sleep and wake event at least. To ensure that kswapd wakes up, a safe version of zone_watermark_ok() is introduced that takes a more accurate reading of NR_FREE_PAGES when called from wakeup_kswapd, when deciding whether it is really safe to go back to sleep in sleeping_prematurely() and when deciding if a zone is really balanced or not in balance_pgdat(). We are still using an expensive function but limiting how often it is called. When the test case is reproduced, the time spent in the watermark functions is reduced. The following report is on the percentage of time spent cumulatively spent in the functions zone_nr_free_pages(), zone_watermark_ok(), __zone_watermark_ok(), zone_watermark_ok_safe(), zone_page_state_snapshot(), zone_page_state(). vanilla 11.6615% disable-threshold 0.2584% David said: : We had to pull aa454840 "mm: page allocator: calculate a better estimate : of NR_FREE_PAGES when memory is low and kswapd is awake" from 2.6.36 : internally because tests showed that it would cause the machine to stall : as the result of heavy kswapd activity. I merged it back with this fix as : it is pending in the -mm tree and it solves the issue we were seeing, so I : definitely think this should be pushed to -stable (and I would seriously : consider it for 2.6.37 inclusion even at this late date). Signed-off-by: Mel Gorman <mel@csn.ul.ie> Reported-by: Shaohua Li <shaohua.li@intel.com> Reviewed-by: Christoph Lameter <cl@linux.com> Tested-by: Nicolas Bareil <nico@chdir.org> Cc: David Rientjes <rientjes@google.com> Cc: Kyle McMartin <kyle@mcmartin.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:38 -08:00
Ian Campbell	697627a212	xen-platform: use PCI interfaces to request IO and MEM resources. commit 00f28e4037c8d5782fa7a1b2666b0dca21522d69 upstream. This is the correct interface to use and something has broken the use of the previous incorrect interface (which fails because the request conflicts with the resources assigned for the PCI device itself instead of nesting like the PCI interfaces do). Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:38 -08:00
Kees Cook	6b1c3b60a4	net: ax25: fix information leak to userland harder commit 5b919f833d9d60588d026ad82d17f17e8872c7a9 upstream. Commit fe10ae53384e48c51996941b7720ee16995cbcb7 adds a memset() to clear the structure being sent back to userspace, but accidentally used the wrong size. Reported-by: Brad Spengler <spender@grsecurity.net> Signed-off-by: Kees Cook <kees.cook@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:37 -08:00
H. Peter Anvin	1fe7a67d02	x86, olpc: Add missing Kconfig dependencies commit 76d1f7bfcd5872056902c5a88b5fcd5d4d00a7a9 upstream. OLPC uses select for OLPC_OPENFIRMWARE, which means OLPC has to enforce the dependencies for OLPC_OPENFIRMWARE. Make sure it does so. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com> Cc: Daniel Drake <dsd@laptop.org> Cc: Andres Salomon <dilinger@queued.net> Cc: Grant Likely <grant.likely@secretlab.ca> LKML-Reference: <20100923162846.D8D409D401B@zog.reactivated.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:37 -08:00
Christoph Lameter	8255012e8b	slub: Avoid use of slub_lock in show_slab_objects() commit 04d94879c8a4973b5499dc26b9d38acee8928791 upstream. The purpose of the locking is to prevent removal and additions of nodes when statistics are gathered for a slab cache. So we need to avoid racing with memory hotplug functionality. It is enough to take the memory hotplug locks there instead of the slub_lock. online_pages() currently does not acquire the memory_hotplug lock. Another patch will be submitted by the memory hotplug authors to take the memory hotplug lock and describe the uses of the memory hotplug lock to protect against adding and removal of nodes from non hotplug data structures. Reported-and-tested-by: Bart Van Assche <bvanassche@acm.org> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Christoph Lameter <cl@linux.com> Signed-off-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:37 -08:00
Boaz Harrosh	e8160e170b	Revert "exofs: Set i_mapping->backing_dev_info anyway" commit 0b0abeaf3d30cec03ac6497fe978b8f7edecc5ae upstream. This reverts commit 115e19c53501edc11f730191f7f047736815ae3d. Apparently setting inode->bdi to one's own sb->s_bdi stops VFS from sending read-aheads. This problem was bisected to this commit. A revert fixes it. I'll investigate farther why is this happening for the next Kernel, but for now a revert. I'm sending to stable@kernel.org as well, since it exists also in 2.6.37. 2.6.36 is good and does not have this patch. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:36 -08:00
Sonic Zhang	23a3d92534	mmc: bfin_sdh: fix alloc size for private data commit a34650f0f1ca589cda09c48cb62baf15e680a247 upstream. The bfin_sdh driver allocates the wrong size for the private data in the mmc_host. The first parameter of mmc_alloc_host should be the size of the local driver struct rather than the common mmc_host. Signed-off-by: Sonic Zhang <sonic.zhang@analog.com> Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: Chris Ball <cjb@laptop.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:36 -08:00
KAMEZAWA Hiroyuki	e0b1361245	memcg: fix account leak at failure of memsw acconting commit 01c88e2d6b7330c0cc5867fe2297e7d826e1337d upstream. Commit 4b53433468 ("memcg: clean up try_charge main loop") removes a cancel of charge at case: memory charge-> success. mem+swap charge-> failure. This leaks usage of memory. Fix it. Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Reviewed-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@in.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:36 -08:00
Russell King	f7095bc222	ARM: initrd: disable initrd if passed address overlaps reserved region commit b0a2679d27408d97ce31e5f800b44227d3388b84 upstream. Disable the initrd if the passed address already overlaps the reserved region. This avoids oopses on Netwinders when NeTTrom tells the kernel that an initrd is located at mem+4MB, but this overlaps the BSS, resulting in the kernels in-use BSS being freed. This should be applied to v2.6.37-stable. Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:35 -08:00
Changhwan Youn	afbbe5ac9f	ARM: S5PV310: Set bit 22 in the PL310 (cache controller) AuxCtlr register commit a50eb1c7680973f5441ca20ac4da0af2055d0d87 upstream. This patch is applied according to the commit 1a8e41cd672f894bbd74874eac601e6cedf838fb (ARM: 6395/1: VExpress: Set bit 22 in the PL310 (cache controller) AuxCtlr register). Actually, S5PV310 has same cache controller(PL310). Following is from Catalin Marinas' commit. Clearing bit 22 in the PL310 Auxiliary Control register (shared attribute override enable) has the side effect of transforming Normal Shared Non-cacheable reads into Cacheable no-allocate reads. Coherent DMA buffers in Linux always have a Cacheable alias via the kernel linear mapping and the processor can speculatively load cache lines into the PL310 controller. With bit 22 cleared, Non-cacheable reads would unexpectedly hit such cache lines leading to buffer corruption. Signed-off-by: Changhwan Youn <chaos.youn@samsung.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Kukjin Kim <kgene.kim@samsung.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:35 -08:00
Kacper Kornet	191c11d606	Fix prlimit64 for suid/sgid processes commit aa5bd67dcfdf9af34c7fa36ebc87d4e1f7e91873 upstream. Since check_prlimit_permission always fails in the case of SUID/GUID processes, such processes are not able to read or set their own limits. This commit changes this by assuming that process can always read/change its own limits. Signed-off-by: Kacper Kornet <kornet@camk.edu.pl> Acked-by: Jiri Slaby <jslaby@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:34 -08:00
Roland Stigge	f045961680	iio: Fixpoint formatted output bugfix commit e71a7fd259943a2c2e11484880c80248ad139fe5 upstream. Fix some ADC drivers' _scale interface to correct fixpoint formatted output Signed-off-by: Roland Stigge <stigge@antcom.de> Acked-by: Jonathan Cameron <jic23@cam.ac.uk> Acked-by: Michael Hennerich <Michael.Hennerich@analog.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:34 -08:00
Don Skidmore	42633db909	ixgbe: fix for 82599 erratum on Header Splitting commit a124339ad28389093ed15eca990d39c51c5736cc upstream. We have found a hardware erratum on 82599 hardware that can lead to unpredictable behavior when Header Splitting mode is enabled. So we are no longer enabling this feature on affected hardware. Please see the 82599 Specification Update for more information. Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com> Tested-by: Stephen Ko <stephen.s.ko@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:34 -08:00
Tim Deegan	ae03b63f92	fix jiffy calculations in calibrate_delay_direct to handle overflow commit 70a062286b9dfcbd24d2e11601aecfead5cf709a upstream. Fixes a hang when booting as dom0 under Xen, when jiffies can be quite large by the time the kernel init gets this far. Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com> [jbeulich@novell.com: !time_after() -> time_before_eq() as suggested by Jiri Slaby] Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:34 -08:00
Suresh Siddha	23665d48ff	x86, mtrr: Avoid MTRR reprogramming on BP during boot on UP platforms commit f7448548a9f32db38f243ccd4271617758ddfe2c upstream. Markus Kohn ran into a hard hang regression on an acer aspire 1310, when acpi is enabled. git bisect showed the following commit as the bad one that introduced the boot regression. commit d0af9eed5aa91b6b7b5049cae69e5ea956fd85c3 Author: Suresh Siddha <suresh.b.siddha@intel.com> Date: Wed Aug 19 18:05:36 2009 -0700 x86, pat/mtrr: Rendezvous all the cpus for MTRR/PAT init Because of the UP configuration of that platform, native_smp_prepare_cpus() bailed out (in smp_sanity_check()) before doing the set_mtrr_aps_delayed_init() Further down the boot path, native_smp_cpus_done() will call the delayed MTRR initialization for the AP's (mtrr_aps_init()) with mtrr_aps_delayed_init not set. This resulted in the boot processor reprogramming its MTRR's to the values seen during the start of the OS boot. While this is not needed ideally, this shouldn't have caused any side-effects. This is because the reprogramming of MTRR's (set_mtrr_state() that gets called via set_mtrr()) will check if the live register contents are different from what is being asked to write and will do the actual write only if they are different. BP's mtrr state is read during the start of the OS boot and typically nothing would have changed when we ask to reprogram it on BP again because of the above scenario on an UP platform. So on a normal UP platform no reprogramming of BP MTRR MSR's happens and all is well. However, on this platform, bios seems to be modifying the fixed mtrr range registers between the start of OS boot and when we double check the live registers for reprogramming BP MTRR registers. And as the live registers are modified, we end up reprogramming the MTRR's to the state seen during the start of the OS boot. During ACPI initialization, something in the bios (probably smi handler?) don't like this fact and results in a hard lockup. We didn't see this boot hang issue on this platform before the commit d0af9eed5aa91b6b7b5049cae69e5ea956fd85c3, because only the AP's (if any) will program its MTRR's to the value that BP had at the start of the OS boot. Fix this issue by checking mtrr_aps_delayed_init before continuing further in the mtrr_aps_init(). Now, only AP's (if any) will program its MTRR's to the BP values during boot. Addresses https://bugzilla.novell.com/show_bug.cgi?id=623393 [ By the way, this behavior of the bios modifying MTRR's after the start of the OS boot is not common and the kernel is not prepared to handle this situation well. Irrespective of this issue, during suspend/resume, linux kernel will try to reprogram the BP's MTRR values to the values seen during the start of the OS boot. So suspend/resume might be already broken on this platform for all linux kernel versions. ] Reported-and-bisected-by: Markus Kohn <jabber@gmx.org> Tested-by: Markus Kohn <jabber@gmx.org> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Cc: Thomas Renninger <trenn@novell.com> Cc: Rafael Wysocki <rjw@novell.com> Cc: Venkatesh Pallipadi <venki@google.com> LKML-Reference: <1296694975.4418.402.camel@sbsiddha-MOBL3.sc.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:33 -08:00
Eric W. Biederman	6940b3914f	net: Fix ip link add netns oops commit 13ad17745c2cbd437d9e24b2d97393e0be11c439 upstream. Ed Swierk <eswierk@bigswitch.com> writes: > On 2.6.35.7 > ip link add link eth0 netns 9999 type macvlan > where 9999 is a nonexistent PID triggers an oops and causes all network functions to hang: > [10663.821898] BUG: unable to handle kernel NULL pointer dereference at 000000000000006d > [10663.821917] IP: [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170 > [10663.821933] PGD 1d3927067 PUD 22f5c5067 PMD 0 > [10663.821944] Oops: 0000 [#1] SMP > [10663.821953] last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq > [10663.821959] CPU 3 > [10663.821963] Modules linked in: macvlan ip6table_filter ip6_tables rfcomm ipt_MASQUERADE binfmt_misc iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack sco ipt_REJECT bnep l2cap xt_tcpudp iptable_filter ip_tables x_tables bridge stp vboxnetadp vboxnetflt vboxdrv kvm_intel kvm parport_pc ppdev snd_hda_codec_intelhdmi snd_hda_codec_conexant arc4 iwlagn iwlcore mac80211 snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi i915 snd_seq_midi_event snd_seq thinkpad_acpi drm_kms_helper btusb tpm_tis nvram uvcvideo snd_timer snd_seq_device bluetooth videodev v4l1_compat v4l2_compat_ioctl32 tpm drm tpm_bios snd cfg80211 psmouse serio_raw intel_ips soundcore snd_page_alloc intel_agp i2c_algo_bit video output netconsole configfs lp parport usbhid hid e1000e sdhci_pci ahci libahci sdhci led_class > [10663.822155] > [10663.822161] Pid: 6000, comm: ip Not tainted 2.6.35-23-generic #41-Ubuntu 2901CTO/2901CTO > [10663.822167] RIP: 0010:[<ffffffff8149c2fa>] [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170 > [10663.822177] RSP: 0018:ffff88014aebf7b8 EFLAGS: 00010286 > [10663.822182] RAX: 00000000fffffff4 RBX: ffff8801ad900800 RCX: 0000000000000000 > [10663.822187] RDX: ffff880000000000 RSI: 0000000000000000 RDI: ffff88014ad63000 > [10663.822191] RBP: ffff88014aebf808 R08: 0000000000000041 R09: 0000000000000041 > [10663.822196] R10: 0000000000000000 R11: dead000000200200 R12: ffff88014aebf818 > [10663.822201] R13: fffffffffffffffd R14: ffff88014aebf918 R15: ffff88014ad62000 > [10663.822207] FS: 00007f00c487f700(0000) GS:ffff880001f80000(0000) knlGS:0000000000000000 > [10663.822212] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [10663.822216] CR2: 000000000000006d CR3: 0000000231f19000 CR4: 00000000000026e0 > [10663.822221] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [10663.822226] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > [10663.822231] Process ip (pid: 6000, threadinfo ffff88014aebe000, task ffff88014afb16e0) > [10663.822236] Stack: > [10663.822240] ffff88014aebf808 ffffffff814a2bb5 ffff88014aebf7e8 00000000a00ee8d6 > [10663.822251] <0> 0000000000000000 ffffffffa00ef940 ffff8801ad900800 ffff88014aebf818 > [10663.822265] <0> ffff88014aebf918 ffff8801ad900800 ffff88014aebf858 ffffffff8149c413 > [10663.822281] Call Trace: > [10663.822290] [<ffffffff814a2bb5>] ? dev_addr_init+0x75/0xb0 > [10663.822298] [<ffffffff8149c413>] dev_alloc_name+0x43/0x90 > [10663.822307] [<ffffffff814a85ee>] rtnl_create_link+0xbe/0x1b0 > [10663.822314] [<ffffffff814ab2aa>] rtnl_newlink+0x48a/0x570 > [10663.822321] [<ffffffff814aafcc>] ? rtnl_newlink+0x1ac/0x570 > [10663.822332] [<ffffffff81030064>] ? native_x2apic_icr_read+0x4/0x20 > [10663.822339] [<ffffffff814a8c17>] rtnetlink_rcv_msg+0x177/0x290 > [10663.822346] [<ffffffff814a8aa0>] ? rtnetlink_rcv_msg+0x0/0x290 > [10663.822354] [<ffffffff814c25d9>] netlink_rcv_skb+0xa9/0xd0 > [10663.822360] [<ffffffff814a8a85>] rtnetlink_rcv+0x25/0x40 > [10663.822367] [<ffffffff814c223e>] netlink_unicast+0x2de/0x2f0 > [10663.822374] [<ffffffff814c303e>] netlink_sendmsg+0x1fe/0x2e0 > [10663.822383] [<ffffffff81488533>] sock_sendmsg+0xf3/0x120 > [10663.822391] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20 > [10663.822400] [<ffffffff81168656>] ? __d_lookup+0x136/0x150 > [10663.822406] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20 > [10663.822414] [<ffffffff812b7a0d>] ? _atomic_dec_and_lock+0x4d/0x80 > [10663.822422] [<ffffffff8116ea90>] ? mntput_no_expire+0x30/0x110 > [10663.822429] [<ffffffff81486ff5>] ? move_addr_to_kernel+0x65/0x70 > [10663.822435] [<ffffffff81493308>] ? verify_iovec+0x88/0xe0 > [10663.822442] [<ffffffff81489020>] sys_sendmsg+0x240/0x3a0 > [10663.822450] [<ffffffff8111e2a9>] ? __do_fault+0x479/0x560 > [10663.822457] [<ffffffff815899fe>] ? _raw_spin_lock+0xe/0x20 > [10663.822465] [<ffffffff8116cf4a>] ? alloc_fd+0x10a/0x150 > [10663.822473] [<ffffffff8158d76e>] ? do_page_fault+0x15e/0x350 > [10663.822482] [<ffffffff8100a0f2>] system_call_fastpath+0x16/0x1b > [10663.822487] Code: 90 48 8d 78 02 be 25 00 00 00 e8 92 1d e2 ff 48 85 c0 75 cf bf 20 00 00 00 e8 c3 b1 c6 ff 49 89 c7 b8 f4 ff ff ff 4d 85 ff 74 bd <4d> 8b 75 70 49 8d 45 70 48 89 45 b8 49 83 ee 58 eb 28 48 8d 55 > [10663.822618] RIP [<ffffffff8149c2fa>] __dev_alloc_name+0x9a/0x170 > [10663.822627] RSP <ffff88014aebf7b8> > [10663.822631] CR2: 000000000000006d > [10663.822636] ---[ end trace 3dfd6c3ad5327ca7 ]--- This bug was introduced in: commit 81adee47dfb608df3ad0b91d230fb3cef75f0060 Author: Eric W. Biederman <ebiederm@aristanetworks.com> Date: Sun Nov 8 00:53:51 2009 -0800 net: Support specifying the network namespace upon device creation. There is no good reason to not support userspace specifying the network namespace during device creation, and it makes it easier to create a network device and pass it to a child network namespace with a well known name. We have to be careful to ensure that the target network namespace for the new device exists through the life of the call. To keep that logic clear I have factored out the network namespace grabbing logic into rtnl_link_get_net. In addtion we need to continue to pass the source network namespace to the rtnl_link_ops.newlink method so that we can find the base device source network namespace. Signed-off-by: Eric W. Biederman <ebiederm@aristanetworks.com> Acked-by: Eric Dumazet <eric.dumazet@gmail.com> Where apparently I forgot to add error handling to the path where we create a new network device in a new network namespace, and pass in an invalid pid. Reported-by: Ed Swierk <eswierk@bigswitch.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:33 -08:00
Tejun Heo	bf7adcf7ea	ptrace: use safer wake up on ptrace_detach() commit 01e05e9a90b8f4c3997ae0537e87720eb475e532 upstream. The wake_up_process() call in ptrace_detach() is spurious and not interlocked with the tracee state. IOW, the tracee could be running or sleeping in any place in the kernel by the time wake_up_process() is called. This can lead to the tracee waking up unexpectedly which can be dangerous. The wake_up is spurious and should be removed but for now reduce its toxicity by only waking up if the tracee is in TRACED or STOPPED state. This bug can possibly be used as an attack vector. I don't think it will take too much effort to come up with an attack which triggers oops somewhere. Most sleeps are wrapped in condition test loops and should be safe but we have quite a number of places where sleep and wakeup conditions are expected to be interlocked. Although the window of opportunity is tiny, ptrace can be used by non-privileged users and with some loading the window can definitely be extended and exploited. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Roland McGrath <roland@redhat.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:32 -08:00
Pavel Machek	ee48eb6bd8	serial: unbreak billionton CF card commit d0694e2aeb815042aa0f3e5036728b3db4446f1d upstream. Unbreak Billionton CF bluetooth card. This actually fixes a regression on zaurus. Signed-off-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:32 -08:00
Dario Lombardo	4572634fff	drivers: update to pl2303 usb-serial to support Motorola cables commit 96a3e79edff6f41b0f115a82f1a39d66218077a7 upstream. Added 0x0307 device id to support Motorola cables to the pl2303 usb serial driver. This cable has a modified chip that is a pl2303, but declares itself as 0307. Fixed by adding the right device id to the supported devices list, assigning it the code labeled PL2303_PRODUCT_ID_MOTOROLA. Signed-off-by: Dario Lombardo <dario.lombardo@libero.it> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-17 15:14:32 -08:00

1 2 3 4 5 ...

223985 Commits