linux

iv/linux

Author	SHA1	Message	Date
Paul E. McKenney	dc36be4419	Merge branches 'barrier.2012.05.09a', 'fixes.2012.04.26a', 'inline.2012.05.02b' and 'srcu.2012.05.07b' into HEAD barrier: Reduce the amount of disturbance by rcu_barrier() to the rest of the system. This branch also includes improvements to RCU_FAST_NO_HZ, which are included here due to conflicts. fixes: Miscellaneous fixes. inline: Remaining changes from an abortive attempt to inline preemptible RCU's __rcu_read_lock(). These are (1) making exit_rcu() avoid unnecessary work and (2) avoiding having preemptible RCU record a blocked thread when the scheduler declines to do a context switch. srcu: Lai Jiangshan's algorithmic implementation of SRCU, including call_srcu().	2012-05-11 10:14:21 -07:00
Paul E. McKenney	b1420f1c8b	rcu: Make rcu_barrier() less disruptive The rcu_barrier() primitive interrupts each and every CPU, registering a callback on every CPU. Once all of these callbacks have been invoked, rcu_barrier() knows that every callback that was registered before the call to rcu_barrier() has also been invoked. However, there is no point in registering a callback on a CPU that currently has no callbacks, most especially if that CPU is in a deep idle state. This commit therefore makes rcu_barrier() avoid interrupting CPUs that have no callbacks. Doing this requires reworking the handling of orphaned callbacks, otherwise callbacks could slip through rcu_barrier()'s net by being orphaned from a CPU that rcu_barrier() had not yet interrupted to a CPU that rcu_barrier() had already interrupted. This reworking was needed anyway to take a first step towards weaning RCU from the CPU_DYING notifier's use of stop_cpu(). Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-05-09 14:27:54 -07:00
Paul E. McKenney	98248a0e24	rcu: Explicitly initialize RCU_FAST_NO_HZ per-CPU variables The current initialization of the RCU_FAST_NO_HZ per-CPU variables makes needless and fragile assumptions about the initial value of things like the jiffies counter. This commit therefore explicitly initializes all of them that are better started with a non-zero value. It also adds some comments describing the per-CPU state variables. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-05-09 14:26:57 -07:00
Paul E. McKenney	21e52e1566	rcu: Make RCU_FAST_NO_HZ handle timer migration The current RCU_FAST_NO_HZ assumes that timers do not migrate unless a CPU goes offline, in which case it assumes that the CPU will have to come out of dyntick-idle mode (cancelling the timer) in order to go offline. This is important because when RCU_FAST_NO_HZ permits a CPU to enter dyntick-idle mode despite having RCU callbacks pending, it posts a timer on that CPU to force a wakeup on that CPU. This wakeup ensures that the CPU will eventually handle the end of the grace period, including invoking its RCU callbacks. However, Pascal Chapperon's test setup shows that the timer handler rcu_idle_gp_timer_func() really does get invoked in some cases. This is problematic because this can cause the CPU that entered dyntick-idle mode despite still having RCU callbacks pending to remain in dyntick-idle mode indefinitely, which means that its RCU callbacks might never be invoked. This situation can result in grace-period delays or even system hangs, which matches Pascal's observations of slow boot-up and shutdown (https://lkml.org/lkml/2012/4/5/142). See also the bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=806548 This commit therefore causes the "should never be invoked" timer handler rcu_idle_gp_timer_func() to use smp_call_function_single() to wake up the CPU for which the timer was intended, allowing that CPU to invoke its RCU callbacks in a timely manner. Reported-by: Pascal Chapperon <pascal.chapperon@wanadoo.fr> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-05-09 14:26:56 -07:00
Paul E. McKenney	9fab97876a	rcu: Update RCU maintainership Split SRCU out and add Lai Jiangshan as SRCU co-maintainer. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-05-07 09:36:34 -07:00
Paul E. McKenney	9dd8fb16c3	rcu: Make exit_rcu() more precise and consolidate When running preemptible RCU, if a task exits in an RCU read-side critical section having blocked within that same RCU read-side critical section, the task must be removed from the list of tasks blocking a grace period (perhaps the current grace period, perhaps the next grace period, depending on timing). The exit() path invokes exit_rcu() to do this cleanup. However, the current implementation of exit_rcu() needlessly does the cleanup even if the task did not block within the current RCU read-side critical section, which wastes time and needlessly increases the size of the state space. Fix this by only doing the cleanup if the current task is actually on the list of tasks blocking some grace period. While we are at it, consolidate the two identical exit_rcu() functions into a single function. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Linus Torvalds <torvalds@linux-foundation.org> Conflicts: kernel/rcupdate.c	2012-05-02 14:48:27 -07:00
Paul E. McKenney	616c310e83	rcu: Move PREEMPT_RCU preemption to switch_to() invocation Currently, PREEMPT_RCU readers are enqueued upon entry to the scheduler. This is inefficient because enqueuing is required only if there is a context switch, and entry to the scheduler does not guarantee a context switch. The commit therefore moves the enqueuing to immediately precede the call to switch_to() from the scheduler. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-02 14:43:23 -07:00
Paul E. McKenney	f511fc6246	rcu: Ensure that RCU_FAST_NO_HZ timers expire on correct CPU Timers are subject to migration, which can lead to the following system-hang scenario when CONFIG_RCU_FAST_NO_HZ=y: 1. CPU 0 executes synchronize_rcu(), which posts an RCU callback. 2. CPU 0 then goes idle. It cannot immediately invoke the callback, but there is nothing RCU needs from ti, so it enters dyntick-idle mode after posting a timer. 3. The timer gets migrated to CPU 1. 4. CPU 0 never wakes up, so the synchronize_rcu() never returns, so the system hangs. This commit fixes this problem by using mod_timer_pinned(), as suggested by Peter Zijlstra, to ensure that the timer is actually posted on the running CPU. Reported-by: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-05-01 08:22:50 -07:00
Lai Jiangshan	9059c94017	rcu: Add rcutorture test for call_srcu() Add srcu_torture_deferred_free() for srcu_ops so as to test the new call_srcu(). Rename the original srcu_ops to srcu_sync_ops. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-30 10:48:26 -07:00
Lai Jiangshan	931ea9d1a6	rcu: Implement per-domain single-threaded call_srcu() state machine This commit implements an SRCU state machine in support of call_srcu(). The state machine is preemptible, light-weight, and single-threaded, minimizing synchronization overhead. In particular, there is no longer any need for synchronize_srcu() to be guarded by a mutex. Expedited processing is handled, at least in the absence of concurrent grace-period operations on that same srcu_struct structure, by having the synchronize_srcu_expedited() thread take on the role of the workqueue thread for one iteration. There is a reasonable probability that a given SRCU callback will be invoked on the same CPU that registered it, however, there is no guarantee. Concurrent SRCU grace-period primitives can cause callbacks to be executed elsewhere, even in absence of CPU-hotplug operations. Callbacks execute in process context, but under the influence of local_bh_disable(), so it is illegal to sleep in an SRCU callback function. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-30 10:48:25 -07:00
Lai Jiangshan	d9792edd7a	rcu: Use single value to handle expedited SRCU grace periods The earlier algorithm used an "expedited" flag combined with a "trycount" counter to differentiate between normal and expedited SRCU grace periods. However, the difference can be encoded into a single counter with a cutoff value and different initial values for expedited and normal SRCU grace periods. This commit makes that change. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Conflicts: kernel/srcu.c	2012-04-30 10:48:24 -07:00
Lai Jiangshan	dc87917501	rcu: Improve srcu_readers_active_idx()'s cache locality Expand the calls to srcu_readers_active_idx() from srcu_readers_active() inline. This change improves cache locality by interating over the CPUs once rather than twice. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-30 10:48:24 -07:00
Lai Jiangshan	966f58c2f6	rcu: Remove unused srcu_barrier() The old srcu_barrier() macro is now unused. This commit removes it so that it may be used for the SRCU flavor of rcu_barrier(), which will in turn be needed to allow the upcoming call_srcu() to be used from within modules. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-30 10:48:23 -07:00
Lai Jiangshan	b52ce066c5	rcu: Implement a variant of Peter's SRCU algorithm This commit implements a variant of Peter's algorithm, which may be found at https://lkml.org/lkml/2012/2/1/119. o Make the checking lock-free to enable parallel checking. Parallel checking is required when (1) the original checking task is preempted for a long time, (2) sychronize_srcu_expedited() starts during an ongoing SRCU grace period, or (3) we wish to avoid acquiring a lock. o Since the checking is lock-free, we avoid a mutex in state machine for call_srcu(). o Remove the SRCU_REF_MASK and remove the coupling with the flipping. This might allow us to remove the preempt_disable() in future versions, though such removal will need great care because it rescinds the one-old-reader-per-CPU guarantee. o Remove a smp_mb(), simplify the comments and make the smp_mb() pairs more intuitive. Inspired-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-30 10:48:22 -07:00
Lai Jiangshan	18108ebfeb	rcu: Improve SRCU's wait_idx() comments The safety of SRCU is provided byy wait_idx() rather than flipping. The flipping actually prevents starvation. This commit therefore updates the comments to more accurately and precisely describe what is going on. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-30 10:48:22 -07:00
Lai Jiangshan	944ce9af47	rcu: Flip ->completed only once per SRCU grace period This is an optimization of the SRCU grace period. To guard against preempted readers with old values of the counter, it suffices to scan the old counters once more, then flip ->completed only one time. The reason this works is that the old readers must have incremented the old set of counters (if they have not yet incremented, then their critical section starts after this grace period, so they may be safely ignored). This commit therefore optimizes the second flip out in favor of a simple rescan. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-30 10:48:21 -07:00
Lai Jiangshan	440253c17f	rcu: Increment upper bit only for srcu_read_lock() The purpose of the upper bit of SRCU's per-CPU counters is to guarantee that no reasonable series of srcu_read_lock() and srcu_read_unlock() operations can return the value of the counter to its original value. This guarantee is require only after the index has been switched to the other set of counters, so at most one srcu_read_lock() can affect a given CPU's counter. The number of srcu_read_unlock() operations on a given counter is limited to the number of tasks in the system, which given the Linux kernel's current structure is limited to far less than 2^30 on 32-bit systems and far less than 2^62 on 64-bit systems. (Something about a limited number of bytes in the kernel's address space.) Therefore, if srcu_read_lock() increments the upper bits, then srcu_read_unlock() need not do so. In this case, an srcu_read_lock() and an srcu_read_unlock() will flip the lower bit of the upper field of the counter. An unreasonably large additional number of srcu_read_unlock() operations would be required to return the counter to its initial value, thus preserving the guarantee. This commit takes this approach, which further allows it to shrink the size of the upper field to one bit, making the number of srcu_read_unlock() operations required to return the counter to its initial value even more unreasonable than before. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-30 10:48:20 -07:00
Lai Jiangshan	4b7a3e9e32	rcu: Remove fast check path from __synchronize_srcu() The fastpath in __synchronize_srcu() is designed to handle cases where there are a large number of concurrent calls for the same srcu_struct structure. However, the Linux kernel currently does not use SRCU in this manner, so remove the fastpath checks for simplicity. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-30 10:48:20 -07:00
Paul E. McKenney	cef50120b6	rcu: Direct algorithmic SRCU implementation The current implementation of synchronize_srcu_expedited() can cause severe OS jitter due to its use of synchronize_sched(), which in turn invokes try_stop_cpus(), which causes each CPU to be sent an IPI. This can result in severe performance degradation for real-time workloads and especially for short-interation-length HPC workloads. Furthermore, because only one instance of try_stop_cpus() can be making forward progress at a given time, only one instance of synchronize_srcu_expedited() can make forward progress at a time, even if they are all operating on distinct srcu_struct structures. This commit, inspired by an earlier implementation by Peter Zijlstra (https://lkml.org/lkml/2012/1/31/211) and by further offline discussions, takes a strictly algorithmic bits-in-memory approach. This has the disadvantage of requiring one explicit memory-barrier instruction in each of srcu_read_lock() and srcu_read_unlock(), but on the other hand completely dispenses with OS jitter and furthermore allows SRCU to be used freely by CPUs that RCU believes to be idle or offline. The update-side implementation handles the single read-side memory barrier by rechecking the per-CPU counters after summing them and by running through the update-side state machine twice. This implementation has passed moderate rcutorture testing on both x86 and Power. Also updated to use this_cpu_ptr() instead of per_cpu_ptr(), as suggested by Peter Zijlstra. Reported-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Reviewed-by: Lai Jiangshan <laijs@cn.fujitsu.com>	2012-04-30 10:48:19 -07:00
Paul E. McKenney	fae4b54f28	rcu: Introduce rcutorture testing for rcu_barrier() Although rcutorture does invoke rcu_barrier() and friends, it cannot really be called a torture test given that it invokes them only once at the end of the test. This commit therefore introduces heavy-duty rcutorture testing for rcu_barrier(), which may be carried out concurrently with normal rcutorture testing. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-30 10:48:18 -07:00
Paul E. McKenney	048a0e8f5e	timer: Fix mod_timer_pinned() header comment The mod_timer_pinned() header comment states that it prevents timers from being migrated to a different CPU. This is not the case, instead, it ensures that the timer is posted to the current CPU, but does nothing to prevent CPU-hotplug operations from migrating the timer. This commit therefore brings the comment header into alignment with reality. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Steven Rostedt <rostedt@goodmis.org>	2012-04-26 12:28:03 -07:00
Paul E. McKenney	79b9a75fb7	rcu: Add warning for RCU_FAST_NO_HZ timer firing RCU_FAST_NO_HZ uses a timer to limit the time that a CPU with callbacks can remain in dyntick-idle mode. This timer is cancelled when the CPU exits idle, and therefore should never fire. However, if the timer were migrated to some other CPU for whatever reason (1) the timer could actually fire and (2) firing on some other CPU would fail to wake up the CPU with callbacks, possibly resulting in sluggishness or a system hang. This commit therfore adds a WARN_ON_ONCE() to the timer handler in order to detect this condition. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-26 08:49:05 -07:00
Paul E. McKenney	37e377d282	rcu: Fixes to rcutorture error handling and cleanup The rcutorture initialization code ignored the error returns from rcu_torture_onoff_init() and rcu_torture_stall_init(). The rcutorture cleanup code failed to NULL out a number of pointers. These bugs will normally have no effect, but this commit fixes them nevertheless. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-24 20:55:39 -07:00
Paul E. McKenney	c57afe80db	rcu: Make RCU_FAST_NO_HZ account for pauses out of idle Both Steven Rostedt's new idle-capable trace macros and the RCU_NONIDLE() macro can cause RCU to momentarily pause out of idle without the rest of the system being involved. This can cause rcu_prepare_for_idle() to run through its state machine too quickly, which can in turn result in needless scheduling-clock interrupts. This commit therefore adds code to enable rcu_prepare_for_idle() to distinguish between an initial entry to idle on the one hand (which needs to advance the rcu_prepare_for_idle() state machine) and an idle reentry due to idle-capable trace macros and RCU_NONIDLE() on the other hand (which should avoid advancing the rcu_prepare_for_idle() state machine). Additional state is maintained to allow the timer to be correctly reposted when returning after a momentary pause out of idle, and even more state is maintained to detect when new non-lazy callbacks have been enqueued (which may require re-evaluation of the approach to idleness). Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-24 20:55:20 -07:00
Paul E. McKenney	2ee3dc8066	rcu: Make RCU_FAST_NO_HZ use timer rather than hrtimer The RCU_FAST_NO_HZ facility uses an hrtimer to wake up a CPU when it is allowed to go into dyntick-idle mode, which is almost always cancelled soon after. This is not what hrtimers are good at, so this commit switches to the timer wheel. Reported-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-24 20:55:19 -07:00
Paul E. McKenney	2fdbb31b66	rcu: Add RCU_FAST_NO_HZ tracing for idle exit Traces of rcu_prep_idle events can be confusing because rcu_cleanup_after_idle() does no tracing. This commit therefore adds this tracing. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-24 20:55:19 -07:00
Paul E. McKenney	6d8133919b	rcu: Document why rcu_blocking_is_gp() is safe The rcu_blocking_is_gp() function tests to see if there is only one online CPU, and if so, synchronize_sched() and friends become no-ops. However, for larger systems, num_online_cpus() scans a large vector, and might be preempted while doing so. While preempted, any number of CPUs might come online and go offline, potentially resulting in num_online_cpus() returning 1 when there never had only been one CPU online. This could result in a too-short RCU grace period, which could in turn result in total failure, except that the only way that the grace period is too short is if there is an RCU read-side critical section spanning it. For RCU-sched and RCU-bh (which are the only cases using rcu_blocking_is_gp()), RCU read-side critical sections have either preemption or bh disabled, which prevents CPUs from going offline. This in turn prevents actual failures from occurring. This commit therefore adds a large block comment to rcu_blocking_is_gp() documenting why it is safe. This commit also moves rcu_blocking_is_gp() into kernel/rcutree.c, which should help prevent unwary developers from mistaking it for a generally useful function. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-24 20:54:53 -07:00
Paul E. McKenney	dabb8aa960	rcu: Document kernel command-line parameters Bring RCU's kernel command-line parameter documentation up to date. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-24 20:54:53 -07:00
Paul E. McKenney	8932a63d5e	rcu: Reduce cache-miss initialization latencies for large systems Commit #0209f649 (rcu: limit rcu_node leaf-level fanout) set an upper limit of 16 on the leaf-level fanout for the rcu_node tree. This was needed to reduce lock contention that was induced by the synchronization of scheduling-clock interrupts, which was in turn needed to improve energy efficiency for moderate-sized lightly loaded servers. However, reducing the leaf-level fanout means that there are more leaf-level rcu_node structures in the tree, which in turn means that RCU's grace-period initialization incurs more cache misses. This is not a problem on moderate-sized servers with only a few tens of CPUs, but becomes a major source of real-time latency spikes on systems with many hundreds of CPUs. In addition, the workloads running on these large systems tend to be CPU-bound, which eliminates the energy-efficiency advantages of synchronizing scheduling-clock interrupts. Therefore, these systems need maximal values for the rcu_node leaf-level fanout. This commit addresses this problem by introducing a new kernel parameter named RCU_FANOUT_LEAF that directly controls the leaf-level fanout. This parameter defaults to 16 to handle the common case of a moderate sized lightly loaded servers, but may be set higher on larger systems. Reported-by: Mike Galbraith <efault@gmx.de> Reported-by: Dimitri Sivanich <sivanich@sgi.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-24 20:54:52 -07:00
Jan Engelhardt	d8169d4c36	rcu: Make __kfree_rcu() less dependent on compiler choices Currently, __kfree_rcu() is implemented as an inline function, and contains a BUILD_BUG_ON() that malfunctions if __kfree_rcu() is compiled as an out-of-line function. Unfortunately, there are compiler settings (e.g., -O0) that can result in __kfree_rcu() being compiled out of line, resulting in annoying build breakage. This commit therefore converts both __kfree_rcu() and __is_kfree_rcu_offset() from inline functions to macros to prevent such misbehavior on the part of the compiler. Signed-off-by: Jan Engelhardt <jengelh@medozas.de> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2012-04-24 20:54:51 -07:00
Paul E. McKenney	c9336643e1	rcu: Clarify help text for RCU_BOOST_PRIO The old text confused real-time applications with real-time threads, so that you pretty much needed to understand how this kernel configuration parameter worked to understand the help text. This commit therefore attempts to make the help text human-readable. Reported-by: Jörn Engel <joern@purestorage.com> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-24 20:54:50 -07:00
Michel Machado	f88022a4f6	rcu: Replace list_first_entry_rcu() with list_first_or_null_rcu() The list_first_entry_rcu() macro is inherently unsafe because it cannot be applied to an empty list. But because RCU readers do not exclude updaters, a list might become empty between the time that list_empty() claimed it was non-empty and the time that list_first_entry_rcu() is invoked. Therefore, the list_empty() test cannot be separated from the list_first_entry_rcu() call. This commit therefore combines these to macros to create a new list_first_or_null_rcu() macro that replaces the old (and unsafe) list_first_entry_rcu() macro. This patch incorporates Paul's review comments on the previous version of this patch available here: https://lkml.org/lkml/2012/4/2/536 This patch cannot break any upstream code because list_first_entry_rcu() is not being used anywhere in the kernel (tested with grep(1)), and any external code using it is probably broken as a result of using it. Signed-off-by: Michel Machado <michel@digirati.com.br> CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> CC: Dipankar Sarma <dipankar@in.ibm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-24 20:54:49 -07:00
Dave Jones	559f9badd1	rcu: List-debug variants of rcu list routines. * Make __list_add_rcu check the next->prev and prev->next pointers just like __list_add does. * Make list_del_rcu use __list_del_entry, which does the same checking at deletion time. Has been running for a week here without anything being tripped up, but it seems worth adding for completeness just in case something ever does corrupt those lists. Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-24 20:54:49 -07:00
Linus Torvalds	66f75a5d02	Linux 3.4-rc4	2012-04-21 14:47:52 -07:00
Yong Zhang	e9a5ea1852	sparc32,leon: add notify_cpu_starting() Otherwise cpu_active_mask will not set, which lead to other issue. Signed-off-by: Yong Zhang <yong.zhang0@gmail.com> Signed-off-by: Konrad Eisele <konrad@gaisler.com> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2012-04-21 16:35:06 -04:00
Linus Torvalds	8f4f9d4d3c	ARM: SoC fixes for 3.4-rc - at91, ux500, imx, omap and bcmring: - at91 fixes for =m driver build issues, irqdomain fixes and config dependency fixes - ux500 kconfig dependency fixes and a smp wakeup bugfix - imx idle bugfix and build fix due to irq domain changes - omap uart pinmux fixes, softreset regression revert and misc fixes - bcmring build error regression fix - ux500 and imx had some small defconfig updates in this branch -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPksdXAAoJEIwa5zzehBx3iqIP/ibJhM5QYWCXCoudOHouvXW7 FALYsWTJodKf3qN0SQtty3RdmEKjdvCHGPcwtSEtjIc5xngCtYPEd5JWNl1u94nb f0+rhELfHljG+5XSyqnZfPmAN74ApvULl52hGXVudLjEwCB4uoYV5BN4c0dyr3Of Wm1+9HUyUo/WwXbE9UxxJkLDVsB+eAm2iOeAcerxCqsgKUzUqGP3fgp2eVJ7Q8LT f9tiSzaLeQnbYVymNeAiCzk3L9lKFx5r3QoxH2QOW6ieNUqAZC11X3L9anj30joG Ns1dMf5jGbWoSbHGMbff+PWj1sRxOzuoksjZ/jEZ2eXgNF4maoFrPqFZe6+o4K36 pekjYwKfeuKdT9JXPSwJe3yQULxfWMTwvg2ZF86R9KOIcRGI2gwGDDTYZ4LOr8mp vQbOMWRGDNFxzWPZA9BfMDnG5AQxuoBlprnZQht2HqLar0dbHTx0qsqoxttOwenG GwnLG0ZiwsCkrXcAJ/PSSlHfmhEE37H8NsCzBXMeF1kpWwDJZROjEOdDQREuwtYB qcQ7GfQ9u8ysemMbXyrAM+SySsc9r8HXw/J2NMIlEBSdPba6CIgIQigUMYnWgFt8 oJAPuTsT4/VIJbqCLkEOai3m5oaGllZ0zd5+SxzSbmmhybtFUrttjMP4WCUTqtuX lEKPTuNko5mb+2q6MEW0 =mzPZ -----END PGP SIGNATURE----- Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull "ARM: SoC fixes" from Olof Johansson: * at91, ux500, imx, omap and bcmring: - at91 fixes for =m driver build issues, irqdomain fixes and config dependency fixes - ux500 kconfig dependency fixes and a smp wakeup bugfix - imx idle bugfix and build fix due to irq domain changes - omap uart pinmux fixes, softreset regression revert and misc fixes - bcmring build error regression fix * ux500 and imx had some small defconfig updates in this branch * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (27 commits) ARM: bcmring: fix UART declarations ARM: imx: Fix imx5 idle logic bug ARM: imx27-dt: Fix build due to removal of irq_domain_add_simple() ARM: imx_v4_v5_defconfig: Add support for CONFIG_REGULATOR_FIXED_VOLTAGE ARM: OMAP1: DMTIMER: fix broken timer clock source selection ARM: OMAP: serial: Fix the ocp smart idlemode handling bug ARM: OMAP2+: UART: Fix incorrect population of default uart pads ARM: OMAP: sram: fix BUG in dpll code for !PM case dmaengine: Kconfig: fix Atmel at_hdmac entry USB: gadget/at91_udc: add gpio_to_irq() function to vbus interrupt USB: ohci-at91: change annotations for probe/remove functions leds-atmel-pwm.c: Make pwmled_probe() __devinit ARM: at91: fix at91sam9261ek Ethernet dm9000 irq ARM: at91: fix rm9200ek flash size ARM: at91: remove empty at91_init_serial function ARM: at91: fix typo in at91_pmc_base assembly declaration ARM: at91: Export at91_matrix_base ARM: at91: Export at91_pmc_base ARM: at91: Export at91_ramc_base ARM: at91: Export at91_st_base ...	2012-04-21 12:45:52 -07:00
Linus Torvalds	126a3483d6	MMC fixes for 3.4-rc4: The major fixes here are: * Build fix for omap_hsmmc with OF against 3.4-rc1. * Fix CONFIG_MMC_UNSAFE_RESUME semantics regression against 3.3, which broke hotplug card detection when UNSAFE_RESUME is set. * Fix a race condition in omap_hsmmc with runtime PM. * Fix two libertas SDIO-powered-resume regressions. Also small fixes for discard/sanitize, dw_mmc, cd-gpio and esdhc-imx. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPkh1jAAoJEHNBYZ7TNxYMe0EQAIpvPan5J3eT9rQU2R0Fsgbd hd10NVOPQAQaBDc59r/jmTj5iqDQQ1PGkrHti94IECCPWr3dbeTamhmUfYt6+XXZ Ae9dlYQ8llKa6ZLSVPmNsMLmFnsqDek3LIsxk/viJIKqSGee/yOHeM6W7v6d2T+z GCLFtV3xADU4F9PrZxdfWOJ3kDsV9WCTYnn5/WR78maSURo8iAfck5u+isitO0rn NaXfgO29WIHlYmRHRpXMGk2pCmv8cMX8i89mBExeIWdfn40yGCZRi+kUb0VquR1W 7lqdeKEF0NY1EU9ejuPZK2VepvxEhC1ibOlPkGfgHBd52YLrDaSl3WcqKdzKILIm wYKEls8AU5ZnWiuRpJq7QpPemTsbFQpI72O6OgntYhwiKE2TkJ0BY3KtjvFpVCxi s+zqZsQwIt6rjJZyQzg1Y5yGA42JF75rkzWnRtRfWcSma83MFiZawAVAV/qGZsrn EddUdOodvvO3owC9dD4wI0Q2eCRSZog3sVO9uSQg8zVHvfRGD61GBvqguyR8mbiH +CUAMMD9mpKiApu3gvHbjn3OjZuKQ9AGDz0SYj+PbQEwsEBF+qXs5BeytBC8S1Yh cNNdpvzjZl7EM/lJWywZ1XQ741Tg1Qgdgtx8njsoilfE51UTlR9TUd04V0bg0Fk2 fS3P+BRKf+4WR+iGyXJ1 =Zl8K -----END PGP SIGNATURE----- Merge tag 'mmc-fixes-for-3.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc Pull MMC fixes from Chris Ball: - Build fix for omap_hsmmc with OF against 3.4-rc1. - Fix CONFIG_MMC_UNSAFE_RESUME semantics regression against 3.3, which broke hotplug card detection when UNSAFE_RESUME is set. - Fix a race condition in omap_hsmmc with runtime PM. - Fix two libertas SDIO-powered-resume regressions. - Small fixes for discard/sanitize, dw_mmc, cd-gpio and esdhc-imx. * tag 'mmc-fixes-for-3.4-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc: mmc: core: Do not pre-claim host in suspend mmc: dw_mmc: prevent NULL dereference for dma_ops mmc: unbreak sdhci-esdhc-imx on i.MX25 mmc: cd-gpio: Include header to pickup exported symbol prototypes mmc: sdhci: refine non-removable card checking for card detection mmc: dw_mmc: Fix switch from DMA to PIO mmc: remove MMC bus legacy suspend/resume method mmc: omap_hsmmc: Get rid of of_have_populated_dt() usage mmc: omap_hsmmc: build fix for CONFIG_OF=y and CONFIG_MMC_OMAP_HS=m mmc: fixes for eMMC v4.5 sanitize operation mmc: fixes for eMMC v4.5 discard operation	2012-04-21 12:44:37 -07:00
Linus Torvalds	8898159650	Merge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: - Fixes a regression at DVB core when switching from DVB-S2 to DVB-S on Kaffeine (Fedora 16 Bugzilla #812895); - Fixes a mutex unlock at an error condition at drx-k; - Fix winbond-cir set mode; - mt9m032: Fix a compilation breakage with some random Kconfig; - mt9m032: fix two dead locks; - xc5000: don't require an special firmware (that won't be provided by the vendor) just because the xtal frequency is different; - V4L DocBook: fix some typos at multi-plane formats description. * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] xc5000: support 32MHz & 31.875MHz xtal using the 41.024.5 firmware [media] V4L: mt9m032: fix compilation breakage [media] V4L: DocBook: Fix typos in the multi-plane formats description [media] V4L: mt9m032: fix two dead-locks [media] rc-core: set mode for winbond-cir [media] drxk: Does not unlock mutex if sanity check failed in scu_command() [media] dvb_frontend: Fix a regression when switching back to DVB-S	2012-04-21 12:43:23 -07:00
Linus Torvalds	9f24ff6f42	First MFD pull request for 3.4 fixes -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPkXdoAAoJEIqAPN1PVmxKOuMP/3K87kcpwUUI/vA0pSPYf58T Q+Sxsd85C6c0SOvE3MOI+1stibLAXeeT+MsmMKYIhmAXbTtKsmMW5TC1aTapJHQx kDGuhqiw5Zyk5tPrZ333cLBdgiDDr8qWUBRzcNCK5O1xuDET76JtQwqtehSoDXDh Afcg3BLzYA3HIz0nm+Wlll1yeyKrAg20dESOCvl1ptNbb2BVBSfaBpOqTjw6R88J BRtua//L9HGHQIRntYnrH6/nzwDAhkrw2m3p1ZGWG+y5j88cQy4s0/dtZ7FJ8ZAE qoUx2YqH6dPYGZa2A6XaOkF4hvDC6iAXawWllvsDxcQSYRWR4qxmHYm5KkxyT6y9 UACk+c7qdRmZgHfPcNNaq5CPDAEFvSFRKfDBpXUJdO6O/bVzBsA/P4fCjYFZ1FOC NQtouAbz2BpH1iwCMRWtTsCSwiVXSHQL/jR4vQrtXU6KwX1ArKF5W1zTvnbaK13c Bc9E4Se4Hn5Bs+FkJIbBnViAW/9gv7KUe9AtDjhcrUWkxZLswDnXhUd1k2x1Gxfp WQf29FZmoLiITA4ffsizqR6wC98lzIrHW29FdoSyTnz9SSoqo6J10l82w8ED45lJ wGanen7Txjsc2ub9GYqzCUYHGBitLfaQSkSvBIRSWc43Ju3b0l/esH12ioajjSEu sAvMHCkaR7l7NZVEt6rS =gHlK -----END PGP SIGNATURE----- Merge tag 'mfd-for-linus-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6 Pull MFD fixes from Samuel Ortiz: "We have 3 build fixes, a OMAP USB host PHY reset fix and the twl6040 conversion to an i2c driver. The latter may not sound like a fix but the twl6040 MFD driver won't probe without it, triggering an OMAP4 audio regression." * tag 'mfd-for-linus-3.4' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: mfd: Fix modular builds of rc5t583 regulator support mfd: Fix asic3_gpio_to_irq ARM: OMAP3: USB: Fix the EHCI ULPI PHY reset issue mfd: Convert twl6040 to i2c driver, and separate it from twl core mfd : Fix dbx500 compilation error	2012-04-21 12:42:12 -07:00
Al Viro	bfce281c28	kill mm argument of vm_munmap() it's always current->mm Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-04-21 01:58:20 -04:00
Al Viro	9f3a4afb27	perfmon: kill some helpers and arguments pfm_vm_munmap() is simply vm_munmap() and pfm_remove_smpl_mapping() always get current as the first argument. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-04-21 01:58:18 -04:00
Al Viro	936af1576e	aio: don't bother with unmapping when aio_free_ring() is coming from exit_aio() ... since exit_mmap() is coming and it will munmap() everything anyway. In all other cases aio_free_ring() has ctx->mm == current->mm; moreover, all other callers of vm_munmap() have mm == current->mm, so this will allow us to get rid of mm argument of vm_munmap(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-04-21 01:58:16 -04:00
Ulf Hansson	7c57091940	mmc: core: Do not pre-claim host in suspend Since SDIO drivers may want to do some SDIO operations in their suspend callback functions, we must not keep the host claimed when calling them. Daniel Drake reported that libertas_sdio encountered a deadlock in its suspend function. Signed-off-by: Ulf Hansson <ulf.hansson@stericsson.com> Tested-by: Daniel Drake <dsd@laptop.org> [stable@: please apply to 3.2-stable and 3.3-stable] Cc: stable <stable@vger.kernel.org> Signed-off-by: Chris Ball <cjb@laptop.org>	2012-04-20 21:52:13 -04:00
Jaehoon Chung	e1631f989e	mmc: dw_mmc: prevent NULL dereference for dma_ops Now, dma_ops is assumed that use the IDMAC. But if dma_ops is assigned the pdata->dma_ops, we didn't ensure that callback function is defined. If the callback isn't defined, then we should run in PIO mode. Signed-off-by: Jaehoon Chung <jh80.chung@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> Acked-by: Will Newton <will.newton@imgtec.com> Signed-off-by: Chris Ball <cjb@laptop.org>	2012-04-20 21:52:05 -04:00
Eric Bénard	b89152824f	mmc: unbreak sdhci-esdhc-imx on i.MX25 This was broken by me in 37865fe91582582a6f6c00652f6a2b1ff71f8a78 ("mmc: sdhci-esdhc-imx: fix timeout on i.MX's sdhci") where more extensive tests would have shown that read or write of data to the card were failing (even if the partition table was correctly read). Signed-off-by: Eric Bénard <eric@eukrea.com> Acked-by: Wolfram Sang <w.sang@pengutronix.de> Cc: stable <stable@vger.kernel.org> Signed-off-by: Chris Ball <cjb@laptop.org>	2012-04-20 20:45:00 -04:00
H Hartley Sweeten	5ca6518832	mmc: cd-gpio: Include header to pickup exported symbol prototypes Include the linux/mmc/cd-gpio.h header to pickup the prototypes for the two exported symbols. This quiets the sparse warnings: warning: symbol 'mmc_cd_gpio_request' was not declared. Should it be static? warning: symbol 'mmc_cd_gpio_free' was not declared. Should it be static? Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Acked-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de> Acked-by: Venkatraman S <svenkatr@ti.com> Signed-off-by: Chris Ball <cjb@laptop.org>	2012-04-20 20:45:00 -04:00
Daniel Drake	87b87a3fc0	mmc: sdhci: refine non-removable card checking for card detection Commit c79396c191bc19 ("mmc: sdhci: prevent card detection activity for non-removable cards") disables card detection where the cards are marked as non-removable. This makes sense, but the implementation detail of calling mmc_card_is_removable() causes some problems, because mmc_card_is_removable() is overloaded with CONFIG_MMC_UNSAFE_RESUME semantics. In the OLPC XO case, we need CONFIG_MMC_UNSAFE_RESUME because our root filesystem is stored on SD, but we also have external SD card slots where we want automatic card detection. Refine the check to only apply to hosts marked as MMC_CAP_NONREMOVABLE, which is defined to mean that the card is really nonremovable. This could be revisited in future if we find a way to improve CONFIG_MMC_UNSAFE_RESUME semantics. Signed-off-by: Daniel Drake <dsd@laptop.org> Acked-by: Chuanxiao Dong <chuanxiao.dong@intel.com> [stable@: please apply to 3.3-stable] Cc: stable <stable@vger.kernel.org> Signed-off-by: Chris Ball <cjb@laptop.org>	2012-04-20 20:44:25 -04:00
Seungwon Jeon	a99aa9b9b4	mmc: dw_mmc: Fix switch from DMA to PIO When dw_mci_pre_dma_transfer returns failure in some reasons, dw_mci_submit_data will prepare to switch the PIO mode from DMA. After switching to PIO mode, DMA(IDMAC in particular) is still enabled. This makes the corruption in handling interrupt and the driver lock-up. Signed-off-by: Seungwon Jeon <tgih.jun@samsung.com> Acked-by: Will Newton <will.newton@imgtec.com> Signed-off-by: Chris Ball <cjb@laptop.org>	2012-04-20 20:30:37 -04:00
Chuanxiao Dong	32d317c60e	mmc: remove MMC bus legacy suspend/resume method MMC bus is using legacy suspend/resume method, which is not compatible if runtime pm callbacks are used. In this scenario, MMC bus suspend/resume callbacks cannot be called when system entering S3. So change to use the new defined dev_pm_ops for system sleeping mode. Tested on AM335x Platform. Solves major issue/crash reported at http://www.mail-archive.com/linux-omap@vger.kernel.org/msg65425.html Signed-off-by: Chuanxiao Dong <chuanxiao.dong@intel.com> Tested-by: Hebbar, Gururaja <gururaja.hebbar@ti.com> Acked-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Ulf Hansson <ulf.hansson@stericsson.com> Signed-off-by: Chris Ball <cjb@laptop.org>	2012-04-20 20:30:19 -04:00
Linus Torvalds	6be5ceb02e	VM: add "vm_mmap()" helper function This continues the theme started with vm_brk() and vm_munmap(): vm_mmap() does the same thing as do_mmap(), but additionally does the required VM locking. This uninlines (and rewrites it to be clearer) do_mmap(), which sadly duplicates it in mm/mmap.c and mm/nommu.c. But that way we don't have to export our internal do_mmap_pgoff() function. Some day we hopefully don't have to export do_mmap() either, if all modular users can become the simpler vm_mmap() instead. We're actually very close to that already, with the notable exception of the (broken) use in i810, and a couple of stragglers in binfmt_elf. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-04-20 17:29:13 -07:00

1 2 3 4 5 ...

299548 Commits