linux

iv/linux

History

Ben Zhang 62572e29bc kernel/watchdog.c: touch_nmi_watchdog should only touch local cpu not every one I ran into a scenario where while one cpu was stuck and should have panic'd because of the NMI watchdog, it didn't. The reason was another cpu was spewing stack dumps on to the console. Upon investigation, I noticed that when writing to the console and also when dumping the stack, the watchdog is touched. This causes all the cpus to reset their NMI watchdog flags and the 'stuck' cpu just spins forever. This change causes the semantics of touch_nmi_watchdog to be changed slightly. Previously, I accidentally changed the semantics and we noticed there was a codepath in which touch_nmi_watchdog could be touched from a preemtible area. That caused a BUG() to happen when CONFIG_DEBUG_PREEMPT was enabled. I believe it was the acpi code. My attempt here re-introduces the change to have the touch_nmi_watchdog() code only touch the local cpu instead of all of the cpus. But instead of using __get_cpu_var(), I use the __raw_get_cpu_var() version. This avoids the preemption problem. However my reasoning wasn't because I was trying to be lazy. Instead I rationalized it as, well if preemption is enabled then interrupts should be enabled to and the NMI watchdog will have no reason to trigger. So it won't matter if the wrong cpu is touched because the percpu interrupt counters the NMI watchdog uses should still be incrementing. Don said: : I'm ok with this patch, though it does alter the behaviour of how : touch_nmi_watchdog works. For the most part I don't think most callers : need to touch all of the watchdogs (on each cpu). Perhaps a corner case : will pop up (the scheduler?? to mimic touch_all_softlockup_watchdogs() ). : : But this does address an issue where if a system is locked up and one cpu : is spewing out useful debug messages (or error messages), the hard lockup : will fail to go off. We have seen this on RHEL also. Signed-off-by: Don Zickus <dzickus@redhat.com> Signed-off-by: Ben Zhang <benzh@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>		2014-04-03 16:20:58 -07:00
..
debug	KGDB: make kgdb_breakpoint() as noinline	2014-02-26 11:16:25 +00:00
events	perf: Optimize group_sched_in()	2014-02-27 12:43:26 +01:00
gcov	gcov: reuse kbasename helper	2013-11-13 12:09:34 +09:00
irq	genirq: Export symbol no_action()	2014-03-22 11:33:09 +01:00
locking	Merge branch 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2014-03-31 14:13:25 -07:00
power	Merge branches 'pm-runtime' and 'pm-sleep'	2014-03-20 13:25:54 +01:00
printk	printk: fix syslog() overflowing user buffer	2014-02-17 12:24:45 -08:00
rcu	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2014-03-31 11:21:19 -07:00
sched	Merge branch 'sched-idle-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2014-04-02 16:22:27 -07:00
time	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2014-04-01 11:00:07 -07:00
trace	Merge branch 'for-3.15/core' of git://git.kernel.dk/linux-block	2014-04-01 19:19:15 -07:00
.gitignore	Ignore generated file kernel/x509_certificate_list	2013-12-10 18:21:34 +00:00
acct.c
async.c
audit_tree.c	inotify: Fix reporting of cookies for inotify events	2014-02-18 11:17:17 +01:00
audit_watch.c	inotify: Fix reporting of cookies for inotify events	2014-02-18 11:17:17 +01:00
audit.c	AUDIT: Allow login in non-init namespaces	2014-03-30 17:02:53 -07:00
audit.h	audit: Use struct net not pid_t to remember the network namespce to reply in	2014-02-28 04:04:33 -08:00
auditfilter.c	audit: Update kdoc for audit_send_reply and audit_list_rules_send	2014-03-08 15:31:54 -08:00
auditsc.c	execve: use 'struct filename *' for executable name passing	2014-02-05 12:54:53 -08:00
backtracetest.c
bounds.c	mm: do not allocate page->ptl dynamically, if spinlock_t fits to long	2013-12-20 12:25:45 -08:00
capability.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security	2014-04-03 09:26:18 -07:00
cgroup_freezer.c	cgroup: replace cftype->read_seq_string() with cftype->seq_show()	2013-12-05 12:28:04 -05:00
cgroup.c	cgroup: fix a failure path in create_css()	2014-03-18 17:15:36 -04:00
compat.c	Merge branch 'x86-x32-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2014-04-02 12:51:41 -07:00
configs.c
context_tracking.c	context_tracking: Wrap static key check into more intuitive function name	2013-12-02 20:43:14 +01:00
cpu_pm.c
cpu.c	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2013-11-14 16:55:11 +09:00
cpuset.c	cpuset: fix a race condition in __cpuset_node_allowed_softwall()	2014-02-27 09:39:54 -05:00
crash_dump.c
cred.c
delayacct.c	kernel/delayacct.c: remove redundant checking in __delayacct_add_tsk()	2013-11-13 12:09:12 +09:00
dma.c
elfcore.c	switch elf_core_write_extra_phdrs() to dump_emit()	2013-11-09 00:16:23 -05:00
exec_domain.c
exit.c	introduce for_each_thread() to replace the buggy while_each_thread()	2014-01-21 16:19:46 -08:00
extable.c	asmlinkage: Make main_extable_sort_needed visible	2014-02-13 18:13:22 -08:00
fork.c	sched/numa: Move task_numa_free() to __put_task_struct()	2014-03-11 12:05:43 +01:00
freezer.c	libata, freezer: avoid block device removal while system is frozen	2013-12-19 13:50:32 -05:00
futex_compat.c	compat: Get rid of (get\|put)_compat_time(val\|spec)	2014-02-02 14:09:12 -08:00
futex.c	Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2014-03-31 10:59:39 -07:00
groups.c	userns: Kill nsown_capable it makes the wrong thing easy	2013-08-30 23:44:11 -07:00
hrtimer.c	timer: Remove code redundancy while calling get_nohz_timer_target()	2014-03-20 12:35:46 +01:00
hung_task.c	hung_task: Display every hung task warning	2014-01-25 12:13:33 +01:00
irq_work.c	perf/x86: Warn to early_printk() in case irq_work is too slow	2014-02-21 21:49:07 +01:00
itimer.c
jump_label.c	static_key: WARN on usage before jump_label_init was called	2013-10-19 19:45:35 -04:00
kallsyms.c
kcmp.c
Kconfig.freezer
Kconfig.hz	kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS	2013-11-15 09:32:22 +09:00
Kconfig.locks
Kconfig.preempt
kexec.c	kexec/compat: convert to COMPAT_SYSCALL_DEFINE with changing parameter types	2014-03-06 16:30:46 +01:00
kmod.c	execve: use 'struct filename *' for executable name passing	2014-02-05 12:54:53 -08:00
kprobes.c	kprobes: use KSYM_NAME_LEN to size identifier buffers	2013-11-13 12:09:26 +09:00
ksysfs.c	rcu: Fix sparse warning for rcu_expedited from kernel/ksysfs.c	2014-02-26 06:35:16 -08:00
kthread.c	kthread: ensure locality of task_struct allocations	2014-04-03 16:20:49 -07:00
latencytop.c
Makefile	Merge branch 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2014-03-31 14:13:25 -07:00
module_signing.c	keys: change asymmetric keys to use common hash definitions	2013-10-25 17:15:18 -04:00
module-internal.h	KEYS: Separate the kernel signature checking keyring from module signing	2013-09-25 17:17:01 +01:00
module.c	Merge branch 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2014-03-31 14:13:25 -07:00
notifier.c	notifier: Substitute rcu_access_pointer() for rcu_dereference_raw()	2014-02-26 06:35:13 -08:00
nsproxy.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace	2013-09-07 14:35:32 -07:00
padata.c	padata: Fix wrong usage of rcu_dereference()	2013-12-05 21:28:42 +08:00
panic.c	Merge branch 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2014-03-31 14:13:25 -07:00
params.c	params: improve standard definitions	2013-12-04 14:09:46 +10:30
pid_namespace.c	pid_namespace: pidns_get() should check task_active_pid_ns() != NULL	2014-04-02 16:20:21 -07:00
pid.c	pidns: fix free_pid() to handle the first fork failure	2013-09-30 14:31:03 -07:00
posix-cpu-timers.c	posix-timers: Convert abuses of BUG_ON to WARN_ON	2013-12-09 16:56:29 +01:00
posix-timers.c
profile.c	mm: fix GFP_THISNODE callers and clarify	2014-03-10 17:26:19 -07:00
ptrace.c	kernel/compat: convert to COMPAT_SYSCALL_DEFINE	2014-03-06 15:35:10 +01:00
range.c
reboot.c	kexec: migrate to reboot cpu	2013-12-18 19:04:50 -08:00
relay.c	treewide: Fix typo in Documentation/DocBook	2014-02-19 14:58:17 +01:00
res_counter.c	memcg: reduce function dereference	2013-09-12 15:38:02 -07:00
resource.c	resources: Set type in __request_region()	2014-03-19 15:00:16 -06:00
seccomp.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security	2014-04-03 09:26:18 -07:00
signal.c	Merge branch 'master' into for-next	2014-02-20 14:54:28 +01:00
smp.c	smp: Rename __smp_call_function_single() to smp_call_function_single_async()	2014-02-24 14:47:15 -08:00
smpboot.c
smpboot.h
softirq.c	softirq: Add linux/irq.h to make it compile again	2014-03-19 11:28:14 +01:00
stacktrace.c
stop_machine.c	stop_machine: Fix^2 race between stop_two_cpus() and stop_cpus()	2014-03-11 11:33:47 +01:00
sys_ni.c
sys.c	sys: Replace hardcoding of -20 and 19 with MIN_NICE and MAX_NICE	2014-02-22 18:16:19 +01:00
sysctl_binary.c	kernel/sysctl_binary.c: use scnprintf() instead of snprintf()	2013-11-13 12:09:33 +09:00
sysctl.c	Merge branch 'for-3.15/core' of git://git.kernel.dk/linux-block	2014-04-01 19:19:15 -07:00
system_certificates.S	KEYS: correct alignment of system_certificate_list content in assembly file	2013-12-10 18:25:28 +00:00
system_keyring.c	KEYS: correct alignment of system_certificate_list content in assembly file	2013-12-10 18:25:28 +00:00
task_work.c	task_work: documentation	2013-09-11 15:58:27 -07:00
taskstats.c	genetlink: only pass array to genl_register_family_with_ops()	2013-11-19 16:39:05 -05:00
test_kprobes.c
time.c
timeconst.bc
timer.c	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2014-04-01 11:00:07 -07:00
torture.c	rcutorture: Gracefully handle NULL cleanup hooks	2014-02-23 09:04:39 -08:00
tracepoint.c	tracing: Do not add event files for modules that fail tracepoints	2014-03-03 21:11:05 -05:00
tsacct.c
uid16.c	userns: Kill nsown_capable it makes the wrong thing easy	2013-08-30 23:44:11 -07:00
up.c	smp: Rename __smp_call_function_single() to smp_call_function_single_async()	2014-02-24 14:47:15 -08:00
user_namespace.c	user_namespace.c: Remove duplicated word in comment	2014-02-20 11:58:35 -08:00
user-return-notifier.c
user.c	KEYS: fix uninitialized persistent_keyring_register_sem	2013-12-13 15:59:11 +00:00
utsname_sysctl.c
utsname.c	userns: Kill nsown_capable it makes the wrong thing easy	2013-08-30 23:44:11 -07:00
watchdog.c	kernel/watchdog.c: touch_nmi_watchdog should only touch local cpu not every one	2014-04-03 16:20:58 -07:00
workqueue_internal.h
workqueue.c	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2014-04-01 11:00:07 -07:00