Go to file
Petr Mladek 9bf3bc949f watchdog: cleanup handling of false positives
Commit d6ad3e286d ("softlockup: Add sched_clock_tick() to avoid kernel
warning on kgdb resume") introduced touch_softlockup_watchdog_sync().

It solved a problem when the watchdog was touched in an atomic context,
the timer callback was proceed right after releasing interrupts, and the
local clock has not been updated yet.  In this case, sched_clock_tick()
was called in watchdog_timer_fn() before updating the timer.

So far so good.

Later commit 5d1c0f4a80 ("watchdog: add check for suspended vm in
softlockup detector") added two kvm_check_and_clear_guest_paused()
calls.  They touch the watchdog when the guest has been sleeping.

The code makes my head spin around.

Scenario 1:

    + guest did sleep:
	+ PVCLOCK_GUEST_STOPPED is set

    + 1st watchdog_timer_fn() invocation:
	+ the watchdog is not touched yet
	+ is_softlockup() returns too big delay
	+ kvm_check_and_clear_guest_paused():
	   + clear PVCLOCK_GUEST_STOPPED
	   + call touch_softlockup_watchdog_sync()
		+ set SOFTLOCKUP_DELAY_REPORT
		+ set softlockup_touch_sync
	+ return from the timer callback

      + 2nd watchdog_timer_fn() invocation:

	+ call sched_clock_tick() even though it is not needed.
	  The timer callback was invoked again only because the clock
	  has already been updated in the meantime.

	+ call kvm_check_and_clear_guest_paused() that does nothing
	  because PVCLOCK_GUEST_STOPPED has been cleared already.

	+ call update_report_ts() and return. This is fine. Except
	  that sched_clock_tick() might allow to set it already
	  during the 1st invocation.

Scenario 2:

	+ guest did sleep

	+ 1st watchdog_timer_fn() invocation
	    + same as in 1st scenario

	+ guest did sleep again:
	    + set PVCLOCK_GUEST_STOPPED again

	+ 2nd watchdog_timer_fn() invocation
	    + SOFTLOCKUP_DELAY_REPORT is set from 1st invocation
	    + call sched_clock_tick()
	    + call kvm_check_and_clear_guest_paused()
		+ clear PVCLOCK_GUEST_STOPPED
		+ call touch_softlockup_watchdog_sync()
		    + set SOFTLOCKUP_DELAY_REPORT
		    + set softlockup_touch_sync
	    + call update_report_ts() (set real timestamp immediately)
	    + return from the timer callback

	+ 3rd watchdog_timer_fn() invocation
	    + timestamp is set from 2nd invocation
	    + softlockup_touch_sync is set but not checked because
	      the real timestamp is already set

Make the code more straightforward:

1. Always call kvm_check_and_clear_guest_paused() at the very
   beginning to handle PVCLOCK_GUEST_STOPPED. It touches the watchdog
   when the quest did sleep.

2. Handle the situation when the watchdog has been touched
   (SOFTLOCKUP_DELAY_REPORT is set).

   Call sched_clock_tick() when touch_*sync() variant was used. It makes
   sure that the timestamp will be up to date even when it has been
   touched in atomic context or quest did sleep.

As a result, kvm_check_and_clear_guest_paused() is called on a single
location.  And the right timestamp is always set when returning from the
timer callback.

Link: https://lkml.kernel.org/r/20210311122130.6788-7-pmladek@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Laurence Oberman <loberman@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-04-30 11:20:36 -07:00
arch arch/sh/include/asm/tlb.h: remove duplicate include 2021-04-30 11:20:35 -07:00
block SCSI misc on 20210428 2021-04-28 17:22:10 -07:00
certs certs: add 'x509_revocation_list' to gitignore 2021-04-26 10:48:07 -07:00
crypto for-5.13/drivers-2021-04-27 2021-04-28 14:39:37 -07:00
Documentation Kconfig updates for v5.13 2021-04-29 14:32:00 -07:00
drivers Networking changes for 5.13. 2021-04-29 11:57:23 -07:00
fs vfs: fs_parser: clean up kernel-doc warnings 2021-04-30 11:20:35 -07:00
include include/linux/compiler-gcc.h: sparse can do constant folding of __builtin_bswap*() 2021-04-30 11:20:35 -07:00
init Kconfig updates for v5.13 2021-04-29 14:32:00 -07:00
ipc fs: make helpers idmap mount aware 2021-01-24 14:27:20 +01:00
kernel watchdog: cleanup handling of false positives 2021-04-30 11:20:36 -07:00
lib Kbuild updates for v5.13 2021-04-29 14:24:39 -07:00
LICENSES LICENSES: Add the CC-BY-4.0 license 2020-12-08 10:33:27 -07:00
mm \n 2021-04-29 11:06:13 -07:00
net Networking changes for 5.13. 2021-04-29 11:57:23 -07:00
samples kfifo: fix ternary sign extension bugs 2021-04-30 11:20:35 -07:00
scripts scripts: a new script for checking duplicate struct declaration 2021-04-30 11:20:35 -07:00
security Networking changes for 5.13. 2021-04-29 11:57:23 -07:00
sound - Core Frameworks 2021-04-28 15:59:13 -07:00
tools Kbuild updates for v5.13 2021-04-29 14:24:39 -07:00
usr Kbuild updates for v5.12 2021-02-25 10:17:31 -08:00
virt KVM: x86/mmu: Consider the hva in mmu_notifier retry 2021-02-22 13:16:53 -05:00
.clang-format cxl for 5.12 2021-02-24 09:38:36 -08:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore kbuild: generate Module.symvers only when vmlinux exists 2021-04-25 05:17:02 +09:00
.mailmap It's been a relatively busy cycle in docsland, though more than usually 2021-04-26 13:22:43 -07:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS - Core Frameworks 2021-04-28 15:59:13 -07:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS Kbuild updates for v5.13 2021-04-29 14:24:39 -07:00
Makefile Kconfig updates for v5.13 2021-04-29 14:32:00 -07:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.