2019-06-01 10:08:42 +02:00
// SPDX-License-Identifier: GPL-2.0-only
2005-04-16 15:20:36 -07:00
/*
2009-06-10 01:27:49 +02:00
* kernel / power / hibernate . c - Hibernation ( a . k . a suspend - to - disk ) support .
2005-04-16 15:20:36 -07:00
*
* Copyright ( c ) 2003 Patrick Mochel
* Copyright ( c ) 2003 Open Source Development Lab
2010-07-18 14:27:13 +02:00
* Copyright ( c ) 2004 Pavel Machek < pavel @ ucw . cz >
2009-06-10 01:27:49 +02:00
* Copyright ( c ) 2009 Rafael J . Wysocki , Novell Inc .
2012-06-16 00:09:58 +02:00
* Copyright ( C ) 2012 Bojan Smojver < bojan @ rexursive . com >
2005-04-16 15:20:36 -07:00
*/
2020-01-02 15:19:40 -08:00
# define pr_fmt(fmt) "PM: hibernation: " fmt
2017-02-24 00:26:15 +01:00
2023-05-31 14:55:24 +02:00
# include <linux/blkdev.h>
2011-05-26 16:00:52 -04:00
# include <linux/export.h>
2005-04-16 15:20:36 -07:00
# include <linux/suspend.h>
# include <linux/reboot.h>
# include <linux/string.h>
# include <linux/device.h>
2011-10-06 20:34:46 +02:00
# include <linux/async.h>
2005-04-16 15:20:36 -07:00
# include <linux/delay.h>
# include <linux/fs.h>
2005-07-12 13:58:07 -07:00
# include <linux/mount.h>
2005-09-22 21:43:46 -07:00
# include <linux/pm.h>
2017-02-08 18:51:31 +01:00
# include <linux/nmi.h>
2006-10-11 01:20:45 -07:00
# include <linux/console.h>
2006-09-25 23:32:48 -07:00
# include <linux/cpu.h>
2006-12-06 20:34:23 -08:00
# include <linux/freezer.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 17:04:11 +09:00
# include <linux/gfp.h>
2011-03-15 00:43:46 +01:00
# include <linux/syscore_ops.h>
2012-05-14 21:45:31 +02:00
# include <linux/ctype.h>
2014-10-30 11:04:53 -07:00
# include <linux/ktime.h>
2019-08-19 17:17:46 -07:00
# include <linux/security.h>
2021-07-07 18:08:07 -07:00
# include <linux/secretmem.h>
2014-06-06 05:40:17 -07:00
# include <trace/events/power.h>
2005-07-12 13:58:07 -07:00
2005-04-16 15:20:36 -07:00
# include "power.h"
2011-10-11 23:29:18 -07:00
static int nocompress ;
static int noresume ;
2014-06-13 13:30:35 -07:00
static int nohibernate ;
2011-10-11 23:29:18 -07:00
static int resume_wait ;
2014-05-14 19:08:46 +03:00
static unsigned int resume_delay ;
2008-02-04 22:30:06 -08:00
static char resume_file [ 256 ] = CONFIG_PM_STD_PARTITION ;
2005-04-16 15:20:36 -07:00
dev_t swsusp_resume_device ;
2006-12-06 20:34:12 -08:00
sector_t swsusp_resume_block ;
2013-08-05 15:02:49 -07:00
__visible int in_suspend __nosavedata ;
2005-04-16 15:20:36 -07:00
2024-02-14 13:09:32 +05:30
static char hibernate_compressor [ CRYPTO_MAX_ALG_NAME ] = CONFIG_HIBERNATION_DEF_COMP ;
2024-01-22 18:45:26 +05:30
/*
* Compression / decompression algorithm to be used while saving / loading
* image to / from disk . This would later be used in ' kernel / power / swap . c '
* to allocate comp streams .
*/
char hib_comp_algo [ CRYPTO_MAX_ALG_NAME ] ;
2007-05-09 02:33:18 -07:00
enum {
HIBERNATION_INVALID ,
HIBERNATION_PLATFORM ,
HIBERNATION_SHUTDOWN ,
HIBERNATION_REBOOT ,
2012-06-16 00:09:58 +02:00
# ifdef CONFIG_SUSPEND
HIBERNATION_SUSPEND ,
# endif
2016-07-22 10:30:47 +08:00
HIBERNATION_TEST_RESUME ,
2007-05-09 02:33:18 -07:00
/* keep last */
__HIBERNATION_AFTER_LAST
} ;
# define HIBERNATION_MAX (__HIBERNATION_AFTER_LAST-1)
# define HIBERNATION_FIRST (HIBERNATION_INVALID + 1)
static int hibernation_mode = HIBERNATION_SHUTDOWN ;
2011-12-01 22:33:10 +01:00
bool freezer_test_done ;
2011-11-18 23:02:42 +01:00
2010-11-09 21:48:49 +01:00
static const struct platform_hibernation_ops * hibernation_ops ;
2007-05-09 02:33:18 -07:00
2020-05-07 09:19:52 +02:00
static atomic_t hibernate_atomic = ATOMIC_INIT ( 1 ) ;
bool hibernate_acquire ( void )
{
return atomic_add_unless ( & hibernate_atomic , - 1 , 0 ) ;
}
void hibernate_release ( void )
{
atomic_inc ( & hibernate_atomic ) ;
}
2014-06-13 13:30:35 -07:00
bool hibernation_available ( void )
{
2021-07-07 18:08:07 -07:00
return nohibernate = = 0 & &
! security_locked_down ( LOCKDOWN_HIBERNATION ) & &
2022-04-22 15:58:11 -07:00
! secretmem_active ( ) & & ! cxl_mem_active ( ) ;
2014-06-13 13:30:35 -07:00
}
2007-05-09 02:33:18 -07:00
/**
2011-05-24 23:36:06 +02:00
* hibernation_set_ops - Set the global hibernate operations .
* @ ops : Hibernation operations to use in subsequent hibernation transitions .
2007-05-09 02:33:18 -07:00
*/
2010-11-09 21:48:49 +01:00
void hibernation_set_ops ( const struct platform_hibernation_ops * ops )
2007-05-09 02:33:18 -07:00
{
2022-08-22 13:18:17 +02:00
unsigned int sleep_flags ;
2008-01-08 00:08:44 +01:00
if ( ops & & ! ( ops - > begin & & ops - > end & & ops - > pre_snapshot
& & ops - > prepare & & ops - > finish & & ops - > enter & & ops - > pre_restore
2010-11-26 23:07:48 +01:00
& & ops - > restore_cleanup & & ops - > leave ) ) {
2007-05-09 02:33:18 -07:00
WARN_ON ( 1 ) ;
return ;
}
2022-08-22 13:18:17 +02:00
sleep_flags = lock_system_sleep ( ) ;
2007-05-09 02:33:18 -07:00
hibernation_ops = ops ;
if ( ops )
hibernation_mode = HIBERNATION_PLATFORM ;
else if ( hibernation_mode = = HIBERNATION_PLATFORM )
hibernation_mode = HIBERNATION_SHUTDOWN ;
2022-08-22 13:18:17 +02:00
unlock_system_sleep ( sleep_flags ) ;
2007-05-09 02:33:18 -07:00
}
2013-11-19 12:27:41 +00:00
EXPORT_SYMBOL_GPL ( hibernation_set_ops ) ;
2007-05-09 02:33:18 -07:00
2009-01-19 20:54:54 +01:00
static bool entering_platform_hibernation ;
bool system_entering_hibernation ( void )
{
return entering_platform_hibernation ;
}
EXPORT_SYMBOL ( system_entering_hibernation ) ;
2007-11-19 23:42:31 +01:00
# ifdef CONFIG_PM_DEBUG
static void hibernation_debug_sleep ( void )
{
2020-01-02 15:19:40 -08:00
pr_info ( " debug: Waiting for 5 seconds. \n " ) ;
2007-11-19 23:42:31 +01:00
mdelay ( 5000 ) ;
}
static int hibernation_test ( int level )
{
if ( pm_test_level = = level ) {
hibernation_debug_sleep ( ) ;
return 1 ;
}
return 0 ;
}
# else /* !CONFIG_PM_DEBUG */
static int hibernation_test ( int level ) { return 0 ; }
# endif /* !CONFIG_PM_DEBUG */
2007-10-18 03:04:42 -07:00
/**
2011-05-24 23:36:06 +02:00
* platform_begin - Call platform to start hibernation .
* @ platform_mode : Whether or not to use the platform driver .
2007-10-18 03:04:42 -07:00
*/
2008-01-08 00:08:44 +01:00
static int platform_begin ( int platform_mode )
2007-10-18 03:04:42 -07:00
{
return ( platform_mode & & hibernation_ops ) ?
2019-05-16 12:43:19 +02:00
hibernation_ops - > begin ( PMSG_FREEZE ) : 0 ;
2008-01-08 00:08:44 +01:00
}
/**
2011-05-24 23:36:06 +02:00
* platform_end - Call platform to finish transition to the working state .
* @ platform_mode : Whether or not to use the platform driver .
2008-01-08 00:08:44 +01:00
*/
static void platform_end ( int platform_mode )
{
if ( platform_mode & & hibernation_ops )
hibernation_ops - > end ( ) ;
2007-10-18 03:04:42 -07:00
}
2007-05-09 02:33:18 -07:00
2006-12-06 20:34:21 -08:00
/**
2011-05-24 23:36:06 +02:00
* platform_pre_snapshot - Call platform to prepare the machine for hibernation .
* @ platform_mode : Whether or not to use the platform driver .
*
* Use the platform driver to prepare the system for creating a hibernate image ,
* if so configured , and return an error code if that fails .
2006-12-06 20:34:21 -08:00
*/
2007-10-18 03:04:42 -07:00
static int platform_pre_snapshot ( int platform_mode )
2006-12-06 20:34:21 -08:00
{
2007-07-19 01:47:29 -07:00
return ( platform_mode & & hibernation_ops ) ?
2007-10-18 03:04:42 -07:00
hibernation_ops - > pre_snapshot ( ) : 0 ;
2007-05-09 02:33:18 -07:00
}
2006-12-06 20:34:21 -08:00
2007-10-18 03:04:55 -07:00
/**
2011-05-24 23:36:06 +02:00
* platform_leave - Call platform to prepare a transition to the working state .
* @ platform_mode : Whether or not to use the platform driver .
*
* Use the platform driver prepare to prepare the machine for switching to the
* normal mode of operation .
*
* This routine is called on one CPU with interrupts disabled .
2007-10-18 03:04:55 -07:00
*/
static void platform_leave ( int platform_mode )
{
if ( platform_mode & & hibernation_ops )
hibernation_ops - > leave ( ) ;
}
2007-05-09 02:33:18 -07:00
/**
2011-05-24 23:36:06 +02:00
* platform_finish - Call platform to switch the system to the working state .
* @ platform_mode : Whether or not to use the platform driver .
*
* Use the platform driver to switch the machine to the normal mode of
* operation .
*
* This routine must be called after platform_prepare ( ) .
2007-05-09 02:33:18 -07:00
*/
2007-07-19 01:47:29 -07:00
static void platform_finish ( int platform_mode )
2007-05-09 02:33:18 -07:00
{
2007-07-19 01:47:29 -07:00
if ( platform_mode & & hibernation_ops )
2007-05-09 02:33:18 -07:00
hibernation_ops - > finish ( ) ;
2006-12-06 20:34:21 -08:00
}
swsusp: introduce restore platform operations
At least on some machines it is necessary to prepare the ACPI firmware for the
restoration of the system memory state from the hibernation image if the
"platform" mode of hibernation has been used. Namely, in that cases we need
to disable the GPEs before replacing the "boot" kernel with the "frozen"
kernel (cf. http://bugzilla.kernel.org/show_bug.cgi?id=7887). After the
restore they will be re-enabled by hibernation_ops->finish(), but if the
restore fails, they have to be re-enabled by the restore code explicitly.
For this purpose we can introduce two additional hibernation operations,
called pre_restore() and restore_cleanup() and call them from the restore code
path. Still, they should be called if the "platform" mode of hibernation has
been used, so we need to pass the information about the hibernation mode from
the "frozen" kernel to the "boot" kernel in the image header.
Apparently, we can't drop the disabling of GPEs before the restore because of
Bug #7887 . We also can't do it unconditionally, because the GPEs wouldn't
have been enabled after a successful restore if the suspend had been done in
the 'shutdown' or 'reboot' mode.
In principle we could (and probably should) unconditionally disable the GPEs
before each snapshot creation *and* before the restore, but then we'd have to
unconditionally enable them after the snapshot creation as well as after the
restore (or restore failure) Still, for this purpose we'd need to modify
acpi_enter_sleep_state_prep() and acpi_leave_sleep_state() and we'd have to
introduce some mechanism synchronizing the disablind/enabling of the GPEs with
the device drivers' .suspend()/.resume() routines and with
disable_/enable_nonboot_cpus(). However, this would have affected the
suspend (ie. s2ram) code as well as the hibernation, which I'd like to avoid
in this patch series.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 01:47:30 -07:00
/**
2011-05-24 23:36:06 +02:00
* platform_pre_restore - Prepare for hibernate image restoration .
* @ platform_mode : Whether or not to use the platform driver .
*
* Use the platform driver to prepare the system for resume from a hibernation
* image .
*
* If the restore fails after this function has been called ,
* platform_restore_cleanup ( ) must be called .
swsusp: introduce restore platform operations
At least on some machines it is necessary to prepare the ACPI firmware for the
restoration of the system memory state from the hibernation image if the
"platform" mode of hibernation has been used. Namely, in that cases we need
to disable the GPEs before replacing the "boot" kernel with the "frozen"
kernel (cf. http://bugzilla.kernel.org/show_bug.cgi?id=7887). After the
restore they will be re-enabled by hibernation_ops->finish(), but if the
restore fails, they have to be re-enabled by the restore code explicitly.
For this purpose we can introduce two additional hibernation operations,
called pre_restore() and restore_cleanup() and call them from the restore code
path. Still, they should be called if the "platform" mode of hibernation has
been used, so we need to pass the information about the hibernation mode from
the "frozen" kernel to the "boot" kernel in the image header.
Apparently, we can't drop the disabling of GPEs before the restore because of
Bug #7887 . We also can't do it unconditionally, because the GPEs wouldn't
have been enabled after a successful restore if the suspend had been done in
the 'shutdown' or 'reboot' mode.
In principle we could (and probably should) unconditionally disable the GPEs
before each snapshot creation *and* before the restore, but then we'd have to
unconditionally enable them after the snapshot creation as well as after the
restore (or restore failure) Still, for this purpose we'd need to modify
acpi_enter_sleep_state_prep() and acpi_leave_sleep_state() and we'd have to
introduce some mechanism synchronizing the disablind/enabling of the GPEs with
the device drivers' .suspend()/.resume() routines and with
disable_/enable_nonboot_cpus(). However, this would have affected the
suspend (ie. s2ram) code as well as the hibernation, which I'd like to avoid
in this patch series.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 01:47:30 -07:00
*/
static int platform_pre_restore ( int platform_mode )
{
return ( platform_mode & & hibernation_ops ) ?
hibernation_ops - > pre_restore ( ) : 0 ;
}
/**
2011-05-24 23:36:06 +02:00
* platform_restore_cleanup - Switch to the working state after failing restore .
* @ platform_mode : Whether or not to use the platform driver .
*
* Use the platform driver to switch the system to the normal mode of operation
* after a failing restore .
*
* If platform_pre_restore ( ) has been called before the failing restore , this
* function must be called too , regardless of the result of
* platform_pre_restore ( ) .
swsusp: introduce restore platform operations
At least on some machines it is necessary to prepare the ACPI firmware for the
restoration of the system memory state from the hibernation image if the
"platform" mode of hibernation has been used. Namely, in that cases we need
to disable the GPEs before replacing the "boot" kernel with the "frozen"
kernel (cf. http://bugzilla.kernel.org/show_bug.cgi?id=7887). After the
restore they will be re-enabled by hibernation_ops->finish(), but if the
restore fails, they have to be re-enabled by the restore code explicitly.
For this purpose we can introduce two additional hibernation operations,
called pre_restore() and restore_cleanup() and call them from the restore code
path. Still, they should be called if the "platform" mode of hibernation has
been used, so we need to pass the information about the hibernation mode from
the "frozen" kernel to the "boot" kernel in the image header.
Apparently, we can't drop the disabling of GPEs before the restore because of
Bug #7887 . We also can't do it unconditionally, because the GPEs wouldn't
have been enabled after a successful restore if the suspend had been done in
the 'shutdown' or 'reboot' mode.
In principle we could (and probably should) unconditionally disable the GPEs
before each snapshot creation *and* before the restore, but then we'd have to
unconditionally enable them after the snapshot creation as well as after the
restore (or restore failure) Still, for this purpose we'd need to modify
acpi_enter_sleep_state_prep() and acpi_leave_sleep_state() and we'd have to
introduce some mechanism synchronizing the disablind/enabling of the GPEs with
the device drivers' .suspend()/.resume() routines and with
disable_/enable_nonboot_cpus(). However, this would have affected the
suspend (ie. s2ram) code as well as the hibernation, which I'd like to avoid
in this patch series.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 01:47:30 -07:00
*/
static void platform_restore_cleanup ( int platform_mode )
{
if ( platform_mode & & hibernation_ops )
hibernation_ops - > restore_cleanup ( ) ;
}
2008-06-12 23:24:06 +02:00
/**
2011-05-24 23:36:06 +02:00
* platform_recover - Recover from a failure to suspend devices .
* @ platform_mode : Whether or not to use the platform driver .
2008-06-12 23:24:06 +02:00
*/
static void platform_recover ( int platform_mode )
{
if ( platform_mode & & hibernation_ops & & hibernation_ops - > recover )
hibernation_ops - > recover ( ) ;
}
2009-12-06 16:16:07 +01:00
/**
2011-05-24 23:36:06 +02:00
* swsusp_show_speed - Print time elapsed between two events during hibernation .
* @ start : Starting event .
* @ stop : Final event .
* @ nr_pages : Number of memory pages processed between @ start and @ stop .
* @ msg : Additional diagnostic message to print .
2009-12-06 16:16:07 +01:00
*/
2014-10-30 11:04:53 -07:00
void swsusp_show_speed ( ktime_t start , ktime_t stop ,
unsigned nr_pages , char * msg )
2009-12-06 16:16:07 +01:00
{
2014-10-30 11:04:53 -07:00
ktime_t diff ;
2014-04-25 08:44:59 +08:00
u64 elapsed_centisecs64 ;
unsigned int centisecs ;
unsigned int k ;
unsigned int kps ;
2009-12-06 16:16:07 +01:00
2014-10-30 11:04:53 -07:00
diff = ktime_sub ( stop , start ) ;
elapsed_centisecs64 = ktime_divns ( diff , 10 * NSEC_PER_MSEC ) ;
2009-12-06 16:16:07 +01:00
centisecs = elapsed_centisecs64 ;
if ( centisecs = = 0 )
centisecs = 1 ; /* avoid div-by-zero */
k = nr_pages * ( PAGE_SIZE / 1024 ) ;
kps = ( k * 100 ) / centisecs ;
2017-02-24 00:26:15 +01:00
pr_info ( " %s %u kbytes in %u.%02u seconds (%u.%02u MB/s) \n " ,
msg , k , centisecs / 100 , centisecs % 100 , kps / 1000 ,
( kps % 1000 ) / 10 ) ;
2009-12-06 16:16:07 +01:00
}
x86/power: Fix 'nosmt' vs hibernation triple fault during resume
As explained in
0cc3cd21657b ("cpu/hotplug: Boot HT siblings at least once")
we always, no matter what, have to bring up x86 HT siblings during boot at
least once in order to avoid first MCE bringing the system to its knees.
That means that whenever 'nosmt' is supplied on the kernel command-line,
all the HT siblings are as a result sitting in mwait or cpudile after
going through the online-offline cycle at least once.
This causes a serious issue though when a kernel, which saw 'nosmt' on its
commandline, is going to perform resume from hibernation: if the resume
from the hibernated image is successful, cr3 is flipped in order to point
to the address space of the kernel that is being resumed, which in turn
means that all the HT siblings are all of a sudden mwaiting on address
which is no longer valid.
That results in triple fault shortly after cr3 is switched, and machine
reboots.
Fix this by always waking up all the SMT siblings before initiating the
'restore from hibernation' process; this guarantees that all the HT
siblings will be properly carried over to the resumed kernel waiting in
resume_play_dead(), and acted upon accordingly afterwards, based on the
target kernel configuration.
Symmetricaly, the resumed kernel has to push the SMT siblings to mwait
again in case it has SMT disabled; this means it has to online all
the siblings when resuming (so that they come out of hlt) and offline
them again to let them reach mwait.
Cc: 4.19+ <stable@vger.kernel.org> # v4.19+
Debugged-by: Thomas Gleixner <tglx@linutronix.de>
Fixes: 0cc3cd21657b ("cpu/hotplug: Boot HT siblings at least once")
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-05-30 00:09:39 +02:00
__weak int arch_resume_nosmt ( void )
{
return 0 ;
}
2007-10-18 03:04:55 -07:00
/**
2011-05-24 23:36:06 +02:00
* create_image - Create a hibernation image .
* @ platform_mode : Whether or not to use the platform driver .
*
2012-01-29 20:38:29 +01:00
* Execute device drivers ' " late " and " noirq " freeze callbacks , create a
* hibernation image and run the drivers ' " noirq " and " early " thaw callbacks .
2011-05-24 23:36:06 +02:00
*
* Control reappears in this routine after the subsequent restore .
2007-10-18 03:04:55 -07:00
*/
2008-02-04 22:30:06 -08:00
static int create_image ( int platform_mode )
2007-10-18 03:04:55 -07:00
{
int error ;
2012-01-29 20:38:29 +01:00
error = dpm_suspend_end ( PMSG_FREEZE ) ;
2007-10-18 03:04:55 -07:00
if ( error ) {
2020-01-02 15:19:40 -08:00
pr_err ( " Some devices failed to power down, aborting \n " ) ;
2009-05-24 21:15:07 +02:00
return error ;
2007-10-18 03:04:55 -07:00
}
2009-03-16 22:34:06 +01:00
2009-03-16 22:34:26 +01:00
error = platform_pre_snapshot ( platform_mode ) ;
if ( error | | hibernation_test ( TEST_PLATFORM ) )
goto Platform_finish ;
PM: sleep: Pause cpuidle later and resume it earlier during system transitions
Commit 8651f97bd951 ("PM / cpuidle: System resume hang fix with
cpuidle") that introduced cpuidle pausing during system suspend
did that to work around a platform firmware issue causing systems
to hang during resume if CPUs were allowed to enter idle states
in the system suspend and resume code paths.
However, pausing cpuidle before the last phase of suspending
devices is the source of an otherwise arbitrary difference between
the suspend-to-idle path and other system suspend variants, so it is
cleaner to do that later, before taking secondary CPUs offline (it
is still safer to take secondary CPUs offline with cpuidle paused,
though).
Modify the code accordingly, but in order to avoid code duplication,
introduce new wrapper functions, pm_sleep_disable_secondary_cpus()
and pm_sleep_enable_secondary_cpus(), to combine cpuidle_pause()
and cpuidle_resume(), respectively, with the handling of secondary
CPUs during system-wide transitions to sleep states.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-10-22 18:07:47 +02:00
error = pm_sleep_disable_secondary_cpus ( ) ;
2011-12-01 22:33:20 +01:00
if ( error | | hibernation_test ( TEST_CPUS ) )
2009-03-16 22:34:26 +01:00
goto Enable_cpus ;
2009-03-16 22:34:06 +01:00
local_irq_disable ( ) ;
2018-05-25 17:54:41 +02:00
system_state = SYSTEM_SUSPEND ;
2011-04-26 19:15:07 +02:00
error = syscore_suspend ( ) ;
2009-02-22 18:38:50 +01:00
if ( error ) {
2020-01-02 15:19:40 -08:00
pr_err ( " Some system devices failed to power down, aborting \n " ) ;
2009-03-16 22:34:26 +01:00
goto Enable_irqs ;
2009-02-22 18:38:50 +01:00
}
2007-10-18 03:04:55 -07:00
2010-12-03 22:58:31 +01:00
if ( hibernation_test ( TEST_CORE ) | | pm_wakeup_pending ( ) )
2007-11-19 23:42:31 +01:00
goto Power_up ;
in_suspend = 1 ;
2007-10-18 03:04:55 -07:00
save_processor_state ( ) ;
2014-06-06 05:40:17 -07:00
trace_suspend_resume ( TPS ( " machine_suspend " ) , PM_EVENT_HIBERNATE , true ) ;
2007-10-18 03:04:55 -07:00
error = swsusp_arch_suspend ( ) ;
2016-08-11 14:49:29 -07:00
/* Restore control flow magically appears here */
restore_processor_state ( ) ;
2014-06-06 05:40:17 -07:00
trace_suspend_resume ( TPS ( " machine_suspend " ) , PM_EVENT_HIBERNATE , false ) ;
2007-10-18 03:04:55 -07:00
if ( error )
2020-01-02 15:19:40 -08:00
pr_err ( " Error %d creating image \n " , error ) ;
2017-02-24 00:26:15 +01:00
2016-09-09 10:43:32 +02:00
if ( ! in_suspend ) {
PM: Make it possible to avoid races between wakeup and system sleep
One of the arguments during the suspend blockers discussion was that
the mainline kernel didn't contain any mechanisms making it possible
to avoid races between wakeup and system suspend.
Generally, there are two problems in that area. First, if a wakeup
event occurs exactly when /sys/power/state is being written to, it
may be delivered to user space right before the freezer kicks in, so
the user space consumer of the event may not be able to process it
before the system is suspended. Second, if a wakeup event occurs
after user space has been frozen, it is not generally guaranteed that
the ongoing transition of the system into a sleep state will be
aborted.
To address these issues introduce a new global sysfs attribute,
/sys/power/wakeup_count, associated with a running counter of wakeup
events and three helper functions, pm_stay_awake(), pm_relax(), and
pm_wakeup_event(), that may be used by kernel subsystems to control
the behavior of this attribute and to request the PM core to abort
system transitions into a sleep state already in progress.
The /sys/power/wakeup_count file may be read from or written to by
user space. Reads will always succeed (unless interrupted by a
signal) and return the current value of the wakeup events counter.
Writes, however, will only succeed if the written number is equal to
the current value of the wakeup events counter. If a write is
successful, it will cause the kernel to save the current value of the
wakeup events counter and to abort the subsequent system transition
into a sleep state if any wakeup events are reported after the write
has returned.
[The assumption is that before writing to /sys/power/state user space
will first read from /sys/power/wakeup_count. Next, user space
consumers of wakeup events will have a chance to acknowledge or
veto the upcoming system transition to a sleep state. Finally, if
the transition is allowed to proceed, /sys/power/wakeup_count will
be written to and if that succeeds, /sys/power/state will be written
to as well. Still, if any wakeup events are reported to the PM core
by kernel subsystems after that point, the transition will be
aborted.]
Additionally, put a wakeup events counter into struct dev_pm_info and
make these per-device wakeup event counters available via sysfs,
so that it's possible to check the activity of various wakeup event
sources within the kernel.
To illustrate how subsystems can use pm_wakeup_event(), make the
low-level PCI runtime PM wakeup-handling code use it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: markgross <markgross@thegnar.org>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
2010-07-05 22:43:53 +02:00
events_check_enabled = false ;
2020-12-14 19:13:38 -08:00
clear_or_poison_free_pages ( ) ;
2016-09-09 10:43:32 +02:00
}
PM / hibernate: Call platform_leave() in suspend path too
Since create_image() only executes platform_leave() if in_suspend is
not set, enable_nonboot_cpus() is run by it with EC transactions
blocked (on ACPI systems) in the image creation code path (that is,
for in_suspend set), which may cause CPU online to fail for the CPUs
in question. In particular, this causes the acpi_cpufreq driver's
initialization to fail for those CPUs on some systems with the
following dmesg:
cpufreq: adding CPU 1
acpi_cpufreq_cpu_init
cpufreq: FREQ: 1401000 - CPU: 0
ACPI Exception: AE_BAD_PARAMETER, Returned by Handler for [EmbeddedControl] (20130725/evregion-287)
ACPI Error: Method parse/execution failed [\_SB_.PCI0.LPC_.EC__.LPMD] (Node ffff88023249ab28), AE_BAD_PARAMETER (20130725/psparse-536)
ACPI Error: Method parse/execution failed [\_PR_.CPU0._PPC] (Node ffff88023270e3f8), AE_BAD_PARAMETER (20130725/psparse-536)
ACPI Error: Method parse/execution failed [\_PR_.CPU1._PPC] (Node ffff88023270e290), AE_BAD_PARAMETER (20130725/psparse-536)
ACPI Exception: AE_BAD_PARAMETER, Evaluating _PPC (20130725/processor_perflib-140)
cpufreq: initialization failed
CPU1 is up
To fix this problem, modify create_image() to execute platform_leave()
unconditionally. [rjw: This shouldn't lead to any significant side
effects on ACPI systems.]
Signed-off-by: Bjørn Mork <bjorn@mork.no>
[rjw: Changelog]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-12-04 16:06:58 +01:00
platform_leave ( platform_mode ) ;
2009-03-16 22:34:26 +01:00
2007-11-19 23:42:31 +01:00
Power_up :
2011-03-15 00:43:46 +01:00
syscore_resume ( ) ;
2009-03-16 22:34:06 +01:00
2009-03-16 22:34:26 +01:00
Enable_irqs :
2018-05-25 17:54:41 +02:00
system_state = SYSTEM_RUNNING ;
2009-03-16 22:34:06 +01:00
local_irq_enable ( ) ;
2009-03-16 22:34:26 +01:00
Enable_cpus :
PM: sleep: Pause cpuidle later and resume it earlier during system transitions
Commit 8651f97bd951 ("PM / cpuidle: System resume hang fix with
cpuidle") that introduced cpuidle pausing during system suspend
did that to work around a platform firmware issue causing systems
to hang during resume if CPUs were allowed to enter idle states
in the system suspend and resume code paths.
However, pausing cpuidle before the last phase of suspending
devices is the source of an otherwise arbitrary difference between
the suspend-to-idle path and other system suspend variants, so it is
cleaner to do that later, before taking secondary CPUs offline (it
is still safer to take secondary CPUs offline with cpuidle paused,
though).
Modify the code accordingly, but in order to avoid code duplication,
introduce new wrapper functions, pm_sleep_disable_secondary_cpus()
and pm_sleep_enable_secondary_cpus(), to combine cpuidle_pause()
and cpuidle_resume(), respectively, with the handling of secondary
CPUs during system-wide transitions to sleep states.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-10-22 18:07:47 +02:00
pm_sleep_enable_secondary_cpus ( ) ;
2009-03-16 22:34:26 +01:00
x86/power: Fix 'nosmt' vs hibernation triple fault during resume
As explained in
0cc3cd21657b ("cpu/hotplug: Boot HT siblings at least once")
we always, no matter what, have to bring up x86 HT siblings during boot at
least once in order to avoid first MCE bringing the system to its knees.
That means that whenever 'nosmt' is supplied on the kernel command-line,
all the HT siblings are as a result sitting in mwait or cpudile after
going through the online-offline cycle at least once.
This causes a serious issue though when a kernel, which saw 'nosmt' on its
commandline, is going to perform resume from hibernation: if the resume
from the hibernated image is successful, cr3 is flipped in order to point
to the address space of the kernel that is being resumed, which in turn
means that all the HT siblings are all of a sudden mwaiting on address
which is no longer valid.
That results in triple fault shortly after cr3 is switched, and machine
reboots.
Fix this by always waking up all the SMT siblings before initiating the
'restore from hibernation' process; this guarantees that all the HT
siblings will be properly carried over to the resumed kernel waiting in
resume_play_dead(), and acted upon accordingly afterwards, based on the
target kernel configuration.
Symmetricaly, the resumed kernel has to push the SMT siblings to mwait
again in case it has SMT disabled; this means it has to online all
the siblings when resuming (so that they come out of hlt) and offline
them again to let them reach mwait.
Cc: 4.19+ <stable@vger.kernel.org> # v4.19+
Debugged-by: Thomas Gleixner <tglx@linutronix.de>
Fixes: 0cc3cd21657b ("cpu/hotplug: Boot HT siblings at least once")
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Acked-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-05-30 00:09:39 +02:00
/* Allow architectures to do nosmt-specific post-resume dances */
if ( ! in_suspend )
error = arch_resume_nosmt ( ) ;
2009-03-16 22:34:26 +01:00
Platform_finish :
platform_finish ( platform_mode ) ;
2012-01-29 20:38:29 +01:00
dpm_resume_start ( in_suspend ?
Introduce new top level suspend and hibernation callbacks
Introduce 'struct pm_ops' and 'struct pm_ext_ops' ('ext' meaning
'extended') representing suspend and hibernation operations for bus
types, device classes, device types and device drivers.
Modify the PM core to use 'struct pm_ops' and 'struct pm_ext_ops'
objects, if defined, instead of the ->suspend(), ->resume(),
->suspend_late(), and ->resume_early() callbacks (the old callbacks
will be considered as legacy and gradually phased out).
The main purpose of doing this is to separate suspend (aka S2RAM and
standby) callbacks from hibernation callbacks in such a way that the
new callbacks won't take arguments and the semantics of each of them
will be clearly specified. This has been requested for multiple
times by many people, including Linus himself, and the reason is that
within the current scheme if ->resume() is called, for example, it's
difficult to say why it's been called (ie. is it a resume from RAM or
from hibernation or a suspend/hibernation failure etc.?).
The second purpose is to make the suspend/hibernation callbacks more
flexible so that device drivers can handle more than they can within
the current scheme. For example, some drivers may need to prevent
new children of the device from being registered before their
->suspend() callbacks are executed or they may want to carry out some
operations requiring the availability of some other devices, not
directly bound via the parent-child relationship, in order to prepare
for the execution of ->suspend(), etc.
Ultimately, we'd like to stop using the freezing of tasks for suspend
and therefore the drivers' suspend/hibernation code will have to take
care of the handling of the user space during suspend/hibernation.
That, in turn, would be difficult within the current scheme, without
the new ->prepare() and ->complete() callbacks.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2008-05-20 23:00:01 +02:00
( error ? PMSG_RECOVER : PMSG_THAW ) : PMSG_RESTORE ) ;
2009-03-16 22:34:06 +01:00
2007-10-18 03:04:55 -07:00
return error ;
}
2007-07-19 01:47:29 -07:00
/**
2011-05-24 23:36:06 +02:00
* hibernation_snapshot - Quiesce devices and create a hibernation image .
* @ platform_mode : If set , use platform driver to prepare for the transition .
2007-07-19 01:47:29 -07:00
*
2018-07-31 16:51:32 +08:00
* This routine must be called with system_transition_mutex held .
2007-07-19 01:47:29 -07:00
*/
int hibernation_snapshot ( int platform_mode )
{
2011-11-22 23:20:31 +01:00
pm_message_t msg ;
2008-11-23 10:37:12 +01:00
int error ;
2007-07-19 01:47:29 -07:00
2016-03-23 00:11:20 +01:00
pm_suspend_clear_flags ( ) ;
2008-10-26 20:50:26 +01:00
error = platform_begin ( platform_mode ) ;
2007-07-19 01:47:29 -07:00
if ( error )
2010-07-07 23:43:35 +02:00
goto Close ;
2007-07-19 01:47:29 -07:00
2009-07-08 13:24:05 +02:00
/* Preallocate image memory before shutting down devices. */
error = hibernate_preallocate_memory ( ) ;
2011-09-26 20:32:27 +02:00
if ( error )
goto Close ;
error = freeze_kernel_threads ( ) ;
if ( error )
2011-11-22 23:08:10 +01:00
goto Cleanup ;
2011-09-26 20:32:27 +02:00
2011-12-01 22:33:20 +01:00
if ( hibernation_test ( TEST_FREEZER ) ) {
2011-11-18 23:02:42 +01:00
/*
* Indicate to the caller that we are returning due to a
* successful freezer test .
*/
freezer_test_done = true ;
2012-02-04 22:26:38 +01:00
goto Thaw ;
2011-11-18 23:02:42 +01:00
}
2011-09-26 20:32:27 +02:00
error = dpm_prepare ( PMSG_FREEZE ) ;
2011-11-22 23:08:10 +01:00
if ( error ) {
2011-11-22 23:20:31 +01:00
dpm_complete ( PMSG_RECOVER ) ;
2012-02-04 22:26:38 +01:00
goto Thaw ;
2011-11-22 23:08:10 +01:00
}
2007-10-18 03:04:42 -07:00
2007-07-19 01:47:29 -07:00
suspend_console ( ) ;
2010-12-03 22:57:45 +01:00
pm_restrict_gfp_mask ( ) ;
2011-11-22 23:20:31 +01:00
2011-05-17 23:26:00 +02:00
error = dpm_suspend ( PMSG_FREEZE ) ;
2007-07-19 01:47:31 -07:00
2011-11-22 23:20:31 +01:00
if ( error | | hibernation_test ( TEST_DEVICES ) )
platform_recover ( platform_mode ) ;
else
error = create_image ( platform_mode ) ;
2007-07-19 01:47:29 -07:00
2010-12-03 22:57:45 +01:00
/*
2011-11-22 23:20:31 +01:00
* In the case that we call create_image ( ) above , the control
* returns here ( 1 ) after the image has been created or the
2010-12-03 22:57:45 +01:00
* image creation has failed and ( 2 ) after a successful restore .
*/
2007-11-19 23:42:31 +01:00
2009-07-08 13:24:05 +02:00
/* We may need to release the preallocated image pages here. */
if ( error | | ! in_suspend )
swsusp_free ( ) ;
2011-05-17 23:26:00 +02:00
msg = in_suspend ? ( error ? PMSG_RECOVER : PMSG_THAW ) : PMSG_RESTORE ;
dpm_resume ( msg ) ;
2010-12-03 22:57:45 +01:00
if ( error | | ! in_suspend )
pm_restore_gfp_mask ( ) ;
2007-07-19 01:47:29 -07:00
resume_console ( ) ;
2011-05-17 23:26:00 +02:00
dpm_complete ( msg ) ;
2008-01-08 00:08:44 +01:00
Close :
platform_end ( platform_mode ) ;
2007-07-19 01:47:29 -07:00
return error ;
2008-06-12 23:24:06 +02:00
2012-02-04 22:26:38 +01:00
Thaw :
thaw_kernel_threads ( ) ;
2011-11-22 23:08:10 +01:00
Cleanup :
swsusp_free ( ) ;
goto Close ;
2007-07-19 01:47:29 -07:00
}
x86 / hibernate: Use hlt_play_dead() when resuming from hibernation
On Intel hardware, native_play_dead() uses mwait_play_dead() by
default and only falls back to the other methods if that fails.
That also happens during resume from hibernation, when the restore
(boot) kernel runs disable_nonboot_cpus() to take all of the CPUs
except for the boot one offline.
However, that is problematic, because the address passed to
__monitor() in mwait_play_dead() is likely to be written to in the
last phase of hibernate image restoration and that causes the "dead"
CPU to start executing instructions again. Unfortunately, the page
containing the address in that CPU's instruction pointer may not be
valid any more at that point.
First, that page may have been overwritten with image kernel memory
contents already, so the instructions the CPU attempts to execute may
simply be invalid. Second, the page tables previously used by that
CPU may have been overwritten by image kernel memory contents, so the
address in its instruction pointer is impossible to resolve then.
A report from Varun Koyyalagunta and investigation carried out by
Chen Yu show that the latter sometimes happens in practice.
To prevent it from happening, temporarily change the smp_ops.play_dead
pointer during resume from hibernation so that it points to a special
"play dead" routine which uses hlt_play_dead() and avoids the
inadvertent "revivals" of "dead" CPUs this way.
A slightly unpleasant consequence of this change is that if the
system is hibernated with one or more CPUs offline, it will generally
draw more power after resume than it did before hibernation, because
the physical state entered by CPUs via hlt_play_dead() is higher-power
than the mwait_play_dead() one in the majority of cases. It is
possible to work around this, but it is unclear how much of a problem
that's going to be in practice, so the workaround will be implemented
later if it turns out to be necessary.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=106371
Reported-by: Varun Koyyalagunta <cpudebug@centtech.com>
Original-by: Chen Yu <yu.c.chen@intel.com>
Tested-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
2016-07-14 03:55:23 +02:00
int __weak hibernate_resume_nonboot_cpu_disable ( void )
{
2019-04-11 13:34:45 +10:00
return suspend_disable_secondary_cpus ( ) ;
x86 / hibernate: Use hlt_play_dead() when resuming from hibernation
On Intel hardware, native_play_dead() uses mwait_play_dead() by
default and only falls back to the other methods if that fails.
That also happens during resume from hibernation, when the restore
(boot) kernel runs disable_nonboot_cpus() to take all of the CPUs
except for the boot one offline.
However, that is problematic, because the address passed to
__monitor() in mwait_play_dead() is likely to be written to in the
last phase of hibernate image restoration and that causes the "dead"
CPU to start executing instructions again. Unfortunately, the page
containing the address in that CPU's instruction pointer may not be
valid any more at that point.
First, that page may have been overwritten with image kernel memory
contents already, so the instructions the CPU attempts to execute may
simply be invalid. Second, the page tables previously used by that
CPU may have been overwritten by image kernel memory contents, so the
address in its instruction pointer is impossible to resolve then.
A report from Varun Koyyalagunta and investigation carried out by
Chen Yu show that the latter sometimes happens in practice.
To prevent it from happening, temporarily change the smp_ops.play_dead
pointer during resume from hibernation so that it points to a special
"play dead" routine which uses hlt_play_dead() and avoids the
inadvertent "revivals" of "dead" CPUs this way.
A slightly unpleasant consequence of this change is that if the
system is hibernated with one or more CPUs offline, it will generally
draw more power after resume than it did before hibernation, because
the physical state entered by CPUs via hlt_play_dead() is higher-power
than the mwait_play_dead() one in the majority of cases. It is
possible to work around this, but it is unclear how much of a problem
that's going to be in practice, so the workaround will be implemented
later if it turns out to be necessary.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=106371
Reported-by: Varun Koyyalagunta <cpudebug@centtech.com>
Original-by: Chen Yu <yu.c.chen@intel.com>
Tested-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
2016-07-14 03:55:23 +02:00
}
2007-12-08 02:04:21 +01:00
/**
2011-05-24 23:36:06 +02:00
* resume_target_kernel - Restore system state from a hibernation image .
* @ platform_mode : Whether or not to use the platform driver .
*
2012-01-29 20:38:29 +01:00
* Execute device drivers ' " noirq " and " late " freeze callbacks , restore the
* contents of highmem that have not been restored yet from the image and run
* the low - level code that will restore the remaining contents of memory and
* switch to the just restored target kernel .
2007-12-08 02:04:21 +01:00
*/
2009-03-16 22:34:26 +01:00
static int resume_target_kernel ( bool platform_mode )
2007-12-08 02:04:21 +01:00
{
int error ;
2012-01-29 20:38:29 +01:00
error = dpm_suspend_end ( PMSG_QUIESCE ) ;
2007-12-08 02:04:21 +01:00
if ( error ) {
2017-02-24 00:26:15 +01:00
pr_err ( " Some devices failed to power down, aborting resume \n " ) ;
2009-05-24 21:15:07 +02:00
return error ;
2007-12-08 02:04:21 +01:00
}
2009-03-16 22:34:06 +01:00
2009-03-16 22:34:26 +01:00
error = platform_pre_restore ( platform_mode ) ;
if ( error )
goto Cleanup ;
PM: sleep: Pause cpuidle later and resume it earlier during system transitions
Commit 8651f97bd951 ("PM / cpuidle: System resume hang fix with
cpuidle") that introduced cpuidle pausing during system suspend
did that to work around a platform firmware issue causing systems
to hang during resume if CPUs were allowed to enter idle states
in the system suspend and resume code paths.
However, pausing cpuidle before the last phase of suspending
devices is the source of an otherwise arbitrary difference between
the suspend-to-idle path and other system suspend variants, so it is
cleaner to do that later, before taking secondary CPUs offline (it
is still safer to take secondary CPUs offline with cpuidle paused,
though).
Modify the code accordingly, but in order to avoid code duplication,
introduce new wrapper functions, pm_sleep_disable_secondary_cpus()
and pm_sleep_enable_secondary_cpus(), to combine cpuidle_pause()
and cpuidle_resume(), respectively, with the handling of secondary
CPUs during system-wide transitions to sleep states.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-10-22 18:07:47 +02:00
cpuidle_pause ( ) ;
x86 / hibernate: Use hlt_play_dead() when resuming from hibernation
On Intel hardware, native_play_dead() uses mwait_play_dead() by
default and only falls back to the other methods if that fails.
That also happens during resume from hibernation, when the restore
(boot) kernel runs disable_nonboot_cpus() to take all of the CPUs
except for the boot one offline.
However, that is problematic, because the address passed to
__monitor() in mwait_play_dead() is likely to be written to in the
last phase of hibernate image restoration and that causes the "dead"
CPU to start executing instructions again. Unfortunately, the page
containing the address in that CPU's instruction pointer may not be
valid any more at that point.
First, that page may have been overwritten with image kernel memory
contents already, so the instructions the CPU attempts to execute may
simply be invalid. Second, the page tables previously used by that
CPU may have been overwritten by image kernel memory contents, so the
address in its instruction pointer is impossible to resolve then.
A report from Varun Koyyalagunta and investigation carried out by
Chen Yu show that the latter sometimes happens in practice.
To prevent it from happening, temporarily change the smp_ops.play_dead
pointer during resume from hibernation so that it points to a special
"play dead" routine which uses hlt_play_dead() and avoids the
inadvertent "revivals" of "dead" CPUs this way.
A slightly unpleasant consequence of this change is that if the
system is hibernated with one or more CPUs offline, it will generally
draw more power after resume than it did before hibernation, because
the physical state entered by CPUs via hlt_play_dead() is higher-power
than the mwait_play_dead() one in the majority of cases. It is
possible to work around this, but it is unclear how much of a problem
that's going to be in practice, so the workaround will be implemented
later if it turns out to be necessary.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=106371
Reported-by: Varun Koyyalagunta <cpudebug@centtech.com>
Original-by: Chen Yu <yu.c.chen@intel.com>
Tested-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
2016-07-14 03:55:23 +02:00
error = hibernate_resume_nonboot_cpu_disable ( ) ;
2009-03-16 22:34:26 +01:00
if ( error )
goto Enable_cpus ;
2009-03-16 22:34:06 +01:00
local_irq_disable ( ) ;
2018-05-25 17:54:41 +02:00
system_state = SYSTEM_SUSPEND ;
2009-03-16 22:34:06 +01:00
2011-04-26 19:15:07 +02:00
error = syscore_suspend ( ) ;
2009-03-16 22:34:26 +01:00
if ( error )
goto Enable_irqs ;
2007-12-08 02:04:21 +01:00
save_processor_state ( ) ;
error = restore_highmem ( ) ;
if ( ! error ) {
error = swsusp_arch_resume ( ) ;
/*
* The code below is only ever reached in case of a failure .
2011-05-24 00:21:26 +02:00
* Otherwise , execution continues at the place where
* swsusp_arch_suspend ( ) was called .
2007-12-08 02:04:21 +01:00
*/
BUG_ON ( ! error ) ;
2011-05-24 00:21:26 +02:00
/*
* This call to restore_highmem ( ) reverts the changes made by
* the previous one .
*/
2007-12-08 02:04:21 +01:00
restore_highmem ( ) ;
}
/*
* The only reason why swsusp_arch_resume ( ) can fail is memory being
* very tight , so we have to free it as soon as we can to avoid
2011-05-24 00:21:26 +02:00
* subsequent failures .
2007-12-08 02:04:21 +01:00
*/
swsusp_free ( ) ;
restore_processor_state ( ) ;
touch_softlockup_watchdog ( ) ;
2009-03-16 22:34:06 +01:00
2011-03-15 00:43:46 +01:00
syscore_resume ( ) ;
2009-03-16 22:34:06 +01:00
2009-03-16 22:34:26 +01:00
Enable_irqs :
2018-05-25 17:54:41 +02:00
system_state = SYSTEM_RUNNING ;
2007-12-08 02:04:21 +01:00
local_irq_enable ( ) ;
2009-03-16 22:34:06 +01:00
2009-03-16 22:34:26 +01:00
Enable_cpus :
PM: sleep: Pause cpuidle later and resume it earlier during system transitions
Commit 8651f97bd951 ("PM / cpuidle: System resume hang fix with
cpuidle") that introduced cpuidle pausing during system suspend
did that to work around a platform firmware issue causing systems
to hang during resume if CPUs were allowed to enter idle states
in the system suspend and resume code paths.
However, pausing cpuidle before the last phase of suspending
devices is the source of an otherwise arbitrary difference between
the suspend-to-idle path and other system suspend variants, so it is
cleaner to do that later, before taking secondary CPUs offline (it
is still safer to take secondary CPUs offline with cpuidle paused,
though).
Modify the code accordingly, but in order to avoid code duplication,
introduce new wrapper functions, pm_sleep_disable_secondary_cpus()
and pm_sleep_enable_secondary_cpus(), to combine cpuidle_pause()
and cpuidle_resume(), respectively, with the handling of secondary
CPUs during system-wide transitions to sleep states.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-10-22 18:07:47 +02:00
pm_sleep_enable_secondary_cpus ( ) ;
2009-03-16 22:34:26 +01:00
Cleanup :
platform_restore_cleanup ( platform_mode ) ;
2012-01-29 20:38:29 +01:00
dpm_resume_start ( PMSG_RECOVER ) ;
2009-03-16 22:34:06 +01:00
2007-12-08 02:04:21 +01:00
return error ;
}
2007-07-19 01:47:29 -07:00
/**
2011-05-24 23:36:06 +02:00
* hibernation_restore - Quiesce devices and restore from a hibernation image .
* @ platform_mode : If set , use platform driver to prepare for the transition .
2007-07-19 01:47:29 -07:00
*
2018-07-31 16:51:32 +08:00
* This routine must be called with system_transition_mutex held . If it is
* successful , control reappears in the restored target kernel in
* hibernation_snapshot ( ) .
2007-07-19 01:47:29 -07:00
*/
swsusp: introduce restore platform operations
At least on some machines it is necessary to prepare the ACPI firmware for the
restoration of the system memory state from the hibernation image if the
"platform" mode of hibernation has been used. Namely, in that cases we need
to disable the GPEs before replacing the "boot" kernel with the "frozen"
kernel (cf. http://bugzilla.kernel.org/show_bug.cgi?id=7887). After the
restore they will be re-enabled by hibernation_ops->finish(), but if the
restore fails, they have to be re-enabled by the restore code explicitly.
For this purpose we can introduce two additional hibernation operations,
called pre_restore() and restore_cleanup() and call them from the restore code
path. Still, they should be called if the "platform" mode of hibernation has
been used, so we need to pass the information about the hibernation mode from
the "frozen" kernel to the "boot" kernel in the image header.
Apparently, we can't drop the disabling of GPEs before the restore because of
Bug #7887 . We also can't do it unconditionally, because the GPEs wouldn't
have been enabled after a successful restore if the suspend had been done in
the 'shutdown' or 'reboot' mode.
In principle we could (and probably should) unconditionally disable the GPEs
before each snapshot creation *and* before the restore, but then we'd have to
unconditionally enable them after the snapshot creation as well as after the
restore (or restore failure) Still, for this purpose we'd need to modify
acpi_enter_sleep_state_prep() and acpi_leave_sleep_state() and we'd have to
introduce some mechanism synchronizing the disablind/enabling of the GPEs with
the device drivers' .suspend()/.resume() routines and with
disable_/enable_nonboot_cpus(). However, this would have affected the
suspend (ie. s2ram) code as well as the hibernation, which I'd like to avoid
in this patch series.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 01:47:30 -07:00
int hibernation_restore ( int platform_mode )
2007-07-19 01:47:29 -07:00
{
2008-11-23 10:37:12 +01:00
int error ;
2007-07-19 01:47:29 -07:00
pm_prepare_console ( ) ;
suspend_console ( ) ;
2010-12-03 22:57:45 +01:00
pm_restrict_gfp_mask ( ) ;
2009-05-24 22:05:42 +02:00
error = dpm_suspend_start ( PMSG_QUIESCE ) ;
swsusp: introduce restore platform operations
At least on some machines it is necessary to prepare the ACPI firmware for the
restoration of the system memory state from the hibernation image if the
"platform" mode of hibernation has been used. Namely, in that cases we need
to disable the GPEs before replacing the "boot" kernel with the "frozen"
kernel (cf. http://bugzilla.kernel.org/show_bug.cgi?id=7887). After the
restore they will be re-enabled by hibernation_ops->finish(), but if the
restore fails, they have to be re-enabled by the restore code explicitly.
For this purpose we can introduce two additional hibernation operations,
called pre_restore() and restore_cleanup() and call them from the restore code
path. Still, they should be called if the "platform" mode of hibernation has
been used, so we need to pass the information about the hibernation mode from
the "frozen" kernel to the "boot" kernel in the image header.
Apparently, we can't drop the disabling of GPEs before the restore because of
Bug #7887 . We also can't do it unconditionally, because the GPEs wouldn't
have been enabled after a successful restore if the suspend had been done in
the 'shutdown' or 'reboot' mode.
In principle we could (and probably should) unconditionally disable the GPEs
before each snapshot creation *and* before the restore, but then we'd have to
unconditionally enable them after the snapshot creation as well as after the
restore (or restore failure) Still, for this purpose we'd need to modify
acpi_enter_sleep_state_prep() and acpi_leave_sleep_state() and we'd have to
introduce some mechanism synchronizing the disablind/enabling of the GPEs with
the device drivers' .suspend()/.resume() routines and with
disable_/enable_nonboot_cpus(). However, this would have affected the
suspend (ie. s2ram) code as well as the hibernation, which I'd like to avoid
in this patch series.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 01:47:30 -07:00
if ( ! error ) {
2009-03-16 22:34:26 +01:00
error = resume_target_kernel ( platform_mode ) ;
2014-10-24 20:29:10 +03:00
/*
* The above should either succeed and jump to the new kernel ,
* or return with an error . Otherwise things are just
* undefined , so let ' s be paranoid .
*/
BUG_ON ( ! error ) ;
swsusp: introduce restore platform operations
At least on some machines it is necessary to prepare the ACPI firmware for the
restoration of the system memory state from the hibernation image if the
"platform" mode of hibernation has been used. Namely, in that cases we need
to disable the GPEs before replacing the "boot" kernel with the "frozen"
kernel (cf. http://bugzilla.kernel.org/show_bug.cgi?id=7887). After the
restore they will be re-enabled by hibernation_ops->finish(), but if the
restore fails, they have to be re-enabled by the restore code explicitly.
For this purpose we can introduce two additional hibernation operations,
called pre_restore() and restore_cleanup() and call them from the restore code
path. Still, they should be called if the "platform" mode of hibernation has
been used, so we need to pass the information about the hibernation mode from
the "frozen" kernel to the "boot" kernel in the image header.
Apparently, we can't drop the disabling of GPEs before the restore because of
Bug #7887 . We also can't do it unconditionally, because the GPEs wouldn't
have been enabled after a successful restore if the suspend had been done in
the 'shutdown' or 'reboot' mode.
In principle we could (and probably should) unconditionally disable the GPEs
before each snapshot creation *and* before the restore, but then we'd have to
unconditionally enable them after the snapshot creation as well as after the
restore (or restore failure) Still, for this purpose we'd need to modify
acpi_enter_sleep_state_prep() and acpi_leave_sleep_state() and we'd have to
introduce some mechanism synchronizing the disablind/enabling of the GPEs with
the device drivers' .suspend()/.resume() routines and with
disable_/enable_nonboot_cpus(). However, this would have affected the
suspend (ie. s2ram) code as well as the hibernation, which I'd like to avoid
in this patch series.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 01:47:30 -07:00
}
2014-10-24 20:29:10 +03:00
dpm_resume_end ( PMSG_RECOVER ) ;
2010-12-03 22:57:45 +01:00
pm_restore_gfp_mask ( ) ;
2007-07-19 01:47:29 -07:00
resume_console ( ) ;
pm_restore_console ( ) ;
return error ;
}
/**
2011-05-24 23:36:06 +02:00
* hibernation_platform_enter - Power off the system using the platform driver .
2007-07-19 01:47:29 -07:00
*/
int hibernation_platform_enter ( void )
{
2008-11-23 10:37:12 +01:00
int error ;
2007-07-19 01:47:31 -07:00
2007-10-18 03:04:56 -07:00
if ( ! hibernation_ops )
return - ENOSYS ;
/*
* We have cancelled the power transition by running
* hibernation_ops - > finish ( ) before saving the image , so we should let
* the firmware know that we ' re going to enter the sleep state after all
*/
2019-05-16 12:43:19 +02:00
error = hibernation_ops - > begin ( PMSG_HIBERNATE ) ;
2007-10-18 03:04:56 -07:00
if ( error )
2008-01-08 00:08:44 +01:00
goto Close ;
2007-10-18 03:04:56 -07:00
2009-01-19 20:54:54 +01:00
entering_platform_hibernation = true ;
2007-10-18 03:04:56 -07:00
suspend_console ( ) ;
2009-05-24 22:05:42 +02:00
error = dpm_suspend_start ( PMSG_HIBERNATE ) ;
2008-06-12 23:24:06 +02:00
if ( error ) {
if ( hibernation_ops - > recover )
hibernation_ops - > recover ( ) ;
goto Resume_devices ;
}
2007-10-18 03:04:56 -07:00
2012-01-29 20:38:29 +01:00
error = dpm_suspend_end ( PMSG_HIBERNATE ) ;
2009-03-16 22:34:26 +01:00
if ( error )
2009-05-24 21:15:07 +02:00
goto Resume_devices ;
2009-03-16 22:34:26 +01:00
2007-10-18 03:04:56 -07:00
error = hibernation_ops - > prepare ( ) ;
if ( error )
2009-07-08 13:23:32 +02:00
goto Platform_finish ;
2007-10-18 03:04:56 -07:00
PM: sleep: Pause cpuidle later and resume it earlier during system transitions
Commit 8651f97bd951 ("PM / cpuidle: System resume hang fix with
cpuidle") that introduced cpuidle pausing during system suspend
did that to work around a platform firmware issue causing systems
to hang during resume if CPUs were allowed to enter idle states
in the system suspend and resume code paths.
However, pausing cpuidle before the last phase of suspending
devices is the source of an otherwise arbitrary difference between
the suspend-to-idle path and other system suspend variants, so it is
cleaner to do that later, before taking secondary CPUs offline (it
is still safer to take secondary CPUs offline with cpuidle paused,
though).
Modify the code accordingly, but in order to avoid code duplication,
introduce new wrapper functions, pm_sleep_disable_secondary_cpus()
and pm_sleep_enable_secondary_cpus(), to combine cpuidle_pause()
and cpuidle_resume(), respectively, with the handling of secondary
CPUs during system-wide transitions to sleep states.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-10-22 18:07:47 +02:00
error = pm_sleep_disable_secondary_cpus ( ) ;
2007-10-18 03:04:56 -07:00
if ( error )
2015-06-24 16:02:06 +02:00
goto Enable_cpus ;
2009-03-16 22:34:06 +01:00
2009-03-16 22:34:26 +01:00
local_irq_disable ( ) ;
2018-05-25 17:54:41 +02:00
system_state = SYSTEM_SUSPEND ;
2011-03-15 00:43:46 +01:00
syscore_suspend ( ) ;
2010-12-03 22:58:31 +01:00
if ( pm_wakeup_pending ( ) ) {
PM: Make it possible to avoid races between wakeup and system sleep
One of the arguments during the suspend blockers discussion was that
the mainline kernel didn't contain any mechanisms making it possible
to avoid races between wakeup and system suspend.
Generally, there are two problems in that area. First, if a wakeup
event occurs exactly when /sys/power/state is being written to, it
may be delivered to user space right before the freezer kicks in, so
the user space consumer of the event may not be able to process it
before the system is suspended. Second, if a wakeup event occurs
after user space has been frozen, it is not generally guaranteed that
the ongoing transition of the system into a sleep state will be
aborted.
To address these issues introduce a new global sysfs attribute,
/sys/power/wakeup_count, associated with a running counter of wakeup
events and three helper functions, pm_stay_awake(), pm_relax(), and
pm_wakeup_event(), that may be used by kernel subsystems to control
the behavior of this attribute and to request the PM core to abort
system transitions into a sleep state already in progress.
The /sys/power/wakeup_count file may be read from or written to by
user space. Reads will always succeed (unless interrupted by a
signal) and return the current value of the wakeup events counter.
Writes, however, will only succeed if the written number is equal to
the current value of the wakeup events counter. If a write is
successful, it will cause the kernel to save the current value of the
wakeup events counter and to abort the subsequent system transition
into a sleep state if any wakeup events are reported after the write
has returned.
[The assumption is that before writing to /sys/power/state user space
will first read from /sys/power/wakeup_count. Next, user space
consumers of wakeup events will have a chance to acknowledge or
veto the upcoming system transition to a sleep state. Finally, if
the transition is allowed to proceed, /sys/power/wakeup_count will
be written to and if that succeeds, /sys/power/state will be written
to as well. Still, if any wakeup events are reported to the PM core
by kernel subsystems after that point, the transition will be
aborted.]
Additionally, put a wakeup events counter into struct dev_pm_info and
make these per-device wakeup event counters available via sysfs,
so that it's possible to check the activity of various wakeup event
sources within the kernel.
To illustrate how subsystems can use pm_wakeup_event(), make the
low-level PCI runtime PM wakeup-handling code use it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: markgross <markgross@thegnar.org>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
2010-07-05 22:43:53 +02:00
error = - EAGAIN ;
goto Power_up ;
}
2009-03-16 22:34:26 +01:00
hibernation_ops - > enter ( ) ;
/* We should never get here */
while ( 1 ) ;
2007-10-18 03:04:56 -07:00
PM: Make it possible to avoid races between wakeup and system sleep
One of the arguments during the suspend blockers discussion was that
the mainline kernel didn't contain any mechanisms making it possible
to avoid races between wakeup and system suspend.
Generally, there are two problems in that area. First, if a wakeup
event occurs exactly when /sys/power/state is being written to, it
may be delivered to user space right before the freezer kicks in, so
the user space consumer of the event may not be able to process it
before the system is suspended. Second, if a wakeup event occurs
after user space has been frozen, it is not generally guaranteed that
the ongoing transition of the system into a sleep state will be
aborted.
To address these issues introduce a new global sysfs attribute,
/sys/power/wakeup_count, associated with a running counter of wakeup
events and three helper functions, pm_stay_awake(), pm_relax(), and
pm_wakeup_event(), that may be used by kernel subsystems to control
the behavior of this attribute and to request the PM core to abort
system transitions into a sleep state already in progress.
The /sys/power/wakeup_count file may be read from or written to by
user space. Reads will always succeed (unless interrupted by a
signal) and return the current value of the wakeup events counter.
Writes, however, will only succeed if the written number is equal to
the current value of the wakeup events counter. If a write is
successful, it will cause the kernel to save the current value of the
wakeup events counter and to abort the subsequent system transition
into a sleep state if any wakeup events are reported after the write
has returned.
[The assumption is that before writing to /sys/power/state user space
will first read from /sys/power/wakeup_count. Next, user space
consumers of wakeup events will have a chance to acknowledge or
veto the upcoming system transition to a sleep state. Finally, if
the transition is allowed to proceed, /sys/power/wakeup_count will
be written to and if that succeeds, /sys/power/state will be written
to as well. Still, if any wakeup events are reported to the PM core
by kernel subsystems after that point, the transition will be
aborted.]
Additionally, put a wakeup events counter into struct dev_pm_info and
make these per-device wakeup event counters available via sysfs,
so that it's possible to check the activity of various wakeup event
sources within the kernel.
To illustrate how subsystems can use pm_wakeup_event(), make the
low-level PCI runtime PM wakeup-handling code use it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: markgross <markgross@thegnar.org>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
2010-07-05 22:43:53 +02:00
Power_up :
2011-03-15 00:43:46 +01:00
syscore_resume ( ) ;
2018-05-25 17:54:41 +02:00
system_state = SYSTEM_RUNNING ;
PM: Make it possible to avoid races between wakeup and system sleep
One of the arguments during the suspend blockers discussion was that
the mainline kernel didn't contain any mechanisms making it possible
to avoid races between wakeup and system suspend.
Generally, there are two problems in that area. First, if a wakeup
event occurs exactly when /sys/power/state is being written to, it
may be delivered to user space right before the freezer kicks in, so
the user space consumer of the event may not be able to process it
before the system is suspended. Second, if a wakeup event occurs
after user space has been frozen, it is not generally guaranteed that
the ongoing transition of the system into a sleep state will be
aborted.
To address these issues introduce a new global sysfs attribute,
/sys/power/wakeup_count, associated with a running counter of wakeup
events and three helper functions, pm_stay_awake(), pm_relax(), and
pm_wakeup_event(), that may be used by kernel subsystems to control
the behavior of this attribute and to request the PM core to abort
system transitions into a sleep state already in progress.
The /sys/power/wakeup_count file may be read from or written to by
user space. Reads will always succeed (unless interrupted by a
signal) and return the current value of the wakeup events counter.
Writes, however, will only succeed if the written number is equal to
the current value of the wakeup events counter. If a write is
successful, it will cause the kernel to save the current value of the
wakeup events counter and to abort the subsequent system transition
into a sleep state if any wakeup events are reported after the write
has returned.
[The assumption is that before writing to /sys/power/state user space
will first read from /sys/power/wakeup_count. Next, user space
consumers of wakeup events will have a chance to acknowledge or
veto the upcoming system transition to a sleep state. Finally, if
the transition is allowed to proceed, /sys/power/wakeup_count will
be written to and if that succeeds, /sys/power/state will be written
to as well. Still, if any wakeup events are reported to the PM core
by kernel subsystems after that point, the transition will be
aborted.]
Additionally, put a wakeup events counter into struct dev_pm_info and
make these per-device wakeup event counters available via sysfs,
so that it's possible to check the activity of various wakeup event
sources within the kernel.
To illustrate how subsystems can use pm_wakeup_event(), make the
low-level PCI runtime PM wakeup-handling code use it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: markgross <markgross@thegnar.org>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
2010-07-05 22:43:53 +02:00
local_irq_enable ( ) ;
2015-06-24 16:02:06 +02:00
Enable_cpus :
PM: sleep: Pause cpuidle later and resume it earlier during system transitions
Commit 8651f97bd951 ("PM / cpuidle: System resume hang fix with
cpuidle") that introduced cpuidle pausing during system suspend
did that to work around a platform firmware issue causing systems
to hang during resume if CPUs were allowed to enter idle states
in the system suspend and resume code paths.
However, pausing cpuidle before the last phase of suspending
devices is the source of an otherwise arbitrary difference between
the suspend-to-idle path and other system suspend variants, so it is
cleaner to do that later, before taking secondary CPUs offline (it
is still safer to take secondary CPUs offline with cpuidle paused,
though).
Modify the code accordingly, but in order to avoid code duplication,
introduce new wrapper functions, pm_sleep_disable_secondary_cpus()
and pm_sleep_enable_secondary_cpus(), to combine cpuidle_pause()
and cpuidle_resume(), respectively, with the handling of secondary
CPUs during system-wide transitions to sleep states.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-10-22 18:07:47 +02:00
pm_sleep_enable_secondary_cpus ( ) ;
PM: Make it possible to avoid races between wakeup and system sleep
One of the arguments during the suspend blockers discussion was that
the mainline kernel didn't contain any mechanisms making it possible
to avoid races between wakeup and system suspend.
Generally, there are two problems in that area. First, if a wakeup
event occurs exactly when /sys/power/state is being written to, it
may be delivered to user space right before the freezer kicks in, so
the user space consumer of the event may not be able to process it
before the system is suspended. Second, if a wakeup event occurs
after user space has been frozen, it is not generally guaranteed that
the ongoing transition of the system into a sleep state will be
aborted.
To address these issues introduce a new global sysfs attribute,
/sys/power/wakeup_count, associated with a running counter of wakeup
events and three helper functions, pm_stay_awake(), pm_relax(), and
pm_wakeup_event(), that may be used by kernel subsystems to control
the behavior of this attribute and to request the PM core to abort
system transitions into a sleep state already in progress.
The /sys/power/wakeup_count file may be read from or written to by
user space. Reads will always succeed (unless interrupted by a
signal) and return the current value of the wakeup events counter.
Writes, however, will only succeed if the written number is equal to
the current value of the wakeup events counter. If a write is
successful, it will cause the kernel to save the current value of the
wakeup events counter and to abort the subsequent system transition
into a sleep state if any wakeup events are reported after the write
has returned.
[The assumption is that before writing to /sys/power/state user space
will first read from /sys/power/wakeup_count. Next, user space
consumers of wakeup events will have a chance to acknowledge or
veto the upcoming system transition to a sleep state. Finally, if
the transition is allowed to proceed, /sys/power/wakeup_count will
be written to and if that succeeds, /sys/power/state will be written
to as well. Still, if any wakeup events are reported to the PM core
by kernel subsystems after that point, the transition will be
aborted.]
Additionally, put a wakeup events counter into struct dev_pm_info and
make these per-device wakeup event counters available via sysfs,
so that it's possible to check the activity of various wakeup event
sources within the kernel.
To illustrate how subsystems can use pm_wakeup_event(), make the
low-level PCI runtime PM wakeup-handling code use it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: markgross <markgross@thegnar.org>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
2010-07-05 22:43:53 +02:00
2009-07-08 13:23:32 +02:00
Platform_finish :
2007-10-18 03:04:56 -07:00
hibernation_ops - > finish ( ) ;
2009-03-16 22:34:06 +01:00
2012-01-29 20:38:29 +01:00
dpm_resume_start ( PMSG_RESTORE ) ;
2009-03-16 22:34:26 +01:00
2007-10-18 03:04:56 -07:00
Resume_devices :
2009-01-19 20:54:54 +01:00
entering_platform_hibernation = false ;
2009-05-24 22:05:42 +02:00
dpm_resume_end ( PMSG_RESTORE ) ;
2007-10-18 03:04:56 -07:00
resume_console ( ) ;
2009-03-16 22:34:06 +01:00
2008-01-08 00:08:44 +01:00
Close :
hibernation_ops - > end ( ) ;
2009-03-16 22:34:06 +01:00
2007-07-19 01:47:31 -07:00
return error ;
2007-07-19 01:47:29 -07:00
}
2005-04-16 15:20:36 -07:00
/**
2011-05-24 23:36:06 +02:00
* power_down - Shut the machine down for hibernation .
2005-04-16 15:20:36 -07:00
*
2011-05-24 23:36:06 +02:00
* Use the platform driver , if configured , to put the system into the sleep
* state corresponding to hibernation , or try to power it off or reboot ,
* depending on the value of hibernation_mode .
2005-04-16 15:20:36 -07:00
*/
rework pm_ops pm_disk_mode, kill misuse
This patch series cleans up some misconceptions about pm_ops. Some users of
the pm_ops structure attempt to use it to stop the user from entering suspend
to disk, this, however, is not possible since the user can always use
"shutdown" in /sys/power/disk and then the pm_ops are never invoked. Also,
platforms that don't support suspend to disk simply should not allow
configuring SOFTWARE_SUSPEND (read the help text on it, it only selects
suspend to disk and nothing else, all the other stuff depends on PM).
The pm_ops structure is actually intended to provide a way to enter
platform-defined sleep states (currently supported states are "standby" and
"mem" (suspend to ram)) and additionally (if SOFTWARE_SUSPEND is configured)
allows a platform to support a platform specific way to enter low-power mode
once everything has been saved to disk. This is currently only used by ACPI
(S4).
This patch:
The pm_ops.pm_disk_mode is used in totally bogus ways since nobody really
seems to understand what it actually does.
This patch clarifies the pm_disk_mode description.
It also removes all the arm and sh users that think they can veto suspend to
disk via pm_ops; not so since the user can always do echo shutdown >
/sys/power/disk, they need to find a better way involving Kconfig or such.
ACPI is the only user left with a non-zero pm_disk_mode.
The patch also sets the default mode to shutdown again, but when a new pm_ops
is registered its pm_disk_mode is selected as default, that way the default
stays for ACPI where it is apparently required.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: David Brownell <david-b@pacbell.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: <linux-pm@lists.linux-foundation.org>
Cc: Len Brown <lenb@kernel.org>
Acked-by: Russell King <rmk@arm.linux.org.uk>
Cc: Greg KH <greg@kroah.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-30 15:09:51 -07:00
static void power_down ( void )
2005-04-16 15:20:36 -07:00
{
2012-06-16 00:09:58 +02:00
int error ;
2017-02-24 00:25:28 +01:00
2023-12-13 16:32:51 +08:00
# ifdef CONFIG_SUSPEND
2017-02-24 00:25:28 +01:00
if ( hibernation_mode = = HIBERNATION_SUSPEND ) {
2022-10-12 22:50:17 -05:00
error = suspend_devices_and_enter ( mem_sleep_current ) ;
2017-02-24 00:25:28 +01:00
if ( error ) {
hibernation_mode = hibernation_ops ?
HIBERNATION_PLATFORM :
HIBERNATION_SHUTDOWN ;
} else {
/* Restore swap signature. */
error = swsusp_unmark ( ) ;
if ( error )
2017-02-24 00:26:15 +01:00
pr_err ( " Swap will be unusable! Try swapon -a. \n " ) ;
2017-02-24 00:25:28 +01:00
return ;
}
}
2012-06-16 00:09:58 +02:00
# endif
2007-05-09 02:33:18 -07:00
switch ( hibernation_mode ) {
case HIBERNATION_REBOOT :
2005-07-26 12:01:17 -06:00
kernel_restart ( NULL ) ;
2005-04-16 15:20:36 -07:00
break ;
2007-05-09 02:33:18 -07:00
case HIBERNATION_PLATFORM :
2023-12-13 16:32:51 +08:00
error = hibernation_platform_enter ( ) ;
if ( error = = - EAGAIN | | error = = - EBUSY ) {
swsusp_unmark ( ) ;
events_check_enabled = false ;
pr_info ( " Wakeup event detected during hibernation, rolling back. \n " ) ;
return ;
}
2020-08-23 17:36:59 -05:00
fallthrough ;
2007-10-18 03:04:56 -07:00
case HIBERNATION_SHUTDOWN :
2022-06-17 15:24:02 +03:00
if ( kernel_can_power_off ( ) )
2014-04-21 17:30:46 -07:00
kernel_power_off ( ) ;
2007-10-18 03:04:56 -07:00
break ;
2005-04-16 15:20:36 -07:00
}
2005-07-26 12:01:17 -06:00
kernel_halt ( ) ;
rework pm_ops pm_disk_mode, kill misuse
This patch series cleans up some misconceptions about pm_ops. Some users of
the pm_ops structure attempt to use it to stop the user from entering suspend
to disk, this, however, is not possible since the user can always use
"shutdown" in /sys/power/disk and then the pm_ops are never invoked. Also,
platforms that don't support suspend to disk simply should not allow
configuring SOFTWARE_SUSPEND (read the help text on it, it only selects
suspend to disk and nothing else, all the other stuff depends on PM).
The pm_ops structure is actually intended to provide a way to enter
platform-defined sleep states (currently supported states are "standby" and
"mem" (suspend to ram)) and additionally (if SOFTWARE_SUSPEND is configured)
allows a platform to support a platform specific way to enter low-power mode
once everything has been saved to disk. This is currently only used by ACPI
(S4).
This patch:
The pm_ops.pm_disk_mode is used in totally bogus ways since nobody really
seems to understand what it actually does.
This patch clarifies the pm_disk_mode description.
It also removes all the arm and sh users that think they can veto suspend to
disk via pm_ops; not so since the user can always do echo shutdown >
/sys/power/disk, they need to find a better way involving Kconfig or such.
ACPI is the only user left with a non-zero pm_disk_mode.
The patch also sets the default mode to shutdown again, but when a new pm_ops
is registered its pm_disk_mode is selected as default, that way the default
stays for ACPI where it is apparently required.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: David Brownell <david-b@pacbell.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: <linux-pm@lists.linux-foundation.org>
Cc: Len Brown <lenb@kernel.org>
Acked-by: Russell King <rmk@arm.linux.org.uk>
Cc: Greg KH <greg@kroah.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-30 15:09:51 -07:00
/*
* Valid image is on the disk , if we continue we risk serious data
* corruption after resume .
*/
2017-02-24 00:26:15 +01:00
pr_crit ( " Power down manually \n " ) ;
2014-04-21 17:30:46 -07:00
while ( 1 )
cpu_relax ( ) ;
2005-04-16 15:20:36 -07:00
}
2023-09-27 11:34:23 +02:00
static int load_image_and_restore ( void )
2016-07-22 10:30:47 +08:00
{
int error ;
unsigned int flags ;
2017-07-19 02:38:44 +02:00
pm_pr_dbg ( " Loading hibernation image. \n " ) ;
2016-07-22 10:30:47 +08:00
lock_device_hotplug ( ) ;
error = create_basic_memory_bitmaps ( ) ;
2022-02-09 19:29:51 +08:00
if ( error ) {
2023-09-27 11:34:23 +02:00
swsusp_close ( ) ;
2016-07-22 10:30:47 +08:00
goto Unlock ;
2022-02-09 19:29:51 +08:00
}
2016-07-22 10:30:47 +08:00
error = swsusp_read ( & flags ) ;
2023-09-27 11:34:23 +02:00
swsusp_close ( ) ;
2016-07-22 10:30:47 +08:00
if ( ! error )
2020-03-31 08:55:25 -07:00
error = hibernation_restore ( flags & SF_PLATFORM_MODE ) ;
2016-07-22 10:30:47 +08:00
2020-01-02 15:19:40 -08:00
pr_err ( " Failed to load image, recovering. \n " ) ;
2016-07-22 10:30:47 +08:00
swsusp_free ( ) ;
free_basic_memory_bitmaps ( ) ;
Unlock :
unlock_device_hotplug ( ) ;
return error ;
}
2024-01-22 18:45:27 +05:30
# define COMPRESSION_ALGO_LZO "lzo"
# define COMPRESSION_ALGO_LZ4 "lz4"
2005-04-16 15:20:36 -07:00
/**
2011-05-24 23:36:06 +02:00
* hibernate - Carry out system hibernation , including saving the image .
2005-04-16 15:20:36 -07:00
*/
2007-05-09 02:33:18 -07:00
int hibernate ( void )
2005-04-16 15:20:36 -07:00
{
2023-05-31 14:55:14 +02:00
bool snapshot_test = false ;
2022-08-22 13:18:17 +02:00
unsigned int sleep_flags ;
notifier: Fix broken error handling pattern
The current notifiers have the following error handling pattern all
over the place:
int err, nr;
err = __foo_notifier_call_chain(&chain, val_up, v, -1, &nr);
if (err & NOTIFIER_STOP_MASK)
__foo_notifier_call_chain(&chain, val_down, v, nr-1, NULL)
And aside from the endless repetition thereof, it is broken. Consider
blocking notifiers; both calls take and drop the rwsem, this means
that the notifier list can change in between the two calls, making @nr
meaningless.
Fix this by replacing all the __foo_notifier_call_chain() functions
with foo_notifier_call_chain_robust() that embeds the above pattern,
but ensures it is inside a single lock region.
Note: I switched atomic_notifier_call_chain_robust() to use
the spinlock, since RCU cannot provide the guarantee
required for the recovery.
Note: software_resume() error handling was broken afaict.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20200818135804.325626653@infradead.org
2020-08-18 15:57:36 +02:00
int error ;
2005-04-16 15:20:36 -07:00
2014-06-13 13:30:35 -07:00
if ( ! hibernation_available ( ) ) {
2017-07-19 02:38:44 +02:00
pm_pr_dbg ( " Hibernation not available. \n " ) ;
2014-06-13 13:30:35 -07:00
return - EPERM ;
}
2024-01-22 18:45:26 +05:30
/*
* Query for the compression algorithm support if compression is enabled .
*/
if ( ! nocompress ) {
2024-02-14 13:09:32 +05:30
strscpy ( hib_comp_algo , hibernate_compressor , sizeof ( hib_comp_algo ) ) ;
2024-01-22 18:45:26 +05:30
if ( crypto_has_comp ( hib_comp_algo , 0 , 0 ) ! = 1 ) {
pr_err ( " %s compression is not available \n " , hib_comp_algo ) ;
return - EOPNOTSUPP ;
}
}
2022-08-22 13:18:17 +02:00
sleep_flags = lock_system_sleep ( ) ;
2007-05-06 14:50:45 -07:00
/* The snapshot device should not be opened while we're running */
2020-05-07 09:19:52 +02:00
if ( ! hibernate_acquire ( ) ) {
2007-07-19 01:47:36 -07:00
error = - EBUSY ;
goto Unlock ;
}
2017-07-20 03:38:07 +02:00
pr_info ( " hibernation entry \n " ) ;
2008-01-11 01:25:21 +01:00
pm_prepare_console ( ) ;
notifier: Fix broken error handling pattern
The current notifiers have the following error handling pattern all
over the place:
int err, nr;
err = __foo_notifier_call_chain(&chain, val_up, v, -1, &nr);
if (err & NOTIFIER_STOP_MASK)
__foo_notifier_call_chain(&chain, val_down, v, nr-1, NULL)
And aside from the endless repetition thereof, it is broken. Consider
blocking notifiers; both calls take and drop the rwsem, this means
that the notifier list can change in between the two calls, making @nr
meaningless.
Fix this by replacing all the __foo_notifier_call_chain() functions
with foo_notifier_call_chain_robust() that embeds the above pattern,
but ensures it is inside a single lock region.
Note: I switched atomic_notifier_call_chain_robust() to use
the spinlock, since RCU cannot provide the guarantee
required for the recovery.
Note: software_resume() error handling was broken afaict.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20200818135804.325626653@infradead.org
2020-08-18 15:57:36 +02:00
error = pm_notifier_call_chain_robust ( PM_HIBERNATION_PREPARE , PM_POST_HIBERNATION ) ;
if ( error )
goto Restore ;
2007-05-06 14:50:45 -07:00
2019-02-25 20:36:41 +08:00
ksys_sync_helper ( ) ;
2007-10-18 03:04:44 -07:00
2011-11-21 12:32:24 -08:00
error = freeze_processes ( ) ;
2005-06-25 14:55:06 -07:00
if ( error )
2013-08-30 14:19:38 +02:00
goto Exit ;
2013-08-30 14:19:46 +02:00
lock_device_hotplug ( ) ;
2013-08-30 14:19:38 +02:00
/* Allocate memory management structures */
error = create_basic_memory_bitmaps ( ) ;
if ( error )
goto Thaw ;
2005-04-16 15:20:36 -07:00
2007-07-19 01:47:29 -07:00
error = hibernation_snapshot ( hibernation_mode = = HIBERNATION_PLATFORM ) ;
2012-02-04 23:39:56 +01:00
if ( error | | freezer_test_done )
2013-08-30 14:19:38 +02:00
goto Free_bitmaps ;
2009-07-08 13:24:05 +02:00
if ( in_suspend ) {
swsusp: introduce restore platform operations
At least on some machines it is necessary to prepare the ACPI firmware for the
restoration of the system memory state from the hibernation image if the
"platform" mode of hibernation has been used. Namely, in that cases we need
to disable the GPEs before replacing the "boot" kernel with the "frozen"
kernel (cf. http://bugzilla.kernel.org/show_bug.cgi?id=7887). After the
restore they will be re-enabled by hibernation_ops->finish(), but if the
restore fails, they have to be re-enabled by the restore code explicitly.
For this purpose we can introduce two additional hibernation operations,
called pre_restore() and restore_cleanup() and call them from the restore code
path. Still, they should be called if the "platform" mode of hibernation has
been used, so we need to pass the information about the hibernation mode from
the "frozen" kernel to the "boot" kernel in the image header.
Apparently, we can't drop the disabling of GPEs before the restore because of
Bug #7887 . We also can't do it unconditionally, because the GPEs wouldn't
have been enabled after a successful restore if the suspend had been done in
the 'shutdown' or 'reboot' mode.
In principle we could (and probably should) unconditionally disable the GPEs
before each snapshot creation *and* before the restore, but then we'd have to
unconditionally enable them after the snapshot creation as well as after the
restore (or restore failure) Still, for this purpose we'd need to modify
acpi_enter_sleep_state_prep() and acpi_leave_sleep_state() and we'd have to
introduce some mechanism synchronizing the disablind/enabling of the GPEs with
the device drivers' .suspend()/.resume() routines and with
disable_/enable_nonboot_cpus(). However, this would have affected the
suspend (ie. s2ram) code as well as the hibernation, which I'd like to avoid
in this patch series.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 01:47:30 -07:00
unsigned int flags = 0 ;
if ( hibernation_mode = = HIBERNATION_PLATFORM )
flags | = SF_PLATFORM_MODE ;
2024-01-22 18:45:27 +05:30
if ( nocompress ) {
2010-09-09 23:06:23 +02:00
flags | = SF_NOCOMPRESS_MODE ;
2024-01-22 18:45:27 +05:30
} else {
2011-10-13 23:58:07 +02:00
flags | = SF_CRC32_MODE ;
2024-01-22 18:45:27 +05:30
/*
* By default , LZO compression is enabled . Use SF_COMPRESSION_ALG_LZ4
* to override this behaviour and use LZ4 .
*
* Refer kernel / power / power . h for more details
*/
if ( ! strcmp ( hib_comp_algo , COMPRESSION_ALGO_LZ4 ) )
flags | = SF_COMPRESSION_ALG_LZ4 ;
else
flags | = SF_COMPRESSION_ALG_LZO ;
}
2020-01-02 15:19:40 -08:00
pm_pr_dbg ( " Writing hibernation image. \n " ) ;
swsusp: introduce restore platform operations
At least on some machines it is necessary to prepare the ACPI firmware for the
restoration of the system memory state from the hibernation image if the
"platform" mode of hibernation has been used. Namely, in that cases we need
to disable the GPEs before replacing the "boot" kernel with the "frozen"
kernel (cf. http://bugzilla.kernel.org/show_bug.cgi?id=7887). After the
restore they will be re-enabled by hibernation_ops->finish(), but if the
restore fails, they have to be re-enabled by the restore code explicitly.
For this purpose we can introduce two additional hibernation operations,
called pre_restore() and restore_cleanup() and call them from the restore code
path. Still, they should be called if the "platform" mode of hibernation has
been used, so we need to pass the information about the hibernation mode from
the "frozen" kernel to the "boot" kernel in the image header.
Apparently, we can't drop the disabling of GPEs before the restore because of
Bug #7887 . We also can't do it unconditionally, because the GPEs wouldn't
have been enabled after a successful restore if the suspend had been done in
the 'shutdown' or 'reboot' mode.
In principle we could (and probably should) unconditionally disable the GPEs
before each snapshot creation *and* before the restore, but then we'd have to
unconditionally enable them after the snapshot creation as well as after the
restore (or restore failure) Still, for this purpose we'd need to modify
acpi_enter_sleep_state_prep() and acpi_leave_sleep_state() and we'd have to
introduce some mechanism synchronizing the disablind/enabling of the GPEs with
the device drivers' .suspend()/.resume() routines and with
disable_/enable_nonboot_cpus(). However, this would have affected the
suspend (ie. s2ram) code as well as the hibernation, which I'd like to avoid
in this patch series.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-19 01:47:30 -07:00
error = swsusp_write ( flags ) ;
2007-07-19 01:47:29 -07:00
swsusp_free ( ) ;
2016-07-22 10:30:47 +08:00
if ( ! error ) {
if ( hibernation_mode = = HIBERNATION_TEST_RESUME )
snapshot_test = true ;
else
power_down ( ) ;
}
2010-11-26 23:07:56 +01:00
in_suspend = 0 ;
2010-12-03 22:57:45 +01:00
pm_restore_gfp_mask ( ) ;
2006-11-02 22:07:19 -08:00
} else {
2020-01-02 15:19:40 -08:00
pm_pr_dbg ( " Hibernation image restored successfully. \n " ) ;
2006-11-02 22:07:19 -08:00
}
2009-07-08 13:24:05 +02:00
2013-08-30 14:19:38 +02:00
Free_bitmaps :
free_basic_memory_bitmaps ( ) ;
2006-11-02 22:07:19 -08:00
Thaw :
2013-08-30 14:19:46 +02:00
unlock_device_hotplug ( ) ;
2016-07-22 10:30:47 +08:00
if ( snapshot_test ) {
2017-07-19 02:38:44 +02:00
pm_pr_dbg ( " Checking hibernation image \n " ) ;
2023-09-06 12:18:52 +08:00
error = swsusp_check ( false ) ;
2016-07-22 10:30:47 +08:00
if ( ! error )
2023-09-27 11:34:23 +02:00
error = load_image_and_restore ( ) ;
2016-07-22 10:30:47 +08:00
}
2008-01-11 01:25:21 +01:00
thaw_processes ( ) ;
2012-02-04 23:39:56 +01:00
/* Don't bother checking whether freezer_test_done is true */
freezer_test_done = false ;
2007-05-06 14:50:45 -07:00
Exit :
notifier: Fix broken error handling pattern
The current notifiers have the following error handling pattern all
over the place:
int err, nr;
err = __foo_notifier_call_chain(&chain, val_up, v, -1, &nr);
if (err & NOTIFIER_STOP_MASK)
__foo_notifier_call_chain(&chain, val_down, v, nr-1, NULL)
And aside from the endless repetition thereof, it is broken. Consider
blocking notifiers; both calls take and drop the rwsem, this means
that the notifier list can change in between the two calls, making @nr
meaningless.
Fix this by replacing all the __foo_notifier_call_chain() functions
with foo_notifier_call_chain_robust() that embeds the above pattern,
but ensures it is inside a single lock region.
Note: I switched atomic_notifier_call_chain_robust() to use
the spinlock, since RCU cannot provide the guarantee
required for the recovery.
Note: software_resume() error handling was broken afaict.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20200818135804.325626653@infradead.org
2020-08-18 15:57:36 +02:00
pm_notifier_call_chain ( PM_POST_HIBERNATION ) ;
Restore :
2008-01-11 01:25:21 +01:00
pm_restore_console ( ) ;
2020-05-07 09:19:52 +02:00
hibernate_release ( ) ;
2007-07-19 01:47:36 -07:00
Unlock :
2022-08-22 13:18:17 +02:00
unlock_system_sleep ( sleep_flags ) ;
2017-07-20 03:38:07 +02:00
pr_info ( " hibernation exit \n " ) ;
2005-04-16 15:20:36 -07:00
return error ;
}
2020-07-20 15:08:18 -07:00
/**
* hibernate_quiet_exec - Execute a function with all devices frozen .
* @ func : Function to execute .
* @ data : Data pointer to pass to @ func .
*
* Return the @ func return value or an error code if it cannot be executed .
*/
int hibernate_quiet_exec ( int ( * func ) ( void * data ) , void * data )
{
2022-08-22 13:18:17 +02:00
unsigned int sleep_flags ;
notifier: Fix broken error handling pattern
The current notifiers have the following error handling pattern all
over the place:
int err, nr;
err = __foo_notifier_call_chain(&chain, val_up, v, -1, &nr);
if (err & NOTIFIER_STOP_MASK)
__foo_notifier_call_chain(&chain, val_down, v, nr-1, NULL)
And aside from the endless repetition thereof, it is broken. Consider
blocking notifiers; both calls take and drop the rwsem, this means
that the notifier list can change in between the two calls, making @nr
meaningless.
Fix this by replacing all the __foo_notifier_call_chain() functions
with foo_notifier_call_chain_robust() that embeds the above pattern,
but ensures it is inside a single lock region.
Note: I switched atomic_notifier_call_chain_robust() to use
the spinlock, since RCU cannot provide the guarantee
required for the recovery.
Note: software_resume() error handling was broken afaict.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20200818135804.325626653@infradead.org
2020-08-18 15:57:36 +02:00
int error ;
2020-07-20 15:08:18 -07:00
2022-08-22 13:18:17 +02:00
sleep_flags = lock_system_sleep ( ) ;
2020-07-20 15:08:18 -07:00
if ( ! hibernate_acquire ( ) ) {
error = - EBUSY ;
goto unlock ;
}
pm_prepare_console ( ) ;
notifier: Fix broken error handling pattern
The current notifiers have the following error handling pattern all
over the place:
int err, nr;
err = __foo_notifier_call_chain(&chain, val_up, v, -1, &nr);
if (err & NOTIFIER_STOP_MASK)
__foo_notifier_call_chain(&chain, val_down, v, nr-1, NULL)
And aside from the endless repetition thereof, it is broken. Consider
blocking notifiers; both calls take and drop the rwsem, this means
that the notifier list can change in between the two calls, making @nr
meaningless.
Fix this by replacing all the __foo_notifier_call_chain() functions
with foo_notifier_call_chain_robust() that embeds the above pattern,
but ensures it is inside a single lock region.
Note: I switched atomic_notifier_call_chain_robust() to use
the spinlock, since RCU cannot provide the guarantee
required for the recovery.
Note: software_resume() error handling was broken afaict.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20200818135804.325626653@infradead.org
2020-08-18 15:57:36 +02:00
error = pm_notifier_call_chain_robust ( PM_HIBERNATION_PREPARE , PM_POST_HIBERNATION ) ;
if ( error )
goto restore ;
2020-07-20 15:08:18 -07:00
error = freeze_processes ( ) ;
if ( error )
goto exit ;
lock_device_hotplug ( ) ;
pm_suspend_clear_flags ( ) ;
error = platform_begin ( true ) ;
if ( error )
goto thaw ;
error = freeze_kernel_threads ( ) ;
if ( error )
goto thaw ;
error = dpm_prepare ( PMSG_FREEZE ) ;
if ( error )
goto dpm_complete ;
suspend_console ( ) ;
error = dpm_suspend ( PMSG_FREEZE ) ;
if ( error )
goto dpm_resume ;
error = dpm_suspend_end ( PMSG_FREEZE ) ;
if ( error )
goto dpm_resume ;
error = platform_pre_snapshot ( true ) ;
if ( error )
goto skip ;
error = func ( data ) ;
skip :
platform_finish ( true ) ;
dpm_resume_start ( PMSG_THAW ) ;
dpm_resume :
dpm_resume ( PMSG_THAW ) ;
resume_console ( ) ;
dpm_complete :
dpm_complete ( PMSG_THAW ) ;
thaw_kernel_threads ( ) ;
thaw :
platform_end ( true ) ;
unlock_device_hotplug ( ) ;
thaw_processes ( ) ;
exit :
notifier: Fix broken error handling pattern
The current notifiers have the following error handling pattern all
over the place:
int err, nr;
err = __foo_notifier_call_chain(&chain, val_up, v, -1, &nr);
if (err & NOTIFIER_STOP_MASK)
__foo_notifier_call_chain(&chain, val_down, v, nr-1, NULL)
And aside from the endless repetition thereof, it is broken. Consider
blocking notifiers; both calls take and drop the rwsem, this means
that the notifier list can change in between the two calls, making @nr
meaningless.
Fix this by replacing all the __foo_notifier_call_chain() functions
with foo_notifier_call_chain_robust() that embeds the above pattern,
but ensures it is inside a single lock region.
Note: I switched atomic_notifier_call_chain_robust() to use
the spinlock, since RCU cannot provide the guarantee
required for the recovery.
Note: software_resume() error handling was broken afaict.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20200818135804.325626653@infradead.org
2020-08-18 15:57:36 +02:00
pm_notifier_call_chain ( PM_POST_HIBERNATION ) ;
2020-07-20 15:08:18 -07:00
notifier: Fix broken error handling pattern
The current notifiers have the following error handling pattern all
over the place:
int err, nr;
err = __foo_notifier_call_chain(&chain, val_up, v, -1, &nr);
if (err & NOTIFIER_STOP_MASK)
__foo_notifier_call_chain(&chain, val_down, v, nr-1, NULL)
And aside from the endless repetition thereof, it is broken. Consider
blocking notifiers; both calls take and drop the rwsem, this means
that the notifier list can change in between the two calls, making @nr
meaningless.
Fix this by replacing all the __foo_notifier_call_chain() functions
with foo_notifier_call_chain_robust() that embeds the above pattern,
but ensures it is inside a single lock region.
Note: I switched atomic_notifier_call_chain_robust() to use
the spinlock, since RCU cannot provide the guarantee
required for the recovery.
Note: software_resume() error handling was broken afaict.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20200818135804.325626653@infradead.org
2020-08-18 15:57:36 +02:00
restore :
2020-07-20 15:08:18 -07:00
pm_restore_console ( ) ;
hibernate_release ( ) ;
unlock :
2022-08-22 13:18:17 +02:00
unlock_system_sleep ( sleep_flags ) ;
2020-07-20 15:08:18 -07:00
return error ;
}
EXPORT_SYMBOL_GPL ( hibernate_quiet_exec ) ;
2005-04-16 15:20:36 -07:00
2023-05-31 14:55:15 +02:00
static int __init find_resume_device ( void )
2023-05-31 14:55:13 +02:00
{
if ( ! strlen ( resume_file ) )
return - ENOENT ;
pm_pr_dbg ( " Checking hibernation image partition %s \n " , resume_file ) ;
if ( resume_delay ) {
pr_info ( " Waiting %dsec before reading resume device ... \n " ,
resume_delay ) ;
ssleep ( resume_delay ) ;
}
/* Check if the device is there */
2023-05-31 14:55:24 +02:00
if ( ! early_lookup_bdev ( resume_file , & swsusp_resume_device ) )
2023-05-31 14:55:13 +02:00
return 0 ;
/*
* Some device discovery might still be in progress ; we need to wait for
* this to finish .
*/
wait_for_device_probe ( ) ;
if ( resume_wait ) {
2023-05-31 14:55:24 +02:00
while ( early_lookup_bdev ( resume_file , & swsusp_resume_device ) )
2023-05-31 14:55:13 +02:00
msleep ( 10 ) ;
async_synchronize_full ( ) ;
}
2023-05-31 14:55:24 +02:00
return early_lookup_bdev ( resume_file , & swsusp_resume_device ) ;
2023-05-31 14:55:13 +02:00
}
2005-04-16 15:20:36 -07:00
static int software_resume ( void )
{
notifier: Fix broken error handling pattern
The current notifiers have the following error handling pattern all
over the place:
int err, nr;
err = __foo_notifier_call_chain(&chain, val_up, v, -1, &nr);
if (err & NOTIFIER_STOP_MASK)
__foo_notifier_call_chain(&chain, val_down, v, nr-1, NULL)
And aside from the endless repetition thereof, it is broken. Consider
blocking notifiers; both calls take and drop the rwsem, this means
that the notifier list can change in between the two calls, making @nr
meaningless.
Fix this by replacing all the __foo_notifier_call_chain() functions
with foo_notifier_call_chain_robust() that embeds the above pattern,
but ensures it is inside a single lock region.
Note: I switched atomic_notifier_call_chain_robust() to use
the spinlock, since RCU cannot provide the guarantee
required for the recovery.
Note: software_resume() error handling was broken afaict.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20200818135804.325626653@infradead.org
2020-08-18 15:57:36 +02:00
int error ;
2005-04-16 15:20:36 -07:00
2017-07-19 02:38:44 +02:00
pm_pr_dbg ( " Hibernation image partition %d:%d present \n " ,
2009-04-25 00:16:06 +02:00
MAJOR ( swsusp_resume_device ) , MINOR ( swsusp_resume_device ) ) ;
2005-04-16 15:20:36 -07:00
2017-07-19 02:38:44 +02:00
pm_pr_dbg ( " Looking for hibernation image. \n " ) ;
2023-05-31 14:55:15 +02:00
mutex_lock ( & system_transition_mutex ) ;
2023-09-06 12:18:52 +08:00
error = swsusp_check ( true ) ;
2007-02-10 01:43:32 -08:00
if ( error )
2007-05-06 14:50:43 -07:00
goto Unlock ;
2005-04-16 15:20:36 -07:00
2024-01-22 18:45:26 +05:30
/*
* Check if the hibernation image is compressed . If so , query for
* the algorithm support .
*/
if ( ! ( swsusp_header_flags & SF_NOCOMPRESS_MODE ) ) {
2024-01-22 18:45:27 +05:30
if ( swsusp_header_flags & SF_COMPRESSION_ALG_LZ4 )
strscpy ( hib_comp_algo , COMPRESSION_ALGO_LZ4 , sizeof ( hib_comp_algo ) ) ;
else
2024-02-14 13:09:32 +05:30
strscpy ( hib_comp_algo , COMPRESSION_ALGO_LZO , sizeof ( hib_comp_algo ) ) ;
2024-01-22 18:45:26 +05:30
if ( crypto_has_comp ( hib_comp_algo , 0 , 0 ) ! = 1 ) {
pr_err ( " %s compression is not available \n " , hib_comp_algo ) ;
error = - EOPNOTSUPP ;
goto Unlock ;
}
}
2007-05-06 14:50:45 -07:00
/* The snapshot device should not be opened while we're running */
2020-05-07 09:19:52 +02:00
if ( ! hibernate_acquire ( ) ) {
2007-05-06 14:50:45 -07:00
error = - EBUSY ;
2023-09-27 11:34:23 +02:00
swsusp_close ( ) ;
2007-05-06 14:50:45 -07:00
goto Unlock ;
}
2017-07-20 03:38:07 +02:00
pr_info ( " resume from hibernation \n " ) ;
2008-01-11 01:25:21 +01:00
pm_prepare_console ( ) ;
notifier: Fix broken error handling pattern
The current notifiers have the following error handling pattern all
over the place:
int err, nr;
err = __foo_notifier_call_chain(&chain, val_up, v, -1, &nr);
if (err & NOTIFIER_STOP_MASK)
__foo_notifier_call_chain(&chain, val_down, v, nr-1, NULL)
And aside from the endless repetition thereof, it is broken. Consider
blocking notifiers; both calls take and drop the rwsem, this means
that the notifier list can change in between the two calls, making @nr
meaningless.
Fix this by replacing all the __foo_notifier_call_chain() functions
with foo_notifier_call_chain_robust() that embeds the above pattern,
but ensures it is inside a single lock region.
Note: I switched atomic_notifier_call_chain_robust() to use
the spinlock, since RCU cannot provide the guarantee
required for the recovery.
Note: software_resume() error handling was broken afaict.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20200818135804.325626653@infradead.org
2020-08-18 15:57:36 +02:00
error = pm_notifier_call_chain_robust ( PM_RESTORE_PREPARE , PM_POST_RESTORE ) ;
if ( error )
goto Restore ;
2008-10-15 22:01:21 -07:00
2020-01-02 15:19:40 -08:00
pm_pr_dbg ( " Preparing processes for hibernation restore. \n " ) ;
2011-11-21 12:32:24 -08:00
error = freeze_processes ( ) ;
2013-08-30 14:19:38 +02:00
if ( error )
goto Close_Finish ;
2020-04-23 20:40:16 -07:00
error = freeze_kernel_threads ( ) ;
if ( error ) {
thaw_processes ( ) ;
goto Close_Finish ;
}
2023-09-27 11:34:23 +02:00
error = load_image_and_restore ( ) ;
2013-08-30 14:19:38 +02:00
thaw_processes ( ) ;
2007-05-06 14:50:45 -07:00
Finish :
notifier: Fix broken error handling pattern
The current notifiers have the following error handling pattern all
over the place:
int err, nr;
err = __foo_notifier_call_chain(&chain, val_up, v, -1, &nr);
if (err & NOTIFIER_STOP_MASK)
__foo_notifier_call_chain(&chain, val_down, v, nr-1, NULL)
And aside from the endless repetition thereof, it is broken. Consider
blocking notifiers; both calls take and drop the rwsem, this means
that the notifier list can change in between the two calls, making @nr
meaningless.
Fix this by replacing all the __foo_notifier_call_chain() functions
with foo_notifier_call_chain_robust() that embeds the above pattern,
but ensures it is inside a single lock region.
Note: I switched atomic_notifier_call_chain_robust() to use
the spinlock, since RCU cannot provide the guarantee
required for the recovery.
Note: software_resume() error handling was broken afaict.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20200818135804.325626653@infradead.org
2020-08-18 15:57:36 +02:00
pm_notifier_call_chain ( PM_POST_RESTORE ) ;
Restore :
2008-01-11 01:25:21 +01:00
pm_restore_console ( ) ;
2020-01-02 15:19:40 -08:00
pr_info ( " resume failed (%d) \n " , error ) ;
2020-05-07 09:19:52 +02:00
hibernate_release ( ) ;
2005-09-03 15:57:04 -07:00
/* For success case, the suspend path will release the lock */
2007-05-06 14:50:43 -07:00
Unlock :
2018-07-31 16:51:32 +08:00
mutex_unlock ( & system_transition_mutex ) ;
2017-07-19 02:38:44 +02:00
pm_pr_dbg ( " Hibernation image not present or could not be loaded. \n " ) ;
2007-07-19 01:47:29 -07:00
return error ;
2013-08-30 14:19:38 +02:00
Close_Finish :
2023-09-27 11:34:23 +02:00
swsusp_close ( ) ;
2009-10-07 22:37:35 +02:00
goto Finish ;
2005-04-16 15:20:36 -07:00
}
2023-05-31 14:55:15 +02:00
/**
* software_resume_initcall - Resume from a saved hibernation image .
*
* This routine is called as a late initcall , when all devices have been
* discovered and initialized already .
*
* The image reading code is called to see if there is a hibernation image
* available for reading . If that is the case , devices are quiesced and the
* contents of memory is restored from the saved image .
*
* If this is successful , control reappears in the restored target kernel in
* hibernation_snapshot ( ) which returns to hibernate ( ) . Otherwise , the routine
* attempts to recover gracefully and make the kernel return to the normal mode
* of operation .
*/
static int __init software_resume_initcall ( void )
{
/*
* If the user said " noresume " . . bail out early .
*/
if ( noresume | | ! hibernation_available ( ) )
return 0 ;
if ( ! swsusp_resume_device ) {
int error = find_resume_device ( ) ;
if ( error )
return error ;
}
return software_resume ( ) ;
}
late_initcall_sync ( software_resume_initcall ) ;
2005-04-16 15:20:36 -07:00
2007-05-09 02:33:18 -07:00
static const char * const hibernation_modes [ ] = {
[ HIBERNATION_PLATFORM ] = " platform " ,
[ HIBERNATION_SHUTDOWN ] = " shutdown " ,
[ HIBERNATION_REBOOT ] = " reboot " ,
2012-06-16 00:09:58 +02:00
# ifdef CONFIG_SUSPEND
[ HIBERNATION_SUSPEND ] = " suspend " ,
# endif
2016-07-22 10:30:47 +08:00
[ HIBERNATION_TEST_RESUME ] = " test_resume " ,
2005-04-16 15:20:36 -07:00
} ;
2011-05-24 23:36:06 +02:00
/*
* / sys / power / disk - Control hibernation mode .
2005-04-16 15:20:36 -07:00
*
2011-05-24 23:36:06 +02:00
* Hibernation can be handled in several ways . There are a few different ways
* to put the system into the sleep state : using the platform driver ( e . g . ACPI
* or other hibernation_ops ) , powering it off or rebooting it ( for testing
2011-12-01 22:33:20 +01:00
* mostly ) .
2005-04-16 15:20:36 -07:00
*
2011-05-24 23:36:06 +02:00
* The sysfs file / sys / power / disk provides an interface for selecting the
* hibernation mode to use . Reading from this file causes the available modes
2011-12-01 22:33:20 +01:00
* to be printed . There are 3 modes that can be supported :
2005-04-16 15:20:36 -07:00
*
* ' platform '
* ' shutdown '
* ' reboot '
*
2011-05-24 23:36:06 +02:00
* If a platform hibernation driver is in use , ' platform ' will be supported
* and will be used by default . Otherwise , ' shutdown ' will be used by default .
* The selected option ( i . e . the one corresponding to the current value of
* hibernation_mode ) is enclosed by a square bracket .
*
* To select a given hibernation mode it is necessary to write the mode ' s
* string representation ( as returned by reading from / sys / power / disk ) back
* into / sys / power / disk .
2005-04-16 15:20:36 -07:00
*/
2007-11-02 13:47:53 +01:00
static ssize_t disk_show ( struct kobject * kobj , struct kobj_attribute * attr ,
char * buf )
2005-04-16 15:20:36 -07:00
{
2007-05-06 14:50:50 -07:00
int i ;
char * start = buf ;
2014-06-13 13:30:35 -07:00
if ( ! hibernation_available ( ) )
return sprintf ( buf , " [disabled] \n " ) ;
2007-05-09 02:33:18 -07:00
for ( i = HIBERNATION_FIRST ; i < = HIBERNATION_MAX ; i + + ) {
if ( ! hibernation_modes [ i ] )
2007-05-06 14:50:50 -07:00
continue ;
switch ( i ) {
2007-05-09 02:33:18 -07:00
case HIBERNATION_SHUTDOWN :
case HIBERNATION_REBOOT :
2012-06-16 00:09:58 +02:00
# ifdef CONFIG_SUSPEND
case HIBERNATION_SUSPEND :
# endif
2016-07-22 10:30:47 +08:00
case HIBERNATION_TEST_RESUME :
2007-05-06 14:50:50 -07:00
break ;
2007-05-09 02:33:18 -07:00
case HIBERNATION_PLATFORM :
if ( hibernation_ops )
2007-05-06 14:50:50 -07:00
break ;
/* not a valid mode, continue with loop */
continue ;
}
2007-05-09 02:33:18 -07:00
if ( i = = hibernation_mode )
buf + = sprintf ( buf , " [%s] " , hibernation_modes [ i ] ) ;
2007-05-06 14:50:50 -07:00
else
2007-05-09 02:33:18 -07:00
buf + = sprintf ( buf , " %s " , hibernation_modes [ i ] ) ;
2007-05-06 14:50:50 -07:00
}
buf + = sprintf ( buf , " \n " ) ;
return buf - start ;
2005-04-16 15:20:36 -07:00
}
2007-11-02 13:47:53 +01:00
static ssize_t disk_store ( struct kobject * kobj , struct kobj_attribute * attr ,
const char * buf , size_t n )
2005-04-16 15:20:36 -07:00
{
2022-08-22 13:18:17 +02:00
int mode = HIBERNATION_INVALID ;
unsigned int sleep_flags ;
2005-04-16 15:20:36 -07:00
int error = 0 ;
int len ;
char * p ;
2022-08-22 13:18:17 +02:00
int i ;
2005-04-16 15:20:36 -07:00
2014-06-13 13:30:35 -07:00
if ( ! hibernation_available ( ) )
return - EPERM ;
2005-04-16 15:20:36 -07:00
p = memchr ( buf , ' \n ' , n ) ;
len = p ? p - buf : n ;
2022-08-22 13:18:17 +02:00
sleep_flags = lock_system_sleep ( ) ;
2007-05-09 02:33:18 -07:00
for ( i = HIBERNATION_FIRST ; i < = HIBERNATION_MAX ; i + + ) {
2007-05-16 22:11:19 -07:00
if ( len = = strlen ( hibernation_modes [ i ] )
& & ! strncmp ( buf , hibernation_modes [ i ] , len ) ) {
2005-04-16 15:20:36 -07:00
mode = i ;
break ;
}
}
2007-05-09 02:33:18 -07:00
if ( mode ! = HIBERNATION_INVALID ) {
rework pm_ops pm_disk_mode, kill misuse
This patch series cleans up some misconceptions about pm_ops. Some users of
the pm_ops structure attempt to use it to stop the user from entering suspend
to disk, this, however, is not possible since the user can always use
"shutdown" in /sys/power/disk and then the pm_ops are never invoked. Also,
platforms that don't support suspend to disk simply should not allow
configuring SOFTWARE_SUSPEND (read the help text on it, it only selects
suspend to disk and nothing else, all the other stuff depends on PM).
The pm_ops structure is actually intended to provide a way to enter
platform-defined sleep states (currently supported states are "standby" and
"mem" (suspend to ram)) and additionally (if SOFTWARE_SUSPEND is configured)
allows a platform to support a platform specific way to enter low-power mode
once everything has been saved to disk. This is currently only used by ACPI
(S4).
This patch:
The pm_ops.pm_disk_mode is used in totally bogus ways since nobody really
seems to understand what it actually does.
This patch clarifies the pm_disk_mode description.
It also removes all the arm and sh users that think they can veto suspend to
disk via pm_ops; not so since the user can always do echo shutdown >
/sys/power/disk, they need to find a better way involving Kconfig or such.
ACPI is the only user left with a non-zero pm_disk_mode.
The patch also sets the default mode to shutdown again, but when a new pm_ops
is registered its pm_disk_mode is selected as default, that way the default
stays for ACPI where it is apparently required.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: David Brownell <david-b@pacbell.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: <linux-pm@lists.linux-foundation.org>
Cc: Len Brown <lenb@kernel.org>
Acked-by: Russell King <rmk@arm.linux.org.uk>
Cc: Greg KH <greg@kroah.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-30 15:09:51 -07:00
switch ( mode ) {
2007-05-09 02:33:18 -07:00
case HIBERNATION_SHUTDOWN :
case HIBERNATION_REBOOT :
2012-06-16 00:09:58 +02:00
# ifdef CONFIG_SUSPEND
case HIBERNATION_SUSPEND :
# endif
2016-07-22 10:30:47 +08:00
case HIBERNATION_TEST_RESUME :
2007-05-09 02:33:18 -07:00
hibernation_mode = mode ;
rework pm_ops pm_disk_mode, kill misuse
This patch series cleans up some misconceptions about pm_ops. Some users of
the pm_ops structure attempt to use it to stop the user from entering suspend
to disk, this, however, is not possible since the user can always use
"shutdown" in /sys/power/disk and then the pm_ops are never invoked. Also,
platforms that don't support suspend to disk simply should not allow
configuring SOFTWARE_SUSPEND (read the help text on it, it only selects
suspend to disk and nothing else, all the other stuff depends on PM).
The pm_ops structure is actually intended to provide a way to enter
platform-defined sleep states (currently supported states are "standby" and
"mem" (suspend to ram)) and additionally (if SOFTWARE_SUSPEND is configured)
allows a platform to support a platform specific way to enter low-power mode
once everything has been saved to disk. This is currently only used by ACPI
(S4).
This patch:
The pm_ops.pm_disk_mode is used in totally bogus ways since nobody really
seems to understand what it actually does.
This patch clarifies the pm_disk_mode description.
It also removes all the arm and sh users that think they can veto suspend to
disk via pm_ops; not so since the user can always do echo shutdown >
/sys/power/disk, they need to find a better way involving Kconfig or such.
ACPI is the only user left with a non-zero pm_disk_mode.
The patch also sets the default mode to shutdown again, but when a new pm_ops
is registered its pm_disk_mode is selected as default, that way the default
stays for ACPI where it is apparently required.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Cc: David Brownell <david-b@pacbell.net>
Acked-by: Pavel Machek <pavel@ucw.cz>
Cc: <linux-pm@lists.linux-foundation.org>
Cc: Len Brown <lenb@kernel.org>
Acked-by: Russell King <rmk@arm.linux.org.uk>
Cc: Greg KH <greg@kroah.com>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-04-30 15:09:51 -07:00
break ;
2007-05-09 02:33:18 -07:00
case HIBERNATION_PLATFORM :
if ( hibernation_ops )
hibernation_mode = mode ;
2005-04-16 15:20:36 -07:00
else
error = - EINVAL ;
}
2007-05-09 02:33:18 -07:00
} else
2005-04-16 15:20:36 -07:00
error = - EINVAL ;
2007-05-09 02:33:18 -07:00
if ( ! error )
2017-07-19 02:38:44 +02:00
pm_pr_dbg ( " Hibernation mode set to '%s' \n " ,
hibernation_modes [ mode ] ) ;
2022-08-22 13:18:17 +02:00
unlock_system_sleep ( sleep_flags ) ;
2005-04-16 15:20:36 -07:00
return error ? error : n ;
}
power_attr ( disk ) ;
2007-11-02 13:47:53 +01:00
static ssize_t resume_show ( struct kobject * kobj , struct kobj_attribute * attr ,
char * buf )
2005-04-16 15:20:36 -07:00
{
2020-06-22 17:09:13 +08:00
return sprintf ( buf , " %d:%d \n " , MAJOR ( swsusp_resume_device ) ,
2005-04-16 15:20:36 -07:00
MINOR ( swsusp_resume_device ) ) ;
}
2007-11-02 13:47:53 +01:00
static ssize_t resume_store ( struct kobject * kobj , struct kobj_attribute * attr ,
const char * buf , size_t n )
2005-04-16 15:20:36 -07:00
{
2022-08-22 13:18:17 +02:00
unsigned int sleep_flags ;
2014-02-14 14:52:56 -08:00
int len = n ;
char * name ;
2023-05-31 14:55:24 +02:00
dev_t dev ;
int error ;
2005-04-16 15:20:36 -07:00
2023-05-31 14:55:15 +02:00
if ( ! hibernation_available ( ) )
2023-08-07 10:33:57 +02:00
return n ;
2023-05-31 14:55:15 +02:00
2014-02-14 14:52:56 -08:00
if ( len & & buf [ len - 1 ] = = ' \n ' )
len - - ;
name = kstrndup ( buf , len , GFP_KERNEL ) ;
if ( ! name )
return - ENOMEM ;
2005-04-16 15:20:36 -07:00
2023-05-31 14:55:32 +02:00
error = lookup_bdev ( name , & dev ) ;
if ( error ) {
unsigned maj , min , offset ;
char * p , dummy ;
2023-07-11 13:48:12 +02:00
error = 0 ;
2023-05-31 14:55:32 +02:00
if ( sscanf ( name , " %u:%u%c " , & maj , & min , & dummy ) = = 2 | |
sscanf ( name , " %u:%u:%u:%c " , & maj , & min , & offset ,
& dummy ) = = 3 ) {
dev = MKDEV ( maj , min ) ;
if ( maj ! = MAJOR ( dev ) | | min ! = MINOR ( dev ) )
error = - EINVAL ;
} else {
dev = new_decode_dev ( simple_strtoul ( name , & p , 16 ) ) ;
if ( * p )
error = - EINVAL ;
}
}
2014-02-14 14:52:56 -08:00
kfree ( name ) ;
2023-05-31 14:55:24 +02:00
if ( error )
return error ;
2005-04-16 15:20:36 -07:00
2022-08-22 13:18:17 +02:00
sleep_flags = lock_system_sleep ( ) ;
2023-05-31 14:55:24 +02:00
swsusp_resume_device = dev ;
2022-08-22 13:18:17 +02:00
unlock_system_sleep ( sleep_flags ) ;
2020-01-02 15:19:40 -08:00
pm_pr_dbg ( " Configured hibernation resume from disk to %u \n " ,
swsusp_resume_device ) ;
2006-01-06 00:09:50 -08:00
noresume = 0 ;
software_resume ( ) ;
2014-02-14 14:52:56 -08:00
return n ;
2005-04-16 15:20:36 -07:00
}
power_attr ( resume ) ;
2018-03-28 12:01:09 -05:00
static ssize_t resume_offset_show ( struct kobject * kobj ,
struct kobj_attribute * attr , char * buf )
{
return sprintf ( buf , " %llu \n " , ( unsigned long long ) swsusp_resume_block ) ;
}
static ssize_t resume_offset_store ( struct kobject * kobj ,
struct kobj_attribute * attr , const char * buf ,
size_t n )
{
unsigned long long offset ;
int rc ;
rc = kstrtoull ( buf , 0 , & offset ) ;
if ( rc )
return rc ;
swsusp_resume_block = offset ;
return n ;
}
power_attr ( resume_offset ) ;
2007-11-02 13:47:53 +01:00
static ssize_t image_size_show ( struct kobject * kobj , struct kobj_attribute * attr ,
char * buf )
2006-01-06 00:15:56 -08:00
{
2006-02-01 03:05:07 -08:00
return sprintf ( buf , " %lu \n " , image_size ) ;
2006-01-06 00:15:56 -08:00
}
2007-11-02 13:47:53 +01:00
static ssize_t image_size_store ( struct kobject * kobj , struct kobj_attribute * attr ,
const char * buf , size_t n )
2006-01-06 00:15:56 -08:00
{
2006-02-01 03:05:07 -08:00
unsigned long size ;
2006-01-06 00:15:56 -08:00
2006-02-01 03:05:07 -08:00
if ( sscanf ( buf , " %lu " , & size ) = = 1 ) {
2006-01-06 00:15:56 -08:00
image_size = size ;
return n ;
}
return - EINVAL ;
}
power_attr ( image_size ) ;
2011-05-15 11:38:48 +02:00
static ssize_t reserved_size_show ( struct kobject * kobj ,
struct kobj_attribute * attr , char * buf )
{
return sprintf ( buf , " %lu \n " , reserved_size ) ;
}
static ssize_t reserved_size_store ( struct kobject * kobj ,
struct kobj_attribute * attr ,
const char * buf , size_t n )
{
unsigned long size ;
if ( sscanf ( buf , " %lu " , & size ) = = 1 ) {
reserved_size = size ;
return n ;
}
return - EINVAL ;
}
power_attr ( reserved_size ) ;
2020-06-22 17:09:13 +08:00
static struct attribute * g [ ] = {
2005-04-16 15:20:36 -07:00
& disk_attr . attr ,
2018-03-28 12:01:09 -05:00
& resume_offset_attr . attr ,
2005-04-16 15:20:36 -07:00
& resume_attr . attr ,
2006-01-06 00:15:56 -08:00
& image_size_attr . attr ,
2011-05-15 11:38:48 +02:00
& reserved_size_attr . attr ,
2005-04-16 15:20:36 -07:00
NULL ,
} ;
2017-06-29 16:58:40 +05:30
static const struct attribute_group attr_group = {
2005-04-16 15:20:36 -07:00
. attrs = g ,
} ;
static int __init pm_disk_init ( void )
{
2007-11-27 11:28:26 -08:00
return sysfs_create_group ( power_kobj , & attr_group ) ;
2005-04-16 15:20:36 -07:00
}
core_initcall ( pm_disk_init ) ;
static int __init resume_setup ( char * str )
{
if ( noresume )
return 1 ;
2024-04-29 20:50:30 +00:00
strscpy ( resume_file , str ) ;
2005-04-16 15:20:36 -07:00
return 1 ;
}
2006-12-06 20:34:12 -08:00
static int __init resume_offset_setup ( char * str )
{
unsigned long long offset ;
if ( noresume )
return 1 ;
if ( sscanf ( str , " %llu " , & offset ) = = 1 )
swsusp_resume_block = offset ;
return 1 ;
}
2010-09-09 23:06:23 +02:00
static int __init hibernate_setup ( char * str )
{
2016-07-06 02:40:56 +02:00
if ( ! strncmp ( str , " noresume " , 8 ) ) {
2010-09-09 23:06:23 +02:00
noresume = 1 ;
2016-07-06 02:40:56 +02:00
} else if ( ! strncmp ( str , " nocompress " , 10 ) ) {
2010-09-09 23:06:23 +02:00
nocompress = 1 ;
2016-07-06 02:40:56 +02:00
} else if ( ! strncmp ( str , " no " , 2 ) ) {
2014-06-13 13:30:35 -07:00
noresume = 1 ;
nohibernate = 1 ;
2017-02-06 16:31:58 -08:00
} else if ( IS_ENABLED ( CONFIG_STRICT_KERNEL_RWX )
2016-07-10 02:12:10 +02:00
& & ! strncmp ( str , " protect_image " , 13 ) ) {
enable_restore_image_protection ( ) ;
2014-06-13 13:30:35 -07:00
}
2010-09-09 23:06:23 +02:00
return 1 ;
}
2005-04-16 15:20:36 -07:00
static int __init noresume_setup ( char * str )
{
noresume = 1 ;
return 1 ;
}
2011-10-06 20:34:46 +02:00
static int __init resumewait_setup ( char * str )
{
resume_wait = 1 ;
return 1 ;
}
2011-10-10 23:38:41 +02:00
static int __init resumedelay_setup ( char * str )
{
2014-05-14 19:08:46 +03:00
int rc = kstrtouint ( str , 0 , & resume_delay ) ;
2014-05-09 23:32:08 +02:00
if ( rc )
2022-02-28 14:05:32 -08:00
pr_warn ( " resumedelay: bad option string '%s' \n " , str ) ;
2011-10-10 23:38:41 +02:00
return 1 ;
}
2014-06-13 13:30:35 -07:00
static int __init nohibernate_setup ( char * str )
{
noresume = 1 ;
nohibernate = 1 ;
return 1 ;
}
2024-02-14 13:09:32 +05:30
static const char * const comp_alg_enabled [ ] = {
# if IS_ENABLED(CONFIG_CRYPTO_LZO)
COMPRESSION_ALGO_LZO ,
# endif
# if IS_ENABLED(CONFIG_CRYPTO_LZ4)
COMPRESSION_ALGO_LZ4 ,
# endif
} ;
static int hibernate_compressor_param_set ( const char * compressor ,
const struct kernel_param * kp )
{
unsigned int sleep_flags ;
int index , ret ;
sleep_flags = lock_system_sleep ( ) ;
index = sysfs_match_string ( comp_alg_enabled , compressor ) ;
if ( index > = 0 ) {
ret = param_set_copystring ( comp_alg_enabled [ index ] , kp ) ;
if ( ! ret )
strscpy ( hib_comp_algo , comp_alg_enabled [ index ] ,
sizeof ( hib_comp_algo ) ) ;
} else {
ret = index ;
}
unlock_system_sleep ( sleep_flags ) ;
if ( ret )
pr_debug ( " Cannot set specified compressor %s \n " ,
compressor ) ;
return ret ;
}
static const struct kernel_param_ops hibernate_compressor_param_ops = {
. set = hibernate_compressor_param_set ,
. get = param_get_string ,
} ;
static struct kparam_string hibernate_compressor_param_string = {
. maxlen = sizeof ( hibernate_compressor ) ,
. string = hibernate_compressor ,
} ;
module_param_cb ( compressor , & hibernate_compressor_param_ops ,
& hibernate_compressor_param_string , 0644 ) ;
MODULE_PARM_DESC ( compressor ,
" Compression algorithm to be used with hibernation " ) ;
2005-04-16 15:20:36 -07:00
__setup ( " noresume " , noresume_setup ) ;
2006-12-06 20:34:12 -08:00
__setup ( " resume_offset= " , resume_offset_setup ) ;
2005-04-16 15:20:36 -07:00
__setup ( " resume= " , resume_setup ) ;
2010-09-09 23:06:23 +02:00
__setup ( " hibernate= " , hibernate_setup ) ;
2011-10-06 20:34:46 +02:00
__setup ( " resumewait " , resumewait_setup ) ;
2011-10-10 23:38:41 +02:00
__setup ( " resumedelay= " , resumedelay_setup ) ;
2014-06-13 13:30:35 -07:00
__setup ( " nohibernate " , nohibernate_setup ) ;