2019-06-01 11:08:42 +03:00
// SPDX-License-Identifier: GPL-2.0-only
2009-06-10 03:27:12 +04:00
/*
* kernel / power / suspend . c - Suspend to RAM and standby functionality .
*
* Copyright ( c ) 2003 Patrick Mochel
* Copyright ( c ) 2003 Open Source Development Lab
* Copyright ( c ) 2009 Rafael J . Wysocki < rjw @ sisk . pl > , Novell Inc .
*/
2017-07-21 15:50:48 +03:00
# define pr_fmt(fmt) "PM: " fmt
2009-06-10 03:27:12 +04:00
# include <linux/string.h>
# include <linux/delay.h>
# include <linux/errno.h>
# include <linux/init.h>
# include <linux/console.h>
# include <linux/cpu.h>
2014-04-21 01:43:01 +04:00
# include <linux/cpuidle.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 11:04:11 +03:00
# include <linux/gfp.h>
2010-05-29 00:32:14 +04:00
# include <linux/io.h>
# include <linux/kernel.h>
# include <linux/list.h>
# include <linux/mm.h>
# include <linux/slab.h>
2011-05-27 00:00:52 +04:00
# include <linux/export.h>
2010-05-29 00:32:14 +04:00
# include <linux/suspend.h>
2011-03-15 02:43:46 +03:00
# include <linux/syscore_ops.h>
2018-05-25 12:46:47 +03:00
# include <linux/swait.h>
2012-06-16 17:30:45 +04:00
# include <linux/ftrace.h>
2011-01-05 21:49:01 +03:00
# include <trace/events/power.h>
2014-04-08 02:39:20 +04:00
# include <linux/compiler.h>
PM / sleep: add configurable delay for pm_test
When CONFIG_PM_DEBUG=y, we provide a sysfs file (/sys/power/pm_test) for
selecting one of a few suspend test modes, where rather than entering a
full suspend state, the kernel will perform some subset of suspend
steps, wait 5 seconds, and then resume back to normal operation.
This mode is useful for (among other things) observing the state of the
system just before entering a sleep mode, for debugging or analysis
purposes. However, a constant 5 second wait is not sufficient for some
sorts of analysis; for example, on an SoC, one might want to use
external tools to probe the power states of various on-chip controllers
or clocks.
This patch turns this 5 second delay into a configurable module
parameter, so users can determine how long to wait in this
pseudo-suspend state before resuming the system.
Example (wait 30 seconds);
# echo 30 > /sys/module/suspend/parameters/pm_test_delay
# echo core > /sys/power/pm_test
# time echo mem > /sys/power/state
...
[ 17.583625] suspend debug: Waiting for 30 second(s).
...
real 0m30.381s
user 0m0.017s
sys 0m0.080s
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Kevin Cernekee <cernekee@chromium.org>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-23 08:16:49 +03:00
# include <linux/moduleparam.h>
2009-06-10 03:27:12 +04:00
# include "power.h"
PM / sleep: System sleep state selection interface rework
There are systems in which the platform doesn't support any special
sleep states, so suspend-to-idle (PM_SUSPEND_FREEZE) is the only
available system sleep state. However, some user space frameworks
only use the "mem" and (sometimes) "standby" sleep state labels, so
the users of those systems need to modify user space in order to be
able to use system suspend at all and that may be a pain in practice.
Commit 0399d4db3edf (PM / sleep: Introduce command line argument for
sleep state enumeration) attempted to address this problem by adding
a command line argument to change the meaning of the "mem" string in
/sys/power/state to make it trigger suspend-to-idle (instead of
suspend-to-RAM).
However, there also are systems in which the platform does support
special sleep states, but suspend-to-idle is the preferred one anyway
(it even may save more energy than the platform-provided sleep states
in some cases) and the above commit doesn't help in those cases.
For this reason, rework the system sleep state selection interface
again (but preserve backwards compatibiliby). Namely, add a new
sysfs file, /sys/power/mem_sleep, that will control the system
suspend mode triggered by writing "mem" to /sys/power/state (in
analogy with what /sys/power/disk does for hibernation). Make it
select suspend-to-RAM ("deep" sleep) by default (if supported) and
fall back to suspend-to-idle ("s2idle") otherwise and add a new
command line argument, mem_sleep_default, allowing that default to
be overridden if need be.
At the same time, drop the relative_sleep_states command line
argument that doesn't make sense any more.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Mario Limonciello <mario.limonciello@dell.com>
2016-11-22 00:45:40 +03:00
const char * const pm_labels [ ] = {
2017-08-10 01:13:07 +03:00
[ PM_SUSPEND_TO_IDLE ] = " freeze " ,
PM / sleep: System sleep state selection interface rework
There are systems in which the platform doesn't support any special
sleep states, so suspend-to-idle (PM_SUSPEND_FREEZE) is the only
available system sleep state. However, some user space frameworks
only use the "mem" and (sometimes) "standby" sleep state labels, so
the users of those systems need to modify user space in order to be
able to use system suspend at all and that may be a pain in practice.
Commit 0399d4db3edf (PM / sleep: Introduce command line argument for
sleep state enumeration) attempted to address this problem by adding
a command line argument to change the meaning of the "mem" string in
/sys/power/state to make it trigger suspend-to-idle (instead of
suspend-to-RAM).
However, there also are systems in which the platform does support
special sleep states, but suspend-to-idle is the preferred one anyway
(it even may save more energy than the platform-provided sleep states
in some cases) and the above commit doesn't help in those cases.
For this reason, rework the system sleep state selection interface
again (but preserve backwards compatibiliby). Namely, add a new
sysfs file, /sys/power/mem_sleep, that will control the system
suspend mode triggered by writing "mem" to /sys/power/state (in
analogy with what /sys/power/disk does for hibernation). Make it
select suspend-to-RAM ("deep" sleep) by default (if supported) and
fall back to suspend-to-idle ("s2idle") otherwise and add a new
command line argument, mem_sleep_default, allowing that default to
be overridden if need be.
At the same time, drop the relative_sleep_states command line
argument that doesn't make sense any more.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Mario Limonciello <mario.limonciello@dell.com>
2016-11-22 00:45:40 +03:00
[ PM_SUSPEND_STANDBY ] = " standby " ,
[ PM_SUSPEND_MEM ] = " mem " ,
} ;
2014-07-16 00:02:11 +04:00
const char * pm_states [ PM_SUSPEND_MAX ] ;
PM / sleep: System sleep state selection interface rework
There are systems in which the platform doesn't support any special
sleep states, so suspend-to-idle (PM_SUSPEND_FREEZE) is the only
available system sleep state. However, some user space frameworks
only use the "mem" and (sometimes) "standby" sleep state labels, so
the users of those systems need to modify user space in order to be
able to use system suspend at all and that may be a pain in practice.
Commit 0399d4db3edf (PM / sleep: Introduce command line argument for
sleep state enumeration) attempted to address this problem by adding
a command line argument to change the meaning of the "mem" string in
/sys/power/state to make it trigger suspend-to-idle (instead of
suspend-to-RAM).
However, there also are systems in which the platform does support
special sleep states, but suspend-to-idle is the preferred one anyway
(it even may save more energy than the platform-provided sleep states
in some cases) and the above commit doesn't help in those cases.
For this reason, rework the system sleep state selection interface
again (but preserve backwards compatibiliby). Namely, add a new
sysfs file, /sys/power/mem_sleep, that will control the system
suspend mode triggered by writing "mem" to /sys/power/state (in
analogy with what /sys/power/disk does for hibernation). Make it
select suspend-to-RAM ("deep" sleep) by default (if supported) and
fall back to suspend-to-idle ("s2idle") otherwise and add a new
command line argument, mem_sleep_default, allowing that default to
be overridden if need be.
At the same time, drop the relative_sleep_states command line
argument that doesn't make sense any more.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Mario Limonciello <mario.limonciello@dell.com>
2016-11-22 00:45:40 +03:00
static const char * const mem_sleep_labels [ ] = {
2017-08-10 01:13:07 +03:00
[ PM_SUSPEND_TO_IDLE ] = " s2idle " ,
PM / sleep: System sleep state selection interface rework
There are systems in which the platform doesn't support any special
sleep states, so suspend-to-idle (PM_SUSPEND_FREEZE) is the only
available system sleep state. However, some user space frameworks
only use the "mem" and (sometimes) "standby" sleep state labels, so
the users of those systems need to modify user space in order to be
able to use system suspend at all and that may be a pain in practice.
Commit 0399d4db3edf (PM / sleep: Introduce command line argument for
sleep state enumeration) attempted to address this problem by adding
a command line argument to change the meaning of the "mem" string in
/sys/power/state to make it trigger suspend-to-idle (instead of
suspend-to-RAM).
However, there also are systems in which the platform does support
special sleep states, but suspend-to-idle is the preferred one anyway
(it even may save more energy than the platform-provided sleep states
in some cases) and the above commit doesn't help in those cases.
For this reason, rework the system sleep state selection interface
again (but preserve backwards compatibiliby). Namely, add a new
sysfs file, /sys/power/mem_sleep, that will control the system
suspend mode triggered by writing "mem" to /sys/power/state (in
analogy with what /sys/power/disk does for hibernation). Make it
select suspend-to-RAM ("deep" sleep) by default (if supported) and
fall back to suspend-to-idle ("s2idle") otherwise and add a new
command line argument, mem_sleep_default, allowing that default to
be overridden if need be.
At the same time, drop the relative_sleep_states command line
argument that doesn't make sense any more.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Mario Limonciello <mario.limonciello@dell.com>
2016-11-22 00:45:40 +03:00
[ PM_SUSPEND_STANDBY ] = " shallow " ,
[ PM_SUSPEND_MEM ] = " deep " ,
} ;
const char * mem_sleep_states [ PM_SUSPEND_MAX ] ;
2017-08-10 01:13:07 +03:00
suspend_state_t mem_sleep_current = PM_SUSPEND_TO_IDLE ;
2017-08-01 00:43:18 +03:00
suspend_state_t mem_sleep_default = PM_SUSPEND_MAX ;
2017-07-18 03:19:25 +03:00
suspend_state_t pm_suspend_target_state ;
EXPORT_SYMBOL_GPL ( pm_suspend_target_state ) ;
2009-06-10 03:27:12 +04:00
2015-10-07 01:49:34 +03:00
unsigned int pm_suspend_global_flags ;
EXPORT_SYMBOL_GPL ( pm_suspend_global_flags ) ;
2010-11-16 16:14:02 +03:00
static const struct platform_suspend_ops * suspend_ops ;
2017-08-10 01:15:30 +03:00
static const struct platform_s2idle_ops * s2idle_ops ;
2018-05-25 12:46:47 +03:00
static DECLARE_SWAIT_QUEUE_HEAD ( s2idle_wait_head ) ;
2015-02-13 01:33:15 +03:00
2017-08-10 01:13:56 +03:00
enum s2idle_states __read_mostly s2idle_state ;
2018-05-25 12:46:48 +03:00
static DEFINE_RAW_SPINLOCK ( s2idle_lock ) ;
PM: Introduce suspend state PM_SUSPEND_FREEZE
PM_SUSPEND_FREEZE state is a general state that
does not need any platform specific support, it equals
frozen processes + suspended devices + idle processors.
Compared with PM_SUSPEND_MEMORY,
PM_SUSPEND_FREEZE saves less power
because the system is still in a running state.
PM_SUSPEND_FREEZE has less resume latency because it does not
touch BIOS, and the processors are in idle state.
Compared with RTPM/idle,
PM_SUSPEND_FREEZE saves more power as
1. the processor has longer sleep time because processes are frozen.
The deeper c-state the processor supports, more power saving we can get.
2. PM_SUSPEND_FREEZE uses system suspend code path, thus we can get
more power saving from the devices that does not have good RTPM support.
This state is useful for
1) platforms that do not have STR, or have a broken STR.
2) platforms that have an extremely low power idle state,
which can be used to replace STR.
The following describes how PM_SUSPEND_FREEZE state works.
1. echo freeze > /sys/power/state
2. the processes are frozen.
3. all the devices are suspended.
4. all the processors are blocked by a wait queue
5. all the processors idles and enters (Deep) c-state.
6. an interrupt fires.
7. a processor is woken up and handles the irq.
8. if it is a general event,
a) the irq handler runs and quites.
b) goto step 4.
9. if it is a real wake event, say, power button pressing, keyboard touch, mouse moving,
a) the irq handler runs and activate the wakeup source
b) wakeup_source_activate() notifies the wait queue.
c) system starts resuming from PM_SUSPEND_FREEZE
10. all the devices are resumed.
11. all the processes are unfrozen.
12. system is back to working.
Known Issue:
The wakeup of this new PM_SUSPEND_FREEZE state may behave differently
from the previous suspend state.
Take ACPI platform for example, there are some GPEs that only enabled
when the system is in sleep state, to wake the system backk from S3/S4.
But we are not touching these GPEs during transition to PM_SUSPEND_FREEZE.
This means we may lose some wake event.
But on the other hand, as we do not disable all the Interrupts during
PM_SUSPEND_FREEZE, we may get some extra "wakeup" Interrupts, that are
not available for S3/S4.
The patches has been tested on an old Sony laptop, and here are the results:
Average Power:
1. RPTM/idle for half an hour:
14.8W, 12.6W, 14.1W, 12.5W, 14.4W, 13.2W, 12.9W
2. Freeze for half an hour:
11W, 10.4W, 9.4W, 11.3W 10.5W
3. RTPM/idle for three hours:
11.6W
4. Freeze for three hours:
10W
5. Suspend to Memory:
0.5~0.9W
Average Resume Latency:
1. RTPM/idle with a black screen: (From pressing keyboard to screen back)
Less than 0.2s
2. Freeze: (From pressing power button to screen back)
2.50s
3. Suspend to Memory: (From pressing power button to screen back)
4.33s
>From the results, we can see that all the platforms should benefit from
this patch, even if it does not have Low Power S0.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-06 16:00:36 +04:00
2019-05-27 13:45:18 +03:00
/**
2019-06-18 11:18:28 +03:00
* pm_suspend_default_s2idle - Check if suspend - to - idle is the default suspend .
2019-05-27 13:45:18 +03:00
*
* Return ' true ' if suspend - to - idle has been selected as the default system
* suspend method .
*/
2019-06-18 11:18:28 +03:00
bool pm_suspend_default_s2idle ( void )
2018-10-02 01:55:22 +03:00
{
return mem_sleep_current = = PM_SUSPEND_TO_IDLE ;
}
2019-06-18 11:18:28 +03:00
EXPORT_SYMBOL_GPL ( pm_suspend_default_s2idle ) ;
2018-10-02 01:55:22 +03:00
2017-08-10 01:15:30 +03:00
void s2idle_set_ops ( const struct platform_s2idle_ops * ops )
2014-05-16 01:29:57 +04:00
{
lock_system_sleep ( ) ;
2017-08-10 01:15:30 +03:00
s2idle_ops = ops ;
2014-05-16 01:29:57 +04:00
unlock_system_sleep ( ) ;
}
2017-08-10 01:13:56 +03:00
static void s2idle_begin ( void )
PM: Introduce suspend state PM_SUSPEND_FREEZE
PM_SUSPEND_FREEZE state is a general state that
does not need any platform specific support, it equals
frozen processes + suspended devices + idle processors.
Compared with PM_SUSPEND_MEMORY,
PM_SUSPEND_FREEZE saves less power
because the system is still in a running state.
PM_SUSPEND_FREEZE has less resume latency because it does not
touch BIOS, and the processors are in idle state.
Compared with RTPM/idle,
PM_SUSPEND_FREEZE saves more power as
1. the processor has longer sleep time because processes are frozen.
The deeper c-state the processor supports, more power saving we can get.
2. PM_SUSPEND_FREEZE uses system suspend code path, thus we can get
more power saving from the devices that does not have good RTPM support.
This state is useful for
1) platforms that do not have STR, or have a broken STR.
2) platforms that have an extremely low power idle state,
which can be used to replace STR.
The following describes how PM_SUSPEND_FREEZE state works.
1. echo freeze > /sys/power/state
2. the processes are frozen.
3. all the devices are suspended.
4. all the processors are blocked by a wait queue
5. all the processors idles and enters (Deep) c-state.
6. an interrupt fires.
7. a processor is woken up and handles the irq.
8. if it is a general event,
a) the irq handler runs and quites.
b) goto step 4.
9. if it is a real wake event, say, power button pressing, keyboard touch, mouse moving,
a) the irq handler runs and activate the wakeup source
b) wakeup_source_activate() notifies the wait queue.
c) system starts resuming from PM_SUSPEND_FREEZE
10. all the devices are resumed.
11. all the processes are unfrozen.
12. system is back to working.
Known Issue:
The wakeup of this new PM_SUSPEND_FREEZE state may behave differently
from the previous suspend state.
Take ACPI platform for example, there are some GPEs that only enabled
when the system is in sleep state, to wake the system backk from S3/S4.
But we are not touching these GPEs during transition to PM_SUSPEND_FREEZE.
This means we may lose some wake event.
But on the other hand, as we do not disable all the Interrupts during
PM_SUSPEND_FREEZE, we may get some extra "wakeup" Interrupts, that are
not available for S3/S4.
The patches has been tested on an old Sony laptop, and here are the results:
Average Power:
1. RPTM/idle for half an hour:
14.8W, 12.6W, 14.1W, 12.5W, 14.4W, 13.2W, 12.9W
2. Freeze for half an hour:
11W, 10.4W, 9.4W, 11.3W 10.5W
3. RTPM/idle for three hours:
11.6W
4. Freeze for three hours:
10W
5. Suspend to Memory:
0.5~0.9W
Average Resume Latency:
1. RTPM/idle with a black screen: (From pressing keyboard to screen back)
Less than 0.2s
2. Freeze: (From pressing power button to screen back)
2.50s
3. Suspend to Memory: (From pressing power button to screen back)
4.33s
>From the results, we can see that all the platforms should benefit from
this patch, even if it does not have Low Power S0.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-06 16:00:36 +04:00
{
2017-08-10 01:13:56 +03:00
s2idle_state = S2IDLE_STATE_NONE ;
PM: Introduce suspend state PM_SUSPEND_FREEZE
PM_SUSPEND_FREEZE state is a general state that
does not need any platform specific support, it equals
frozen processes + suspended devices + idle processors.
Compared with PM_SUSPEND_MEMORY,
PM_SUSPEND_FREEZE saves less power
because the system is still in a running state.
PM_SUSPEND_FREEZE has less resume latency because it does not
touch BIOS, and the processors are in idle state.
Compared with RTPM/idle,
PM_SUSPEND_FREEZE saves more power as
1. the processor has longer sleep time because processes are frozen.
The deeper c-state the processor supports, more power saving we can get.
2. PM_SUSPEND_FREEZE uses system suspend code path, thus we can get
more power saving from the devices that does not have good RTPM support.
This state is useful for
1) platforms that do not have STR, or have a broken STR.
2) platforms that have an extremely low power idle state,
which can be used to replace STR.
The following describes how PM_SUSPEND_FREEZE state works.
1. echo freeze > /sys/power/state
2. the processes are frozen.
3. all the devices are suspended.
4. all the processors are blocked by a wait queue
5. all the processors idles and enters (Deep) c-state.
6. an interrupt fires.
7. a processor is woken up and handles the irq.
8. if it is a general event,
a) the irq handler runs and quites.
b) goto step 4.
9. if it is a real wake event, say, power button pressing, keyboard touch, mouse moving,
a) the irq handler runs and activate the wakeup source
b) wakeup_source_activate() notifies the wait queue.
c) system starts resuming from PM_SUSPEND_FREEZE
10. all the devices are resumed.
11. all the processes are unfrozen.
12. system is back to working.
Known Issue:
The wakeup of this new PM_SUSPEND_FREEZE state may behave differently
from the previous suspend state.
Take ACPI platform for example, there are some GPEs that only enabled
when the system is in sleep state, to wake the system backk from S3/S4.
But we are not touching these GPEs during transition to PM_SUSPEND_FREEZE.
This means we may lose some wake event.
But on the other hand, as we do not disable all the Interrupts during
PM_SUSPEND_FREEZE, we may get some extra "wakeup" Interrupts, that are
not available for S3/S4.
The patches has been tested on an old Sony laptop, and here are the results:
Average Power:
1. RPTM/idle for half an hour:
14.8W, 12.6W, 14.1W, 12.5W, 14.4W, 13.2W, 12.9W
2. Freeze for half an hour:
11W, 10.4W, 9.4W, 11.3W 10.5W
3. RTPM/idle for three hours:
11.6W
4. Freeze for three hours:
10W
5. Suspend to Memory:
0.5~0.9W
Average Resume Latency:
1. RTPM/idle with a black screen: (From pressing keyboard to screen back)
Less than 0.2s
2. Freeze: (From pressing power button to screen back)
2.50s
3. Suspend to Memory: (From pressing power button to screen back)
4.33s
>From the results, we can see that all the platforms should benefit from
this patch, even if it does not have Low Power S0.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-06 16:00:36 +04:00
}
2017-08-10 01:13:56 +03:00
static void s2idle_enter ( void )
PM: Introduce suspend state PM_SUSPEND_FREEZE
PM_SUSPEND_FREEZE state is a general state that
does not need any platform specific support, it equals
frozen processes + suspended devices + idle processors.
Compared with PM_SUSPEND_MEMORY,
PM_SUSPEND_FREEZE saves less power
because the system is still in a running state.
PM_SUSPEND_FREEZE has less resume latency because it does not
touch BIOS, and the processors are in idle state.
Compared with RTPM/idle,
PM_SUSPEND_FREEZE saves more power as
1. the processor has longer sleep time because processes are frozen.
The deeper c-state the processor supports, more power saving we can get.
2. PM_SUSPEND_FREEZE uses system suspend code path, thus we can get
more power saving from the devices that does not have good RTPM support.
This state is useful for
1) platforms that do not have STR, or have a broken STR.
2) platforms that have an extremely low power idle state,
which can be used to replace STR.
The following describes how PM_SUSPEND_FREEZE state works.
1. echo freeze > /sys/power/state
2. the processes are frozen.
3. all the devices are suspended.
4. all the processors are blocked by a wait queue
5. all the processors idles and enters (Deep) c-state.
6. an interrupt fires.
7. a processor is woken up and handles the irq.
8. if it is a general event,
a) the irq handler runs and quites.
b) goto step 4.
9. if it is a real wake event, say, power button pressing, keyboard touch, mouse moving,
a) the irq handler runs and activate the wakeup source
b) wakeup_source_activate() notifies the wait queue.
c) system starts resuming from PM_SUSPEND_FREEZE
10. all the devices are resumed.
11. all the processes are unfrozen.
12. system is back to working.
Known Issue:
The wakeup of this new PM_SUSPEND_FREEZE state may behave differently
from the previous suspend state.
Take ACPI platform for example, there are some GPEs that only enabled
when the system is in sleep state, to wake the system backk from S3/S4.
But we are not touching these GPEs during transition to PM_SUSPEND_FREEZE.
This means we may lose some wake event.
But on the other hand, as we do not disable all the Interrupts during
PM_SUSPEND_FREEZE, we may get some extra "wakeup" Interrupts, that are
not available for S3/S4.
The patches has been tested on an old Sony laptop, and here are the results:
Average Power:
1. RPTM/idle for half an hour:
14.8W, 12.6W, 14.1W, 12.5W, 14.4W, 13.2W, 12.9W
2. Freeze for half an hour:
11W, 10.4W, 9.4W, 11.3W 10.5W
3. RTPM/idle for three hours:
11.6W
4. Freeze for three hours:
10W
5. Suspend to Memory:
0.5~0.9W
Average Resume Latency:
1. RTPM/idle with a black screen: (From pressing keyboard to screen back)
Less than 0.2s
2. Freeze: (From pressing power button to screen back)
2.50s
3. Suspend to Memory: (From pressing power button to screen back)
4.33s
>From the results, we can see that all the platforms should benefit from
this patch, even if it does not have Low Power S0.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-06 16:00:36 +04:00
{
2017-08-10 01:13:07 +03:00
trace_suspend_resume ( TPS ( " machine_suspend " ) , PM_SUSPEND_TO_IDLE , true ) ;
ACPI / PM: Ignore spurious SCI wakeups from suspend-to-idle
The ACPI SCI (System Control Interrupt) is set up as a wakeup IRQ
during suspend-to-idle transitions and, consequently, any events
signaled through it wake up the system from that state. However,
on some systems some of the events signaled via the ACPI SCI while
suspended to idle should not cause the system to wake up. In fact,
quite often they should just be discarded.
Arguably, systems should not resume entirely on such events, but in
order to decide which events really should cause the system to resume
and which are spurious, it is necessary to resume up to the point
when ACPI SCIs are actually handled and processed, which is after
executing dpm_resume_noirq() in the system resume path.
For this reasons, add a loop around freeze_enter() in which the
platforms can process events signaled via multiplexed IRQ lines
like the ACPI SCI and add suspend-to-idle hooks that can be
used for this purpose to struct platform_freeze_ops.
In the ACPI case, the ->wake hook is used for checking if the SCI
has triggered while suspended and deferring the interrupt-induced
system wakeup until the events signaled through it are actually
processed sufficiently to decide whether or not the system should
resume. In turn, the ->sync hook allows all of the relevant event
queues to be flushed so as to prevent events from being missed due
to race conditions.
In addition to that, some ACPI code processing wakeup events needs
to be modified to use the "hard" version of wakeup triggers, so that
it will cause a system resume to happen on device-induced wakeup
events even if the "soft" mechanism to prevent the system from
suspending is not enabled. However, to preserve the existing
behavior with respect to suspend-to-RAM, this only is done in
the suspend-to-idle case and only if an SCI has occurred while
suspended.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-12 23:56:34 +03:00
2018-05-25 12:46:48 +03:00
raw_spin_lock_irq ( & s2idle_lock ) ;
2015-02-13 01:33:15 +03:00
if ( pm_wakeup_pending ( ) )
goto out ;
2017-08-10 01:13:56 +03:00
s2idle_state = S2IDLE_STATE_ENTER ;
2018-05-25 12:46:48 +03:00
raw_spin_unlock_irq ( & s2idle_lock ) ;
2015-02-13 01:33:15 +03:00
2021-08-03 17:16:13 +03:00
cpus_read_lock ( ) ;
2015-02-13 01:33:15 +03:00
/* Push all the CPUs into the idle loop. */
wake_up_all_idle_cpus ( ) ;
/* Make the current CPU wait so it can enter the idle loop too. */
2018-06-12 11:34:52 +03:00
swait_event_exclusive ( s2idle_wait_head ,
2018-05-25 12:46:47 +03:00
s2idle_state = = S2IDLE_STATE_WAKE ) ;
2015-02-13 01:33:15 +03:00
2021-08-03 17:16:13 +03:00
cpus_read_unlock ( ) ;
2015-02-13 01:33:15 +03:00
2018-05-25 12:46:48 +03:00
raw_spin_lock_irq ( & s2idle_lock ) ;
2015-02-13 01:33:15 +03:00
out :
2017-08-10 01:13:56 +03:00
s2idle_state = S2IDLE_STATE_NONE ;
2018-05-25 12:46:48 +03:00
raw_spin_unlock_irq ( & s2idle_lock ) ;
ACPI / PM: Ignore spurious SCI wakeups from suspend-to-idle
The ACPI SCI (System Control Interrupt) is set up as a wakeup IRQ
during suspend-to-idle transitions and, consequently, any events
signaled through it wake up the system from that state. However,
on some systems some of the events signaled via the ACPI SCI while
suspended to idle should not cause the system to wake up. In fact,
quite often they should just be discarded.
Arguably, systems should not resume entirely on such events, but in
order to decide which events really should cause the system to resume
and which are spurious, it is necessary to resume up to the point
when ACPI SCIs are actually handled and processed, which is after
executing dpm_resume_noirq() in the system resume path.
For this reasons, add a loop around freeze_enter() in which the
platforms can process events signaled via multiplexed IRQ lines
like the ACPI SCI and add suspend-to-idle hooks that can be
used for this purpose to struct platform_freeze_ops.
In the ACPI case, the ->wake hook is used for checking if the SCI
has triggered while suspended and deferring the interrupt-induced
system wakeup until the events signaled through it are actually
processed sufficiently to decide whether or not the system should
resume. In turn, the ->sync hook allows all of the relevant event
queues to be flushed so as to prevent events from being missed due
to race conditions.
In addition to that, some ACPI code processing wakeup events needs
to be modified to use the "hard" version of wakeup triggers, so that
it will cause a system resume to happen on device-induced wakeup
events even if the "soft" mechanism to prevent the system from
suspending is not enabled. However, to preserve the existing
behavior with respect to suspend-to-RAM, this only is done in
the suspend-to-idle case and only if an SCI has occurred while
suspended.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-12 23:56:34 +03:00
2017-08-10 01:13:07 +03:00
trace_suspend_resume ( TPS ( " machine_suspend " ) , PM_SUSPEND_TO_IDLE , false ) ;
ACPI / PM: Ignore spurious SCI wakeups from suspend-to-idle
The ACPI SCI (System Control Interrupt) is set up as a wakeup IRQ
during suspend-to-idle transitions and, consequently, any events
signaled through it wake up the system from that state. However,
on some systems some of the events signaled via the ACPI SCI while
suspended to idle should not cause the system to wake up. In fact,
quite often they should just be discarded.
Arguably, systems should not resume entirely on such events, but in
order to decide which events really should cause the system to resume
and which are spurious, it is necessary to resume up to the point
when ACPI SCIs are actually handled and processed, which is after
executing dpm_resume_noirq() in the system resume path.
For this reasons, add a loop around freeze_enter() in which the
platforms can process events signaled via multiplexed IRQ lines
like the ACPI SCI and add suspend-to-idle hooks that can be
used for this purpose to struct platform_freeze_ops.
In the ACPI case, the ->wake hook is used for checking if the SCI
has triggered while suspended and deferring the interrupt-induced
system wakeup until the events signaled through it are actually
processed sufficiently to decide whether or not the system should
resume. In turn, the ->sync hook allows all of the relevant event
queues to be flushed so as to prevent events from being missed due
to race conditions.
In addition to that, some ACPI code processing wakeup events needs
to be modified to use the "hard" version of wakeup triggers, so that
it will cause a system resume to happen on device-induced wakeup
events even if the "soft" mechanism to prevent the system from
suspending is not enabled. However, to preserve the existing
behavior with respect to suspend-to-RAM, this only is done in
the suspend-to-idle case and only if an SCI has occurred while
suspended.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-12 23:56:34 +03:00
}
static void s2idle_loop ( void )
{
2017-07-19 03:38:44 +03:00
pm_pr_dbg ( " suspend-to-idle \n " ) ;
ACPI / PM: Ignore spurious SCI wakeups from suspend-to-idle
The ACPI SCI (System Control Interrupt) is set up as a wakeup IRQ
during suspend-to-idle transitions and, consequently, any events
signaled through it wake up the system from that state. However,
on some systems some of the events signaled via the ACPI SCI while
suspended to idle should not cause the system to wake up. In fact,
quite often they should just be discarded.
Arguably, systems should not resume entirely on such events, but in
order to decide which events really should cause the system to resume
and which are spurious, it is necessary to resume up to the point
when ACPI SCIs are actually handled and processed, which is after
executing dpm_resume_noirq() in the system resume path.
For this reasons, add a loop around freeze_enter() in which the
platforms can process events signaled via multiplexed IRQ lines
like the ACPI SCI and add suspend-to-idle hooks that can be
used for this purpose to struct platform_freeze_ops.
In the ACPI case, the ->wake hook is used for checking if the SCI
has triggered while suspended and deferring the interrupt-induced
system wakeup until the events signaled through it are actually
processed sufficiently to decide whether or not the system should
resume. In turn, the ->sync hook allows all of the relevant event
queues to be flushed so as to prevent events from being missed due
to race conditions.
In addition to that, some ACPI code processing wakeup events needs
to be modified to use the "hard" version of wakeup triggers, so that
it will cause a system resume to happen on device-induced wakeup
events even if the "soft" mechanism to prevent the system from
suspending is not enabled. However, to preserve the existing
behavior with respect to suspend-to-RAM, this only is done in
the suspend-to-idle case and only if an SCI has occurred while
suspended.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-12 23:56:34 +03:00
PM: sleep: Simplify suspend-to-idle control flow
After commit 33e4f80ee69b ("ACPI / PM: Ignore spurious SCI wakeups
from suspend-to-idle") the "noirq" phases of device suspend and
resume may run for multiple times during suspend-to-idle, if there
are spurious system wakeup events while suspended. However, this
is complicated and fragile and actually unnecessary.
The main reason for doing this is that on some systems the EC may
signal system wakeup events (power button events, for example) as
well as events that should not cause the system to resume (spurious
system wakeup events). Thus, in order to determine whether or not
a given event signaled by the EC while suspended is a proper system
wakeup one, the EC GPE needs to be dispatched and to start with that
was achieved by allowing the ACPI SCI action handler to run, which
was only possible after calling resume_device_irqs().
However, dispatching the EC GPE this way turned out to take too much
time in some cases and some EC events might be missed due to that, so
commit 68e22011856f ("ACPI: EC: Dispatch the EC GPE directly on
s2idle wake") started to dispatch the EC GPE right after a wakeup
event has been detected, so in fact the full ACPI SCI action handler
doesn't need to run any more to deal with the wakeups coming from the
EC.
Use this observation to simplify the suspend-to-idle control flow
so that the "noirq" phases of device suspend and resume are each
run only once in every suspend-to-idle cycle, which is reported to
significantly reduce power drawn by some systems when suspended to
idle (by allowing them to reach a deep platform-wide low-power state
through the suspend-to-idle flow). [What appears to happen is that
the "noirq" resume of devices after a spurious EC wakeup brings some
devices into a state in which they prevent the platform from reaching
the deep low-power state going forward, even after a subsequent
"noirq" suspend phase, and on some systems the EC triggers such
wakeups already when the "noirq" suspend of devices is running for
the first time in the given suspend/resume cycle, so the platform
cannot reach the deep low-power state at all.]
First, make acpi_s2idle_wake() use the acpi_ec_dispatch_gpe() return
value to determine whether or not the wakeup may have been triggered
by the EC (in which case the system wakeup is canceled and ACPI
events are processed in order to determine whether or not the event
is a proper system wakeup one) and use rearm_wake_irq() (introduced
by a previous change) in it to rearm the ACPI SCI for system wakeup
detection in case the system will remain suspended.
Second, drop acpi_s2idle_sync(), which is not needed any more, and
the corresponding global platform suspend-to-idle callback.
Next, drop the pm_wakeup_pending() check (which is an optimization
only) from __device_suspend_noirq() to prevent it from returning
errors on system wakeups occurring before the "noirq" phase of
device suspend is complete (as in the case of suspend-to-idle it is
not known whether or not these wakeups are suprious at that point),
in order to avoid having to carry out a "noirq" resume of devices
on a spurious system wakeup.
Finally, change the code flow in s2idle_loop() to (1) run the
"noirq" suspend of devices once before starting the loop, (2) check
for spurious EC wakeups (via the platform ->wake callback) for the
first time before calling s2idle_enter(), and (3) run the "noirq"
resume of devices once after leaving the loop.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
2019-07-16 00:52:03 +03:00
/*
* Suspend - to - idle equals :
* frozen processes + suspended devices + idle processors .
* Thus s2idle_enter ( ) should be called right after all devices have
* been suspended .
*
* Wakeups during the noirq suspend of devices may be spurious , so try
* to avoid them upfront .
*/
2017-07-21 03:12:10 +03:00
for ( ; ; ) {
2020-02-11 12:11:02 +03:00
if ( s2idle_ops & & s2idle_ops - > wake ) {
if ( s2idle_ops - > wake ( ) )
break ;
} else if ( pm_wakeup_pending ( ) ) {
ACPI / PM: Ignore spurious SCI wakeups from suspend-to-idle
The ACPI SCI (System Control Interrupt) is set up as a wakeup IRQ
during suspend-to-idle transitions and, consequently, any events
signaled through it wake up the system from that state. However,
on some systems some of the events signaled via the ACPI SCI while
suspended to idle should not cause the system to wake up. In fact,
quite often they should just be discarded.
Arguably, systems should not resume entirely on such events, but in
order to decide which events really should cause the system to resume
and which are spurious, it is necessary to resume up to the point
when ACPI SCIs are actually handled and processed, which is after
executing dpm_resume_noirq() in the system resume path.
For this reasons, add a loop around freeze_enter() in which the
platforms can process events signaled via multiplexed IRQ lines
like the ACPI SCI and add suspend-to-idle hooks that can be
used for this purpose to struct platform_freeze_ops.
In the ACPI case, the ->wake hook is used for checking if the SCI
has triggered while suspended and deferring the interrupt-induced
system wakeup until the events signaled through it are actually
processed sufficiently to decide whether or not the system should
resume. In turn, the ->sync hook allows all of the relevant event
queues to be flushed so as to prevent events from being missed due
to race conditions.
In addition to that, some ACPI code processing wakeup events needs
to be modified to use the "hard" version of wakeup triggers, so that
it will cause a system resume to happen on device-induced wakeup
events even if the "soft" mechanism to prevent the system from
suspending is not enabled. However, to preserve the existing
behavior with respect to suspend-to-RAM, this only is done in
the suspend-to-idle case and only if an SCI has occurred while
suspended.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-12 23:56:34 +03:00
break ;
2020-02-11 12:11:02 +03:00
}
ACPI / PM: Ignore spurious SCI wakeups from suspend-to-idle
The ACPI SCI (System Control Interrupt) is set up as a wakeup IRQ
during suspend-to-idle transitions and, consequently, any events
signaled through it wake up the system from that state. However,
on some systems some of the events signaled via the ACPI SCI while
suspended to idle should not cause the system to wake up. In fact,
quite often they should just be discarded.
Arguably, systems should not resume entirely on such events, but in
order to decide which events really should cause the system to resume
and which are spurious, it is necessary to resume up to the point
when ACPI SCIs are actually handled and processed, which is after
executing dpm_resume_noirq() in the system resume path.
For this reasons, add a loop around freeze_enter() in which the
platforms can process events signaled via multiplexed IRQ lines
like the ACPI SCI and add suspend-to-idle hooks that can be
used for this purpose to struct platform_freeze_ops.
In the ACPI case, the ->wake hook is used for checking if the SCI
has triggered while suspended and deferring the interrupt-induced
system wakeup until the events signaled through it are actually
processed sufficiently to decide whether or not the system should
resume. In turn, the ->sync hook allows all of the relevant event
queues to be flushed so as to prevent events from being missed due
to race conditions.
In addition to that, some ACPI code processing wakeup events needs
to be modified to use the "hard" version of wakeup triggers, so that
it will cause a system resume to happen on device-induced wakeup
events even if the "soft" mechanism to prevent the system from
suspending is not enabled. However, to preserve the existing
behavior with respect to suspend-to-RAM, this only is done in
the suspend-to-idle case and only if an SCI has occurred while
suspended.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-12 23:56:34 +03:00
pm_wakeup_clear ( false ) ;
PM: sleep: Simplify suspend-to-idle control flow
After commit 33e4f80ee69b ("ACPI / PM: Ignore spurious SCI wakeups
from suspend-to-idle") the "noirq" phases of device suspend and
resume may run for multiple times during suspend-to-idle, if there
are spurious system wakeup events while suspended. However, this
is complicated and fragile and actually unnecessary.
The main reason for doing this is that on some systems the EC may
signal system wakeup events (power button events, for example) as
well as events that should not cause the system to resume (spurious
system wakeup events). Thus, in order to determine whether or not
a given event signaled by the EC while suspended is a proper system
wakeup one, the EC GPE needs to be dispatched and to start with that
was achieved by allowing the ACPI SCI action handler to run, which
was only possible after calling resume_device_irqs().
However, dispatching the EC GPE this way turned out to take too much
time in some cases and some EC events might be missed due to that, so
commit 68e22011856f ("ACPI: EC: Dispatch the EC GPE directly on
s2idle wake") started to dispatch the EC GPE right after a wakeup
event has been detected, so in fact the full ACPI SCI action handler
doesn't need to run any more to deal with the wakeups coming from the
EC.
Use this observation to simplify the suspend-to-idle control flow
so that the "noirq" phases of device suspend and resume are each
run only once in every suspend-to-idle cycle, which is reported to
significantly reduce power drawn by some systems when suspended to
idle (by allowing them to reach a deep platform-wide low-power state
through the suspend-to-idle flow). [What appears to happen is that
the "noirq" resume of devices after a spurious EC wakeup brings some
devices into a state in which they prevent the platform from reaching
the deep low-power state going forward, even after a subsequent
"noirq" suspend phase, and on some systems the EC triggers such
wakeups already when the "noirq" suspend of devices is running for
the first time in the given suspend/resume cycle, so the platform
cannot reach the deep low-power state at all.]
First, make acpi_s2idle_wake() use the acpi_ec_dispatch_gpe() return
value to determine whether or not the wakeup may have been triggered
by the EC (in which case the system wakeup is canceled and ACPI
events are processed in order to determine whether or not the event
is a proper system wakeup one) and use rearm_wake_irq() (introduced
by a previous change) in it to rearm the ACPI SCI for system wakeup
detection in case the system will remain suspended.
Second, drop acpi_s2idle_sync(), which is not needed any more, and
the corresponding global platform suspend-to-idle callback.
Next, drop the pm_wakeup_pending() check (which is an optimization
only) from __device_suspend_noirq() to prevent it from returning
errors on system wakeups occurring before the "noirq" phase of
device suspend is complete (as in the case of suspend-to-idle it is
not known whether or not these wakeups are suprious at that point),
in order to avoid having to carry out a "noirq" resume of devices
on a spurious system wakeup.
Finally, change the code flow in s2idle_loop() to (1) run the
"noirq" suspend of devices once before starting the loop, (2) check
for spurious EC wakeups (via the platform ->wake callback) for the
first time before calling s2idle_enter(), and (3) run the "noirq"
resume of devices once after leaving the loop.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
2019-07-16 00:52:03 +03:00
s2idle_enter ( ) ;
2017-07-21 03:07:54 +03:00
}
ACPI / PM: Ignore spurious SCI wakeups from suspend-to-idle
The ACPI SCI (System Control Interrupt) is set up as a wakeup IRQ
during suspend-to-idle transitions and, consequently, any events
signaled through it wake up the system from that state. However,
on some systems some of the events signaled via the ACPI SCI while
suspended to idle should not cause the system to wake up. In fact,
quite often they should just be discarded.
Arguably, systems should not resume entirely on such events, but in
order to decide which events really should cause the system to resume
and which are spurious, it is necessary to resume up to the point
when ACPI SCIs are actually handled and processed, which is after
executing dpm_resume_noirq() in the system resume path.
For this reasons, add a loop around freeze_enter() in which the
platforms can process events signaled via multiplexed IRQ lines
like the ACPI SCI and add suspend-to-idle hooks that can be
used for this purpose to struct platform_freeze_ops.
In the ACPI case, the ->wake hook is used for checking if the SCI
has triggered while suspended and deferring the interrupt-induced
system wakeup until the events signaled through it are actually
processed sufficiently to decide whether or not the system should
resume. In turn, the ->sync hook allows all of the relevant event
queues to be flushed so as to prevent events from being missed due
to race conditions.
In addition to that, some ACPI code processing wakeup events needs
to be modified to use the "hard" version of wakeup triggers, so that
it will cause a system resume to happen on device-induced wakeup
events even if the "soft" mechanism to prevent the system from
suspending is not enabled. However, to preserve the existing
behavior with respect to suspend-to-RAM, this only is done in
the suspend-to-idle case and only if an SCI has occurred while
suspended.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-12 23:56:34 +03:00
2017-07-19 03:38:44 +03:00
pm_pr_dbg ( " resume from suspend-to-idle \n " ) ;
PM: Introduce suspend state PM_SUSPEND_FREEZE
PM_SUSPEND_FREEZE state is a general state that
does not need any platform specific support, it equals
frozen processes + suspended devices + idle processors.
Compared with PM_SUSPEND_MEMORY,
PM_SUSPEND_FREEZE saves less power
because the system is still in a running state.
PM_SUSPEND_FREEZE has less resume latency because it does not
touch BIOS, and the processors are in idle state.
Compared with RTPM/idle,
PM_SUSPEND_FREEZE saves more power as
1. the processor has longer sleep time because processes are frozen.
The deeper c-state the processor supports, more power saving we can get.
2. PM_SUSPEND_FREEZE uses system suspend code path, thus we can get
more power saving from the devices that does not have good RTPM support.
This state is useful for
1) platforms that do not have STR, or have a broken STR.
2) platforms that have an extremely low power idle state,
which can be used to replace STR.
The following describes how PM_SUSPEND_FREEZE state works.
1. echo freeze > /sys/power/state
2. the processes are frozen.
3. all the devices are suspended.
4. all the processors are blocked by a wait queue
5. all the processors idles and enters (Deep) c-state.
6. an interrupt fires.
7. a processor is woken up and handles the irq.
8. if it is a general event,
a) the irq handler runs and quites.
b) goto step 4.
9. if it is a real wake event, say, power button pressing, keyboard touch, mouse moving,
a) the irq handler runs and activate the wakeup source
b) wakeup_source_activate() notifies the wait queue.
c) system starts resuming from PM_SUSPEND_FREEZE
10. all the devices are resumed.
11. all the processes are unfrozen.
12. system is back to working.
Known Issue:
The wakeup of this new PM_SUSPEND_FREEZE state may behave differently
from the previous suspend state.
Take ACPI platform for example, there are some GPEs that only enabled
when the system is in sleep state, to wake the system backk from S3/S4.
But we are not touching these GPEs during transition to PM_SUSPEND_FREEZE.
This means we may lose some wake event.
But on the other hand, as we do not disable all the Interrupts during
PM_SUSPEND_FREEZE, we may get some extra "wakeup" Interrupts, that are
not available for S3/S4.
The patches has been tested on an old Sony laptop, and here are the results:
Average Power:
1. RPTM/idle for half an hour:
14.8W, 12.6W, 14.1W, 12.5W, 14.4W, 13.2W, 12.9W
2. Freeze for half an hour:
11W, 10.4W, 9.4W, 11.3W 10.5W
3. RTPM/idle for three hours:
11.6W
4. Freeze for three hours:
10W
5. Suspend to Memory:
0.5~0.9W
Average Resume Latency:
1. RTPM/idle with a black screen: (From pressing keyboard to screen back)
Less than 0.2s
2. Freeze: (From pressing power button to screen back)
2.50s
3. Suspend to Memory: (From pressing power button to screen back)
4.33s
>From the results, we can see that all the platforms should benefit from
this patch, even if it does not have Low Power S0.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-06 16:00:36 +04:00
}
2017-08-10 01:13:56 +03:00
void s2idle_wake ( void )
PM: Introduce suspend state PM_SUSPEND_FREEZE
PM_SUSPEND_FREEZE state is a general state that
does not need any platform specific support, it equals
frozen processes + suspended devices + idle processors.
Compared with PM_SUSPEND_MEMORY,
PM_SUSPEND_FREEZE saves less power
because the system is still in a running state.
PM_SUSPEND_FREEZE has less resume latency because it does not
touch BIOS, and the processors are in idle state.
Compared with RTPM/idle,
PM_SUSPEND_FREEZE saves more power as
1. the processor has longer sleep time because processes are frozen.
The deeper c-state the processor supports, more power saving we can get.
2. PM_SUSPEND_FREEZE uses system suspend code path, thus we can get
more power saving from the devices that does not have good RTPM support.
This state is useful for
1) platforms that do not have STR, or have a broken STR.
2) platforms that have an extremely low power idle state,
which can be used to replace STR.
The following describes how PM_SUSPEND_FREEZE state works.
1. echo freeze > /sys/power/state
2. the processes are frozen.
3. all the devices are suspended.
4. all the processors are blocked by a wait queue
5. all the processors idles and enters (Deep) c-state.
6. an interrupt fires.
7. a processor is woken up and handles the irq.
8. if it is a general event,
a) the irq handler runs and quites.
b) goto step 4.
9. if it is a real wake event, say, power button pressing, keyboard touch, mouse moving,
a) the irq handler runs and activate the wakeup source
b) wakeup_source_activate() notifies the wait queue.
c) system starts resuming from PM_SUSPEND_FREEZE
10. all the devices are resumed.
11. all the processes are unfrozen.
12. system is back to working.
Known Issue:
The wakeup of this new PM_SUSPEND_FREEZE state may behave differently
from the previous suspend state.
Take ACPI platform for example, there are some GPEs that only enabled
when the system is in sleep state, to wake the system backk from S3/S4.
But we are not touching these GPEs during transition to PM_SUSPEND_FREEZE.
This means we may lose some wake event.
But on the other hand, as we do not disable all the Interrupts during
PM_SUSPEND_FREEZE, we may get some extra "wakeup" Interrupts, that are
not available for S3/S4.
The patches has been tested on an old Sony laptop, and here are the results:
Average Power:
1. RPTM/idle for half an hour:
14.8W, 12.6W, 14.1W, 12.5W, 14.4W, 13.2W, 12.9W
2. Freeze for half an hour:
11W, 10.4W, 9.4W, 11.3W 10.5W
3. RTPM/idle for three hours:
11.6W
4. Freeze for three hours:
10W
5. Suspend to Memory:
0.5~0.9W
Average Resume Latency:
1. RTPM/idle with a black screen: (From pressing keyboard to screen back)
Less than 0.2s
2. Freeze: (From pressing power button to screen back)
2.50s
3. Suspend to Memory: (From pressing power button to screen back)
4.33s
>From the results, we can see that all the platforms should benefit from
this patch, even if it does not have Low Power S0.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-06 16:00:36 +04:00
{
2015-02-13 01:33:15 +03:00
unsigned long flags ;
2018-05-25 12:46:48 +03:00
raw_spin_lock_irqsave ( & s2idle_lock , flags ) ;
2017-08-10 01:13:56 +03:00
if ( s2idle_state > S2IDLE_STATE_NONE ) {
s2idle_state = S2IDLE_STATE_WAKE ;
2018-06-12 11:34:52 +03:00
swake_up_one ( & s2idle_wait_head ) ;
2015-02-13 01:33:15 +03:00
}
2018-05-25 12:46:48 +03:00
raw_spin_unlock_irqrestore ( & s2idle_lock , flags ) ;
PM: Introduce suspend state PM_SUSPEND_FREEZE
PM_SUSPEND_FREEZE state is a general state that
does not need any platform specific support, it equals
frozen processes + suspended devices + idle processors.
Compared with PM_SUSPEND_MEMORY,
PM_SUSPEND_FREEZE saves less power
because the system is still in a running state.
PM_SUSPEND_FREEZE has less resume latency because it does not
touch BIOS, and the processors are in idle state.
Compared with RTPM/idle,
PM_SUSPEND_FREEZE saves more power as
1. the processor has longer sleep time because processes are frozen.
The deeper c-state the processor supports, more power saving we can get.
2. PM_SUSPEND_FREEZE uses system suspend code path, thus we can get
more power saving from the devices that does not have good RTPM support.
This state is useful for
1) platforms that do not have STR, or have a broken STR.
2) platforms that have an extremely low power idle state,
which can be used to replace STR.
The following describes how PM_SUSPEND_FREEZE state works.
1. echo freeze > /sys/power/state
2. the processes are frozen.
3. all the devices are suspended.
4. all the processors are blocked by a wait queue
5. all the processors idles and enters (Deep) c-state.
6. an interrupt fires.
7. a processor is woken up and handles the irq.
8. if it is a general event,
a) the irq handler runs and quites.
b) goto step 4.
9. if it is a real wake event, say, power button pressing, keyboard touch, mouse moving,
a) the irq handler runs and activate the wakeup source
b) wakeup_source_activate() notifies the wait queue.
c) system starts resuming from PM_SUSPEND_FREEZE
10. all the devices are resumed.
11. all the processes are unfrozen.
12. system is back to working.
Known Issue:
The wakeup of this new PM_SUSPEND_FREEZE state may behave differently
from the previous suspend state.
Take ACPI platform for example, there are some GPEs that only enabled
when the system is in sleep state, to wake the system backk from S3/S4.
But we are not touching these GPEs during transition to PM_SUSPEND_FREEZE.
This means we may lose some wake event.
But on the other hand, as we do not disable all the Interrupts during
PM_SUSPEND_FREEZE, we may get some extra "wakeup" Interrupts, that are
not available for S3/S4.
The patches has been tested on an old Sony laptop, and here are the results:
Average Power:
1. RPTM/idle for half an hour:
14.8W, 12.6W, 14.1W, 12.5W, 14.4W, 13.2W, 12.9W
2. Freeze for half an hour:
11W, 10.4W, 9.4W, 11.3W 10.5W
3. RTPM/idle for three hours:
11.6W
4. Freeze for three hours:
10W
5. Suspend to Memory:
0.5~0.9W
Average Resume Latency:
1. RTPM/idle with a black screen: (From pressing keyboard to screen back)
Less than 0.2s
2. Freeze: (From pressing power button to screen back)
2.50s
3. Suspend to Memory: (From pressing power button to screen back)
4.33s
>From the results, we can see that all the platforms should benefit from
this patch, even if it does not have Low Power S0.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-06 16:00:36 +04:00
}
2017-08-10 01:13:56 +03:00
EXPORT_SYMBOL_GPL ( s2idle_wake ) ;
PM: Introduce suspend state PM_SUSPEND_FREEZE
PM_SUSPEND_FREEZE state is a general state that
does not need any platform specific support, it equals
frozen processes + suspended devices + idle processors.
Compared with PM_SUSPEND_MEMORY,
PM_SUSPEND_FREEZE saves less power
because the system is still in a running state.
PM_SUSPEND_FREEZE has less resume latency because it does not
touch BIOS, and the processors are in idle state.
Compared with RTPM/idle,
PM_SUSPEND_FREEZE saves more power as
1. the processor has longer sleep time because processes are frozen.
The deeper c-state the processor supports, more power saving we can get.
2. PM_SUSPEND_FREEZE uses system suspend code path, thus we can get
more power saving from the devices that does not have good RTPM support.
This state is useful for
1) platforms that do not have STR, or have a broken STR.
2) platforms that have an extremely low power idle state,
which can be used to replace STR.
The following describes how PM_SUSPEND_FREEZE state works.
1. echo freeze > /sys/power/state
2. the processes are frozen.
3. all the devices are suspended.
4. all the processors are blocked by a wait queue
5. all the processors idles and enters (Deep) c-state.
6. an interrupt fires.
7. a processor is woken up and handles the irq.
8. if it is a general event,
a) the irq handler runs and quites.
b) goto step 4.
9. if it is a real wake event, say, power button pressing, keyboard touch, mouse moving,
a) the irq handler runs and activate the wakeup source
b) wakeup_source_activate() notifies the wait queue.
c) system starts resuming from PM_SUSPEND_FREEZE
10. all the devices are resumed.
11. all the processes are unfrozen.
12. system is back to working.
Known Issue:
The wakeup of this new PM_SUSPEND_FREEZE state may behave differently
from the previous suspend state.
Take ACPI platform for example, there are some GPEs that only enabled
when the system is in sleep state, to wake the system backk from S3/S4.
But we are not touching these GPEs during transition to PM_SUSPEND_FREEZE.
This means we may lose some wake event.
But on the other hand, as we do not disable all the Interrupts during
PM_SUSPEND_FREEZE, we may get some extra "wakeup" Interrupts, that are
not available for S3/S4.
The patches has been tested on an old Sony laptop, and here are the results:
Average Power:
1. RPTM/idle for half an hour:
14.8W, 12.6W, 14.1W, 12.5W, 14.4W, 13.2W, 12.9W
2. Freeze for half an hour:
11W, 10.4W, 9.4W, 11.3W 10.5W
3. RTPM/idle for three hours:
11.6W
4. Freeze for three hours:
10W
5. Suspend to Memory:
0.5~0.9W
Average Resume Latency:
1. RTPM/idle with a black screen: (From pressing keyboard to screen back)
Less than 0.2s
2. Freeze: (From pressing power button to screen back)
2.50s
3. Suspend to Memory: (From pressing power button to screen back)
4.33s
>From the results, we can see that all the platforms should benefit from
this patch, even if it does not have Low Power S0.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-06 16:00:36 +04:00
2014-05-26 15:40:53 +04:00
static bool valid_state ( suspend_state_t state )
{
/*
2021-10-22 18:53:43 +03:00
* The PM_SUSPEND_STANDBY and PM_SUSPEND_MEM states require low - level
* support and need to be valid to the low - level implementation .
*
* No - > valid ( ) or - > enter ( ) callback implies that none are valid .
2014-05-26 15:40:53 +04:00
*/
2021-10-22 18:53:43 +03:00
return suspend_ops & & suspend_ops - > valid & & suspend_ops - > valid ( state ) & &
suspend_ops - > enter ;
2014-05-26 15:40:53 +04:00
}
2016-08-19 16:41:00 +03:00
void __init pm_states_init ( void )
{
2021-10-19 21:55:04 +03:00
/* "mem" and "freeze" are always present in /sys/power/state. */
pm_states [ PM_SUSPEND_MEM ] = pm_labels [ PM_SUSPEND_MEM ] ;
2017-08-10 01:13:07 +03:00
pm_states [ PM_SUSPEND_TO_IDLE ] = pm_labels [ PM_SUSPEND_TO_IDLE ] ;
2016-08-19 16:41:00 +03:00
/*
PM / sleep: System sleep state selection interface rework
There are systems in which the platform doesn't support any special
sleep states, so suspend-to-idle (PM_SUSPEND_FREEZE) is the only
available system sleep state. However, some user space frameworks
only use the "mem" and (sometimes) "standby" sleep state labels, so
the users of those systems need to modify user space in order to be
able to use system suspend at all and that may be a pain in practice.
Commit 0399d4db3edf (PM / sleep: Introduce command line argument for
sleep state enumeration) attempted to address this problem by adding
a command line argument to change the meaning of the "mem" string in
/sys/power/state to make it trigger suspend-to-idle (instead of
suspend-to-RAM).
However, there also are systems in which the platform does support
special sleep states, but suspend-to-idle is the preferred one anyway
(it even may save more energy than the platform-provided sleep states
in some cases) and the above commit doesn't help in those cases.
For this reason, rework the system sleep state selection interface
again (but preserve backwards compatibiliby). Namely, add a new
sysfs file, /sys/power/mem_sleep, that will control the system
suspend mode triggered by writing "mem" to /sys/power/state (in
analogy with what /sys/power/disk does for hibernation). Make it
select suspend-to-RAM ("deep" sleep) by default (if supported) and
fall back to suspend-to-idle ("s2idle") otherwise and add a new
command line argument, mem_sleep_default, allowing that default to
be overridden if need be.
At the same time, drop the relative_sleep_states command line
argument that doesn't make sense any more.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Mario Limonciello <mario.limonciello@dell.com>
2016-11-22 00:45:40 +03:00
* Suspend - to - idle should be supported even without any suspend_ops ,
* initialize mem_sleep_states [ ] accordingly here .
2016-08-19 16:41:00 +03:00
*/
2017-08-10 01:13:07 +03:00
mem_sleep_states [ PM_SUSPEND_TO_IDLE ] = mem_sleep_labels [ PM_SUSPEND_TO_IDLE ] ;
2016-08-19 16:41:00 +03:00
}
PM / sleep: System sleep state selection interface rework
There are systems in which the platform doesn't support any special
sleep states, so suspend-to-idle (PM_SUSPEND_FREEZE) is the only
available system sleep state. However, some user space frameworks
only use the "mem" and (sometimes) "standby" sleep state labels, so
the users of those systems need to modify user space in order to be
able to use system suspend at all and that may be a pain in practice.
Commit 0399d4db3edf (PM / sleep: Introduce command line argument for
sleep state enumeration) attempted to address this problem by adding
a command line argument to change the meaning of the "mem" string in
/sys/power/state to make it trigger suspend-to-idle (instead of
suspend-to-RAM).
However, there also are systems in which the platform does support
special sleep states, but suspend-to-idle is the preferred one anyway
(it even may save more energy than the platform-provided sleep states
in some cases) and the above commit doesn't help in those cases.
For this reason, rework the system sleep state selection interface
again (but preserve backwards compatibiliby). Namely, add a new
sysfs file, /sys/power/mem_sleep, that will control the system
suspend mode triggered by writing "mem" to /sys/power/state (in
analogy with what /sys/power/disk does for hibernation). Make it
select suspend-to-RAM ("deep" sleep) by default (if supported) and
fall back to suspend-to-idle ("s2idle") otherwise and add a new
command line argument, mem_sleep_default, allowing that default to
be overridden if need be.
At the same time, drop the relative_sleep_states command line
argument that doesn't make sense any more.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Mario Limonciello <mario.limonciello@dell.com>
2016-11-22 00:45:40 +03:00
static int __init mem_sleep_default_setup ( char * str )
PM / sleep: Introduce command line argument for sleep state enumeration
On some systems the platform doesn't support neither
PM_SUSPEND_MEM nor PM_SUSPEND_STANDBY, so PM_SUSPEND_FREEZE is the
only available system sleep state. However, some user space frameworks
only use the "mem" and (sometimes) "standby" sleep state labels, so
the users of those systems need to modify user space in order to be
able to use system suspend at all and that is not always possible.
For this reason, add a new kernel command line argument,
relative_sleep_states, allowing the users of those systems to change
the way in which the kernel assigns labels to system sleep states.
Namely, for relative_sleep_states=1, the "mem", "standby" and "freeze"
labels will enumerate the available system sleem states from the
deepest to the shallowest, respectively, so that "mem" is always
present in /sys/power/state and the other state strings may or may
not be presend depending on what is supported by the platform.
Update system sleep states documentation to reflect this change.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-05-26 15:40:59 +04:00
{
PM / sleep: System sleep state selection interface rework
There are systems in which the platform doesn't support any special
sleep states, so suspend-to-idle (PM_SUSPEND_FREEZE) is the only
available system sleep state. However, some user space frameworks
only use the "mem" and (sometimes) "standby" sleep state labels, so
the users of those systems need to modify user space in order to be
able to use system suspend at all and that may be a pain in practice.
Commit 0399d4db3edf (PM / sleep: Introduce command line argument for
sleep state enumeration) attempted to address this problem by adding
a command line argument to change the meaning of the "mem" string in
/sys/power/state to make it trigger suspend-to-idle (instead of
suspend-to-RAM).
However, there also are systems in which the platform does support
special sleep states, but suspend-to-idle is the preferred one anyway
(it even may save more energy than the platform-provided sleep states
in some cases) and the above commit doesn't help in those cases.
For this reason, rework the system sleep state selection interface
again (but preserve backwards compatibiliby). Namely, add a new
sysfs file, /sys/power/mem_sleep, that will control the system
suspend mode triggered by writing "mem" to /sys/power/state (in
analogy with what /sys/power/disk does for hibernation). Make it
select suspend-to-RAM ("deep" sleep) by default (if supported) and
fall back to suspend-to-idle ("s2idle") otherwise and add a new
command line argument, mem_sleep_default, allowing that default to
be overridden if need be.
At the same time, drop the relative_sleep_states command line
argument that doesn't make sense any more.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Mario Limonciello <mario.limonciello@dell.com>
2016-11-22 00:45:40 +03:00
suspend_state_t state ;
2017-08-10 01:13:07 +03:00
for ( state = PM_SUSPEND_TO_IDLE ; state < = PM_SUSPEND_MEM ; state + + )
PM / sleep: System sleep state selection interface rework
There are systems in which the platform doesn't support any special
sleep states, so suspend-to-idle (PM_SUSPEND_FREEZE) is the only
available system sleep state. However, some user space frameworks
only use the "mem" and (sometimes) "standby" sleep state labels, so
the users of those systems need to modify user space in order to be
able to use system suspend at all and that may be a pain in practice.
Commit 0399d4db3edf (PM / sleep: Introduce command line argument for
sleep state enumeration) attempted to address this problem by adding
a command line argument to change the meaning of the "mem" string in
/sys/power/state to make it trigger suspend-to-idle (instead of
suspend-to-RAM).
However, there also are systems in which the platform does support
special sleep states, but suspend-to-idle is the preferred one anyway
(it even may save more energy than the platform-provided sleep states
in some cases) and the above commit doesn't help in those cases.
For this reason, rework the system sleep state selection interface
again (but preserve backwards compatibiliby). Namely, add a new
sysfs file, /sys/power/mem_sleep, that will control the system
suspend mode triggered by writing "mem" to /sys/power/state (in
analogy with what /sys/power/disk does for hibernation). Make it
select suspend-to-RAM ("deep" sleep) by default (if supported) and
fall back to suspend-to-idle ("s2idle") otherwise and add a new
command line argument, mem_sleep_default, allowing that default to
be overridden if need be.
At the same time, drop the relative_sleep_states command line
argument that doesn't make sense any more.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Mario Limonciello <mario.limonciello@dell.com>
2016-11-22 00:45:40 +03:00
if ( mem_sleep_labels [ state ] & &
! strcmp ( str , mem_sleep_labels [ state ] ) ) {
mem_sleep_default = state ;
break ;
}
PM / sleep: Introduce command line argument for sleep state enumeration
On some systems the platform doesn't support neither
PM_SUSPEND_MEM nor PM_SUSPEND_STANDBY, so PM_SUSPEND_FREEZE is the
only available system sleep state. However, some user space frameworks
only use the "mem" and (sometimes) "standby" sleep state labels, so
the users of those systems need to modify user space in order to be
able to use system suspend at all and that is not always possible.
For this reason, add a new kernel command line argument,
relative_sleep_states, allowing the users of those systems to change
the way in which the kernel assigns labels to system sleep states.
Namely, for relative_sleep_states=1, the "mem", "standby" and "freeze"
labels will enumerate the available system sleem states from the
deepest to the shallowest, respectively, so that "mem" is always
present in /sys/power/state and the other state strings may or may
not be presend depending on what is supported by the platform.
Update system sleep states documentation to reflect this change.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-05-26 15:40:59 +04:00
return 1 ;
}
PM / sleep: System sleep state selection interface rework
There are systems in which the platform doesn't support any special
sleep states, so suspend-to-idle (PM_SUSPEND_FREEZE) is the only
available system sleep state. However, some user space frameworks
only use the "mem" and (sometimes) "standby" sleep state labels, so
the users of those systems need to modify user space in order to be
able to use system suspend at all and that may be a pain in practice.
Commit 0399d4db3edf (PM / sleep: Introduce command line argument for
sleep state enumeration) attempted to address this problem by adding
a command line argument to change the meaning of the "mem" string in
/sys/power/state to make it trigger suspend-to-idle (instead of
suspend-to-RAM).
However, there also are systems in which the platform does support
special sleep states, but suspend-to-idle is the preferred one anyway
(it even may save more energy than the platform-provided sleep states
in some cases) and the above commit doesn't help in those cases.
For this reason, rework the system sleep state selection interface
again (but preserve backwards compatibiliby). Namely, add a new
sysfs file, /sys/power/mem_sleep, that will control the system
suspend mode triggered by writing "mem" to /sys/power/state (in
analogy with what /sys/power/disk does for hibernation). Make it
select suspend-to-RAM ("deep" sleep) by default (if supported) and
fall back to suspend-to-idle ("s2idle") otherwise and add a new
command line argument, mem_sleep_default, allowing that default to
be overridden if need be.
At the same time, drop the relative_sleep_states command line
argument that doesn't make sense any more.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Mario Limonciello <mario.limonciello@dell.com>
2016-11-22 00:45:40 +03:00
__setup ( " mem_sleep_default= " , mem_sleep_default_setup ) ;
PM / sleep: Introduce command line argument for sleep state enumeration
On some systems the platform doesn't support neither
PM_SUSPEND_MEM nor PM_SUSPEND_STANDBY, so PM_SUSPEND_FREEZE is the
only available system sleep state. However, some user space frameworks
only use the "mem" and (sometimes) "standby" sleep state labels, so
the users of those systems need to modify user space in order to be
able to use system suspend at all and that is not always possible.
For this reason, add a new kernel command line argument,
relative_sleep_states, allowing the users of those systems to change
the way in which the kernel assigns labels to system sleep states.
Namely, for relative_sleep_states=1, the "mem", "standby" and "freeze"
labels will enumerate the available system sleem states from the
deepest to the shallowest, respectively, so that "mem" is always
present in /sys/power/state and the other state strings may or may
not be presend depending on what is supported by the platform.
Update system sleep states documentation to reflect this change.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-05-26 15:40:59 +04:00
2009-06-10 03:27:12 +04:00
/**
2012-02-13 19:29:14 +04:00
* suspend_set_ops - Set the global suspend method table .
* @ ops : Suspend operations to use .
2009-06-10 03:27:12 +04:00
*/
2010-11-16 16:14:02 +03:00
void suspend_set_ops ( const struct platform_suspend_ops * ops )
2009-06-10 03:27:12 +04:00
{
2011-12-08 01:29:54 +04:00
lock_system_sleep ( ) ;
2014-05-26 15:40:53 +04:00
2009-06-10 03:27:12 +04:00
suspend_ops = ops ;
PM / sleep: Introduce command line argument for sleep state enumeration
On some systems the platform doesn't support neither
PM_SUSPEND_MEM nor PM_SUSPEND_STANDBY, so PM_SUSPEND_FREEZE is the
only available system sleep state. However, some user space frameworks
only use the "mem" and (sometimes) "standby" sleep state labels, so
the users of those systems need to modify user space in order to be
able to use system suspend at all and that is not always possible.
For this reason, add a new kernel command line argument,
relative_sleep_states, allowing the users of those systems to change
the way in which the kernel assigns labels to system sleep states.
Namely, for relative_sleep_states=1, the "mem", "standby" and "freeze"
labels will enumerate the available system sleem states from the
deepest to the shallowest, respectively, so that "mem" is always
present in /sys/power/state and the other state strings may or may
not be presend depending on what is supported by the platform.
Update system sleep states documentation to reflect this change.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-05-26 15:40:59 +04:00
PM / sleep: System sleep state selection interface rework
There are systems in which the platform doesn't support any special
sleep states, so suspend-to-idle (PM_SUSPEND_FREEZE) is the only
available system sleep state. However, some user space frameworks
only use the "mem" and (sometimes) "standby" sleep state labels, so
the users of those systems need to modify user space in order to be
able to use system suspend at all and that may be a pain in practice.
Commit 0399d4db3edf (PM / sleep: Introduce command line argument for
sleep state enumeration) attempted to address this problem by adding
a command line argument to change the meaning of the "mem" string in
/sys/power/state to make it trigger suspend-to-idle (instead of
suspend-to-RAM).
However, there also are systems in which the platform does support
special sleep states, but suspend-to-idle is the preferred one anyway
(it even may save more energy than the platform-provided sleep states
in some cases) and the above commit doesn't help in those cases.
For this reason, rework the system sleep state selection interface
again (but preserve backwards compatibiliby). Namely, add a new
sysfs file, /sys/power/mem_sleep, that will control the system
suspend mode triggered by writing "mem" to /sys/power/state (in
analogy with what /sys/power/disk does for hibernation). Make it
select suspend-to-RAM ("deep" sleep) by default (if supported) and
fall back to suspend-to-idle ("s2idle") otherwise and add a new
command line argument, mem_sleep_default, allowing that default to
be overridden if need be.
At the same time, drop the relative_sleep_states command line
argument that doesn't make sense any more.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Mario Limonciello <mario.limonciello@dell.com>
2016-11-22 00:45:40 +03:00
if ( valid_state ( PM_SUSPEND_STANDBY ) ) {
mem_sleep_states [ PM_SUSPEND_STANDBY ] = mem_sleep_labels [ PM_SUSPEND_STANDBY ] ;
pm_states [ PM_SUSPEND_STANDBY ] = pm_labels [ PM_SUSPEND_STANDBY ] ;
if ( mem_sleep_default = = PM_SUSPEND_STANDBY )
mem_sleep_current = PM_SUSPEND_STANDBY ;
}
if ( valid_state ( PM_SUSPEND_MEM ) ) {
mem_sleep_states [ PM_SUSPEND_MEM ] = mem_sleep_labels [ PM_SUSPEND_MEM ] ;
2017-08-01 00:43:18 +03:00
if ( mem_sleep_default > = PM_SUSPEND_MEM )
PM / sleep: System sleep state selection interface rework
There are systems in which the platform doesn't support any special
sleep states, so suspend-to-idle (PM_SUSPEND_FREEZE) is the only
available system sleep state. However, some user space frameworks
only use the "mem" and (sometimes) "standby" sleep state labels, so
the users of those systems need to modify user space in order to be
able to use system suspend at all and that may be a pain in practice.
Commit 0399d4db3edf (PM / sleep: Introduce command line argument for
sleep state enumeration) attempted to address this problem by adding
a command line argument to change the meaning of the "mem" string in
/sys/power/state to make it trigger suspend-to-idle (instead of
suspend-to-RAM).
However, there also are systems in which the platform does support
special sleep states, but suspend-to-idle is the preferred one anyway
(it even may save more energy than the platform-provided sleep states
in some cases) and the above commit doesn't help in those cases.
For this reason, rework the system sleep state selection interface
again (but preserve backwards compatibiliby). Namely, add a new
sysfs file, /sys/power/mem_sleep, that will control the system
suspend mode triggered by writing "mem" to /sys/power/state (in
analogy with what /sys/power/disk does for hibernation). Make it
select suspend-to-RAM ("deep" sleep) by default (if supported) and
fall back to suspend-to-idle ("s2idle") otherwise and add a new
command line argument, mem_sleep_default, allowing that default to
be overridden if need be.
At the same time, drop the relative_sleep_states command line
argument that doesn't make sense any more.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Mario Limonciello <mario.limonciello@dell.com>
2016-11-22 00:45:40 +03:00
mem_sleep_current = PM_SUSPEND_MEM ;
}
2014-05-26 15:40:53 +04:00
2011-12-08 01:29:54 +04:00
unlock_system_sleep ( ) ;
2009-06-10 03:27:12 +04:00
}
2011-06-27 03:01:07 +04:00
EXPORT_SYMBOL_GPL ( suspend_set_ops ) ;
2009-06-10 03:27:12 +04:00
/**
2012-02-13 19:29:14 +04:00
* suspend_valid_only_mem - Generic memory - only valid callback .
2020-11-13 11:58:10 +03:00
* @ state : Target system sleep state .
2009-06-10 03:27:12 +04:00
*
2012-02-13 19:29:14 +04:00
* Platform drivers that implement mem suspend only and only need to check for
* that in their . valid ( ) callback can use this instead of rolling their own
* . valid ( ) callback .
2009-06-10 03:27:12 +04:00
*/
int suspend_valid_only_mem ( suspend_state_t state )
{
return state = = PM_SUSPEND_MEM ;
}
2011-06-27 03:01:07 +04:00
EXPORT_SYMBOL_GPL ( suspend_valid_only_mem ) ;
2009-06-10 03:27:12 +04:00
2014-07-23 02:57:53 +04:00
static bool sleep_state_supported ( suspend_state_t state )
{
2021-10-22 18:53:43 +03:00
return state = = PM_SUSPEND_TO_IDLE | | valid_state ( state ) ;
2014-07-23 02:57:53 +04:00
}
static int platform_suspend_prepare ( suspend_state_t state )
{
2017-08-10 01:13:07 +03:00
return state ! = PM_SUSPEND_TO_IDLE & & suspend_ops - > prepare ?
2014-07-23 02:57:53 +04:00
suspend_ops - > prepare ( ) : 0 ;
}
2014-09-30 04:29:01 +04:00
static int platform_suspend_prepare_late ( suspend_state_t state )
{
2017-08-10 01:15:30 +03:00
return state = = PM_SUSPEND_TO_IDLE & & s2idle_ops & & s2idle_ops - > prepare ?
s2idle_ops - > prepare ( ) : 0 ;
2014-09-30 04:29:01 +04:00
}
2014-09-30 04:22:24 +04:00
static int platform_suspend_prepare_noirq ( suspend_state_t state )
2014-07-23 02:57:53 +04:00
{
2019-08-10 14:18:06 +03:00
if ( state = = PM_SUSPEND_TO_IDLE )
return s2idle_ops & & s2idle_ops - > prepare_late ?
s2idle_ops - > prepare_late ( ) : 0 ;
ACPI: PM: s2idle: Execute LPS0 _DSM functions with suspended devices
According to Section 3.5 of the "Intel Low Power S0 Idle" document [1],
Function 5 of the LPS0 _DSM is expected to be invoked when the system
configuration matches the criteria for entering the target low-power
state of the platform. In particular, this means that all devices
should be suspended and in low-power states already when that function
is invoked.
This is not the case currently, however, because Function 5 of the
LPS0 _DSM is invoked by it before the "noirq" phase of device suspend,
which means that some devices may not have been put into low-power
states yet at that point. That is a consequence of the previous
design of the suspend-to-idle flow that allowed the "noirq" phase of
device suspend and the "noirq" phase of device resume to be carried
out for multiple times while "suspended" (if any spurious wakeup
events were detected) and the point of the LPS0 _DSM Function 5
invocation was chosen so as to call it (and LPS0 _DSM Function 6
analogously) once per suspend-resume cycle (regardless of how many
times the "noirq" phases of device suspend and resume were carried
out while "suspended").
Now that the suspend-to-idle flow has been redesigned to carry out
the "noirq" phases of device suspend and resume once in each cycle,
the code can be reordered to follow the specification that it is
based on more closely.
For this purpose, add ->prepare_late and ->restore_early platform
callbacks for suspend-to-idle, to be executed, respectively, after
the "noirq" phase of suspending devices and before the "noirq"
phase of resuming them and make ACPI use them for the invocation
of LPS0 _DSM functions as appropriate.
While at it, move the LPS0 entry requirements check to be made
before invoking Functions 3 and 5 of the LPS0 _DSM (also once
per cycle) as follows from the specification [1].
Link: https://uefi.org/sites/default/files/resources/Intel_ACPI_Low_Power_S0_Idle.pdf # [1]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
2019-08-01 20:31:10 +03:00
return suspend_ops - > prepare_late ? suspend_ops - > prepare_late ( ) : 0 ;
2014-07-23 02:57:53 +04:00
}
2014-09-30 04:22:24 +04:00
static void platform_resume_noirq ( suspend_state_t state )
2014-07-23 02:57:53 +04:00
{
ACPI: PM: s2idle: Execute LPS0 _DSM functions with suspended devices
According to Section 3.5 of the "Intel Low Power S0 Idle" document [1],
Function 5 of the LPS0 _DSM is expected to be invoked when the system
configuration matches the criteria for entering the target low-power
state of the platform. In particular, this means that all devices
should be suspended and in low-power states already when that function
is invoked.
This is not the case currently, however, because Function 5 of the
LPS0 _DSM is invoked by it before the "noirq" phase of device suspend,
which means that some devices may not have been put into low-power
states yet at that point. That is a consequence of the previous
design of the suspend-to-idle flow that allowed the "noirq" phase of
device suspend and the "noirq" phase of device resume to be carried
out for multiple times while "suspended" (if any spurious wakeup
events were detected) and the point of the LPS0 _DSM Function 5
invocation was chosen so as to call it (and LPS0 _DSM Function 6
analogously) once per suspend-resume cycle (regardless of how many
times the "noirq" phases of device suspend and resume were carried
out while "suspended").
Now that the suspend-to-idle flow has been redesigned to carry out
the "noirq" phases of device suspend and resume once in each cycle,
the code can be reordered to follow the specification that it is
based on more closely.
For this purpose, add ->prepare_late and ->restore_early platform
callbacks for suspend-to-idle, to be executed, respectively, after
the "noirq" phase of suspending devices and before the "noirq"
phase of resuming them and make ACPI use them for the invocation
of LPS0 _DSM functions as appropriate.
While at it, move the LPS0 entry requirements check to be made
before invoking Functions 3 and 5 of the LPS0 _DSM (also once
per cycle) as follows from the specification [1].
Link: https://uefi.org/sites/default/files/resources/Intel_ACPI_Low_Power_S0_Idle.pdf # [1]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
2019-08-01 20:31:10 +03:00
if ( state = = PM_SUSPEND_TO_IDLE ) {
if ( s2idle_ops & & s2idle_ops - > restore_early )
s2idle_ops - > restore_early ( ) ;
2019-08-10 14:18:06 +03:00
} else if ( suspend_ops - > wake ) {
2014-07-23 02:57:53 +04:00
suspend_ops - > wake ( ) ;
2019-08-10 14:18:06 +03:00
}
2014-07-23 02:57:53 +04:00
}
2014-09-30 04:29:01 +04:00
static void platform_resume_early ( suspend_state_t state )
{
2017-08-10 01:15:30 +03:00
if ( state = = PM_SUSPEND_TO_IDLE & & s2idle_ops & & s2idle_ops - > restore )
s2idle_ops - > restore ( ) ;
2014-09-30 04:29:01 +04:00
}
2014-09-30 04:22:24 +04:00
static void platform_resume_finish ( suspend_state_t state )
2014-07-23 02:57:53 +04:00
{
2017-08-10 01:13:07 +03:00
if ( state ! = PM_SUSPEND_TO_IDLE & & suspend_ops - > finish )
2014-07-23 02:57:53 +04:00
suspend_ops - > finish ( ) ;
}
static int platform_suspend_begin ( suspend_state_t state )
{
2017-08-10 01:15:30 +03:00
if ( state = = PM_SUSPEND_TO_IDLE & & s2idle_ops & & s2idle_ops - > begin )
return s2idle_ops - > begin ( ) ;
2016-08-19 16:41:00 +03:00
else if ( suspend_ops & & suspend_ops - > begin )
2014-07-23 02:57:53 +04:00
return suspend_ops - > begin ( state ) ;
else
return 0 ;
}
2014-09-30 04:22:24 +04:00
static void platform_resume_end ( suspend_state_t state )
2014-07-23 02:57:53 +04:00
{
2017-08-10 01:15:30 +03:00
if ( state = = PM_SUSPEND_TO_IDLE & & s2idle_ops & & s2idle_ops - > end )
s2idle_ops - > end ( ) ;
2016-08-19 16:41:00 +03:00
else if ( suspend_ops & & suspend_ops - > end )
2014-07-23 02:57:53 +04:00
suspend_ops - > end ( ) ;
}
2014-09-30 04:22:24 +04:00
static void platform_recover ( suspend_state_t state )
2014-07-23 02:57:53 +04:00
{
2017-08-10 01:13:07 +03:00
if ( state ! = PM_SUSPEND_TO_IDLE & & suspend_ops - > recover )
2014-07-23 02:57:53 +04:00
suspend_ops - > recover ( ) ;
}
static bool platform_suspend_again ( suspend_state_t state )
{
2017-08-10 01:13:07 +03:00
return state ! = PM_SUSPEND_TO_IDLE & & suspend_ops - > suspend_again ?
2014-07-23 02:57:53 +04:00
suspend_ops - > suspend_again ( ) : false ;
}
PM / sleep: add configurable delay for pm_test
When CONFIG_PM_DEBUG=y, we provide a sysfs file (/sys/power/pm_test) for
selecting one of a few suspend test modes, where rather than entering a
full suspend state, the kernel will perform some subset of suspend
steps, wait 5 seconds, and then resume back to normal operation.
This mode is useful for (among other things) observing the state of the
system just before entering a sleep mode, for debugging or analysis
purposes. However, a constant 5 second wait is not sufficient for some
sorts of analysis; for example, on an SoC, one might want to use
external tools to probe the power states of various on-chip controllers
or clocks.
This patch turns this 5 second delay into a configurable module
parameter, so users can determine how long to wait in this
pseudo-suspend state before resuming the system.
Example (wait 30 seconds);
# echo 30 > /sys/module/suspend/parameters/pm_test_delay
# echo core > /sys/power/pm_test
# time echo mem > /sys/power/state
...
[ 17.583625] suspend debug: Waiting for 30 second(s).
...
real 0m30.381s
user 0m0.017s
sys 0m0.080s
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Kevin Cernekee <cernekee@chromium.org>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-23 08:16:49 +03:00
# ifdef CONFIG_PM_DEBUG
static unsigned int pm_test_delay = 5 ;
module_param ( pm_test_delay , uint , 0644 ) ;
MODULE_PARM_DESC ( pm_test_delay ,
" Number of seconds to wait before resuming from suspend test " ) ;
# endif
2009-06-10 03:27:12 +04:00
static int suspend_test ( int level )
{
# ifdef CONFIG_PM_DEBUG
if ( pm_test_level = = level ) {
2015-10-28 06:24:01 +03:00
pr_info ( " suspend debug: Waiting for %d second(s). \n " ,
PM / sleep: add configurable delay for pm_test
When CONFIG_PM_DEBUG=y, we provide a sysfs file (/sys/power/pm_test) for
selecting one of a few suspend test modes, where rather than entering a
full suspend state, the kernel will perform some subset of suspend
steps, wait 5 seconds, and then resume back to normal operation.
This mode is useful for (among other things) observing the state of the
system just before entering a sleep mode, for debugging or analysis
purposes. However, a constant 5 second wait is not sufficient for some
sorts of analysis; for example, on an SoC, one might want to use
external tools to probe the power states of various on-chip controllers
or clocks.
This patch turns this 5 second delay into a configurable module
parameter, so users can determine how long to wait in this
pseudo-suspend state before resuming the system.
Example (wait 30 seconds);
# echo 30 > /sys/module/suspend/parameters/pm_test_delay
# echo core > /sys/power/pm_test
# time echo mem > /sys/power/state
...
[ 17.583625] suspend debug: Waiting for 30 second(s).
...
real 0m30.381s
user 0m0.017s
sys 0m0.080s
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Kevin Cernekee <cernekee@chromium.org>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-23 08:16:49 +03:00
pm_test_delay ) ;
mdelay ( pm_test_delay * 1000 ) ;
2009-06-10 03:27:12 +04:00
return 1 ;
}
# endif /* !CONFIG_PM_DEBUG */
return 0 ;
}
/**
2012-02-13 19:29:14 +04:00
* suspend_prepare - Prepare for entering system sleep state .
2020-11-13 11:58:10 +03:00
* @ state : Target system sleep state .
2009-06-10 03:27:12 +04:00
*
2012-02-13 19:29:14 +04:00
* Common code run for every system sleep state that can be entered ( except for
* hibernation ) . Run suspend notifiers , allocate the " suspend " console and
* freeze processes .
2009-06-10 03:27:12 +04:00
*/
PM: Introduce suspend state PM_SUSPEND_FREEZE
PM_SUSPEND_FREEZE state is a general state that
does not need any platform specific support, it equals
frozen processes + suspended devices + idle processors.
Compared with PM_SUSPEND_MEMORY,
PM_SUSPEND_FREEZE saves less power
because the system is still in a running state.
PM_SUSPEND_FREEZE has less resume latency because it does not
touch BIOS, and the processors are in idle state.
Compared with RTPM/idle,
PM_SUSPEND_FREEZE saves more power as
1. the processor has longer sleep time because processes are frozen.
The deeper c-state the processor supports, more power saving we can get.
2. PM_SUSPEND_FREEZE uses system suspend code path, thus we can get
more power saving from the devices that does not have good RTPM support.
This state is useful for
1) platforms that do not have STR, or have a broken STR.
2) platforms that have an extremely low power idle state,
which can be used to replace STR.
The following describes how PM_SUSPEND_FREEZE state works.
1. echo freeze > /sys/power/state
2. the processes are frozen.
3. all the devices are suspended.
4. all the processors are blocked by a wait queue
5. all the processors idles and enters (Deep) c-state.
6. an interrupt fires.
7. a processor is woken up and handles the irq.
8. if it is a general event,
a) the irq handler runs and quites.
b) goto step 4.
9. if it is a real wake event, say, power button pressing, keyboard touch, mouse moving,
a) the irq handler runs and activate the wakeup source
b) wakeup_source_activate() notifies the wait queue.
c) system starts resuming from PM_SUSPEND_FREEZE
10. all the devices are resumed.
11. all the processes are unfrozen.
12. system is back to working.
Known Issue:
The wakeup of this new PM_SUSPEND_FREEZE state may behave differently
from the previous suspend state.
Take ACPI platform for example, there are some GPEs that only enabled
when the system is in sleep state, to wake the system backk from S3/S4.
But we are not touching these GPEs during transition to PM_SUSPEND_FREEZE.
This means we may lose some wake event.
But on the other hand, as we do not disable all the Interrupts during
PM_SUSPEND_FREEZE, we may get some extra "wakeup" Interrupts, that are
not available for S3/S4.
The patches has been tested on an old Sony laptop, and here are the results:
Average Power:
1. RPTM/idle for half an hour:
14.8W, 12.6W, 14.1W, 12.5W, 14.4W, 13.2W, 12.9W
2. Freeze for half an hour:
11W, 10.4W, 9.4W, 11.3W 10.5W
3. RTPM/idle for three hours:
11.6W
4. Freeze for three hours:
10W
5. Suspend to Memory:
0.5~0.9W
Average Resume Latency:
1. RTPM/idle with a black screen: (From pressing keyboard to screen back)
Less than 0.2s
2. Freeze: (From pressing power button to screen back)
2.50s
3. Suspend to Memory: (From pressing power button to screen back)
4.33s
>From the results, we can see that all the platforms should benefit from
this patch, even if it does not have Low Power S0.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-06 16:00:36 +04:00
static int suspend_prepare ( suspend_state_t state )
2009-06-10 03:27:12 +04:00
{
notifier: Fix broken error handling pattern
The current notifiers have the following error handling pattern all
over the place:
int err, nr;
err = __foo_notifier_call_chain(&chain, val_up, v, -1, &nr);
if (err & NOTIFIER_STOP_MASK)
__foo_notifier_call_chain(&chain, val_down, v, nr-1, NULL)
And aside from the endless repetition thereof, it is broken. Consider
blocking notifiers; both calls take and drop the rwsem, this means
that the notifier list can change in between the two calls, making @nr
meaningless.
Fix this by replacing all the __foo_notifier_call_chain() functions
with foo_notifier_call_chain_robust() that embeds the above pattern,
but ensures it is inside a single lock region.
Note: I switched atomic_notifier_call_chain_robust() to use
the spinlock, since RCU cannot provide the guarantee
required for the recovery.
Note: software_resume() error handling was broken afaict.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20200818135804.325626653@infradead.org
2020-08-18 16:57:36 +03:00
int error ;
2009-06-10 03:27:12 +04:00
2014-07-23 02:57:53 +04:00
if ( ! sleep_state_supported ( state ) )
2009-06-10 03:27:12 +04:00
return - EPERM ;
pm_prepare_console ( ) ;
notifier: Fix broken error handling pattern
The current notifiers have the following error handling pattern all
over the place:
int err, nr;
err = __foo_notifier_call_chain(&chain, val_up, v, -1, &nr);
if (err & NOTIFIER_STOP_MASK)
__foo_notifier_call_chain(&chain, val_down, v, nr-1, NULL)
And aside from the endless repetition thereof, it is broken. Consider
blocking notifiers; both calls take and drop the rwsem, this means
that the notifier list can change in between the two calls, making @nr
meaningless.
Fix this by replacing all the __foo_notifier_call_chain() functions
with foo_notifier_call_chain_robust() that embeds the above pattern,
but ensures it is inside a single lock region.
Note: I switched atomic_notifier_call_chain_robust() to use
the spinlock, since RCU cannot provide the guarantee
required for the recovery.
Note: software_resume() error handling was broken afaict.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20200818135804.325626653@infradead.org
2020-08-18 16:57:36 +03:00
error = pm_notifier_call_chain_robust ( PM_SUSPEND_PREPARE , PM_POST_SUSPEND ) ;
if ( error )
goto Restore ;
2009-06-10 03:27:12 +04:00
2014-06-06 16:40:17 +04:00
trace_suspend_resume ( TPS ( " freeze_processes " ) , 0 , true ) ;
2009-06-10 03:27:12 +04:00
error = suspend_freeze_processes ( ) ;
2014-06-06 16:40:17 +04:00
trace_suspend_resume ( TPS ( " freeze_processes " ) , 0 , false ) ;
2011-11-22 00:32:24 +04:00
if ( ! error )
2009-06-10 03:27:12 +04:00
return 0 ;
2011-11-22 00:32:24 +04:00
suspend_stats . failed_freeze + + ;
dpm_save_failed_step ( SUSPEND_FREEZE ) ;
notifier: Fix broken error handling pattern
The current notifiers have the following error handling pattern all
over the place:
int err, nr;
err = __foo_notifier_call_chain(&chain, val_up, v, -1, &nr);
if (err & NOTIFIER_STOP_MASK)
__foo_notifier_call_chain(&chain, val_down, v, nr-1, NULL)
And aside from the endless repetition thereof, it is broken. Consider
blocking notifiers; both calls take and drop the rwsem, this means
that the notifier list can change in between the two calls, making @nr
meaningless.
Fix this by replacing all the __foo_notifier_call_chain() functions
with foo_notifier_call_chain_robust() that embeds the above pattern,
but ensures it is inside a single lock region.
Note: I switched atomic_notifier_call_chain_robust() to use
the spinlock, since RCU cannot provide the guarantee
required for the recovery.
Note: software_resume() error handling was broken afaict.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/r/20200818135804.325626653@infradead.org
2020-08-18 16:57:36 +03:00
pm_notifier_call_chain ( PM_POST_SUSPEND ) ;
Restore :
2009-06-10 03:27:12 +04:00
pm_restore_console ( ) ;
return error ;
}
/* default implementation */
2014-04-08 02:39:20 +04:00
void __weak arch_suspend_disable_irqs ( void )
2009-06-10 03:27:12 +04:00
{
local_irq_disable ( ) ;
}
/* default implementation */
2014-04-08 02:39:20 +04:00
void __weak arch_suspend_enable_irqs ( void )
2009-06-10 03:27:12 +04:00
{
local_irq_enable ( ) ;
}
/**
2012-02-13 19:29:14 +04:00
* suspend_enter - Make the system enter the given sleep state .
* @ state : System sleep state to enter .
* @ wakeup : Returns information that the sleep state should not be re - entered .
2009-06-10 03:27:12 +04:00
*
PM / Suspend: Add .suspend_again() callback to suspend_ops
A system or a device may need to control suspend/wakeup events. It may
want to wakeup the system after a predefined amount of time or at a
predefined event decided while entering suspend for polling or delayed
work. Then, it may want to enter suspend again if its predefined wakeup
condition is the only wakeup reason and there is no outstanding events;
thus, it does not wakeup the userspace unnecessary or unnecessary
devices and keeps suspended as long as possible (saving the power).
Enabling a system to wakeup after a specified time can be easily
achieved by using RTC. However, to enter suspend again immediately
without invoking userland and unrelated devices, we need additional
features in the suspend framework.
Such need comes from:
1. Monitoring a critical device status without interrupts that can
wakeup the system. (in-suspend polling)
An example is ambient temperature monitoring that needs to shut down
the system or a specific device function if it is too hot or cold. The
temperature of a specific device may be needed to be monitored as well;
e.g., a charger monitors battery temperature in order to stop charging
if overheated.
2. Execute critical "delayed work" at suspend.
A driver or a system/board may have a delayed work (or any similar
things) that it wants to execute at the requested time.
For example, some chargers want to check the battery voltage some
time (e.g., 30 seconds) after the battery is fully charged and the
charger has stopped. Then, the charger restarts charging if the voltage
has dropped more than a threshold, which is smaller than "restart-charger"
voltage, which is a threshold to restart charging regardless of the
time passed.
This patch allows to add "suspend_again" callback at struct
platform_suspend_ops and let the "suspend_again" callback return true if
the system is required to enter suspend again after the current instance
of wakeup. Device-wise suspend_again implemented at dev_pm_ops or
syscore is not done because: a) suspend_again feature is usually under
platform-wise decision and controls the behavior of the whole platform
and b) There are very limited devices related to the usage cases of
suspend_again; chargers and temperature sensors are mentioned so far.
With suspend_again callback registered at struct platform_suspend_ops
suspend_ops in kernel/power/suspend.c with suspend_set_ops by the
platform, the suspend framework tries to enter suspend again by
looping suspend_enter() if suspend_again has returned true and there has
been no errors in the suspending sequence or pending wakeups (by
pm_wakeup_pending).
Tested at Exynos4-NURI.
[rjw: Fixed up kerneldoc comment for suspend_enter().]
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-06-12 17:57:05 +04:00
* This function should be called after devices have been suspended .
2009-06-10 03:27:12 +04:00
*/
PM / Suspend: Add .suspend_again() callback to suspend_ops
A system or a device may need to control suspend/wakeup events. It may
want to wakeup the system after a predefined amount of time or at a
predefined event decided while entering suspend for polling or delayed
work. Then, it may want to enter suspend again if its predefined wakeup
condition is the only wakeup reason and there is no outstanding events;
thus, it does not wakeup the userspace unnecessary or unnecessary
devices and keeps suspended as long as possible (saving the power).
Enabling a system to wakeup after a specified time can be easily
achieved by using RTC. However, to enter suspend again immediately
without invoking userland and unrelated devices, we need additional
features in the suspend framework.
Such need comes from:
1. Monitoring a critical device status without interrupts that can
wakeup the system. (in-suspend polling)
An example is ambient temperature monitoring that needs to shut down
the system or a specific device function if it is too hot or cold. The
temperature of a specific device may be needed to be monitored as well;
e.g., a charger monitors battery temperature in order to stop charging
if overheated.
2. Execute critical "delayed work" at suspend.
A driver or a system/board may have a delayed work (or any similar
things) that it wants to execute at the requested time.
For example, some chargers want to check the battery voltage some
time (e.g., 30 seconds) after the battery is fully charged and the
charger has stopped. Then, the charger restarts charging if the voltage
has dropped more than a threshold, which is smaller than "restart-charger"
voltage, which is a threshold to restart charging regardless of the
time passed.
This patch allows to add "suspend_again" callback at struct
platform_suspend_ops and let the "suspend_again" callback return true if
the system is required to enter suspend again after the current instance
of wakeup. Device-wise suspend_again implemented at dev_pm_ops or
syscore is not done because: a) suspend_again feature is usually under
platform-wise decision and controls the behavior of the whole platform
and b) There are very limited devices related to the usage cases of
suspend_again; chargers and temperature sensors are mentioned so far.
With suspend_again callback registered at struct platform_suspend_ops
suspend_ops in kernel/power/suspend.c with suspend_set_ops by the
platform, the suspend framework tries to enter suspend again by
looping suspend_enter() if suspend_again has returned true and there has
been no errors in the suspending sequence or pending wakeups (by
pm_wakeup_pending).
Tested at Exynos4-NURI.
[rjw: Fixed up kerneldoc comment for suspend_enter().]
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-06-12 17:57:05 +04:00
static int suspend_enter ( suspend_state_t state , bool * wakeup )
2009-06-10 03:27:12 +04:00
{
int error ;
2014-07-23 02:57:53 +04:00
error = platform_suspend_prepare ( state ) ;
if ( error )
goto Platform_finish ;
2009-06-10 03:27:12 +04:00
2014-09-30 04:21:34 +04:00
error = dpm_suspend_late ( PMSG_SUSPEND ) ;
2009-06-10 03:27:12 +04:00
if ( error ) {
2017-07-21 15:50:48 +03:00
pr_err ( " late suspend of devices failed \n " ) ;
2010-07-08 01:43:45 +04:00
goto Platform_finish ;
2009-06-10 03:27:12 +04:00
}
2014-09-30 04:29:01 +04:00
error = platform_suspend_prepare_late ( state ) ;
if ( error )
goto Devices_early_resume ;
2014-09-30 04:21:34 +04:00
error = dpm_suspend_noirq ( PMSG_SUSPEND ) ;
if ( error ) {
2017-07-21 15:50:48 +03:00
pr_err ( " noirq suspend of devices failed \n " ) ;
2014-09-30 04:29:01 +04:00
goto Platform_early_resume ;
2014-09-30 04:21:34 +04:00
}
2014-09-30 04:22:24 +04:00
error = platform_suspend_prepare_noirq ( state ) ;
2014-07-23 02:57:53 +04:00
if ( error )
goto Platform_wake ;
2009-06-10 03:27:12 +04:00
2013-03-27 07:36:10 +04:00
if ( suspend_test ( TEST_PLATFORM ) )
goto Platform_wake ;
2019-07-16 00:52:12 +03:00
if ( state = = PM_SUSPEND_TO_IDLE ) {
s2idle_loop ( ) ;
goto Platform_wake ;
}
PM: sleep: Pause cpuidle later and resume it earlier during system transitions
Commit 8651f97bd951 ("PM / cpuidle: System resume hang fix with
cpuidle") that introduced cpuidle pausing during system suspend
did that to work around a platform firmware issue causing systems
to hang during resume if CPUs were allowed to enter idle states
in the system suspend and resume code paths.
However, pausing cpuidle before the last phase of suspending
devices is the source of an otherwise arbitrary difference between
the suspend-to-idle path and other system suspend variants, so it is
cleaner to do that later, before taking secondary CPUs offline (it
is still safer to take secondary CPUs offline with cpuidle paused,
though).
Modify the code accordingly, but in order to avoid code duplication,
introduce new wrapper functions, pm_sleep_disable_secondary_cpus()
and pm_sleep_enable_secondary_cpus(), to combine cpuidle_pause()
and cpuidle_resume(), respectively, with the handling of secondary
CPUs during system-wide transitions to sleep states.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-10-22 19:07:47 +03:00
error = pm_sleep_disable_secondary_cpus ( ) ;
2009-06-10 03:27:12 +04:00
if ( error | | suspend_test ( TEST_CPUS ) )
goto Enable_cpus ;
arch_suspend_disable_irqs ( ) ;
BUG_ON ( ! irqs_disabled ( ) ) ;
2018-05-25 18:54:41 +03:00
system_state = SYSTEM_SUSPEND ;
2011-04-26 21:15:07 +04:00
error = syscore_suspend ( ) ;
2009-06-10 03:27:12 +04:00
if ( ! error ) {
PM / Suspend: Add .suspend_again() callback to suspend_ops
A system or a device may need to control suspend/wakeup events. It may
want to wakeup the system after a predefined amount of time or at a
predefined event decided while entering suspend for polling or delayed
work. Then, it may want to enter suspend again if its predefined wakeup
condition is the only wakeup reason and there is no outstanding events;
thus, it does not wakeup the userspace unnecessary or unnecessary
devices and keeps suspended as long as possible (saving the power).
Enabling a system to wakeup after a specified time can be easily
achieved by using RTC. However, to enter suspend again immediately
without invoking userland and unrelated devices, we need additional
features in the suspend framework.
Such need comes from:
1. Monitoring a critical device status without interrupts that can
wakeup the system. (in-suspend polling)
An example is ambient temperature monitoring that needs to shut down
the system or a specific device function if it is too hot or cold. The
temperature of a specific device may be needed to be monitored as well;
e.g., a charger monitors battery temperature in order to stop charging
if overheated.
2. Execute critical "delayed work" at suspend.
A driver or a system/board may have a delayed work (or any similar
things) that it wants to execute at the requested time.
For example, some chargers want to check the battery voltage some
time (e.g., 30 seconds) after the battery is fully charged and the
charger has stopped. Then, the charger restarts charging if the voltage
has dropped more than a threshold, which is smaller than "restart-charger"
voltage, which is a threshold to restart charging regardless of the
time passed.
This patch allows to add "suspend_again" callback at struct
platform_suspend_ops and let the "suspend_again" callback return true if
the system is required to enter suspend again after the current instance
of wakeup. Device-wise suspend_again implemented at dev_pm_ops or
syscore is not done because: a) suspend_again feature is usually under
platform-wise decision and controls the behavior of the whole platform
and b) There are very limited devices related to the usage cases of
suspend_again; chargers and temperature sensors are mentioned so far.
With suspend_again callback registered at struct platform_suspend_ops
suspend_ops in kernel/power/suspend.c with suspend_set_ops by the
platform, the suspend framework tries to enter suspend again by
looping suspend_enter() if suspend_again has returned true and there has
been no errors in the suspending sequence or pending wakeups (by
pm_wakeup_pending).
Tested at Exynos4-NURI.
[rjw: Fixed up kerneldoc comment for suspend_enter().]
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-06-12 17:57:05 +04:00
* wakeup = pm_wakeup_pending ( ) ;
if ( ! ( suspend_test ( TEST_CORE ) | | * wakeup ) ) {
2014-06-06 16:40:17 +04:00
trace_suspend_resume ( TPS ( " machine_suspend " ) ,
state , true ) ;
2009-06-10 03:27:12 +04:00
error = suspend_ops - > enter ( state ) ;
2014-06-06 16:40:17 +04:00
trace_suspend_resume ( TPS ( " machine_suspend " ) ,
state , false ) ;
2015-05-08 05:07:42 +03:00
} else if ( * wakeup ) {
error = - EBUSY ;
PM: Make it possible to avoid races between wakeup and system sleep
One of the arguments during the suspend blockers discussion was that
the mainline kernel didn't contain any mechanisms making it possible
to avoid races between wakeup and system suspend.
Generally, there are two problems in that area. First, if a wakeup
event occurs exactly when /sys/power/state is being written to, it
may be delivered to user space right before the freezer kicks in, so
the user space consumer of the event may not be able to process it
before the system is suspended. Second, if a wakeup event occurs
after user space has been frozen, it is not generally guaranteed that
the ongoing transition of the system into a sleep state will be
aborted.
To address these issues introduce a new global sysfs attribute,
/sys/power/wakeup_count, associated with a running counter of wakeup
events and three helper functions, pm_stay_awake(), pm_relax(), and
pm_wakeup_event(), that may be used by kernel subsystems to control
the behavior of this attribute and to request the PM core to abort
system transitions into a sleep state already in progress.
The /sys/power/wakeup_count file may be read from or written to by
user space. Reads will always succeed (unless interrupted by a
signal) and return the current value of the wakeup events counter.
Writes, however, will only succeed if the written number is equal to
the current value of the wakeup events counter. If a write is
successful, it will cause the kernel to save the current value of the
wakeup events counter and to abort the subsequent system transition
into a sleep state if any wakeup events are reported after the write
has returned.
[The assumption is that before writing to /sys/power/state user space
will first read from /sys/power/wakeup_count. Next, user space
consumers of wakeup events will have a chance to acknowledge or
veto the upcoming system transition to a sleep state. Finally, if
the transition is allowed to proceed, /sys/power/wakeup_count will
be written to and if that succeeds, /sys/power/state will be written
to as well. Still, if any wakeup events are reported to the PM core
by kernel subsystems after that point, the transition will be
aborted.]
Additionally, put a wakeup events counter into struct dev_pm_info and
make these per-device wakeup event counters available via sysfs,
so that it's possible to check the activity of various wakeup event
sources within the kernel.
To illustrate how subsystems can use pm_wakeup_event(), make the
low-level PCI runtime PM wakeup-handling code use it.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Acked-by: markgross <markgross@thegnar.org>
Reviewed-by: Alan Stern <stern@rowland.harvard.edu>
2010-07-06 00:43:53 +04:00
}
2011-03-15 02:43:46 +03:00
syscore_resume ( ) ;
2009-06-10 03:27:12 +04:00
}
2018-05-25 18:54:41 +03:00
system_state = SYSTEM_RUNNING ;
2009-06-10 03:27:12 +04:00
arch_suspend_enable_irqs ( ) ;
BUG_ON ( irqs_disabled ( ) ) ;
Enable_cpus :
PM: sleep: Pause cpuidle later and resume it earlier during system transitions
Commit 8651f97bd951 ("PM / cpuidle: System resume hang fix with
cpuidle") that introduced cpuidle pausing during system suspend
did that to work around a platform firmware issue causing systems
to hang during resume if CPUs were allowed to enter idle states
in the system suspend and resume code paths.
However, pausing cpuidle before the last phase of suspending
devices is the source of an otherwise arbitrary difference between
the suspend-to-idle path and other system suspend variants, so it is
cleaner to do that later, before taking secondary CPUs offline (it
is still safer to take secondary CPUs offline with cpuidle paused,
though).
Modify the code accordingly, but in order to avoid code duplication,
introduce new wrapper functions, pm_sleep_disable_secondary_cpus()
and pm_sleep_enable_secondary_cpus(), to combine cpuidle_pause()
and cpuidle_resume(), respectively, with the handling of secondary
CPUs during system-wide transitions to sleep states.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
2021-10-22 19:07:47 +03:00
pm_sleep_enable_secondary_cpus ( ) ;
2009-06-10 03:27:12 +04:00
Platform_wake :
2014-09-30 04:22:24 +04:00
platform_resume_noirq ( state ) ;
2014-09-30 04:21:34 +04:00
dpm_resume_noirq ( PMSG_RESUME ) ;
2014-09-30 04:29:01 +04:00
Platform_early_resume :
platform_resume_early ( state ) ;
2014-09-30 04:21:34 +04:00
Devices_early_resume :
dpm_resume_early ( PMSG_RESUME ) ;
2009-06-10 03:27:12 +04:00
2010-07-08 01:43:45 +04:00
Platform_finish :
2014-09-30 04:22:24 +04:00
platform_resume_finish ( state ) ;
2009-06-10 03:27:12 +04:00
return error ;
}
/**
2012-02-13 19:29:14 +04:00
* suspend_devices_and_enter - Suspend devices and enter system sleep state .
* @ state : System sleep state to enter .
2009-06-10 03:27:12 +04:00
*/
int suspend_devices_and_enter ( suspend_state_t state )
{
int error ;
PM / Suspend: Add .suspend_again() callback to suspend_ops
A system or a device may need to control suspend/wakeup events. It may
want to wakeup the system after a predefined amount of time or at a
predefined event decided while entering suspend for polling or delayed
work. Then, it may want to enter suspend again if its predefined wakeup
condition is the only wakeup reason and there is no outstanding events;
thus, it does not wakeup the userspace unnecessary or unnecessary
devices and keeps suspended as long as possible (saving the power).
Enabling a system to wakeup after a specified time can be easily
achieved by using RTC. However, to enter suspend again immediately
without invoking userland and unrelated devices, we need additional
features in the suspend framework.
Such need comes from:
1. Monitoring a critical device status without interrupts that can
wakeup the system. (in-suspend polling)
An example is ambient temperature monitoring that needs to shut down
the system or a specific device function if it is too hot or cold. The
temperature of a specific device may be needed to be monitored as well;
e.g., a charger monitors battery temperature in order to stop charging
if overheated.
2. Execute critical "delayed work" at suspend.
A driver or a system/board may have a delayed work (or any similar
things) that it wants to execute at the requested time.
For example, some chargers want to check the battery voltage some
time (e.g., 30 seconds) after the battery is fully charged and the
charger has stopped. Then, the charger restarts charging if the voltage
has dropped more than a threshold, which is smaller than "restart-charger"
voltage, which is a threshold to restart charging regardless of the
time passed.
This patch allows to add "suspend_again" callback at struct
platform_suspend_ops and let the "suspend_again" callback return true if
the system is required to enter suspend again after the current instance
of wakeup. Device-wise suspend_again implemented at dev_pm_ops or
syscore is not done because: a) suspend_again feature is usually under
platform-wise decision and controls the behavior of the whole platform
and b) There are very limited devices related to the usage cases of
suspend_again; chargers and temperature sensors are mentioned so far.
With suspend_again callback registered at struct platform_suspend_ops
suspend_ops in kernel/power/suspend.c with suspend_set_ops by the
platform, the suspend framework tries to enter suspend again by
looping suspend_enter() if suspend_again has returned true and there has
been no errors in the suspending sequence or pending wakeups (by
pm_wakeup_pending).
Tested at Exynos4-NURI.
[rjw: Fixed up kerneldoc comment for suspend_enter().]
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-06-12 17:57:05 +04:00
bool wakeup = false ;
2009-06-10 03:27:12 +04:00
2014-07-23 02:57:53 +04:00
if ( ! sleep_state_supported ( state ) )
2009-06-10 03:27:12 +04:00
return - ENOSYS ;
2017-07-18 03:19:25 +03:00
pm_suspend_target_state = state ;
2019-06-26 01:20:23 +03:00
if ( state = = PM_SUSPEND_TO_IDLE )
pm_set_suspend_no_platform ( ) ;
2014-07-23 02:57:53 +04:00
error = platform_suspend_begin ( state ) ;
if ( error )
goto Close ;
2009-06-10 03:27:12 +04:00
suspend_console ( ) ;
suspend_test_start ( ) ;
error = dpm_suspend_start ( PMSG_SUSPEND ) ;
if ( error ) {
2017-07-21 15:50:48 +03:00
pr_err ( " Some devices failed to suspend, or early wake event detected \n " ) ;
2009-06-10 03:27:12 +04:00
goto Recover_platform ;
}
suspend_test_finish ( " suspend devices " ) ;
if ( suspend_test ( TEST_DEVICES ) )
goto Recover_platform ;
PM / Suspend: Add .suspend_again() callback to suspend_ops
A system or a device may need to control suspend/wakeup events. It may
want to wakeup the system after a predefined amount of time or at a
predefined event decided while entering suspend for polling or delayed
work. Then, it may want to enter suspend again if its predefined wakeup
condition is the only wakeup reason and there is no outstanding events;
thus, it does not wakeup the userspace unnecessary or unnecessary
devices and keeps suspended as long as possible (saving the power).
Enabling a system to wakeup after a specified time can be easily
achieved by using RTC. However, to enter suspend again immediately
without invoking userland and unrelated devices, we need additional
features in the suspend framework.
Such need comes from:
1. Monitoring a critical device status without interrupts that can
wakeup the system. (in-suspend polling)
An example is ambient temperature monitoring that needs to shut down
the system or a specific device function if it is too hot or cold. The
temperature of a specific device may be needed to be monitored as well;
e.g., a charger monitors battery temperature in order to stop charging
if overheated.
2. Execute critical "delayed work" at suspend.
A driver or a system/board may have a delayed work (or any similar
things) that it wants to execute at the requested time.
For example, some chargers want to check the battery voltage some
time (e.g., 30 seconds) after the battery is fully charged and the
charger has stopped. Then, the charger restarts charging if the voltage
has dropped more than a threshold, which is smaller than "restart-charger"
voltage, which is a threshold to restart charging regardless of the
time passed.
This patch allows to add "suspend_again" callback at struct
platform_suspend_ops and let the "suspend_again" callback return true if
the system is required to enter suspend again after the current instance
of wakeup. Device-wise suspend_again implemented at dev_pm_ops or
syscore is not done because: a) suspend_again feature is usually under
platform-wise decision and controls the behavior of the whole platform
and b) There are very limited devices related to the usage cases of
suspend_again; chargers and temperature sensors are mentioned so far.
With suspend_again callback registered at struct platform_suspend_ops
suspend_ops in kernel/power/suspend.c with suspend_set_ops by the
platform, the suspend framework tries to enter suspend again by
looping suspend_enter() if suspend_again has returned true and there has
been no errors in the suspending sequence or pending wakeups (by
pm_wakeup_pending).
Tested at Exynos4-NURI.
[rjw: Fixed up kerneldoc comment for suspend_enter().]
Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-06-12 17:57:05 +04:00
do {
error = suspend_enter ( state , & wakeup ) ;
2014-07-23 02:57:53 +04:00
} while ( ! error & & ! wakeup & & platform_suspend_again ( state ) ) ;
2009-06-10 03:27:12 +04:00
Resume_devices :
suspend_test_start ( ) ;
dpm_resume_end ( PMSG_RESUME ) ;
suspend_test_finish ( " resume devices " ) ;
2014-09-20 01:07:12 +04:00
trace_suspend_resume ( TPS ( " resume_console " ) , state , true ) ;
2009-06-10 03:27:12 +04:00
resume_console ( ) ;
2014-09-20 01:07:12 +04:00
trace_suspend_resume ( TPS ( " resume_console " ) , state , false ) ;
2014-05-16 01:29:57 +04:00
2014-07-23 02:57:53 +04:00
Close :
2014-09-30 04:22:24 +04:00
platform_resume_end ( state ) ;
2017-07-18 03:19:25 +03:00
pm_suspend_target_state = PM_SUSPEND_ON ;
2009-06-10 03:27:12 +04:00
return error ;
Recover_platform :
2014-09-30 04:22:24 +04:00
platform_recover ( state ) ;
2009-06-10 03:27:12 +04:00
goto Resume_devices ;
}
/**
2012-02-13 19:29:14 +04:00
* suspend_finish - Clean up before finishing the suspend sequence .
2009-06-10 03:27:12 +04:00
*
2012-02-13 19:29:14 +04:00
* Call platform code to clean up , restart processes , and free the console that
* we ' ve allocated . This routine is not called for hibernation .
2009-06-10 03:27:12 +04:00
*/
static void suspend_finish ( void )
{
suspend_thaw_processes ( ) ;
pm_notifier_call_chain ( PM_POST_SUSPEND ) ;
pm_restore_console ( ) ;
}
/**
2012-02-13 19:29:14 +04:00
* enter_state - Do common work needed to enter system sleep state .
* @ state : System sleep state to enter .
2009-06-10 03:27:12 +04:00
*
2012-02-13 19:29:14 +04:00
* Make sure that no one else is trying to put the system into a sleep state .
* Fail if that ' s not the case . Otherwise , prepare for system suspend , make the
* system enter the given sleep state and clean up after wakeup .
2009-06-10 03:27:12 +04:00
*/
2012-02-13 19:29:24 +04:00
static int enter_state ( suspend_state_t state )
2009-06-10 03:27:12 +04:00
{
int error ;
2014-06-06 16:40:17 +04:00
trace_suspend_resume ( TPS ( " suspend_enter " ) , state , true ) ;
2017-08-10 01:13:07 +03:00
if ( state = = PM_SUSPEND_TO_IDLE ) {
2014-05-26 15:40:53 +04:00
# ifdef CONFIG_PM_DEBUG
if ( pm_test_level ! = TEST_NONE & & pm_test_level < = TEST_CPUS ) {
2017-07-21 15:50:48 +03:00
pr_warn ( " Unsupported test mode for suspend to idle, please choose none/freezer/devices/platform. \n " ) ;
2014-05-26 15:40:53 +04:00
return - EAGAIN ;
}
# endif
} else if ( ! valid_state ( state ) ) {
return - EINVAL ;
}
2018-07-31 11:51:32 +03:00
if ( ! mutex_trylock ( & system_transition_mutex ) )
2009-06-10 03:27:12 +04:00
return - EBUSY ;
2017-08-10 01:13:07 +03:00
if ( state = = PM_SUSPEND_TO_IDLE )
2017-08-10 01:13:56 +03:00
s2idle_begin ( ) ;
PM: Introduce suspend state PM_SUSPEND_FREEZE
PM_SUSPEND_FREEZE state is a general state that
does not need any platform specific support, it equals
frozen processes + suspended devices + idle processors.
Compared with PM_SUSPEND_MEMORY,
PM_SUSPEND_FREEZE saves less power
because the system is still in a running state.
PM_SUSPEND_FREEZE has less resume latency because it does not
touch BIOS, and the processors are in idle state.
Compared with RTPM/idle,
PM_SUSPEND_FREEZE saves more power as
1. the processor has longer sleep time because processes are frozen.
The deeper c-state the processor supports, more power saving we can get.
2. PM_SUSPEND_FREEZE uses system suspend code path, thus we can get
more power saving from the devices that does not have good RTPM support.
This state is useful for
1) platforms that do not have STR, or have a broken STR.
2) platforms that have an extremely low power idle state,
which can be used to replace STR.
The following describes how PM_SUSPEND_FREEZE state works.
1. echo freeze > /sys/power/state
2. the processes are frozen.
3. all the devices are suspended.
4. all the processors are blocked by a wait queue
5. all the processors idles and enters (Deep) c-state.
6. an interrupt fires.
7. a processor is woken up and handles the irq.
8. if it is a general event,
a) the irq handler runs and quites.
b) goto step 4.
9. if it is a real wake event, say, power button pressing, keyboard touch, mouse moving,
a) the irq handler runs and activate the wakeup source
b) wakeup_source_activate() notifies the wait queue.
c) system starts resuming from PM_SUSPEND_FREEZE
10. all the devices are resumed.
11. all the processes are unfrozen.
12. system is back to working.
Known Issue:
The wakeup of this new PM_SUSPEND_FREEZE state may behave differently
from the previous suspend state.
Take ACPI platform for example, there are some GPEs that only enabled
when the system is in sleep state, to wake the system backk from S3/S4.
But we are not touching these GPEs during transition to PM_SUSPEND_FREEZE.
This means we may lose some wake event.
But on the other hand, as we do not disable all the Interrupts during
PM_SUSPEND_FREEZE, we may get some extra "wakeup" Interrupts, that are
not available for S3/S4.
The patches has been tested on an old Sony laptop, and here are the results:
Average Power:
1. RPTM/idle for half an hour:
14.8W, 12.6W, 14.1W, 12.5W, 14.4W, 13.2W, 12.9W
2. Freeze for half an hour:
11W, 10.4W, 9.4W, 11.3W 10.5W
3. RTPM/idle for three hours:
11.6W
4. Freeze for three hours:
10W
5. Suspend to Memory:
0.5~0.9W
Average Resume Latency:
1. RTPM/idle with a black screen: (From pressing keyboard to screen back)
Less than 0.2s
2. Freeze: (From pressing power button to screen back)
2.50s
3. Suspend to Memory: (From pressing power button to screen back)
4.33s
>From the results, we can see that all the platforms should benefit from
this patch, even if it does not have Low Power S0.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-06 16:00:36 +04:00
2020-01-16 14:53:54 +03:00
if ( sync_on_suspend_enabled ) {
2019-02-25 15:36:41 +03:00
trace_suspend_resume ( TPS ( " sync_filesystems " ) , 0 , true ) ;
ksys_sync_helper ( ) ;
trace_suspend_resume ( TPS ( " sync_filesystems " ) , 0 , false ) ;
}
2009-06-10 03:27:12 +04:00
2017-07-21 15:49:51 +03:00
pm_pr_dbg ( " Preparing system for sleep (%s) \n " , mem_sleep_labels [ state ] ) ;
2015-10-07 01:49:34 +03:00
pm_suspend_clear_flags ( ) ;
PM: Introduce suspend state PM_SUSPEND_FREEZE
PM_SUSPEND_FREEZE state is a general state that
does not need any platform specific support, it equals
frozen processes + suspended devices + idle processors.
Compared with PM_SUSPEND_MEMORY,
PM_SUSPEND_FREEZE saves less power
because the system is still in a running state.
PM_SUSPEND_FREEZE has less resume latency because it does not
touch BIOS, and the processors are in idle state.
Compared with RTPM/idle,
PM_SUSPEND_FREEZE saves more power as
1. the processor has longer sleep time because processes are frozen.
The deeper c-state the processor supports, more power saving we can get.
2. PM_SUSPEND_FREEZE uses system suspend code path, thus we can get
more power saving from the devices that does not have good RTPM support.
This state is useful for
1) platforms that do not have STR, or have a broken STR.
2) platforms that have an extremely low power idle state,
which can be used to replace STR.
The following describes how PM_SUSPEND_FREEZE state works.
1. echo freeze > /sys/power/state
2. the processes are frozen.
3. all the devices are suspended.
4. all the processors are blocked by a wait queue
5. all the processors idles and enters (Deep) c-state.
6. an interrupt fires.
7. a processor is woken up and handles the irq.
8. if it is a general event,
a) the irq handler runs and quites.
b) goto step 4.
9. if it is a real wake event, say, power button pressing, keyboard touch, mouse moving,
a) the irq handler runs and activate the wakeup source
b) wakeup_source_activate() notifies the wait queue.
c) system starts resuming from PM_SUSPEND_FREEZE
10. all the devices are resumed.
11. all the processes are unfrozen.
12. system is back to working.
Known Issue:
The wakeup of this new PM_SUSPEND_FREEZE state may behave differently
from the previous suspend state.
Take ACPI platform for example, there are some GPEs that only enabled
when the system is in sleep state, to wake the system backk from S3/S4.
But we are not touching these GPEs during transition to PM_SUSPEND_FREEZE.
This means we may lose some wake event.
But on the other hand, as we do not disable all the Interrupts during
PM_SUSPEND_FREEZE, we may get some extra "wakeup" Interrupts, that are
not available for S3/S4.
The patches has been tested on an old Sony laptop, and here are the results:
Average Power:
1. RPTM/idle for half an hour:
14.8W, 12.6W, 14.1W, 12.5W, 14.4W, 13.2W, 12.9W
2. Freeze for half an hour:
11W, 10.4W, 9.4W, 11.3W 10.5W
3. RTPM/idle for three hours:
11.6W
4. Freeze for three hours:
10W
5. Suspend to Memory:
0.5~0.9W
Average Resume Latency:
1. RTPM/idle with a black screen: (From pressing keyboard to screen back)
Less than 0.2s
2. Freeze: (From pressing power button to screen back)
2.50s
3. Suspend to Memory: (From pressing power button to screen back)
4.33s
>From the results, we can see that all the platforms should benefit from
this patch, even if it does not have Low Power S0.
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-02-06 16:00:36 +04:00
error = suspend_prepare ( state ) ;
2009-06-10 03:27:12 +04:00
if ( error )
goto Unlock ;
if ( suspend_test ( TEST_FREEZER ) )
goto Finish ;
2014-06-06 16:40:17 +04:00
trace_suspend_resume ( TPS ( " suspend_enter " ) , state , false ) ;
2017-07-21 15:49:51 +03:00
pm_pr_dbg ( " Suspending system (%s) \n " , mem_sleep_labels [ state ] ) ;
2011-05-10 23:09:53 +04:00
pm_restrict_gfp_mask ( ) ;
2009-06-10 03:27:12 +04:00
error = suspend_devices_and_enter ( state ) ;
2011-05-10 23:09:53 +04:00
pm_restore_gfp_mask ( ) ;
2009-06-10 03:27:12 +04:00
Finish :
2017-11-01 00:44:24 +03:00
events_check_enabled = false ;
2017-07-19 03:38:44 +03:00
pm_pr_dbg ( " Finishing wakeup. \n " ) ;
2009-06-10 03:27:12 +04:00
suspend_finish ( ) ;
Unlock :
2018-07-31 11:51:32 +03:00
mutex_unlock ( & system_transition_mutex ) ;
2009-06-10 03:27:12 +04:00
return error ;
}
/**
2012-02-13 19:29:14 +04:00
* pm_suspend - Externally visible function for suspending the system .
* @ state : System sleep state to enter .
2009-06-10 03:27:12 +04:00
*
2012-02-13 19:29:14 +04:00
* Check if the value of @ state represents one of the supported states ,
* execute enter_state ( ) and update system suspend statistics .
2009-06-10 03:27:12 +04:00
*/
int pm_suspend ( suspend_state_t state )
{
2012-02-13 19:29:33 +04:00
int error ;
if ( state < = PM_SUSPEND_ON | | state > = PM_SUSPEND_MAX )
return - EINVAL ;
2017-07-21 15:50:48 +03:00
pr_info ( " suspend entry (%s) \n " , mem_sleep_labels [ state ] ) ;
2012-02-13 19:29:33 +04:00
error = enter_state ( state ) ;
if ( error ) {
suspend_stats . fail + + ;
dpm_save_failed_errno ( error ) ;
} else {
suspend_stats . success + + ;
PM / Suspend: Add statistics debugfs file for suspend to RAM
Record S3 failure time about each reason and the latest two failed
devices' names in S3 progress.
We can check it through 'suspend_stats' entry in debugfs.
The motivation of the patch:
We are enabling power features on Medfield. Comparing with PC/notebook,
a mobile enters/exits suspend-2-ram (we call it s3 on Medfield) far
more frequently. If it can't enter suspend-2-ram in time, the power
might be used up soon.
We often find sometimes, a device suspend fails. Then, system retries
s3 over and over again. As display is off, testers and developers
don't know what happens.
Some testers and developers complain they don't know if system
tries suspend-2-ram, and what device fails to suspend. They need
such info for a quick check. The patch adds suspend_stats under
debugfs for users to check suspend to RAM statistics quickly.
If not using this patch, we have other methods to get info about
what device fails. One is to turn on CONFIG_PM_DEBUG, but users
would get too much info and testers need recompile the system.
In addition, dynamic debug is another good tool to dump debug info.
But it still doesn't match our utilization scenario closely.
1) user need write a user space parser to process the syslog output;
2) Our testing scenario is we leave the mobile for at least hours.
Then, check its status. No serial console available during the
testing. One is because console would be suspended, and the other
is serial console connecting with spi or HSU devices would consume
power. These devices are powered off at suspend-2-ram.
Signed-off-by: ShuoX Liu <shuox.liu@intel.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-08-11 01:01:26 +04:00
}
2017-07-21 15:50:48 +03:00
pr_info ( " suspend exit \n " ) ;
2012-02-13 19:29:33 +04:00
return error ;
2009-06-10 03:27:12 +04:00
}
EXPORT_SYMBOL ( pm_suspend ) ;