2019-05-27 08:55:06 +02:00
// SPDX-License-Identifier: GPL-2.0-or-later
2005-04-16 15:20:36 -07:00
/*
2015-02-05 16:27:35 +08:00
* ec . c - ACPI Embedded Controller Driver ( v3 )
2005-04-16 15:20:36 -07:00
*
2015-02-05 16:27:35 +08:00
* Copyright ( C ) 2001 - 2015 Intel Corporation
* Author : 2014 , 2015 Lv Zheng < lv . zheng @ intel . com >
2014-06-15 08:42:19 +08:00
* 2006 , 2007 Alexey Starikovskiy < alexey . y . starikovskiy @ intel . com >
* 2006 Denis Sadykov < denis . m . sadykov @ intel . com >
* 2004 Luming Yu < luming . yu @ intel . com >
* 2001 , 2002 Andy Grover < andrew . grover @ intel . com >
* 2001 , 2002 Paul Diefenbaugh < paul . s . diefenbaugh @ intel . com >
* Copyright ( C ) 2008 Alexey Starikovskiy < astarikovskiy @ suse . de >
2005-04-16 15:20:36 -07:00
*/
2008-09-25 21:00:31 +04:00
/* Uncomment next line to get verbose printout */
2008-01-23 22:34:09 -05:00
/* #define DEBUG */
2017-05-20 20:38:13 +02:00
# define pr_fmt(fmt) "ACPI: EC: " fmt
2008-01-23 22:34:09 -05:00
2005-04-16 15:20:36 -07:00
# include <linux/kernel.h>
# include <linux/module.h>
# include <linux/init.h>
# include <linux/types.h>
# include <linux/delay.h>
2005-03-19 01:10:05 -05:00
# include <linux/interrupt.h>
2007-05-29 16:43:02 +04:00
# include <linux/list.h>
2008-09-25 21:00:31 +04:00
# include <linux/spinlock.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 17:04:11 +09:00
# include <linux/slab.h>
2019-07-30 11:55:59 +02:00
# include <linux/suspend.h>
2013-12-03 08:49:16 +08:00
# include <linux/acpi.h>
2009-07-06 23:40:19 -04:00
# include <linux/dmi.h>
2013-12-03 08:49:16 +08:00
# include <asm/io.h>
2005-04-16 15:20:36 -07:00
2010-07-16 13:11:31 +02:00
# include "internal.h"
2005-04-16 15:20:36 -07:00
# define ACPI_EC_CLASS "embedded_controller"
# define ACPI_EC_DEVICE_NAME "Embedded Controller"
2007-05-29 16:43:02 +04:00
2006-09-26 19:50:33 +04:00
/* EC status register */
2005-04-16 15:20:36 -07:00
# define ACPI_EC_FLAG_OBF 0x01 /* Output buffer full */
# define ACPI_EC_FLAG_IBF 0x02 /* Input buffer full */
2014-06-15 08:42:42 +08:00
# define ACPI_EC_FLAG_CMD 0x08 /* Input buffer contains a command */
2005-03-19 01:10:05 -05:00
# define ACPI_EC_FLAG_BURST 0x10 /* burst mode */
2005-04-16 15:20:36 -07:00
# define ACPI_EC_FLAG_SCI 0x20 /* EC-SCI occurred */
2007-05-29 16:42:57 +04:00
2015-06-11 13:21:38 +08:00
/*
* The SCI_EVT clearing timing is not defined by the ACPI specification .
* This leads to lots of practical timing issues for the host EC driver .
* The following variations are defined ( from the target EC firmware ' s
* perspective ) :
* STATUS : After indicating SCI_EVT edge triggered IRQ to the host , the
* target can clear SCI_EVT at any time so long as the host can see
* the indication by reading the status register ( EC_SC ) . So the
* host should re - check SCI_EVT after the first time the SCI_EVT
* indication is seen , which is the same time the query request
* ( QR_EC ) is written to the command register ( EC_CMD ) . SCI_EVT set
* at any later time could indicate another event . Normally such
* kind of EC firmware has implemented an event queue and will
* return 0x00 to indicate " no outstanding event " .
* QUERY : After seeing the query request ( QR_EC ) written to the command
* register ( EC_CMD ) by the host and having prepared the responding
* event value in the data register ( EC_DATA ) , the target can safely
* clear SCI_EVT because the target can confirm that the current
* event is being handled by the host . The host then should check
* SCI_EVT right after reading the event response from the data
* register ( EC_DATA ) .
* EVENT : After seeing the event response read from the data register
* ( EC_DATA ) by the host , the target can clear SCI_EVT . As the
* target requires time to notice the change in the data register
* ( EC_DATA ) , the host may be required to wait additional guarding
* time before checking the SCI_EVT again . Such guarding may not be
* necessary if the host is notified via another IRQ .
*/
# define ACPI_EC_EVT_TIMING_STATUS 0x00
# define ACPI_EC_EVT_TIMING_QUERY 0x01
# define ACPI_EC_EVT_TIMING_EVENT 0x02
2006-09-26 19:50:33 +04:00
/* EC commands */
2006-12-07 18:42:17 +03:00
enum ec_command {
2006-12-07 18:42:17 +03:00
ACPI_EC_COMMAND_READ = 0x80 ,
ACPI_EC_COMMAND_WRITE = 0x81 ,
ACPI_EC_BURST_ENABLE = 0x82 ,
ACPI_EC_BURST_DISABLE = 0x83 ,
ACPI_EC_COMMAND_QUERY = 0x84 ,
2006-12-07 18:42:17 +03:00
} ;
2007-05-29 16:43:02 +04:00
2006-12-07 18:42:16 +03:00
# define ACPI_EC_DELAY 500 /* Wait 500ms max. during EC ops */
2006-09-26 19:50:33 +04:00
# define ACPI_EC_UDELAY_GLK 1000 /* Wait 1ms max. to get global lock */
ACPI / EC: Fix and clean up register access guarding logics.
In the polling mode, EC driver shouldn't access the EC registers too
frequently. Though this statement is concluded from the non-root caused
bugs (see links below), we've maintained the register access guarding
logics in the current EC driver. The guarding logics can be found here and
there, makes it hard to root cause real timing issues. This patch collects
the guarding logics into one single function so that all hidden logics
related to this can be seen clearly.
The current guarding related code also has several issues:
1. Per-transaction timestamp prevents inter-transaction guarding from being
implemented in the same place. We have an inter-transaction udelay() in
acpi_ec_transaction_unblocked(), this logic can be merged into ec_poll()
if we can use per-device timestamp. This patch completes such merge to
form a new ec_guard() function and collects all guarding related hidden
logics in it.
One hidden logic is: there is no inter-transaction guarding performed
for non MSI quirk (wait polling mode), this patch skips
inter-transaction guarding before wait_event_timeout() for the wait
polling mode to reveal the hidden logic.
The other hidden logic is: there is msleep() inter-transaction guarding
performed when the GPE storming is observed. As after merging this
commit:
Commit: e1d4d90fc0313d3d58cbd7912c90f8ef24df45ff
Subject: ACPI / EC: Refine command storm prevention support
EC_FLAGS_COMMAND_STORM is ensured to be cleared after invoking
acpi_ec_transaction_unlocked(), the msleep() guard logic will never
happen now. Since no one complains such change, this logic is likely
added during the old times where the EC race issues are not fixed and
the bugs are false root-caused to the timing issue. This patch simply
removes the out-dated logic. We can restore it by stop skipping
inter-transaction guarding for wait polling mode.
Two different delay values are defined for msleep() and udelay() while
they are merged in this patch to 550us.
2. time_after() causes additional delay in the polling mode (can only be
observed in noirq suspend/resume processes where polling mode is always
used) before advance_transaction() is invoked ("wait polling" log is
added before wait_event_timeout()). We can see 2 wait_event_timeout()
invocations. This is because time_after() ensures a ">" validation while
we only need a ">=" validation here:
[ 86.739909] ACPI: Waking up from system sleep state S3
[ 86.742857] ACPI : EC: 2: Increase command
[ 86.742859] ACPI : EC: ***** Command(RD_EC) started *****
[ 86.742861] ACPI : EC: ===== TASK (0) =====
[ 86.742871] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.742873] ACPI : EC: EC_SC(W) = 0x80
[ 86.742876] ACPI : EC: ***** Event started *****
[ 86.742880] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.743972] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.747966] ACPI : EC: ===== TASK (0) =====
[ 86.747977] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.747978] ACPI : EC: EC_DATA(W) = 0x06
[ 86.747981] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.751971] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755969] ACPI : EC: ===== TASK (0) =====
[ 86.755991] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 86.755993] ACPI : EC: EC_DATA(R) = 0x03
[ 86.755994] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755995] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 86.755996] ACPI : EC: 1: Decrease command
This patch corrects this by using time_before() instead in ec_guard():
[ 54.283146] ACPI: Waking up from system sleep state S3
[ 54.285414] ACPI : EC: 2: Increase command
[ 54.285415] ACPI : EC: ***** Command(RD_EC) started *****
[ 54.285416] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.285417] ACPI : EC: ===== TASK (0) =====
[ 54.285424] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.285425] ACPI : EC: EC_SC(W) = 0x80
[ 54.285427] ACPI : EC: ***** Event started *****
[ 54.285429] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.287209] ACPI : EC: ===== TASK (0) =====
[ 54.287218] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.287219] ACPI : EC: EC_DATA(W) = 0x06
[ 54.287222] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291190] ACPI : EC: ===== TASK (0) =====
[ 54.291210] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 54.291213] ACPI : EC: EC_DATA(R) = 0x03
[ 54.291214] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291215] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 54.291216] ACPI : EC: 1: Decrease command
After cleaning up all guarding logics, we have one single function
ec_guard() collecting all old, non-root-caused, hidden logics. Then we can
easily tune the logics in one place to respond to the bug reports.
Except the time_before() change, all other changes do not change the
behavior of the EC driver.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=12011
Link: https://bugzilla.kernel.org/show_bug.cgi?id=20242
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 14:16:42 +08:00
# define ACPI_EC_UDELAY_POLL 550 /* Wait 1ms for EC transaction polling */
ACPI / EC: Clear stale EC events on Samsung systems
A number of Samsung notebooks (530Uxx/535Uxx/540Uxx/550Pxx/900Xxx/etc)
continue to log events during sleep (lid open/close, AC plug/unplug,
battery level change), which accumulate in the EC until a buffer fills.
After the buffer is full (tests suggest it holds 8 events), GPEs stop
being triggered for new events. This state persists on wake or even on
power cycle, and prevents new events from being registered until the EC
is manually polled.
This is the root cause of a number of bugs, including AC not being
detected properly, lid close not triggering suspend, and low ambient
light not triggering the keyboard backlight. The bug also seemed to be
responsible for performance issues on at least one user's machine.
Juan Manuel Cabo found the cause of bug and the workaround of polling
the EC manually on wake.
The loop which clears the stale events is based on an earlier patch by
Lan Tianyu (see referenced attachment).
This patch:
- Adds a function acpi_ec_clear() which polls the EC for stale _Q
events at most ACPI_EC_CLEAR_MAX (currently 100) times. A warning is
logged if this limit is reached.
- Adds a flag EC_FLAGS_CLEAR_ON_RESUME which is set to 1 if the DMI
system vendor is Samsung. This check could be replaced by several
more specific DMI vendor/product pairs, but it's likely that the bug
affects more Samsung products than just the five series mentioned
above. Further, it should not be harmful to run acpi_ec_clear() on
systems without the bug; it will return immediately after finding no
data waiting.
- Runs acpi_ec_clear() on initialisation (boot), from acpi_ec_add()
- Runs acpi_ec_clear() on wake, from acpi_ec_unblock_transactions()
References: https://bugzilla.kernel.org/show_bug.cgi?id=44161
References: https://bugzilla.kernel.org/show_bug.cgi?id=45461
References: https://bugzilla.kernel.org/show_bug.cgi?id=57271
References: https://bugzilla.kernel.org/attachment.cgi?id=126801
Suggested-by: Juan Manuel Cabo <juanmanuel.cabo@gmail.com>
Signed-off-by: Kieran Clancy <clancy.kieran@gmail.com>
Reviewed-by: Lan Tianyu <tianyu.lan@intel.com>
Reviewed-by: Dennis Jansen <dennis.jansen@web.de>
Tested-by: Kieran Clancy <clancy.kieran@gmail.com>
Tested-by: Juan Manuel Cabo <juanmanuel.cabo@gmail.com>
Tested-by: Dennis Jansen <dennis.jansen@web.de>
Tested-by: Maurizio D'Addona <mauritiusdadd@gmail.com>
Tested-by: San Zamoyski <san@plusnet.pl>
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-01 00:42:28 +10:30
# define ACPI_EC_CLEAR_MAX 100 / * Maximum number of events to query
* when trying to clear the EC */
2016-08-03 09:00:14 +08:00
# define ACPI_EC_MAX_QUERIES 16 /* Maximum number of parallel queries */
2006-09-26 19:50:33 +04:00
2007-10-22 14:18:30 +04:00
enum {
2016-08-03 16:01:24 +08:00
EC_FLAGS_QUERY_ENABLED , /* Query is enabled */
2019-10-14 16:56:01 +08:00
EC_FLAGS_EVENT_HANDLER_INSTALLED , /* Event handler installed */
2016-03-24 10:42:47 +08:00
EC_FLAGS_EC_HANDLER_INSTALLED , /* OpReg handler installed */
2022-12-08 15:23:35 +01:00
EC_FLAGS_EC_REG_CALLED , /* OpReg ACPI _REG method called */
2019-10-14 16:56:01 +08:00
EC_FLAGS_QUERY_METHODS_INSTALLED , /* _Qxx handlers installed */
2015-02-06 08:57:52 +08:00
EC_FLAGS_STARTED , /* Driver is started */
EC_FLAGS_STOPPED , /* Driver is stopped */
2019-10-14 16:56:01 +08:00
EC_FLAGS_EVENTS_MASKED , /* Events masked */
2005-04-16 15:20:36 -07:00
} ;
2006-09-26 19:50:33 +04:00
2014-06-15 08:41:35 +08:00
# define ACPI_EC_COMMAND_POLL 0x01 /* Available for command byte */
# define ACPI_EC_COMMAND_COMPLETE 0x02 /* Completed last byte */
2010-10-21 18:24:57 +02:00
/* ec.c is compiled in acpi namespace so this shows up as acpi.ec_delay param */
static unsigned int ec_delay __read_mostly = ACPI_EC_DELAY ;
module_param ( ec_delay , uint , 0644 ) ;
MODULE_PARM_DESC ( ec_delay , " Timeout(ms) waited until an EC command completes " ) ;
2016-08-03 09:00:14 +08:00
static unsigned int ec_max_queries __read_mostly = ACPI_EC_MAX_QUERIES ;
module_param ( ec_max_queries , uint , 0644 ) ;
MODULE_PARM_DESC ( ec_max_queries , " Maximum parallel _Qxx evaluations " ) ;
2015-05-15 14:16:48 +08:00
static bool ec_busy_polling __read_mostly ;
module_param ( ec_busy_polling , bool , 0644 ) ;
MODULE_PARM_DESC ( ec_busy_polling , " Use busy polling to advance EC transaction " ) ;
static unsigned int ec_polling_guard __read_mostly = ACPI_EC_UDELAY_POLL ;
module_param ( ec_polling_guard , uint , 0644 ) ;
MODULE_PARM_DESC ( ec_polling_guard , " Guard time(us) between EC accesses in polling modes " ) ;
ACPI / EC: Fix EC_FLAGS_QUERY_HANDSHAKE platforms using new event clearing timing.
It is reported that on several platforms, EC firmware will not respond
non-expected QR_EC (see EC_FLAGS_QUERY_HANDSHAKE, only write QR_EC when
SCI_EVT is set).
Unfortunately, ACPI specification doesn't define when the SCI_EVT should be
cleared by the firmware, thus the original implementation queued up second
QR_EC right after writing QR_EC command and before reading the returned
event value as at that time the SCI_EVT is ensured not cleared. This
behavior is also based on the assumption that the firmware should be able
to return 0x00 to indicate "no outstanding event". This behavior did fix
issues on Samsung platforms where the spurious query value of 0x00 is
supported and didn't break platforms in my test queue.
But recently, specific Acer, Asus, Lenovo platforms keep on blaming this
change.
This patch changes the behavior to re-check the SCI_EVT a bit later and
removes EC_FLAGS_QUERY_HANDSHAKE quirks, hoping this is the Windows
compliant EC driver behavior.
In order to be robust to the possible regressions, instead of removing the
quirk directly, this patch keeps the quirk code, removes the quirk users
and keeps old behavior for Samsung platforms.
Cc: 3.16+ <stable@vger.kernel.org> # 3.16+
Link: https://bugzilla.kernel.org/show_bug.cgi?id=94411
Link: https://bugzilla.kernel.org/show_bug.cgi?id=97381
Link: https://bugzilla.kernel.org/show_bug.cgi?id=98111
Reported-and-tested-by: Gabriele Mazzotta <gabriele.mzt@gmail.com>
Reported-and-tested-by: Tigran Gabrielyan <tigrangab@gmail.com>
Reported-and-tested-by: Adrien D <ghbdtn@openmailbox.org>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-06-11 13:21:45 +08:00
static unsigned int ec_event_clearing __read_mostly = ACPI_EC_EVT_TIMING_QUERY ;
2015-06-11 13:21:38 +08:00
2012-09-28 15:22:00 +08:00
/*
* If the number of false interrupts per one transaction exceeds
* this threshold , will think there is a GPE storm happened and
* will disable the GPE for normal transaction .
*/
static unsigned int ec_storm_threshold __read_mostly = 8 ;
module_param ( ec_storm_threshold , uint , 0644 ) ;
MODULE_PARM_DESC ( ec_storm_threshold , " Maxim false GPE numbers not considered as GPE storm " ) ;
2021-11-03 13:51:19 +08:00
static bool ec_freeze_events __read_mostly ;
ACPI / EC: Add PM operations to improve event handling for suspend process
In the original EC driver, though the event handling is not explicitly
stopped, the EC driver is actually not able to handle events during the
noirq stage as the EC driver is not prepared to handle the EC events in the
polling mode. So if there is no advance_transaction() triggered, the EC
driver couldn't notice the EC events.
However, do we actually need to handle EC events during suspend/resume
stage? EC events are mostly useless for the suspend/resume period (key
strokes and battery/thermal updates, etc.,), and the useful ones (lid
close, power/sleep button press) should have already been delivered to the
OSPM to trigger the power saving operations.
Thus this patch implements acpi_ec_disable_event() to be a reverse call of
acpi_ec_enable_event(), with which, the EC driver is able to stop handling
the EC events in a position before entering the noirq stage.
Since there are actually 2 choices for us:
1. implement event handling in polling mode;
2. stop event handling before entering noirq stage.
And this patch only implements the second choice using .suspend() callback.
Thus this is experimental (first choice is better? or different hook
position is better?). This patch finally keeps the old behavior by default
and prepares a boot parameter to enable this feature.
The differences of the event handling availability between the old behavior
(this patch is not applied) and the new behavior (this patch is applied)
are as follows:
!FreezeEvents FreezeEvents
before suspend Y Y
suspend before EC Y Y
suspend after EC Y N
suspend_late Y N
suspend_noirq Y (actually N) N
resume_noirq Y (actually N) N
resume_late Y (actually N) N
resume before EC Y (actually N) N
resume after EC Y Y
after resume Y Y
Where "actually N" means if there is no EC transactions, the EC driver
is actually not able to notice the pending events.
We can see that FreezeEvents is the only approach now can actually flush
the EC event handling with both query commands and _Qxx evaluations
flushed, other modes can only flush the EC event handling with only query
commands flushed, _Qxx evaluations occurred after stopping the EC driver
may end up failure due to the failure of the EC transaction carried out in
the _Qxx control methods.
We also can see that this feature should be able to trigger some platform
notifications later than resuming other drivers.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-03 16:01:43 +08:00
module_param ( ec_freeze_events , bool , 0644 ) ;
MODULE_PARM_DESC ( ec_freeze_events , " Disabling event handling during suspend/resume " ) ;
ACPI / EC: Add parameter to force disable the GPE on suspend
After commit 8110dd281e15 (ACPI / sleep: EC-based wakeup from
suspend-to-idle on recent systems) the configuration of GPEs,
including the EC one, is not changed during suspend-to-idle on
recent systems. That's in order to make system wakeup events
generated by the EC work, in particular.
However, on some of the systems in question (for example on Dell
XPS13 9365), in addition to generating system wakeup events the
EC generates a heartbeat sequence of interrupts that have nothing
to do with wakeup while suspended, and the Low Power Idle S0 _DSM
interface doesn't change that behavior.
The users of those systems may prefer to disable the EC GPE during
system suspend, for the cost of non-functional power button wakeup
or similar, but currently there is no way to do that.
For this reason, add a new module parameter, ec_no_wakeup, for the
EC driver module that, if set, will cause the EC GPE to be disabled
during system suspend and re-enabled during the subsequent system
resume.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=192591#c106
Amends: 8110dd281e15 (ACPI / sleep: EC-based wakeup from suspend-to-idle on recent systems)
Reported-and-tested-by: Patrik Kullman <patrik.kullman@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-19 14:41:58 +02:00
static bool ec_no_wakeup __read_mostly ;
module_param ( ec_no_wakeup , bool , 0644 ) ;
MODULE_PARM_DESC ( ec_no_wakeup , " Do not wake up from suspend-to-idle " ) ;
2007-05-29 16:43:02 +04:00
struct acpi_ec_query_handler {
struct list_head node ;
acpi_ec_query_func func ;
acpi_handle handle ;
void * data ;
u8 query_bit ;
2015-01-14 19:28:28 +08:00
struct kref kref ;
2007-05-29 16:43:02 +04:00
} ;
2008-09-26 00:54:28 +04:00
struct transaction {
2008-09-25 21:00:31 +04:00
const u8 * wdata ;
u8 * rdata ;
unsigned short irq_count ;
2008-09-26 00:54:28 +04:00
u8 command ;
2008-11-12 01:40:19 +03:00
u8 wi ;
u8 ri ;
2008-09-25 21:00:31 +04:00
u8 wlen ;
u8 rlen ;
2014-06-15 08:41:35 +08:00
u8 flags ;
2008-09-25 21:00:31 +04:00
} ;
2015-08-12 11:12:02 +08:00
struct acpi_ec_query {
struct transaction transaction ;
struct work_struct work ;
struct acpi_ec_query_handler * handler ;
ACPI: EC: Rework flushing of EC work while suspended to idle
The flushing of pending work in the EC driver uses drain_workqueue()
to flush the event handling work that can requeue itself via
advance_transaction(), but this is problematic, because that
work may also be requeued from the query workqueue.
Namely, if an EC transaction is carried out during the execution of
a query handler, it involves calling advance_transaction() which
may queue up the event handling work again. This causes the kernel
to complain about attempts to add a work item to the EC event
workqueue while it is being drained and worst-case it may cause a
valid event to be skipped.
To avoid this problem, introduce two new counters, events_in_progress
and queries_in_progress, incremented when a work item is queued on
the event workqueue or the query workqueue, respectively, and
decremented at the end of the corresponding work function, and make
acpi_ec_dispatch_gpe() the workqueues in a loop until the both of
these counters are zero (or system wakeup is pending) instead of
calling acpi_ec_flush_work().
At the same time, change __acpi_ec_flush_work() to call
flush_workqueue() instead of drain_workqueue() to flush the event
workqueue.
While at it, use the observation that the work item queued in
acpi_ec_query() cannot be pending at that time, because it is used
only once, to simplify the code in there.
Additionally, clean up a comment in acpi_ec_query() and adjust white
space in acpi_ec_event_processor().
Fixes: f0ac20c3f613 ("ACPI: EC: Fix flushing of pending work")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:36:51 +01:00
struct acpi_ec * ec ;
2015-08-12 11:12:02 +08:00
} ;
2021-11-23 19:42:02 +01:00
static int acpi_ec_submit_query ( struct acpi_ec * ec ) ;
2022-02-04 18:40:20 +01:00
static void advance_transaction ( struct acpi_ec * ec , bool interrupt ) ;
2015-08-12 11:12:02 +08:00
static void acpi_ec_event_handler ( struct work_struct * work ) ;
ACPI / EC: Fix issues related to the SCI_EVT handling
This patch fixes 2 issues related to the draining behavior. But it doesn't
implement the draining support, it only cleans up code so that further
draining support is possible.
The draining behavior is expected by some platforms (for example, Samsung)
where SCI_EVT is set only once for a set of events and might be cleared for
the very first QR_EC command issued after SCI_EVT is set. EC firmware on
such platforms will return 0x00 to indicate "no outstanding event". Thus
after seeing an SCI_EVT indication, EC driver need to fetch events until
0x00 returned (see acpi_ec_clear()).
Issue 1 - acpi_ec_submit_query():
It's reported on Samsung laptops that SCI_EVT isn't checked when the
transactions are advanced in ec_poll(), which leads to SCI_EVT triggering
source lost:
If no EC GPE IRQs are arrived after that, EC driver cannot detect this
event and handle it.
See comment 244/247 for kernel bugzilla 44161.
This patch fixes this issue by moving SCI_EVT checks into
advance_transaction(). So that SCI_EVT is checked each time we are going to
handle the EC firmware indications. And this check will happen for both IRQ
context and task context.
Since after doing that, SCI_EVT is also checked after completing a
transaction, ec_check_sci() and ec_check_sci_sync() can be removed.
Issue 2 - acpi_ec_complete_query():
We expect to clear EC_FLAGS_QUERY_PENDING to allow queuing another draining
QR_EC after writing a QR_EC command and before reading the event. After
reading the event, SCI_EVT might be cleared by the firmware, thus it may
not be possible to queue such a draining QR_EC at that time.
But putting the EC_FLAGS_QUERY_PENDING clearing code after
start_transaction() is wrong as there are chances that after
start_transaction(), QR_EC can fail to be sent. If this happens,
EC_FLAG_QUERY_PENDING will be cleared earlier. As a consequence, the
draining QR_EC will also be queued earlier than expected.
This patch also moves this code into advance_transaction() where QR_EC is
just sent (ACPI_EC_COMMAND_POLL flagged) to fix this issue.
Notes:
1. After introducing the 2 SCI_EVT related handlings into
advance_transaction(), a next QR_EC can be queued right after writing
the current QR_EC command and before reading the event. But this still
hasn't implemented the draining behavior as the draining support
requires:
If a previous returned event value isn't 0x00, a draining QR_EC need
to be issued even when SCI_EVT isn't set.
2. In this patch, acpi_os_execute() is also converted into a seperate work
item to avoid invoking kmalloc() in the atomic context. We can do this
because of the previous global lock fix.
3. Originally, EC_FLAGS_EVENT_PENDING is also used to avoid queuing up
multiple work items (created by acpi_os_execute()), this can be covered
by only using a single work item. But this patch still keeps this flag
as there are different usages in the driver initialization steps relying
on this flag.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=44161
Reported-by: Kieran Clancy <clancy.kieran@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-14 19:28:47 +08:00
2019-02-01 10:57:01 +01:00
struct acpi_ec * first_ec ;
2010-07-16 13:11:31 +02:00
EXPORT_SYMBOL ( first_ec ) ;
2019-02-01 10:57:01 +01:00
static struct acpi_ec * boot_ec ;
2021-11-03 13:51:19 +08:00
static bool boot_ec_is_ecdt ;
ACPI: EC: Fix flushing of pending work
Commit 016b87ca5c8c ("ACPI: EC: Rework flushing of pending work")
introduced a subtle bug into the flushing of pending EC work while
suspended to idle, which may cause the EC driver to fail to
re-enable the EC GPE after handling a non-wakeup event (like a
battery status change event, for example).
The problem is that the work item flushed by flush_scheduled_work()
in __acpi_ec_flush_work() may disable the EC GPE and schedule another
work item expected to re-enable it, but that new work item is not
flushed, so __acpi_ec_flush_work() returns with the EC GPE disabled
and the CPU running it goes into an idle state subsequently. If all
of the other CPUs are in idle states at that point, the EC GPE won't
be re-enabled until at least one CPU is woken up by another interrupt
source, so system wakeup events that would normally come from the EC
then don't work.
This is reproducible on a Dell XPS13 9360 in my office which
sometimes stops reacting to power button and lid events (triggered
by the EC on that machine) after switching from AC power to battery
power or vice versa while suspended to idle (each of those switches
causes the EC GPE to trigger for several times in a row, but they
are not system wakeup events).
To avoid this problem, it is necessary to drain the workqueue
entirely in __acpi_ec_flush_work(), but that cannot be done with
respect to system_wq, because work items may be added to it from
other places while __acpi_ec_flush_work() is running. For this
reason, make the EC driver use a dedicated workqueue for EC events
processing (let that workqueue be ordered so that EC events are
processed sequentially) and use drain_workqueue() on it in
__acpi_ec_flush_work().
Fixes: 016b87ca5c8c ("ACPI: EC: Rework flushing of pending work")
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-02-11 10:07:43 +01:00
static struct workqueue_struct * ec_wq ;
2016-08-03 09:00:14 +08:00
static struct workqueue_struct * ec_query_wq ;
2006-09-26 19:50:33 +04:00
ACPI 2.0 / ECDT: Remove early namespace reference from EC
All operation region accesses are allowed by AML interpreter when AML is
executed, so actually BIOSen are responsible to avoid the operation region
accesses in AML before OSPM has prepared an operation region driver. This
is done via _REG control method. So AML code normally sets a global named
object REGC to 1 when _REG(3, 1) is evaluated.
Then what is ECDT? Quoting from ACPI spec 6.0, 5.2.15 Embedded Controller
Boot Resources Table (ECDT):
"The presence of this table allows OSPM to provide Embedded Controller
operation region space access before the namespace has been evaluated."
Spec also suggests a compatible mean to indicate the early EC access
availability:
Device (EC)
{
Name (REGC, Ones)
Method (_REG, 2)
{
If (LEqual (Arg0, 3))
{
Store (Arg1, REGC)
}
}
Method (ECAV)
{
If (LEqual (REGC, Ones))
{
If (LGreaterEqual (_REV, 2))
{
Return (One)
}
Else
{
Return (Zero)
}
}
Else
{
Return (REGC)
}
}
}
In this way, it allows EC accesses to happen before EC._REG(3, 1) is
invoked.
But ECAV is not the only way practical BIOSen using to indicate the early
EC access availibility, the known variations include:
1. Setting REGC to One in \_SB._INI when _REV >= 2. Since \_SB._INI is the
first control method evaluated by OSPM during the enumeration, this
allows EC accesses to happen for the entire enumeration process before
the namespace EC is enumerated.
2. Initialize REGC to One by default, this even allows EC accesses to
happen during the table loading.
Linux is now broken around ECDT support during the long term bug fixing
work because it has merged many wrong ECDT bug fixes (see details below).
Linux currently uses namespace EC's settings instead of ECDT settings when
ECDT is detected. This apparently will result in namespace walk and
_CRS/_GPE/_REG evaluations. Such stuffs could only happen after namespace
is ready, while ECDT is purposely to be used before namespace is ready.
The wrong bug fixing story is:
1. Link 1:
At Linux ACPI early stages, "no _Lxx/_Exx/_Qxx evaluation can happen
before the namespace is ready" are not ensured by ACPICA core and Linux.
This is currently ensured by deferred enabling of GPE and defered
registering of EC query methods (acpi_ec_register_query_methods).
2. Link 2:
Reporters reported buggy ECDTs, expecting quirks for the platform.
Originally, the quirk is simple, only doing things with ECDT.
Bug 9399 and 12461 are platforms (Asus L4R, Asus M6R, MSI MS-171F)
reported to have wrong ECDT IO port addresses, the port addresses are
reversed.
Bug 11880 is a platform (Asus X50GL) reported to have 0 valued port
addresses, we can see that all EC accesses are protected by ECAV on
this platform, so actually no early EC accesses is required by this
platform.
3. Link 3:
But when the bug fixing developer was requested to provide a handy and
non-quirk bug fix, he tried to use correct EC settings from namespace
and broke the spec purpose. We can even see that the developer was
suffered from many regrssions. One interesting one is 14086, where the
actual root cause obviously should be: _REG is evaluated too early. But
unfortunately, the bug is fixed in a totally wrong way.
So everything goes wrong from these commits:
Commit: c6cb0e878446c79f42e7833d7bb69ed6bfbb381f
Subject: ACPI: EC: Don't trust ECDT tables from ASUS
Commit: a5032bfdd9c80e0231a6324661e123818eb46ecd
Subject: ACPI: EC: Always parse EC device
This patch reverts Linux behavior to simple ECDT quirk support in order to
stop early _CRS/_GPE/_REG evaluations.
For Bug 9399, 12461, since it is reported that the platforms require early
EC accesses, this patch restores the simple ECDT quirks for them.
For Bug 11880, since it is not reported that the platform requires early EC
accesses and its ACPI tables contain correct ECAV, we choose an ECDT
enumeration failure for this platform.
Link 1: https://bugzilla.kernel.org/show_bug.cgi?id=9916
http://bugzilla.kernel.org/show_bug.cgi?id=10100
https://lkml.org/lkml/2008/2/25/282
Link 2: https://bugzilla.kernel.org/show_bug.cgi?id=9399
https://bugzilla.kernel.org/show_bug.cgi?id=12461
https://bugzilla.kernel.org/show_bug.cgi?id=11880
Link 3: https://bugzilla.kernel.org/show_bug.cgi?id=11884
https://bugzilla.kernel.org/show_bug.cgi?id=14081
https://bugzilla.kernel.org/show_bug.cgi?id=14086
https://bugzilla.kernel.org/show_bug.cgi?id=14446
Link 4: https://bugzilla.kernel.org/show_bug.cgi?id=112911
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-24 10:42:53 +08:00
static int EC_FLAGS_CORRECT_ECDT ; /* Needs ECDT port address correction */
2021-06-21 09:37:27 +08:00
static int EC_FLAGS_TRUST_DSDT_GPE ; /* Needs DSDT GPE as correction setting */
2019-02-01 14:13:41 +08:00
static int EC_FLAGS_CLEAR_ON_RESUME ; /* Needs acpi_ec_clear() on boot/resume */
2009-02-21 12:18:13 -05:00
2015-02-27 14:48:15 +08:00
/* --------------------------------------------------------------------------
* Logging / Debugging
* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
/*
* Splitters used by the developers to track the boundary of the EC
* handling processes .
*/
# ifdef DEBUG
# define EC_DBG_SEP " "
# define EC_DBG_DRV "+++++"
# define EC_DBG_STM "====="
# define EC_DBG_REQ "*****"
# define EC_DBG_EVT "#####"
# else
# define EC_DBG_SEP ""
# define EC_DBG_DRV
# define EC_DBG_STM
# define EC_DBG_REQ
# define EC_DBG_EVT
# endif
# define ec_log_raw(fmt, ...) \
pr_info ( fmt " \n " , # # __VA_ARGS__ )
# define ec_dbg_raw(fmt, ...) \
pr_debug ( fmt " \n " , # # __VA_ARGS__ )
# define ec_log(filter, fmt, ...) \
ec_log_raw ( filter EC_DBG_SEP fmt EC_DBG_SEP filter , # # __VA_ARGS__ )
# define ec_dbg(filter, fmt, ...) \
ec_dbg_raw ( filter EC_DBG_SEP fmt EC_DBG_SEP filter , # # __VA_ARGS__ )
# define ec_log_drv(fmt, ...) \
ec_log ( EC_DBG_DRV , fmt , # # __VA_ARGS__ )
# define ec_dbg_drv(fmt, ...) \
ec_dbg ( EC_DBG_DRV , fmt , # # __VA_ARGS__ )
# define ec_dbg_stm(fmt, ...) \
ec_dbg ( EC_DBG_STM , fmt , # # __VA_ARGS__ )
# define ec_dbg_req(fmt, ...) \
ec_dbg ( EC_DBG_REQ , fmt , # # __VA_ARGS__ )
# define ec_dbg_evt(fmt, ...) \
ec_dbg ( EC_DBG_EVT , fmt , # # __VA_ARGS__ )
2015-02-27 14:48:24 +08:00
# define ec_dbg_ref(ec, fmt, ...) \
ec_dbg_raw ( " %lu: " fmt , ec - > reference_count , # # __VA_ARGS__ )
2015-02-27 14:48:15 +08:00
2015-02-06 08:57:52 +08:00
/* --------------------------------------------------------------------------
* Device Flags
* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
static bool acpi_ec_started ( struct acpi_ec * ec )
{
return test_bit ( EC_FLAGS_STARTED , & ec - > flags ) & &
! test_bit ( EC_FLAGS_STOPPED , & ec - > flags ) ;
}
2016-08-03 16:01:24 +08:00
static bool acpi_ec_event_enabled ( struct acpi_ec * ec )
{
/*
* There is an OSPM early stage logic . During the early stages
* ( boot / resume ) , OSPMs shouldn ' t enable the event handling , only
* the EC transactions are allowed to be performed .
*/
if ( ! test_bit ( EC_FLAGS_QUERY_ENABLED , & ec - > flags ) )
return false ;
/*
ACPI / EC: Add PM operations to improve event handling for suspend process
In the original EC driver, though the event handling is not explicitly
stopped, the EC driver is actually not able to handle events during the
noirq stage as the EC driver is not prepared to handle the EC events in the
polling mode. So if there is no advance_transaction() triggered, the EC
driver couldn't notice the EC events.
However, do we actually need to handle EC events during suspend/resume
stage? EC events are mostly useless for the suspend/resume period (key
strokes and battery/thermal updates, etc.,), and the useful ones (lid
close, power/sleep button press) should have already been delivered to the
OSPM to trigger the power saving operations.
Thus this patch implements acpi_ec_disable_event() to be a reverse call of
acpi_ec_enable_event(), with which, the EC driver is able to stop handling
the EC events in a position before entering the noirq stage.
Since there are actually 2 choices for us:
1. implement event handling in polling mode;
2. stop event handling before entering noirq stage.
And this patch only implements the second choice using .suspend() callback.
Thus this is experimental (first choice is better? or different hook
position is better?). This patch finally keeps the old behavior by default
and prepares a boot parameter to enable this feature.
The differences of the event handling availability between the old behavior
(this patch is not applied) and the new behavior (this patch is applied)
are as follows:
!FreezeEvents FreezeEvents
before suspend Y Y
suspend before EC Y Y
suspend after EC Y N
suspend_late Y N
suspend_noirq Y (actually N) N
resume_noirq Y (actually N) N
resume_late Y (actually N) N
resume before EC Y (actually N) N
resume after EC Y Y
after resume Y Y
Where "actually N" means if there is no EC transactions, the EC driver
is actually not able to notice the pending events.
We can see that FreezeEvents is the only approach now can actually flush
the EC event handling with both query commands and _Qxx evaluations
flushed, other modes can only flush the EC event handling with only query
commands flushed, _Qxx evaluations occurred after stopping the EC driver
may end up failure due to the failure of the EC transaction carried out in
the _Qxx control methods.
We also can see that this feature should be able to trigger some platform
notifications later than resuming other drivers.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-03 16:01:43 +08:00
* However , disabling the event handling is experimental for late
* stage ( suspend ) , and is controlled by the boot parameter of
* " ec_freeze_events " :
* 1. true : The EC event handling is disabled before entering
* the noirq stage .
* 2. false : The EC event handling is automatically disabled as
* soon as the EC driver is stopped .
2016-08-03 16:01:24 +08:00
*/
ACPI / EC: Add PM operations to improve event handling for suspend process
In the original EC driver, though the event handling is not explicitly
stopped, the EC driver is actually not able to handle events during the
noirq stage as the EC driver is not prepared to handle the EC events in the
polling mode. So if there is no advance_transaction() triggered, the EC
driver couldn't notice the EC events.
However, do we actually need to handle EC events during suspend/resume
stage? EC events are mostly useless for the suspend/resume period (key
strokes and battery/thermal updates, etc.,), and the useful ones (lid
close, power/sleep button press) should have already been delivered to the
OSPM to trigger the power saving operations.
Thus this patch implements acpi_ec_disable_event() to be a reverse call of
acpi_ec_enable_event(), with which, the EC driver is able to stop handling
the EC events in a position before entering the noirq stage.
Since there are actually 2 choices for us:
1. implement event handling in polling mode;
2. stop event handling before entering noirq stage.
And this patch only implements the second choice using .suspend() callback.
Thus this is experimental (first choice is better? or different hook
position is better?). This patch finally keeps the old behavior by default
and prepares a boot parameter to enable this feature.
The differences of the event handling availability between the old behavior
(this patch is not applied) and the new behavior (this patch is applied)
are as follows:
!FreezeEvents FreezeEvents
before suspend Y Y
suspend before EC Y Y
suspend after EC Y N
suspend_late Y N
suspend_noirq Y (actually N) N
resume_noirq Y (actually N) N
resume_late Y (actually N) N
resume before EC Y (actually N) N
resume after EC Y Y
after resume Y Y
Where "actually N" means if there is no EC transactions, the EC driver
is actually not able to notice the pending events.
We can see that FreezeEvents is the only approach now can actually flush
the EC event handling with both query commands and _Qxx evaluations
flushed, other modes can only flush the EC event handling with only query
commands flushed, _Qxx evaluations occurred after stopping the EC driver
may end up failure due to the failure of the EC transaction carried out in
the _Qxx control methods.
We also can see that this feature should be able to trigger some platform
notifications later than resuming other drivers.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-03 16:01:43 +08:00
if ( ec_freeze_events )
return acpi_ec_started ( ec ) ;
else
return test_bit ( EC_FLAGS_STARTED , & ec - > flags ) ;
2016-08-03 16:01:24 +08:00
}
2015-02-06 08:57:59 +08:00
static bool acpi_ec_flushed ( struct acpi_ec * ec )
{
return ec - > reference_count = = 1 ;
}
2005-04-16 15:20:36 -07:00
/* --------------------------------------------------------------------------
ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode.
This patch switches EC driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode where
the GPE lock is not held for acpi_ec_gpe_handler() and the ACPICA internal
GPE enabling/disabling/clearing operations are bypassed so that further
improvements are possible with the GPE APIs.
There are 2 strong reasons for deploying raw GPE handler mode in the EC
driver:
1. Some hardware logics can control their interrupts via their own
registers, so their interrupts can be disabled/enabled/acknowledged
without using the super IRQ controller provided functions. While there
is no mean (EC commands) for the EC driver to achieve this.
2. During suspending, the EC driver is still working for a while to
complete the platform firmware provided functionailities using ec_poll()
after all GPEs are disabled (see acpi_ec_block_transactions()), which
means the EC driver will drive the EC GPE out of the GPE core's control.
Without deploying the raw GPE handler mode, we can see many races between
the EC driver and the GPE core due to the above restrictions:
1. There is a race condition due to ACPICA internal GPE
disabling/clearing/enabling logics in acpi_ev_gpe_dispatch():
Orignally EC GPE is disabled (EN=0), cleared (STS=0) before invoking a
GPE handler and re-enabled (EN=1) after invoking a GPE handler in
acpi_ev_gpe_dispatch(). When re-enabling appears, GPE may be flagged
(STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() ec_poll()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1
This race condition is the root cause of different issues on different
silicon variations.
A. Silicon variation A:
On some platforms, GPE will be triggered due to "writing 1 to EN when
STS=1". This is because both EN and STS lines are wired to the GPE
trigger line.
1. Issue 1:
We can see no-op acpi_ec_gpe_handler() invoked on such platforms.
This is because:
a. event pending B: An event can arrive after ACPICA's GPE
clearing performed in acpi_ev_gpe_dispatch(), this event may
fail to be detected by EC_SC read that is performed before its
arrival;
b. event handling B: The event can be handled in ec_poll() because
EC lock is released after acpi_ec_gpe_handler() invocation;
c. There is no code in ec_poll() to clear STS but the GPE can
still be triggered by the EN=1 write performed in
acpi_ev_finish_gpe(), this leads to a no-op EC GPE handler
invocation;
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 1:
If we removed GPE disabling/enabling code from
acpi_ev_gpe_dispatch(), we could still see no-op GPE handlers
triggered by the event arriving after the GPE clearing and before
the GPE handling on both silicon variation A and B. This can only
occur if the CPU is very slow (timing slice between STS=0 write
and EC_SC read should be short enough before hardware sets another
GPE indication). Thus this is very rare and is not what we need to
fix.
B. Silicon variation B:
On other platforms, GPE may not be triggered due to "writing 1 to EN
when STS=1". This is because only STS line is wired to the GPE
trigger line.
2. Issue 2:
We can see GPE loss on such platforms. This is because:
a. event pending B vs. event handling A: An event can arrive after
ACPICA's GPE handling performed in acpi_ev_gpe_dispatch(), or
event pending C vs. event handling B: An event can arrive after
Linux's GPE handling performed in ec_poll(),
these events may fail to be detected by EC_SC read that is
performed before their arrival;
b. The GPE cannot be triggered by EN=1 write performed in
acpi_ev_finish_gpe();
c. If no polling mechanism is implemented in the driver for the
pending event (for example, SCI_EVT), this event is lost due to
no GPE being triggered.
Note 2:
On most platforms, there might be another rule that GPE may not be
triggered due to "writing 1 to STS when STS=1 and EN=1".
Then on silicon variation B, an even worse case is if the issue 2
event loss happens, further events may never trigger GPE again on
such platforms due to being blocked by the current STS=1. Unless
someone clears STS, all events have to be polled.
2. There is a race condition due to lacking in GPE status checking in EC
driver:
Originally, GPE status is checked in ACPICA core but not checked in
the GPE handler. Thus since the status checking and handling is not
locked, it can be interrupted by another handling path.
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_detect() ec_poll()
if (EN==1 && STS==1)
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
acpi_ev_gpe_dispatch()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
Unlock(EC)
*****************************************************************
3. Issue 3:
We can see no-op acpi_ec_gpe_handler() invoked on both silicon
variation A and B. This is because:
a. event pending A: An event can arrive to trigger an EC GPE and
ACPICA checks it and is about to invoke the EC GPE handler;
b. event handling A: The event can be handled in ec_poll() because
EC lock is not held after the GPE status checking;
c. event handling B: Then when the EC GPE handler is invoked, it
becomes a no-op GPE handler invocation.
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 3:
This no-op GPE handler invocation is rare because the time between
the IRQ arrival and the acpi_ec_gpe_handler() invocation is less than
the timeout value waited in ec_poll(). So most of the no-op GPE
handler invocations are caused by the reason described in issue 1.
3. There is a race condition due to ACPICA internal GPE clearing logic in
acpi_enable_gpe():
During runtime, acpi_enable_gpe() can be invoked by the EC storming
prevention code. When it is invoked, GPE may be flagged (STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() acpi_ec_transaction()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1 ?
Lock(EC)
Unlock(EC)
=================================================================
(event pending B)
=================================================================
acpi_enable_gpe()
STS=0
EN=1
4. Issue 4:
We can see GPE loss on both silicon variation A and B platforms.
This is because:
a. event pending B: An event can arrive right before ACPICA's GPE
clearing performed in acpi_enable_gpe();
b. If the GPE is cleared when GPE is disabled, then EN=1 write in
acpi_enable_gpe() cannot trigger this GPE;
c. If no polling mechanism is implemented in the driver for this
event (for example, SCI_EVT), this event is lost due to no GPE
being triggered.
Note 4:
Currently we don't have this issue, but after we switch the EC
driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode, we need to take care
of handling this because the EN=1 write in acpi_ev_gpe_dispatch()
will be abandoned.
There might be more race issues for the current GPE handler usages. This is
because the EC IRQ's enabling/disabling/checking/clearing/handling
operations should be locked by a single lock that is under the EC driver's
control to achieve the serialization. Which means we need to invoke GPE
APIs with EC driver's lock held and all ACPICA internal GPE operations
related to the GPE handler should be abandoned. Invoking GPE APIs inside of
the EC driver lock and bypassing ACPICA internal GPE operations requires
the ACPI_GPE_DISPATCH_RAW_HANDLER mode where the same lock used by the APIs
are released prior than invoking the handlers. Otherwise, we can see dead
locks due to circular locking dependencies (see Reference below).
This patch then switches the EC driver into the
ACPI_GPE_DISPATCH_RAW_HANDLER mode so that it can perform correct GPE
operations using the GPE APIs:
1. Bypasses EN modifications performed in acpi_ev_gpe_dispatch() by
using acpi_install_gpe_raw_handler() and invoking all GPE APIs with EC
spin lock held. This can fix issue 1 as it makes a non frequent GPE
enabling/disabling environment.
2. Bypasses STS clearing performed in acpi_enable_gpe() by replacing
acpi_enable_gpe()/acpi_disable_gpe() with acpi_set_gpe(). This can fix
issue 4. And this can also help to fix issue 1 as it makes a no sudden
GPE clearing environment when GPE is frequently enabled/disabled.
3. Ensures STS acknowledged before handling by invoking acpi_clear_gpe()
in advance_transaction(). This can finally fix issue 1 even in a
frequent GPE enabling/disabling environment. And this can also finally
fix issue 3 when issue 2 is fixed.
Note 3:
GPE clearing is edge triggered W1C, which means we can clear it right
before handling it. Since all EC GPE indications are handled in
advance_transaction() by previous commits, we can now move GPE clearing
into it to implement the correct GPE clearing.
Note 4:
We can use acpi_set_gpe() which is not shared GPE safer instead of
acpi_enable_gpe()/acpi_disable_gpe() because EC GPE is not shared by
other hardware, which is mentioned in the ACPI specification 5.0, 12.6
Interrupt Model: "OSPM driver treats this as an edge event (the EC SCI
cannot be shared)". So we can stop using shared GPE safer APIs
acpi_enable_gpe()/acpi_disable_gpe() in the EC driver. Otherwise
cleanups need to be made in acpi_ev_enable_gpe() to bypass the GPE
clearing logic before keeping acpi_enable_gpe().
This patch also invokes advance_transaction() when GPE is re-enabled in the
task context which:
1. Ensures EN=1 can trigger GPE by checking and handling EC status register
right after EN=1 writes. This can fix issue 2.
After applying this patch, without frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() ec_poll()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 1 (event pending B) can arrive as a next GPE
due to the previous IRQ context STS=0 write. And if it is handled by
ec_poll() (event handling B), as it is also acknowledged by ec_poll(), the
event pending for issue 2 (event pending C) can properly arrive as a next
GPE after the task context STS=0 write. So no GPE will be lost and never
triggered due to GPE clearing performed in the wrong position. And since
all GPE handling is performed after a locked GPE status checking, we can
hardly see no-op GPE handler invocations due to issue 1 and 3. We may still
see no-op GPE handler invocations due to "Note 1", but as it is inevitable,
it needn't be fixed.
After applying this patch, with frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() acpi_ec_transaction()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
EN=1
if STS==1
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 2 can be manually handled by
advance_transaction(). And after the STS=0 write performed in the manual
triggered advance_transaction(), GPE can always arrive. So no GPE will be
lost due to frequent GPE disabling/enabling performed in the driver like
issue 4.
Note 5:
It's ideally when EN=1 write occurred, an IRQ thread should be woken up to
handle the GPE when the GPE was raised. But this requires the IRQ thread to
contain the poller code for all EC GPE indications, while currently some of
the indications are handled in the user tasks. It then is very hard for the
code to determine whether a user task should be invoked or the poller work
item should be scheduled. So we have to invoke advance_transaction()
directly now and it leaves us such a restriction for the GPE re-enabling:
it must be performed in the task context to avoid starving the GPEs.
As a conclusion: we can see the EC GPE is always handled in serial after
deploying the raw GPE handler mode:
Lock(EC)
if (STS==1)
STS=0
EC_SC read
EC_SC handled
Unlock(EC)
The EC driver specific lock is responsible to make the EC GPE handling
processes serialized so that EC can handle its GPE from both IRQ and task
contexts and the next IRQ can be ensured to arrive after this process.
Note 6:
We have many EC_FLAGS_MSI qurik users in the current driver. They all seem
to be suffering from unexpected GPE triggering source lost. And they are
false root caused to a timing issue. Since EC communication protocol has
already flow control defined, timing shouldn't be the root cause, while
this fix might be fixing the root cause of the old bugs.
Link: https://lkml.org/lkml/2014/11/4/974
Link: https://lkml.org/lkml/2014/11/18/316
Link: https://www.spinics.net/lists/linux-acpi/msg54340.html
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05 16:27:22 +08:00
* EC Registers
2014-10-14 14:24:01 +08:00
* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
2005-04-16 15:20:36 -07:00
2006-09-26 19:50:33 +04:00
static inline u8 acpi_ec_read_status ( struct acpi_ec * ec )
2005-04-16 15:20:36 -07:00
{
2007-11-21 03:23:26 +03:00
u8 x = inb ( ec - > command_addr ) ;
2014-10-14 14:24:01 +08:00
2015-02-27 14:48:15 +08:00
ec_dbg_raw ( " EC_SC(R) = 0x%2.2x "
" SCI_EVT=%d BURST=%d CMD=%d IBF=%d OBF=%d " ,
x ,
! ! ( x & ACPI_EC_FLAG_SCI ) ,
! ! ( x & ACPI_EC_FLAG_BURST ) ,
! ! ( x & ACPI_EC_FLAG_CMD ) ,
! ! ( x & ACPI_EC_FLAG_IBF ) ,
! ! ( x & ACPI_EC_FLAG_OBF ) ) ;
2007-11-21 03:23:26 +03:00
return x ;
2005-03-19 01:10:05 -05:00
}
2006-09-26 19:50:33 +04:00
static inline u8 acpi_ec_read_data ( struct acpi_ec * ec )
2006-09-26 19:50:33 +04:00
{
2007-11-21 03:23:26 +03:00
u8 x = inb ( ec - > data_addr ) ;
2014-10-14 14:24:01 +08:00
ACPI / EC: Fix and clean up register access guarding logics.
In the polling mode, EC driver shouldn't access the EC registers too
frequently. Though this statement is concluded from the non-root caused
bugs (see links below), we've maintained the register access guarding
logics in the current EC driver. The guarding logics can be found here and
there, makes it hard to root cause real timing issues. This patch collects
the guarding logics into one single function so that all hidden logics
related to this can be seen clearly.
The current guarding related code also has several issues:
1. Per-transaction timestamp prevents inter-transaction guarding from being
implemented in the same place. We have an inter-transaction udelay() in
acpi_ec_transaction_unblocked(), this logic can be merged into ec_poll()
if we can use per-device timestamp. This patch completes such merge to
form a new ec_guard() function and collects all guarding related hidden
logics in it.
One hidden logic is: there is no inter-transaction guarding performed
for non MSI quirk (wait polling mode), this patch skips
inter-transaction guarding before wait_event_timeout() for the wait
polling mode to reveal the hidden logic.
The other hidden logic is: there is msleep() inter-transaction guarding
performed when the GPE storming is observed. As after merging this
commit:
Commit: e1d4d90fc0313d3d58cbd7912c90f8ef24df45ff
Subject: ACPI / EC: Refine command storm prevention support
EC_FLAGS_COMMAND_STORM is ensured to be cleared after invoking
acpi_ec_transaction_unlocked(), the msleep() guard logic will never
happen now. Since no one complains such change, this logic is likely
added during the old times where the EC race issues are not fixed and
the bugs are false root-caused to the timing issue. This patch simply
removes the out-dated logic. We can restore it by stop skipping
inter-transaction guarding for wait polling mode.
Two different delay values are defined for msleep() and udelay() while
they are merged in this patch to 550us.
2. time_after() causes additional delay in the polling mode (can only be
observed in noirq suspend/resume processes where polling mode is always
used) before advance_transaction() is invoked ("wait polling" log is
added before wait_event_timeout()). We can see 2 wait_event_timeout()
invocations. This is because time_after() ensures a ">" validation while
we only need a ">=" validation here:
[ 86.739909] ACPI: Waking up from system sleep state S3
[ 86.742857] ACPI : EC: 2: Increase command
[ 86.742859] ACPI : EC: ***** Command(RD_EC) started *****
[ 86.742861] ACPI : EC: ===== TASK (0) =====
[ 86.742871] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.742873] ACPI : EC: EC_SC(W) = 0x80
[ 86.742876] ACPI : EC: ***** Event started *****
[ 86.742880] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.743972] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.747966] ACPI : EC: ===== TASK (0) =====
[ 86.747977] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.747978] ACPI : EC: EC_DATA(W) = 0x06
[ 86.747981] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.751971] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755969] ACPI : EC: ===== TASK (0) =====
[ 86.755991] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 86.755993] ACPI : EC: EC_DATA(R) = 0x03
[ 86.755994] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755995] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 86.755996] ACPI : EC: 1: Decrease command
This patch corrects this by using time_before() instead in ec_guard():
[ 54.283146] ACPI: Waking up from system sleep state S3
[ 54.285414] ACPI : EC: 2: Increase command
[ 54.285415] ACPI : EC: ***** Command(RD_EC) started *****
[ 54.285416] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.285417] ACPI : EC: ===== TASK (0) =====
[ 54.285424] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.285425] ACPI : EC: EC_SC(W) = 0x80
[ 54.285427] ACPI : EC: ***** Event started *****
[ 54.285429] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.287209] ACPI : EC: ===== TASK (0) =====
[ 54.287218] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.287219] ACPI : EC: EC_DATA(W) = 0x06
[ 54.287222] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291190] ACPI : EC: ===== TASK (0) =====
[ 54.291210] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 54.291213] ACPI : EC: EC_DATA(R) = 0x03
[ 54.291214] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291215] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 54.291216] ACPI : EC: 1: Decrease command
After cleaning up all guarding logics, we have one single function
ec_guard() collecting all old, non-root-caused, hidden logics. Then we can
easily tune the logics in one place to respond to the bug reports.
Except the time_before() change, all other changes do not change the
behavior of the EC driver.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=12011
Link: https://bugzilla.kernel.org/show_bug.cgi?id=20242
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 14:16:42 +08:00
ec - > timestamp = jiffies ;
2015-02-27 14:48:15 +08:00
ec_dbg_raw ( " EC_DATA(R) = 0x%2.2x " , x ) ;
2008-09-25 21:00:31 +04:00
return x ;
2006-09-26 19:50:33 +04:00
}
2006-09-26 19:50:33 +04:00
static inline void acpi_ec_write_cmd ( struct acpi_ec * ec , u8 command )
2005-07-23 04:08:00 -04:00
{
2015-02-27 14:48:15 +08:00
ec_dbg_raw ( " EC_SC(W) = 0x%2.2x " , command ) ;
2006-09-26 19:50:33 +04:00
outb ( command , ec - > command_addr ) ;
ACPI / EC: Fix and clean up register access guarding logics.
In the polling mode, EC driver shouldn't access the EC registers too
frequently. Though this statement is concluded from the non-root caused
bugs (see links below), we've maintained the register access guarding
logics in the current EC driver. The guarding logics can be found here and
there, makes it hard to root cause real timing issues. This patch collects
the guarding logics into one single function so that all hidden logics
related to this can be seen clearly.
The current guarding related code also has several issues:
1. Per-transaction timestamp prevents inter-transaction guarding from being
implemented in the same place. We have an inter-transaction udelay() in
acpi_ec_transaction_unblocked(), this logic can be merged into ec_poll()
if we can use per-device timestamp. This patch completes such merge to
form a new ec_guard() function and collects all guarding related hidden
logics in it.
One hidden logic is: there is no inter-transaction guarding performed
for non MSI quirk (wait polling mode), this patch skips
inter-transaction guarding before wait_event_timeout() for the wait
polling mode to reveal the hidden logic.
The other hidden logic is: there is msleep() inter-transaction guarding
performed when the GPE storming is observed. As after merging this
commit:
Commit: e1d4d90fc0313d3d58cbd7912c90f8ef24df45ff
Subject: ACPI / EC: Refine command storm prevention support
EC_FLAGS_COMMAND_STORM is ensured to be cleared after invoking
acpi_ec_transaction_unlocked(), the msleep() guard logic will never
happen now. Since no one complains such change, this logic is likely
added during the old times where the EC race issues are not fixed and
the bugs are false root-caused to the timing issue. This patch simply
removes the out-dated logic. We can restore it by stop skipping
inter-transaction guarding for wait polling mode.
Two different delay values are defined for msleep() and udelay() while
they are merged in this patch to 550us.
2. time_after() causes additional delay in the polling mode (can only be
observed in noirq suspend/resume processes where polling mode is always
used) before advance_transaction() is invoked ("wait polling" log is
added before wait_event_timeout()). We can see 2 wait_event_timeout()
invocations. This is because time_after() ensures a ">" validation while
we only need a ">=" validation here:
[ 86.739909] ACPI: Waking up from system sleep state S3
[ 86.742857] ACPI : EC: 2: Increase command
[ 86.742859] ACPI : EC: ***** Command(RD_EC) started *****
[ 86.742861] ACPI : EC: ===== TASK (0) =====
[ 86.742871] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.742873] ACPI : EC: EC_SC(W) = 0x80
[ 86.742876] ACPI : EC: ***** Event started *****
[ 86.742880] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.743972] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.747966] ACPI : EC: ===== TASK (0) =====
[ 86.747977] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.747978] ACPI : EC: EC_DATA(W) = 0x06
[ 86.747981] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.751971] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755969] ACPI : EC: ===== TASK (0) =====
[ 86.755991] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 86.755993] ACPI : EC: EC_DATA(R) = 0x03
[ 86.755994] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755995] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 86.755996] ACPI : EC: 1: Decrease command
This patch corrects this by using time_before() instead in ec_guard():
[ 54.283146] ACPI: Waking up from system sleep state S3
[ 54.285414] ACPI : EC: 2: Increase command
[ 54.285415] ACPI : EC: ***** Command(RD_EC) started *****
[ 54.285416] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.285417] ACPI : EC: ===== TASK (0) =====
[ 54.285424] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.285425] ACPI : EC: EC_SC(W) = 0x80
[ 54.285427] ACPI : EC: ***** Event started *****
[ 54.285429] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.287209] ACPI : EC: ===== TASK (0) =====
[ 54.287218] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.287219] ACPI : EC: EC_DATA(W) = 0x06
[ 54.287222] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291190] ACPI : EC: ===== TASK (0) =====
[ 54.291210] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 54.291213] ACPI : EC: EC_DATA(R) = 0x03
[ 54.291214] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291215] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 54.291216] ACPI : EC: 1: Decrease command
After cleaning up all guarding logics, we have one single function
ec_guard() collecting all old, non-root-caused, hidden logics. Then we can
easily tune the logics in one place to respond to the bug reports.
Except the time_before() change, all other changes do not change the
behavior of the EC driver.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=12011
Link: https://bugzilla.kernel.org/show_bug.cgi?id=20242
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 14:16:42 +08:00
ec - > timestamp = jiffies ;
2005-07-23 04:08:00 -04:00
}
2006-09-26 19:50:33 +04:00
static inline void acpi_ec_write_data ( struct acpi_ec * ec , u8 data )
2005-07-23 04:08:00 -04:00
{
2015-02-27 14:48:15 +08:00
ec_dbg_raw ( " EC_DATA(W) = 0x%2.2x " , data ) ;
2006-09-26 19:50:33 +04:00
outb ( data , ec - > data_addr ) ;
ACPI / EC: Fix and clean up register access guarding logics.
In the polling mode, EC driver shouldn't access the EC registers too
frequently. Though this statement is concluded from the non-root caused
bugs (see links below), we've maintained the register access guarding
logics in the current EC driver. The guarding logics can be found here and
there, makes it hard to root cause real timing issues. This patch collects
the guarding logics into one single function so that all hidden logics
related to this can be seen clearly.
The current guarding related code also has several issues:
1. Per-transaction timestamp prevents inter-transaction guarding from being
implemented in the same place. We have an inter-transaction udelay() in
acpi_ec_transaction_unblocked(), this logic can be merged into ec_poll()
if we can use per-device timestamp. This patch completes such merge to
form a new ec_guard() function and collects all guarding related hidden
logics in it.
One hidden logic is: there is no inter-transaction guarding performed
for non MSI quirk (wait polling mode), this patch skips
inter-transaction guarding before wait_event_timeout() for the wait
polling mode to reveal the hidden logic.
The other hidden logic is: there is msleep() inter-transaction guarding
performed when the GPE storming is observed. As after merging this
commit:
Commit: e1d4d90fc0313d3d58cbd7912c90f8ef24df45ff
Subject: ACPI / EC: Refine command storm prevention support
EC_FLAGS_COMMAND_STORM is ensured to be cleared after invoking
acpi_ec_transaction_unlocked(), the msleep() guard logic will never
happen now. Since no one complains such change, this logic is likely
added during the old times where the EC race issues are not fixed and
the bugs are false root-caused to the timing issue. This patch simply
removes the out-dated logic. We can restore it by stop skipping
inter-transaction guarding for wait polling mode.
Two different delay values are defined for msleep() and udelay() while
they are merged in this patch to 550us.
2. time_after() causes additional delay in the polling mode (can only be
observed in noirq suspend/resume processes where polling mode is always
used) before advance_transaction() is invoked ("wait polling" log is
added before wait_event_timeout()). We can see 2 wait_event_timeout()
invocations. This is because time_after() ensures a ">" validation while
we only need a ">=" validation here:
[ 86.739909] ACPI: Waking up from system sleep state S3
[ 86.742857] ACPI : EC: 2: Increase command
[ 86.742859] ACPI : EC: ***** Command(RD_EC) started *****
[ 86.742861] ACPI : EC: ===== TASK (0) =====
[ 86.742871] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.742873] ACPI : EC: EC_SC(W) = 0x80
[ 86.742876] ACPI : EC: ***** Event started *****
[ 86.742880] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.743972] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.747966] ACPI : EC: ===== TASK (0) =====
[ 86.747977] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.747978] ACPI : EC: EC_DATA(W) = 0x06
[ 86.747981] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.751971] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755969] ACPI : EC: ===== TASK (0) =====
[ 86.755991] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 86.755993] ACPI : EC: EC_DATA(R) = 0x03
[ 86.755994] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755995] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 86.755996] ACPI : EC: 1: Decrease command
This patch corrects this by using time_before() instead in ec_guard():
[ 54.283146] ACPI: Waking up from system sleep state S3
[ 54.285414] ACPI : EC: 2: Increase command
[ 54.285415] ACPI : EC: ***** Command(RD_EC) started *****
[ 54.285416] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.285417] ACPI : EC: ===== TASK (0) =====
[ 54.285424] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.285425] ACPI : EC: EC_SC(W) = 0x80
[ 54.285427] ACPI : EC: ***** Event started *****
[ 54.285429] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.287209] ACPI : EC: ===== TASK (0) =====
[ 54.287218] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.287219] ACPI : EC: EC_DATA(W) = 0x06
[ 54.287222] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291190] ACPI : EC: ===== TASK (0) =====
[ 54.291210] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 54.291213] ACPI : EC: EC_DATA(R) = 0x03
[ 54.291214] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291215] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 54.291216] ACPI : EC: 1: Decrease command
After cleaning up all guarding logics, we have one single function
ec_guard() collecting all old, non-root-caused, hidden logics. Then we can
easily tune the logics in one place to respond to the bug reports.
Except the time_before() change, all other changes do not change the
behavior of the EC driver.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=12011
Link: https://bugzilla.kernel.org/show_bug.cgi?id=20242
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 14:16:42 +08:00
ec - > timestamp = jiffies ;
2006-09-26 19:50:33 +04:00
}
2005-07-23 04:08:00 -04:00
2017-06-14 13:59:18 +08:00
# if defined(DEBUG) || defined(CONFIG_DYNAMIC_DEBUG)
2014-10-14 14:23:49 +08:00
static const char * acpi_ec_cmd_string ( u8 cmd )
{
switch ( cmd ) {
case 0x80 :
return " RD_EC " ;
case 0x81 :
return " WR_EC " ;
case 0x82 :
return " BE_EC " ;
case 0x83 :
return " BD_EC " ;
case 0x84 :
return " QR_EC " ;
}
return " UNKNOWN " ;
}
# else
# define acpi_ec_cmd_string(cmd) "UNDEF"
# endif
ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode.
This patch switches EC driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode where
the GPE lock is not held for acpi_ec_gpe_handler() and the ACPICA internal
GPE enabling/disabling/clearing operations are bypassed so that further
improvements are possible with the GPE APIs.
There are 2 strong reasons for deploying raw GPE handler mode in the EC
driver:
1. Some hardware logics can control their interrupts via their own
registers, so their interrupts can be disabled/enabled/acknowledged
without using the super IRQ controller provided functions. While there
is no mean (EC commands) for the EC driver to achieve this.
2. During suspending, the EC driver is still working for a while to
complete the platform firmware provided functionailities using ec_poll()
after all GPEs are disabled (see acpi_ec_block_transactions()), which
means the EC driver will drive the EC GPE out of the GPE core's control.
Without deploying the raw GPE handler mode, we can see many races between
the EC driver and the GPE core due to the above restrictions:
1. There is a race condition due to ACPICA internal GPE
disabling/clearing/enabling logics in acpi_ev_gpe_dispatch():
Orignally EC GPE is disabled (EN=0), cleared (STS=0) before invoking a
GPE handler and re-enabled (EN=1) after invoking a GPE handler in
acpi_ev_gpe_dispatch(). When re-enabling appears, GPE may be flagged
(STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() ec_poll()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1
This race condition is the root cause of different issues on different
silicon variations.
A. Silicon variation A:
On some platforms, GPE will be triggered due to "writing 1 to EN when
STS=1". This is because both EN and STS lines are wired to the GPE
trigger line.
1. Issue 1:
We can see no-op acpi_ec_gpe_handler() invoked on such platforms.
This is because:
a. event pending B: An event can arrive after ACPICA's GPE
clearing performed in acpi_ev_gpe_dispatch(), this event may
fail to be detected by EC_SC read that is performed before its
arrival;
b. event handling B: The event can be handled in ec_poll() because
EC lock is released after acpi_ec_gpe_handler() invocation;
c. There is no code in ec_poll() to clear STS but the GPE can
still be triggered by the EN=1 write performed in
acpi_ev_finish_gpe(), this leads to a no-op EC GPE handler
invocation;
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 1:
If we removed GPE disabling/enabling code from
acpi_ev_gpe_dispatch(), we could still see no-op GPE handlers
triggered by the event arriving after the GPE clearing and before
the GPE handling on both silicon variation A and B. This can only
occur if the CPU is very slow (timing slice between STS=0 write
and EC_SC read should be short enough before hardware sets another
GPE indication). Thus this is very rare and is not what we need to
fix.
B. Silicon variation B:
On other platforms, GPE may not be triggered due to "writing 1 to EN
when STS=1". This is because only STS line is wired to the GPE
trigger line.
2. Issue 2:
We can see GPE loss on such platforms. This is because:
a. event pending B vs. event handling A: An event can arrive after
ACPICA's GPE handling performed in acpi_ev_gpe_dispatch(), or
event pending C vs. event handling B: An event can arrive after
Linux's GPE handling performed in ec_poll(),
these events may fail to be detected by EC_SC read that is
performed before their arrival;
b. The GPE cannot be triggered by EN=1 write performed in
acpi_ev_finish_gpe();
c. If no polling mechanism is implemented in the driver for the
pending event (for example, SCI_EVT), this event is lost due to
no GPE being triggered.
Note 2:
On most platforms, there might be another rule that GPE may not be
triggered due to "writing 1 to STS when STS=1 and EN=1".
Then on silicon variation B, an even worse case is if the issue 2
event loss happens, further events may never trigger GPE again on
such platforms due to being blocked by the current STS=1. Unless
someone clears STS, all events have to be polled.
2. There is a race condition due to lacking in GPE status checking in EC
driver:
Originally, GPE status is checked in ACPICA core but not checked in
the GPE handler. Thus since the status checking and handling is not
locked, it can be interrupted by another handling path.
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_detect() ec_poll()
if (EN==1 && STS==1)
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
acpi_ev_gpe_dispatch()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
Unlock(EC)
*****************************************************************
3. Issue 3:
We can see no-op acpi_ec_gpe_handler() invoked on both silicon
variation A and B. This is because:
a. event pending A: An event can arrive to trigger an EC GPE and
ACPICA checks it and is about to invoke the EC GPE handler;
b. event handling A: The event can be handled in ec_poll() because
EC lock is not held after the GPE status checking;
c. event handling B: Then when the EC GPE handler is invoked, it
becomes a no-op GPE handler invocation.
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 3:
This no-op GPE handler invocation is rare because the time between
the IRQ arrival and the acpi_ec_gpe_handler() invocation is less than
the timeout value waited in ec_poll(). So most of the no-op GPE
handler invocations are caused by the reason described in issue 1.
3. There is a race condition due to ACPICA internal GPE clearing logic in
acpi_enable_gpe():
During runtime, acpi_enable_gpe() can be invoked by the EC storming
prevention code. When it is invoked, GPE may be flagged (STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() acpi_ec_transaction()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1 ?
Lock(EC)
Unlock(EC)
=================================================================
(event pending B)
=================================================================
acpi_enable_gpe()
STS=0
EN=1
4. Issue 4:
We can see GPE loss on both silicon variation A and B platforms.
This is because:
a. event pending B: An event can arrive right before ACPICA's GPE
clearing performed in acpi_enable_gpe();
b. If the GPE is cleared when GPE is disabled, then EN=1 write in
acpi_enable_gpe() cannot trigger this GPE;
c. If no polling mechanism is implemented in the driver for this
event (for example, SCI_EVT), this event is lost due to no GPE
being triggered.
Note 4:
Currently we don't have this issue, but after we switch the EC
driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode, we need to take care
of handling this because the EN=1 write in acpi_ev_gpe_dispatch()
will be abandoned.
There might be more race issues for the current GPE handler usages. This is
because the EC IRQ's enabling/disabling/checking/clearing/handling
operations should be locked by a single lock that is under the EC driver's
control to achieve the serialization. Which means we need to invoke GPE
APIs with EC driver's lock held and all ACPICA internal GPE operations
related to the GPE handler should be abandoned. Invoking GPE APIs inside of
the EC driver lock and bypassing ACPICA internal GPE operations requires
the ACPI_GPE_DISPATCH_RAW_HANDLER mode where the same lock used by the APIs
are released prior than invoking the handlers. Otherwise, we can see dead
locks due to circular locking dependencies (see Reference below).
This patch then switches the EC driver into the
ACPI_GPE_DISPATCH_RAW_HANDLER mode so that it can perform correct GPE
operations using the GPE APIs:
1. Bypasses EN modifications performed in acpi_ev_gpe_dispatch() by
using acpi_install_gpe_raw_handler() and invoking all GPE APIs with EC
spin lock held. This can fix issue 1 as it makes a non frequent GPE
enabling/disabling environment.
2. Bypasses STS clearing performed in acpi_enable_gpe() by replacing
acpi_enable_gpe()/acpi_disable_gpe() with acpi_set_gpe(). This can fix
issue 4. And this can also help to fix issue 1 as it makes a no sudden
GPE clearing environment when GPE is frequently enabled/disabled.
3. Ensures STS acknowledged before handling by invoking acpi_clear_gpe()
in advance_transaction(). This can finally fix issue 1 even in a
frequent GPE enabling/disabling environment. And this can also finally
fix issue 3 when issue 2 is fixed.
Note 3:
GPE clearing is edge triggered W1C, which means we can clear it right
before handling it. Since all EC GPE indications are handled in
advance_transaction() by previous commits, we can now move GPE clearing
into it to implement the correct GPE clearing.
Note 4:
We can use acpi_set_gpe() which is not shared GPE safer instead of
acpi_enable_gpe()/acpi_disable_gpe() because EC GPE is not shared by
other hardware, which is mentioned in the ACPI specification 5.0, 12.6
Interrupt Model: "OSPM driver treats this as an edge event (the EC SCI
cannot be shared)". So we can stop using shared GPE safer APIs
acpi_enable_gpe()/acpi_disable_gpe() in the EC driver. Otherwise
cleanups need to be made in acpi_ev_enable_gpe() to bypass the GPE
clearing logic before keeping acpi_enable_gpe().
This patch also invokes advance_transaction() when GPE is re-enabled in the
task context which:
1. Ensures EN=1 can trigger GPE by checking and handling EC status register
right after EN=1 writes. This can fix issue 2.
After applying this patch, without frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() ec_poll()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 1 (event pending B) can arrive as a next GPE
due to the previous IRQ context STS=0 write. And if it is handled by
ec_poll() (event handling B), as it is also acknowledged by ec_poll(), the
event pending for issue 2 (event pending C) can properly arrive as a next
GPE after the task context STS=0 write. So no GPE will be lost and never
triggered due to GPE clearing performed in the wrong position. And since
all GPE handling is performed after a locked GPE status checking, we can
hardly see no-op GPE handler invocations due to issue 1 and 3. We may still
see no-op GPE handler invocations due to "Note 1", but as it is inevitable,
it needn't be fixed.
After applying this patch, with frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() acpi_ec_transaction()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
EN=1
if STS==1
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 2 can be manually handled by
advance_transaction(). And after the STS=0 write performed in the manual
triggered advance_transaction(), GPE can always arrive. So no GPE will be
lost due to frequent GPE disabling/enabling performed in the driver like
issue 4.
Note 5:
It's ideally when EN=1 write occurred, an IRQ thread should be woken up to
handle the GPE when the GPE was raised. But this requires the IRQ thread to
contain the poller code for all EC GPE indications, while currently some of
the indications are handled in the user tasks. It then is very hard for the
code to determine whether a user task should be invoked or the poller work
item should be scheduled. So we have to invoke advance_transaction()
directly now and it leaves us such a restriction for the GPE re-enabling:
it must be performed in the task context to avoid starving the GPEs.
As a conclusion: we can see the EC GPE is always handled in serial after
deploying the raw GPE handler mode:
Lock(EC)
if (STS==1)
STS=0
EC_SC read
EC_SC handled
Unlock(EC)
The EC driver specific lock is responsible to make the EC GPE handling
processes serialized so that EC can handle its GPE from both IRQ and task
contexts and the next IRQ can be ensured to arrive after this process.
Note 6:
We have many EC_FLAGS_MSI qurik users in the current driver. They all seem
to be suffering from unexpected GPE triggering source lost. And they are
false root caused to a timing issue. Since EC communication protocol has
already flow control defined, timing shouldn't be the root cause, while
this fix might be fixing the root cause of the old bugs.
Link: https://lkml.org/lkml/2014/11/4/974
Link: https://lkml.org/lkml/2014/11/18/316
Link: https://www.spinics.net/lists/linux-acpi/msg54340.html
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05 16:27:22 +08:00
/* --------------------------------------------------------------------------
* GPE Registers
* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
2020-11-23 20:00:25 +01:00
static inline bool acpi_ec_gpe_status_set ( struct acpi_ec * ec )
ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode.
This patch switches EC driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode where
the GPE lock is not held for acpi_ec_gpe_handler() and the ACPICA internal
GPE enabling/disabling/clearing operations are bypassed so that further
improvements are possible with the GPE APIs.
There are 2 strong reasons for deploying raw GPE handler mode in the EC
driver:
1. Some hardware logics can control their interrupts via their own
registers, so their interrupts can be disabled/enabled/acknowledged
without using the super IRQ controller provided functions. While there
is no mean (EC commands) for the EC driver to achieve this.
2. During suspending, the EC driver is still working for a while to
complete the platform firmware provided functionailities using ec_poll()
after all GPEs are disabled (see acpi_ec_block_transactions()), which
means the EC driver will drive the EC GPE out of the GPE core's control.
Without deploying the raw GPE handler mode, we can see many races between
the EC driver and the GPE core due to the above restrictions:
1. There is a race condition due to ACPICA internal GPE
disabling/clearing/enabling logics in acpi_ev_gpe_dispatch():
Orignally EC GPE is disabled (EN=0), cleared (STS=0) before invoking a
GPE handler and re-enabled (EN=1) after invoking a GPE handler in
acpi_ev_gpe_dispatch(). When re-enabling appears, GPE may be flagged
(STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() ec_poll()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1
This race condition is the root cause of different issues on different
silicon variations.
A. Silicon variation A:
On some platforms, GPE will be triggered due to "writing 1 to EN when
STS=1". This is because both EN and STS lines are wired to the GPE
trigger line.
1. Issue 1:
We can see no-op acpi_ec_gpe_handler() invoked on such platforms.
This is because:
a. event pending B: An event can arrive after ACPICA's GPE
clearing performed in acpi_ev_gpe_dispatch(), this event may
fail to be detected by EC_SC read that is performed before its
arrival;
b. event handling B: The event can be handled in ec_poll() because
EC lock is released after acpi_ec_gpe_handler() invocation;
c. There is no code in ec_poll() to clear STS but the GPE can
still be triggered by the EN=1 write performed in
acpi_ev_finish_gpe(), this leads to a no-op EC GPE handler
invocation;
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 1:
If we removed GPE disabling/enabling code from
acpi_ev_gpe_dispatch(), we could still see no-op GPE handlers
triggered by the event arriving after the GPE clearing and before
the GPE handling on both silicon variation A and B. This can only
occur if the CPU is very slow (timing slice between STS=0 write
and EC_SC read should be short enough before hardware sets another
GPE indication). Thus this is very rare and is not what we need to
fix.
B. Silicon variation B:
On other platforms, GPE may not be triggered due to "writing 1 to EN
when STS=1". This is because only STS line is wired to the GPE
trigger line.
2. Issue 2:
We can see GPE loss on such platforms. This is because:
a. event pending B vs. event handling A: An event can arrive after
ACPICA's GPE handling performed in acpi_ev_gpe_dispatch(), or
event pending C vs. event handling B: An event can arrive after
Linux's GPE handling performed in ec_poll(),
these events may fail to be detected by EC_SC read that is
performed before their arrival;
b. The GPE cannot be triggered by EN=1 write performed in
acpi_ev_finish_gpe();
c. If no polling mechanism is implemented in the driver for the
pending event (for example, SCI_EVT), this event is lost due to
no GPE being triggered.
Note 2:
On most platforms, there might be another rule that GPE may not be
triggered due to "writing 1 to STS when STS=1 and EN=1".
Then on silicon variation B, an even worse case is if the issue 2
event loss happens, further events may never trigger GPE again on
such platforms due to being blocked by the current STS=1. Unless
someone clears STS, all events have to be polled.
2. There is a race condition due to lacking in GPE status checking in EC
driver:
Originally, GPE status is checked in ACPICA core but not checked in
the GPE handler. Thus since the status checking and handling is not
locked, it can be interrupted by another handling path.
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_detect() ec_poll()
if (EN==1 && STS==1)
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
acpi_ev_gpe_dispatch()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
Unlock(EC)
*****************************************************************
3. Issue 3:
We can see no-op acpi_ec_gpe_handler() invoked on both silicon
variation A and B. This is because:
a. event pending A: An event can arrive to trigger an EC GPE and
ACPICA checks it and is about to invoke the EC GPE handler;
b. event handling A: The event can be handled in ec_poll() because
EC lock is not held after the GPE status checking;
c. event handling B: Then when the EC GPE handler is invoked, it
becomes a no-op GPE handler invocation.
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 3:
This no-op GPE handler invocation is rare because the time between
the IRQ arrival and the acpi_ec_gpe_handler() invocation is less than
the timeout value waited in ec_poll(). So most of the no-op GPE
handler invocations are caused by the reason described in issue 1.
3. There is a race condition due to ACPICA internal GPE clearing logic in
acpi_enable_gpe():
During runtime, acpi_enable_gpe() can be invoked by the EC storming
prevention code. When it is invoked, GPE may be flagged (STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() acpi_ec_transaction()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1 ?
Lock(EC)
Unlock(EC)
=================================================================
(event pending B)
=================================================================
acpi_enable_gpe()
STS=0
EN=1
4. Issue 4:
We can see GPE loss on both silicon variation A and B platforms.
This is because:
a. event pending B: An event can arrive right before ACPICA's GPE
clearing performed in acpi_enable_gpe();
b. If the GPE is cleared when GPE is disabled, then EN=1 write in
acpi_enable_gpe() cannot trigger this GPE;
c. If no polling mechanism is implemented in the driver for this
event (for example, SCI_EVT), this event is lost due to no GPE
being triggered.
Note 4:
Currently we don't have this issue, but after we switch the EC
driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode, we need to take care
of handling this because the EN=1 write in acpi_ev_gpe_dispatch()
will be abandoned.
There might be more race issues for the current GPE handler usages. This is
because the EC IRQ's enabling/disabling/checking/clearing/handling
operations should be locked by a single lock that is under the EC driver's
control to achieve the serialization. Which means we need to invoke GPE
APIs with EC driver's lock held and all ACPICA internal GPE operations
related to the GPE handler should be abandoned. Invoking GPE APIs inside of
the EC driver lock and bypassing ACPICA internal GPE operations requires
the ACPI_GPE_DISPATCH_RAW_HANDLER mode where the same lock used by the APIs
are released prior than invoking the handlers. Otherwise, we can see dead
locks due to circular locking dependencies (see Reference below).
This patch then switches the EC driver into the
ACPI_GPE_DISPATCH_RAW_HANDLER mode so that it can perform correct GPE
operations using the GPE APIs:
1. Bypasses EN modifications performed in acpi_ev_gpe_dispatch() by
using acpi_install_gpe_raw_handler() and invoking all GPE APIs with EC
spin lock held. This can fix issue 1 as it makes a non frequent GPE
enabling/disabling environment.
2. Bypasses STS clearing performed in acpi_enable_gpe() by replacing
acpi_enable_gpe()/acpi_disable_gpe() with acpi_set_gpe(). This can fix
issue 4. And this can also help to fix issue 1 as it makes a no sudden
GPE clearing environment when GPE is frequently enabled/disabled.
3. Ensures STS acknowledged before handling by invoking acpi_clear_gpe()
in advance_transaction(). This can finally fix issue 1 even in a
frequent GPE enabling/disabling environment. And this can also finally
fix issue 3 when issue 2 is fixed.
Note 3:
GPE clearing is edge triggered W1C, which means we can clear it right
before handling it. Since all EC GPE indications are handled in
advance_transaction() by previous commits, we can now move GPE clearing
into it to implement the correct GPE clearing.
Note 4:
We can use acpi_set_gpe() which is not shared GPE safer instead of
acpi_enable_gpe()/acpi_disable_gpe() because EC GPE is not shared by
other hardware, which is mentioned in the ACPI specification 5.0, 12.6
Interrupt Model: "OSPM driver treats this as an edge event (the EC SCI
cannot be shared)". So we can stop using shared GPE safer APIs
acpi_enable_gpe()/acpi_disable_gpe() in the EC driver. Otherwise
cleanups need to be made in acpi_ev_enable_gpe() to bypass the GPE
clearing logic before keeping acpi_enable_gpe().
This patch also invokes advance_transaction() when GPE is re-enabled in the
task context which:
1. Ensures EN=1 can trigger GPE by checking and handling EC status register
right after EN=1 writes. This can fix issue 2.
After applying this patch, without frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() ec_poll()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 1 (event pending B) can arrive as a next GPE
due to the previous IRQ context STS=0 write. And if it is handled by
ec_poll() (event handling B), as it is also acknowledged by ec_poll(), the
event pending for issue 2 (event pending C) can properly arrive as a next
GPE after the task context STS=0 write. So no GPE will be lost and never
triggered due to GPE clearing performed in the wrong position. And since
all GPE handling is performed after a locked GPE status checking, we can
hardly see no-op GPE handler invocations due to issue 1 and 3. We may still
see no-op GPE handler invocations due to "Note 1", but as it is inevitable,
it needn't be fixed.
After applying this patch, with frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() acpi_ec_transaction()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
EN=1
if STS==1
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 2 can be manually handled by
advance_transaction(). And after the STS=0 write performed in the manual
triggered advance_transaction(), GPE can always arrive. So no GPE will be
lost due to frequent GPE disabling/enabling performed in the driver like
issue 4.
Note 5:
It's ideally when EN=1 write occurred, an IRQ thread should be woken up to
handle the GPE when the GPE was raised. But this requires the IRQ thread to
contain the poller code for all EC GPE indications, while currently some of
the indications are handled in the user tasks. It then is very hard for the
code to determine whether a user task should be invoked or the poller work
item should be scheduled. So we have to invoke advance_transaction()
directly now and it leaves us such a restriction for the GPE re-enabling:
it must be performed in the task context to avoid starving the GPEs.
As a conclusion: we can see the EC GPE is always handled in serial after
deploying the raw GPE handler mode:
Lock(EC)
if (STS==1)
STS=0
EC_SC read
EC_SC handled
Unlock(EC)
The EC driver specific lock is responsible to make the EC GPE handling
processes serialized so that EC can handle its GPE from both IRQ and task
contexts and the next IRQ can be ensured to arrive after this process.
Note 6:
We have many EC_FLAGS_MSI qurik users in the current driver. They all seem
to be suffering from unexpected GPE triggering source lost. And they are
false root caused to a timing issue. Since EC communication protocol has
already flow control defined, timing shouldn't be the root cause, while
this fix might be fixing the root cause of the old bugs.
Link: https://lkml.org/lkml/2014/11/4/974
Link: https://lkml.org/lkml/2014/11/18/316
Link: https://www.spinics.net/lists/linux-acpi/msg54340.html
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05 16:27:22 +08:00
{
acpi_event_status gpe_status = 0 ;
( void ) acpi_get_gpe_status ( NULL , ec - > gpe , & gpe_status ) ;
2020-11-23 20:00:25 +01:00
return ! ! ( gpe_status & ACPI_EVENT_FLAG_STATUS_SET ) ;
ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode.
This patch switches EC driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode where
the GPE lock is not held for acpi_ec_gpe_handler() and the ACPICA internal
GPE enabling/disabling/clearing operations are bypassed so that further
improvements are possible with the GPE APIs.
There are 2 strong reasons for deploying raw GPE handler mode in the EC
driver:
1. Some hardware logics can control their interrupts via their own
registers, so their interrupts can be disabled/enabled/acknowledged
without using the super IRQ controller provided functions. While there
is no mean (EC commands) for the EC driver to achieve this.
2. During suspending, the EC driver is still working for a while to
complete the platform firmware provided functionailities using ec_poll()
after all GPEs are disabled (see acpi_ec_block_transactions()), which
means the EC driver will drive the EC GPE out of the GPE core's control.
Without deploying the raw GPE handler mode, we can see many races between
the EC driver and the GPE core due to the above restrictions:
1. There is a race condition due to ACPICA internal GPE
disabling/clearing/enabling logics in acpi_ev_gpe_dispatch():
Orignally EC GPE is disabled (EN=0), cleared (STS=0) before invoking a
GPE handler and re-enabled (EN=1) after invoking a GPE handler in
acpi_ev_gpe_dispatch(). When re-enabling appears, GPE may be flagged
(STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() ec_poll()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1
This race condition is the root cause of different issues on different
silicon variations.
A. Silicon variation A:
On some platforms, GPE will be triggered due to "writing 1 to EN when
STS=1". This is because both EN and STS lines are wired to the GPE
trigger line.
1. Issue 1:
We can see no-op acpi_ec_gpe_handler() invoked on such platforms.
This is because:
a. event pending B: An event can arrive after ACPICA's GPE
clearing performed in acpi_ev_gpe_dispatch(), this event may
fail to be detected by EC_SC read that is performed before its
arrival;
b. event handling B: The event can be handled in ec_poll() because
EC lock is released after acpi_ec_gpe_handler() invocation;
c. There is no code in ec_poll() to clear STS but the GPE can
still be triggered by the EN=1 write performed in
acpi_ev_finish_gpe(), this leads to a no-op EC GPE handler
invocation;
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 1:
If we removed GPE disabling/enabling code from
acpi_ev_gpe_dispatch(), we could still see no-op GPE handlers
triggered by the event arriving after the GPE clearing and before
the GPE handling on both silicon variation A and B. This can only
occur if the CPU is very slow (timing slice between STS=0 write
and EC_SC read should be short enough before hardware sets another
GPE indication). Thus this is very rare and is not what we need to
fix.
B. Silicon variation B:
On other platforms, GPE may not be triggered due to "writing 1 to EN
when STS=1". This is because only STS line is wired to the GPE
trigger line.
2. Issue 2:
We can see GPE loss on such platforms. This is because:
a. event pending B vs. event handling A: An event can arrive after
ACPICA's GPE handling performed in acpi_ev_gpe_dispatch(), or
event pending C vs. event handling B: An event can arrive after
Linux's GPE handling performed in ec_poll(),
these events may fail to be detected by EC_SC read that is
performed before their arrival;
b. The GPE cannot be triggered by EN=1 write performed in
acpi_ev_finish_gpe();
c. If no polling mechanism is implemented in the driver for the
pending event (for example, SCI_EVT), this event is lost due to
no GPE being triggered.
Note 2:
On most platforms, there might be another rule that GPE may not be
triggered due to "writing 1 to STS when STS=1 and EN=1".
Then on silicon variation B, an even worse case is if the issue 2
event loss happens, further events may never trigger GPE again on
such platforms due to being blocked by the current STS=1. Unless
someone clears STS, all events have to be polled.
2. There is a race condition due to lacking in GPE status checking in EC
driver:
Originally, GPE status is checked in ACPICA core but not checked in
the GPE handler. Thus since the status checking and handling is not
locked, it can be interrupted by another handling path.
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_detect() ec_poll()
if (EN==1 && STS==1)
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
acpi_ev_gpe_dispatch()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
Unlock(EC)
*****************************************************************
3. Issue 3:
We can see no-op acpi_ec_gpe_handler() invoked on both silicon
variation A and B. This is because:
a. event pending A: An event can arrive to trigger an EC GPE and
ACPICA checks it and is about to invoke the EC GPE handler;
b. event handling A: The event can be handled in ec_poll() because
EC lock is not held after the GPE status checking;
c. event handling B: Then when the EC GPE handler is invoked, it
becomes a no-op GPE handler invocation.
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 3:
This no-op GPE handler invocation is rare because the time between
the IRQ arrival and the acpi_ec_gpe_handler() invocation is less than
the timeout value waited in ec_poll(). So most of the no-op GPE
handler invocations are caused by the reason described in issue 1.
3. There is a race condition due to ACPICA internal GPE clearing logic in
acpi_enable_gpe():
During runtime, acpi_enable_gpe() can be invoked by the EC storming
prevention code. When it is invoked, GPE may be flagged (STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() acpi_ec_transaction()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1 ?
Lock(EC)
Unlock(EC)
=================================================================
(event pending B)
=================================================================
acpi_enable_gpe()
STS=0
EN=1
4. Issue 4:
We can see GPE loss on both silicon variation A and B platforms.
This is because:
a. event pending B: An event can arrive right before ACPICA's GPE
clearing performed in acpi_enable_gpe();
b. If the GPE is cleared when GPE is disabled, then EN=1 write in
acpi_enable_gpe() cannot trigger this GPE;
c. If no polling mechanism is implemented in the driver for this
event (for example, SCI_EVT), this event is lost due to no GPE
being triggered.
Note 4:
Currently we don't have this issue, but after we switch the EC
driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode, we need to take care
of handling this because the EN=1 write in acpi_ev_gpe_dispatch()
will be abandoned.
There might be more race issues for the current GPE handler usages. This is
because the EC IRQ's enabling/disabling/checking/clearing/handling
operations should be locked by a single lock that is under the EC driver's
control to achieve the serialization. Which means we need to invoke GPE
APIs with EC driver's lock held and all ACPICA internal GPE operations
related to the GPE handler should be abandoned. Invoking GPE APIs inside of
the EC driver lock and bypassing ACPICA internal GPE operations requires
the ACPI_GPE_DISPATCH_RAW_HANDLER mode where the same lock used by the APIs
are released prior than invoking the handlers. Otherwise, we can see dead
locks due to circular locking dependencies (see Reference below).
This patch then switches the EC driver into the
ACPI_GPE_DISPATCH_RAW_HANDLER mode so that it can perform correct GPE
operations using the GPE APIs:
1. Bypasses EN modifications performed in acpi_ev_gpe_dispatch() by
using acpi_install_gpe_raw_handler() and invoking all GPE APIs with EC
spin lock held. This can fix issue 1 as it makes a non frequent GPE
enabling/disabling environment.
2. Bypasses STS clearing performed in acpi_enable_gpe() by replacing
acpi_enable_gpe()/acpi_disable_gpe() with acpi_set_gpe(). This can fix
issue 4. And this can also help to fix issue 1 as it makes a no sudden
GPE clearing environment when GPE is frequently enabled/disabled.
3. Ensures STS acknowledged before handling by invoking acpi_clear_gpe()
in advance_transaction(). This can finally fix issue 1 even in a
frequent GPE enabling/disabling environment. And this can also finally
fix issue 3 when issue 2 is fixed.
Note 3:
GPE clearing is edge triggered W1C, which means we can clear it right
before handling it. Since all EC GPE indications are handled in
advance_transaction() by previous commits, we can now move GPE clearing
into it to implement the correct GPE clearing.
Note 4:
We can use acpi_set_gpe() which is not shared GPE safer instead of
acpi_enable_gpe()/acpi_disable_gpe() because EC GPE is not shared by
other hardware, which is mentioned in the ACPI specification 5.0, 12.6
Interrupt Model: "OSPM driver treats this as an edge event (the EC SCI
cannot be shared)". So we can stop using shared GPE safer APIs
acpi_enable_gpe()/acpi_disable_gpe() in the EC driver. Otherwise
cleanups need to be made in acpi_ev_enable_gpe() to bypass the GPE
clearing logic before keeping acpi_enable_gpe().
This patch also invokes advance_transaction() when GPE is re-enabled in the
task context which:
1. Ensures EN=1 can trigger GPE by checking and handling EC status register
right after EN=1 writes. This can fix issue 2.
After applying this patch, without frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() ec_poll()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 1 (event pending B) can arrive as a next GPE
due to the previous IRQ context STS=0 write. And if it is handled by
ec_poll() (event handling B), as it is also acknowledged by ec_poll(), the
event pending for issue 2 (event pending C) can properly arrive as a next
GPE after the task context STS=0 write. So no GPE will be lost and never
triggered due to GPE clearing performed in the wrong position. And since
all GPE handling is performed after a locked GPE status checking, we can
hardly see no-op GPE handler invocations due to issue 1 and 3. We may still
see no-op GPE handler invocations due to "Note 1", but as it is inevitable,
it needn't be fixed.
After applying this patch, with frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() acpi_ec_transaction()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
EN=1
if STS==1
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 2 can be manually handled by
advance_transaction(). And after the STS=0 write performed in the manual
triggered advance_transaction(), GPE can always arrive. So no GPE will be
lost due to frequent GPE disabling/enabling performed in the driver like
issue 4.
Note 5:
It's ideally when EN=1 write occurred, an IRQ thread should be woken up to
handle the GPE when the GPE was raised. But this requires the IRQ thread to
contain the poller code for all EC GPE indications, while currently some of
the indications are handled in the user tasks. It then is very hard for the
code to determine whether a user task should be invoked or the poller work
item should be scheduled. So we have to invoke advance_transaction()
directly now and it leaves us such a restriction for the GPE re-enabling:
it must be performed in the task context to avoid starving the GPEs.
As a conclusion: we can see the EC GPE is always handled in serial after
deploying the raw GPE handler mode:
Lock(EC)
if (STS==1)
STS=0
EC_SC read
EC_SC handled
Unlock(EC)
The EC driver specific lock is responsible to make the EC GPE handling
processes serialized so that EC can handle its GPE from both IRQ and task
contexts and the next IRQ can be ensured to arrive after this process.
Note 6:
We have many EC_FLAGS_MSI qurik users in the current driver. They all seem
to be suffering from unexpected GPE triggering source lost. And they are
false root caused to a timing issue. Since EC communication protocol has
already flow control defined, timing shouldn't be the root cause, while
this fix might be fixing the root cause of the old bugs.
Link: https://lkml.org/lkml/2014/11/4/974
Link: https://lkml.org/lkml/2014/11/18/316
Link: https://www.spinics.net/lists/linux-acpi/msg54340.html
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05 16:27:22 +08:00
}
static inline void acpi_ec_enable_gpe ( struct acpi_ec * ec , bool open )
{
if ( open )
acpi_enable_gpe ( NULL , ec - > gpe ) ;
2015-02-06 08:58:05 +08:00
else {
BUG_ON ( ec - > reference_count < 1 ) ;
ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode.
This patch switches EC driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode where
the GPE lock is not held for acpi_ec_gpe_handler() and the ACPICA internal
GPE enabling/disabling/clearing operations are bypassed so that further
improvements are possible with the GPE APIs.
There are 2 strong reasons for deploying raw GPE handler mode in the EC
driver:
1. Some hardware logics can control their interrupts via their own
registers, so their interrupts can be disabled/enabled/acknowledged
without using the super IRQ controller provided functions. While there
is no mean (EC commands) for the EC driver to achieve this.
2. During suspending, the EC driver is still working for a while to
complete the platform firmware provided functionailities using ec_poll()
after all GPEs are disabled (see acpi_ec_block_transactions()), which
means the EC driver will drive the EC GPE out of the GPE core's control.
Without deploying the raw GPE handler mode, we can see many races between
the EC driver and the GPE core due to the above restrictions:
1. There is a race condition due to ACPICA internal GPE
disabling/clearing/enabling logics in acpi_ev_gpe_dispatch():
Orignally EC GPE is disabled (EN=0), cleared (STS=0) before invoking a
GPE handler and re-enabled (EN=1) after invoking a GPE handler in
acpi_ev_gpe_dispatch(). When re-enabling appears, GPE may be flagged
(STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() ec_poll()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1
This race condition is the root cause of different issues on different
silicon variations.
A. Silicon variation A:
On some platforms, GPE will be triggered due to "writing 1 to EN when
STS=1". This is because both EN and STS lines are wired to the GPE
trigger line.
1. Issue 1:
We can see no-op acpi_ec_gpe_handler() invoked on such platforms.
This is because:
a. event pending B: An event can arrive after ACPICA's GPE
clearing performed in acpi_ev_gpe_dispatch(), this event may
fail to be detected by EC_SC read that is performed before its
arrival;
b. event handling B: The event can be handled in ec_poll() because
EC lock is released after acpi_ec_gpe_handler() invocation;
c. There is no code in ec_poll() to clear STS but the GPE can
still be triggered by the EN=1 write performed in
acpi_ev_finish_gpe(), this leads to a no-op EC GPE handler
invocation;
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 1:
If we removed GPE disabling/enabling code from
acpi_ev_gpe_dispatch(), we could still see no-op GPE handlers
triggered by the event arriving after the GPE clearing and before
the GPE handling on both silicon variation A and B. This can only
occur if the CPU is very slow (timing slice between STS=0 write
and EC_SC read should be short enough before hardware sets another
GPE indication). Thus this is very rare and is not what we need to
fix.
B. Silicon variation B:
On other platforms, GPE may not be triggered due to "writing 1 to EN
when STS=1". This is because only STS line is wired to the GPE
trigger line.
2. Issue 2:
We can see GPE loss on such platforms. This is because:
a. event pending B vs. event handling A: An event can arrive after
ACPICA's GPE handling performed in acpi_ev_gpe_dispatch(), or
event pending C vs. event handling B: An event can arrive after
Linux's GPE handling performed in ec_poll(),
these events may fail to be detected by EC_SC read that is
performed before their arrival;
b. The GPE cannot be triggered by EN=1 write performed in
acpi_ev_finish_gpe();
c. If no polling mechanism is implemented in the driver for the
pending event (for example, SCI_EVT), this event is lost due to
no GPE being triggered.
Note 2:
On most platforms, there might be another rule that GPE may not be
triggered due to "writing 1 to STS when STS=1 and EN=1".
Then on silicon variation B, an even worse case is if the issue 2
event loss happens, further events may never trigger GPE again on
such platforms due to being blocked by the current STS=1. Unless
someone clears STS, all events have to be polled.
2. There is a race condition due to lacking in GPE status checking in EC
driver:
Originally, GPE status is checked in ACPICA core but not checked in
the GPE handler. Thus since the status checking and handling is not
locked, it can be interrupted by another handling path.
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_detect() ec_poll()
if (EN==1 && STS==1)
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
acpi_ev_gpe_dispatch()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
Unlock(EC)
*****************************************************************
3. Issue 3:
We can see no-op acpi_ec_gpe_handler() invoked on both silicon
variation A and B. This is because:
a. event pending A: An event can arrive to trigger an EC GPE and
ACPICA checks it and is about to invoke the EC GPE handler;
b. event handling A: The event can be handled in ec_poll() because
EC lock is not held after the GPE status checking;
c. event handling B: Then when the EC GPE handler is invoked, it
becomes a no-op GPE handler invocation.
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 3:
This no-op GPE handler invocation is rare because the time between
the IRQ arrival and the acpi_ec_gpe_handler() invocation is less than
the timeout value waited in ec_poll(). So most of the no-op GPE
handler invocations are caused by the reason described in issue 1.
3. There is a race condition due to ACPICA internal GPE clearing logic in
acpi_enable_gpe():
During runtime, acpi_enable_gpe() can be invoked by the EC storming
prevention code. When it is invoked, GPE may be flagged (STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() acpi_ec_transaction()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1 ?
Lock(EC)
Unlock(EC)
=================================================================
(event pending B)
=================================================================
acpi_enable_gpe()
STS=0
EN=1
4. Issue 4:
We can see GPE loss on both silicon variation A and B platforms.
This is because:
a. event pending B: An event can arrive right before ACPICA's GPE
clearing performed in acpi_enable_gpe();
b. If the GPE is cleared when GPE is disabled, then EN=1 write in
acpi_enable_gpe() cannot trigger this GPE;
c. If no polling mechanism is implemented in the driver for this
event (for example, SCI_EVT), this event is lost due to no GPE
being triggered.
Note 4:
Currently we don't have this issue, but after we switch the EC
driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode, we need to take care
of handling this because the EN=1 write in acpi_ev_gpe_dispatch()
will be abandoned.
There might be more race issues for the current GPE handler usages. This is
because the EC IRQ's enabling/disabling/checking/clearing/handling
operations should be locked by a single lock that is under the EC driver's
control to achieve the serialization. Which means we need to invoke GPE
APIs with EC driver's lock held and all ACPICA internal GPE operations
related to the GPE handler should be abandoned. Invoking GPE APIs inside of
the EC driver lock and bypassing ACPICA internal GPE operations requires
the ACPI_GPE_DISPATCH_RAW_HANDLER mode where the same lock used by the APIs
are released prior than invoking the handlers. Otherwise, we can see dead
locks due to circular locking dependencies (see Reference below).
This patch then switches the EC driver into the
ACPI_GPE_DISPATCH_RAW_HANDLER mode so that it can perform correct GPE
operations using the GPE APIs:
1. Bypasses EN modifications performed in acpi_ev_gpe_dispatch() by
using acpi_install_gpe_raw_handler() and invoking all GPE APIs with EC
spin lock held. This can fix issue 1 as it makes a non frequent GPE
enabling/disabling environment.
2. Bypasses STS clearing performed in acpi_enable_gpe() by replacing
acpi_enable_gpe()/acpi_disable_gpe() with acpi_set_gpe(). This can fix
issue 4. And this can also help to fix issue 1 as it makes a no sudden
GPE clearing environment when GPE is frequently enabled/disabled.
3. Ensures STS acknowledged before handling by invoking acpi_clear_gpe()
in advance_transaction(). This can finally fix issue 1 even in a
frequent GPE enabling/disabling environment. And this can also finally
fix issue 3 when issue 2 is fixed.
Note 3:
GPE clearing is edge triggered W1C, which means we can clear it right
before handling it. Since all EC GPE indications are handled in
advance_transaction() by previous commits, we can now move GPE clearing
into it to implement the correct GPE clearing.
Note 4:
We can use acpi_set_gpe() which is not shared GPE safer instead of
acpi_enable_gpe()/acpi_disable_gpe() because EC GPE is not shared by
other hardware, which is mentioned in the ACPI specification 5.0, 12.6
Interrupt Model: "OSPM driver treats this as an edge event (the EC SCI
cannot be shared)". So we can stop using shared GPE safer APIs
acpi_enable_gpe()/acpi_disable_gpe() in the EC driver. Otherwise
cleanups need to be made in acpi_ev_enable_gpe() to bypass the GPE
clearing logic before keeping acpi_enable_gpe().
This patch also invokes advance_transaction() when GPE is re-enabled in the
task context which:
1. Ensures EN=1 can trigger GPE by checking and handling EC status register
right after EN=1 writes. This can fix issue 2.
After applying this patch, without frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() ec_poll()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 1 (event pending B) can arrive as a next GPE
due to the previous IRQ context STS=0 write. And if it is handled by
ec_poll() (event handling B), as it is also acknowledged by ec_poll(), the
event pending for issue 2 (event pending C) can properly arrive as a next
GPE after the task context STS=0 write. So no GPE will be lost and never
triggered due to GPE clearing performed in the wrong position. And since
all GPE handling is performed after a locked GPE status checking, we can
hardly see no-op GPE handler invocations due to issue 1 and 3. We may still
see no-op GPE handler invocations due to "Note 1", but as it is inevitable,
it needn't be fixed.
After applying this patch, with frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() acpi_ec_transaction()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
EN=1
if STS==1
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 2 can be manually handled by
advance_transaction(). And after the STS=0 write performed in the manual
triggered advance_transaction(), GPE can always arrive. So no GPE will be
lost due to frequent GPE disabling/enabling performed in the driver like
issue 4.
Note 5:
It's ideally when EN=1 write occurred, an IRQ thread should be woken up to
handle the GPE when the GPE was raised. But this requires the IRQ thread to
contain the poller code for all EC GPE indications, while currently some of
the indications are handled in the user tasks. It then is very hard for the
code to determine whether a user task should be invoked or the poller work
item should be scheduled. So we have to invoke advance_transaction()
directly now and it leaves us such a restriction for the GPE re-enabling:
it must be performed in the task context to avoid starving the GPEs.
As a conclusion: we can see the EC GPE is always handled in serial after
deploying the raw GPE handler mode:
Lock(EC)
if (STS==1)
STS=0
EC_SC read
EC_SC handled
Unlock(EC)
The EC driver specific lock is responsible to make the EC GPE handling
processes serialized so that EC can handle its GPE from both IRQ and task
contexts and the next IRQ can be ensured to arrive after this process.
Note 6:
We have many EC_FLAGS_MSI qurik users in the current driver. They all seem
to be suffering from unexpected GPE triggering source lost. And they are
false root caused to a timing issue. Since EC communication protocol has
already flow control defined, timing shouldn't be the root cause, while
this fix might be fixing the root cause of the old bugs.
Link: https://lkml.org/lkml/2014/11/4/974
Link: https://lkml.org/lkml/2014/11/18/316
Link: https://www.spinics.net/lists/linux-acpi/msg54340.html
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05 16:27:22 +08:00
acpi_set_gpe ( NULL , ec - > gpe , ACPI_GPE_ENABLE ) ;
2015-02-06 08:58:05 +08:00
}
2020-11-23 20:00:25 +01:00
if ( acpi_ec_gpe_status_set ( ec ) ) {
ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode.
This patch switches EC driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode where
the GPE lock is not held for acpi_ec_gpe_handler() and the ACPICA internal
GPE enabling/disabling/clearing operations are bypassed so that further
improvements are possible with the GPE APIs.
There are 2 strong reasons for deploying raw GPE handler mode in the EC
driver:
1. Some hardware logics can control their interrupts via their own
registers, so their interrupts can be disabled/enabled/acknowledged
without using the super IRQ controller provided functions. While there
is no mean (EC commands) for the EC driver to achieve this.
2. During suspending, the EC driver is still working for a while to
complete the platform firmware provided functionailities using ec_poll()
after all GPEs are disabled (see acpi_ec_block_transactions()), which
means the EC driver will drive the EC GPE out of the GPE core's control.
Without deploying the raw GPE handler mode, we can see many races between
the EC driver and the GPE core due to the above restrictions:
1. There is a race condition due to ACPICA internal GPE
disabling/clearing/enabling logics in acpi_ev_gpe_dispatch():
Orignally EC GPE is disabled (EN=0), cleared (STS=0) before invoking a
GPE handler and re-enabled (EN=1) after invoking a GPE handler in
acpi_ev_gpe_dispatch(). When re-enabling appears, GPE may be flagged
(STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() ec_poll()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1
This race condition is the root cause of different issues on different
silicon variations.
A. Silicon variation A:
On some platforms, GPE will be triggered due to "writing 1 to EN when
STS=1". This is because both EN and STS lines are wired to the GPE
trigger line.
1. Issue 1:
We can see no-op acpi_ec_gpe_handler() invoked on such platforms.
This is because:
a. event pending B: An event can arrive after ACPICA's GPE
clearing performed in acpi_ev_gpe_dispatch(), this event may
fail to be detected by EC_SC read that is performed before its
arrival;
b. event handling B: The event can be handled in ec_poll() because
EC lock is released after acpi_ec_gpe_handler() invocation;
c. There is no code in ec_poll() to clear STS but the GPE can
still be triggered by the EN=1 write performed in
acpi_ev_finish_gpe(), this leads to a no-op EC GPE handler
invocation;
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 1:
If we removed GPE disabling/enabling code from
acpi_ev_gpe_dispatch(), we could still see no-op GPE handlers
triggered by the event arriving after the GPE clearing and before
the GPE handling on both silicon variation A and B. This can only
occur if the CPU is very slow (timing slice between STS=0 write
and EC_SC read should be short enough before hardware sets another
GPE indication). Thus this is very rare and is not what we need to
fix.
B. Silicon variation B:
On other platforms, GPE may not be triggered due to "writing 1 to EN
when STS=1". This is because only STS line is wired to the GPE
trigger line.
2. Issue 2:
We can see GPE loss on such platforms. This is because:
a. event pending B vs. event handling A: An event can arrive after
ACPICA's GPE handling performed in acpi_ev_gpe_dispatch(), or
event pending C vs. event handling B: An event can arrive after
Linux's GPE handling performed in ec_poll(),
these events may fail to be detected by EC_SC read that is
performed before their arrival;
b. The GPE cannot be triggered by EN=1 write performed in
acpi_ev_finish_gpe();
c. If no polling mechanism is implemented in the driver for the
pending event (for example, SCI_EVT), this event is lost due to
no GPE being triggered.
Note 2:
On most platforms, there might be another rule that GPE may not be
triggered due to "writing 1 to STS when STS=1 and EN=1".
Then on silicon variation B, an even worse case is if the issue 2
event loss happens, further events may never trigger GPE again on
such platforms due to being blocked by the current STS=1. Unless
someone clears STS, all events have to be polled.
2. There is a race condition due to lacking in GPE status checking in EC
driver:
Originally, GPE status is checked in ACPICA core but not checked in
the GPE handler. Thus since the status checking and handling is not
locked, it can be interrupted by another handling path.
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_detect() ec_poll()
if (EN==1 && STS==1)
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
acpi_ev_gpe_dispatch()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
Unlock(EC)
*****************************************************************
3. Issue 3:
We can see no-op acpi_ec_gpe_handler() invoked on both silicon
variation A and B. This is because:
a. event pending A: An event can arrive to trigger an EC GPE and
ACPICA checks it and is about to invoke the EC GPE handler;
b. event handling A: The event can be handled in ec_poll() because
EC lock is not held after the GPE status checking;
c. event handling B: Then when the EC GPE handler is invoked, it
becomes a no-op GPE handler invocation.
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 3:
This no-op GPE handler invocation is rare because the time between
the IRQ arrival and the acpi_ec_gpe_handler() invocation is less than
the timeout value waited in ec_poll(). So most of the no-op GPE
handler invocations are caused by the reason described in issue 1.
3. There is a race condition due to ACPICA internal GPE clearing logic in
acpi_enable_gpe():
During runtime, acpi_enable_gpe() can be invoked by the EC storming
prevention code. When it is invoked, GPE may be flagged (STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() acpi_ec_transaction()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1 ?
Lock(EC)
Unlock(EC)
=================================================================
(event pending B)
=================================================================
acpi_enable_gpe()
STS=0
EN=1
4. Issue 4:
We can see GPE loss on both silicon variation A and B platforms.
This is because:
a. event pending B: An event can arrive right before ACPICA's GPE
clearing performed in acpi_enable_gpe();
b. If the GPE is cleared when GPE is disabled, then EN=1 write in
acpi_enable_gpe() cannot trigger this GPE;
c. If no polling mechanism is implemented in the driver for this
event (for example, SCI_EVT), this event is lost due to no GPE
being triggered.
Note 4:
Currently we don't have this issue, but after we switch the EC
driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode, we need to take care
of handling this because the EN=1 write in acpi_ev_gpe_dispatch()
will be abandoned.
There might be more race issues for the current GPE handler usages. This is
because the EC IRQ's enabling/disabling/checking/clearing/handling
operations should be locked by a single lock that is under the EC driver's
control to achieve the serialization. Which means we need to invoke GPE
APIs with EC driver's lock held and all ACPICA internal GPE operations
related to the GPE handler should be abandoned. Invoking GPE APIs inside of
the EC driver lock and bypassing ACPICA internal GPE operations requires
the ACPI_GPE_DISPATCH_RAW_HANDLER mode where the same lock used by the APIs
are released prior than invoking the handlers. Otherwise, we can see dead
locks due to circular locking dependencies (see Reference below).
This patch then switches the EC driver into the
ACPI_GPE_DISPATCH_RAW_HANDLER mode so that it can perform correct GPE
operations using the GPE APIs:
1. Bypasses EN modifications performed in acpi_ev_gpe_dispatch() by
using acpi_install_gpe_raw_handler() and invoking all GPE APIs with EC
spin lock held. This can fix issue 1 as it makes a non frequent GPE
enabling/disabling environment.
2. Bypasses STS clearing performed in acpi_enable_gpe() by replacing
acpi_enable_gpe()/acpi_disable_gpe() with acpi_set_gpe(). This can fix
issue 4. And this can also help to fix issue 1 as it makes a no sudden
GPE clearing environment when GPE is frequently enabled/disabled.
3. Ensures STS acknowledged before handling by invoking acpi_clear_gpe()
in advance_transaction(). This can finally fix issue 1 even in a
frequent GPE enabling/disabling environment. And this can also finally
fix issue 3 when issue 2 is fixed.
Note 3:
GPE clearing is edge triggered W1C, which means we can clear it right
before handling it. Since all EC GPE indications are handled in
advance_transaction() by previous commits, we can now move GPE clearing
into it to implement the correct GPE clearing.
Note 4:
We can use acpi_set_gpe() which is not shared GPE safer instead of
acpi_enable_gpe()/acpi_disable_gpe() because EC GPE is not shared by
other hardware, which is mentioned in the ACPI specification 5.0, 12.6
Interrupt Model: "OSPM driver treats this as an edge event (the EC SCI
cannot be shared)". So we can stop using shared GPE safer APIs
acpi_enable_gpe()/acpi_disable_gpe() in the EC driver. Otherwise
cleanups need to be made in acpi_ev_enable_gpe() to bypass the GPE
clearing logic before keeping acpi_enable_gpe().
This patch also invokes advance_transaction() when GPE is re-enabled in the
task context which:
1. Ensures EN=1 can trigger GPE by checking and handling EC status register
right after EN=1 writes. This can fix issue 2.
After applying this patch, without frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() ec_poll()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 1 (event pending B) can arrive as a next GPE
due to the previous IRQ context STS=0 write. And if it is handled by
ec_poll() (event handling B), as it is also acknowledged by ec_poll(), the
event pending for issue 2 (event pending C) can properly arrive as a next
GPE after the task context STS=0 write. So no GPE will be lost and never
triggered due to GPE clearing performed in the wrong position. And since
all GPE handling is performed after a locked GPE status checking, we can
hardly see no-op GPE handler invocations due to issue 1 and 3. We may still
see no-op GPE handler invocations due to "Note 1", but as it is inevitable,
it needn't be fixed.
After applying this patch, with frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() acpi_ec_transaction()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
EN=1
if STS==1
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 2 can be manually handled by
advance_transaction(). And after the STS=0 write performed in the manual
triggered advance_transaction(), GPE can always arrive. So no GPE will be
lost due to frequent GPE disabling/enabling performed in the driver like
issue 4.
Note 5:
It's ideally when EN=1 write occurred, an IRQ thread should be woken up to
handle the GPE when the GPE was raised. But this requires the IRQ thread to
contain the poller code for all EC GPE indications, while currently some of
the indications are handled in the user tasks. It then is very hard for the
code to determine whether a user task should be invoked or the poller work
item should be scheduled. So we have to invoke advance_transaction()
directly now and it leaves us such a restriction for the GPE re-enabling:
it must be performed in the task context to avoid starving the GPEs.
As a conclusion: we can see the EC GPE is always handled in serial after
deploying the raw GPE handler mode:
Lock(EC)
if (STS==1)
STS=0
EC_SC read
EC_SC handled
Unlock(EC)
The EC driver specific lock is responsible to make the EC GPE handling
processes serialized so that EC can handle its GPE from both IRQ and task
contexts and the next IRQ can be ensured to arrive after this process.
Note 6:
We have many EC_FLAGS_MSI qurik users in the current driver. They all seem
to be suffering from unexpected GPE triggering source lost. And they are
false root caused to a timing issue. Since EC communication protocol has
already flow control defined, timing shouldn't be the root cause, while
this fix might be fixing the root cause of the old bugs.
Link: https://lkml.org/lkml/2014/11/4/974
Link: https://lkml.org/lkml/2014/11/18/316
Link: https://www.spinics.net/lists/linux-acpi/msg54340.html
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05 16:27:22 +08:00
/*
* On some platforms , EN = 1 writes cannot trigger GPE . So
* software need to manually trigger a pseudo GPE event on
* EN = 1 writes .
*/
2015-02-27 14:48:15 +08:00
ec_dbg_raw ( " Polling quirk " ) ;
2020-11-13 19:13:17 +01:00
advance_transaction ( ec , false ) ;
ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode.
This patch switches EC driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode where
the GPE lock is not held for acpi_ec_gpe_handler() and the ACPICA internal
GPE enabling/disabling/clearing operations are bypassed so that further
improvements are possible with the GPE APIs.
There are 2 strong reasons for deploying raw GPE handler mode in the EC
driver:
1. Some hardware logics can control their interrupts via their own
registers, so their interrupts can be disabled/enabled/acknowledged
without using the super IRQ controller provided functions. While there
is no mean (EC commands) for the EC driver to achieve this.
2. During suspending, the EC driver is still working for a while to
complete the platform firmware provided functionailities using ec_poll()
after all GPEs are disabled (see acpi_ec_block_transactions()), which
means the EC driver will drive the EC GPE out of the GPE core's control.
Without deploying the raw GPE handler mode, we can see many races between
the EC driver and the GPE core due to the above restrictions:
1. There is a race condition due to ACPICA internal GPE
disabling/clearing/enabling logics in acpi_ev_gpe_dispatch():
Orignally EC GPE is disabled (EN=0), cleared (STS=0) before invoking a
GPE handler and re-enabled (EN=1) after invoking a GPE handler in
acpi_ev_gpe_dispatch(). When re-enabling appears, GPE may be flagged
(STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() ec_poll()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1
This race condition is the root cause of different issues on different
silicon variations.
A. Silicon variation A:
On some platforms, GPE will be triggered due to "writing 1 to EN when
STS=1". This is because both EN and STS lines are wired to the GPE
trigger line.
1. Issue 1:
We can see no-op acpi_ec_gpe_handler() invoked on such platforms.
This is because:
a. event pending B: An event can arrive after ACPICA's GPE
clearing performed in acpi_ev_gpe_dispatch(), this event may
fail to be detected by EC_SC read that is performed before its
arrival;
b. event handling B: The event can be handled in ec_poll() because
EC lock is released after acpi_ec_gpe_handler() invocation;
c. There is no code in ec_poll() to clear STS but the GPE can
still be triggered by the EN=1 write performed in
acpi_ev_finish_gpe(), this leads to a no-op EC GPE handler
invocation;
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 1:
If we removed GPE disabling/enabling code from
acpi_ev_gpe_dispatch(), we could still see no-op GPE handlers
triggered by the event arriving after the GPE clearing and before
the GPE handling on both silicon variation A and B. This can only
occur if the CPU is very slow (timing slice between STS=0 write
and EC_SC read should be short enough before hardware sets another
GPE indication). Thus this is very rare and is not what we need to
fix.
B. Silicon variation B:
On other platforms, GPE may not be triggered due to "writing 1 to EN
when STS=1". This is because only STS line is wired to the GPE
trigger line.
2. Issue 2:
We can see GPE loss on such platforms. This is because:
a. event pending B vs. event handling A: An event can arrive after
ACPICA's GPE handling performed in acpi_ev_gpe_dispatch(), or
event pending C vs. event handling B: An event can arrive after
Linux's GPE handling performed in ec_poll(),
these events may fail to be detected by EC_SC read that is
performed before their arrival;
b. The GPE cannot be triggered by EN=1 write performed in
acpi_ev_finish_gpe();
c. If no polling mechanism is implemented in the driver for the
pending event (for example, SCI_EVT), this event is lost due to
no GPE being triggered.
Note 2:
On most platforms, there might be another rule that GPE may not be
triggered due to "writing 1 to STS when STS=1 and EN=1".
Then on silicon variation B, an even worse case is if the issue 2
event loss happens, further events may never trigger GPE again on
such platforms due to being blocked by the current STS=1. Unless
someone clears STS, all events have to be polled.
2. There is a race condition due to lacking in GPE status checking in EC
driver:
Originally, GPE status is checked in ACPICA core but not checked in
the GPE handler. Thus since the status checking and handling is not
locked, it can be interrupted by another handling path.
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_detect() ec_poll()
if (EN==1 && STS==1)
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
acpi_ev_gpe_dispatch()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
Unlock(EC)
*****************************************************************
3. Issue 3:
We can see no-op acpi_ec_gpe_handler() invoked on both silicon
variation A and B. This is because:
a. event pending A: An event can arrive to trigger an EC GPE and
ACPICA checks it and is about to invoke the EC GPE handler;
b. event handling A: The event can be handled in ec_poll() because
EC lock is not held after the GPE status checking;
c. event handling B: Then when the EC GPE handler is invoked, it
becomes a no-op GPE handler invocation.
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 3:
This no-op GPE handler invocation is rare because the time between
the IRQ arrival and the acpi_ec_gpe_handler() invocation is less than
the timeout value waited in ec_poll(). So most of the no-op GPE
handler invocations are caused by the reason described in issue 1.
3. There is a race condition due to ACPICA internal GPE clearing logic in
acpi_enable_gpe():
During runtime, acpi_enable_gpe() can be invoked by the EC storming
prevention code. When it is invoked, GPE may be flagged (STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() acpi_ec_transaction()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1 ?
Lock(EC)
Unlock(EC)
=================================================================
(event pending B)
=================================================================
acpi_enable_gpe()
STS=0
EN=1
4. Issue 4:
We can see GPE loss on both silicon variation A and B platforms.
This is because:
a. event pending B: An event can arrive right before ACPICA's GPE
clearing performed in acpi_enable_gpe();
b. If the GPE is cleared when GPE is disabled, then EN=1 write in
acpi_enable_gpe() cannot trigger this GPE;
c. If no polling mechanism is implemented in the driver for this
event (for example, SCI_EVT), this event is lost due to no GPE
being triggered.
Note 4:
Currently we don't have this issue, but after we switch the EC
driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode, we need to take care
of handling this because the EN=1 write in acpi_ev_gpe_dispatch()
will be abandoned.
There might be more race issues for the current GPE handler usages. This is
because the EC IRQ's enabling/disabling/checking/clearing/handling
operations should be locked by a single lock that is under the EC driver's
control to achieve the serialization. Which means we need to invoke GPE
APIs with EC driver's lock held and all ACPICA internal GPE operations
related to the GPE handler should be abandoned. Invoking GPE APIs inside of
the EC driver lock and bypassing ACPICA internal GPE operations requires
the ACPI_GPE_DISPATCH_RAW_HANDLER mode where the same lock used by the APIs
are released prior than invoking the handlers. Otherwise, we can see dead
locks due to circular locking dependencies (see Reference below).
This patch then switches the EC driver into the
ACPI_GPE_DISPATCH_RAW_HANDLER mode so that it can perform correct GPE
operations using the GPE APIs:
1. Bypasses EN modifications performed in acpi_ev_gpe_dispatch() by
using acpi_install_gpe_raw_handler() and invoking all GPE APIs with EC
spin lock held. This can fix issue 1 as it makes a non frequent GPE
enabling/disabling environment.
2. Bypasses STS clearing performed in acpi_enable_gpe() by replacing
acpi_enable_gpe()/acpi_disable_gpe() with acpi_set_gpe(). This can fix
issue 4. And this can also help to fix issue 1 as it makes a no sudden
GPE clearing environment when GPE is frequently enabled/disabled.
3. Ensures STS acknowledged before handling by invoking acpi_clear_gpe()
in advance_transaction(). This can finally fix issue 1 even in a
frequent GPE enabling/disabling environment. And this can also finally
fix issue 3 when issue 2 is fixed.
Note 3:
GPE clearing is edge triggered W1C, which means we can clear it right
before handling it. Since all EC GPE indications are handled in
advance_transaction() by previous commits, we can now move GPE clearing
into it to implement the correct GPE clearing.
Note 4:
We can use acpi_set_gpe() which is not shared GPE safer instead of
acpi_enable_gpe()/acpi_disable_gpe() because EC GPE is not shared by
other hardware, which is mentioned in the ACPI specification 5.0, 12.6
Interrupt Model: "OSPM driver treats this as an edge event (the EC SCI
cannot be shared)". So we can stop using shared GPE safer APIs
acpi_enable_gpe()/acpi_disable_gpe() in the EC driver. Otherwise
cleanups need to be made in acpi_ev_enable_gpe() to bypass the GPE
clearing logic before keeping acpi_enable_gpe().
This patch also invokes advance_transaction() when GPE is re-enabled in the
task context which:
1. Ensures EN=1 can trigger GPE by checking and handling EC status register
right after EN=1 writes. This can fix issue 2.
After applying this patch, without frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() ec_poll()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 1 (event pending B) can arrive as a next GPE
due to the previous IRQ context STS=0 write. And if it is handled by
ec_poll() (event handling B), as it is also acknowledged by ec_poll(), the
event pending for issue 2 (event pending C) can properly arrive as a next
GPE after the task context STS=0 write. So no GPE will be lost and never
triggered due to GPE clearing performed in the wrong position. And since
all GPE handling is performed after a locked GPE status checking, we can
hardly see no-op GPE handler invocations due to issue 1 and 3. We may still
see no-op GPE handler invocations due to "Note 1", but as it is inevitable,
it needn't be fixed.
After applying this patch, with frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() acpi_ec_transaction()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
EN=1
if STS==1
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 2 can be manually handled by
advance_transaction(). And after the STS=0 write performed in the manual
triggered advance_transaction(), GPE can always arrive. So no GPE will be
lost due to frequent GPE disabling/enabling performed in the driver like
issue 4.
Note 5:
It's ideally when EN=1 write occurred, an IRQ thread should be woken up to
handle the GPE when the GPE was raised. But this requires the IRQ thread to
contain the poller code for all EC GPE indications, while currently some of
the indications are handled in the user tasks. It then is very hard for the
code to determine whether a user task should be invoked or the poller work
item should be scheduled. So we have to invoke advance_transaction()
directly now and it leaves us such a restriction for the GPE re-enabling:
it must be performed in the task context to avoid starving the GPEs.
As a conclusion: we can see the EC GPE is always handled in serial after
deploying the raw GPE handler mode:
Lock(EC)
if (STS==1)
STS=0
EC_SC read
EC_SC handled
Unlock(EC)
The EC driver specific lock is responsible to make the EC GPE handling
processes serialized so that EC can handle its GPE from both IRQ and task
contexts and the next IRQ can be ensured to arrive after this process.
Note 6:
We have many EC_FLAGS_MSI qurik users in the current driver. They all seem
to be suffering from unexpected GPE triggering source lost. And they are
false root caused to a timing issue. Since EC communication protocol has
already flow control defined, timing shouldn't be the root cause, while
this fix might be fixing the root cause of the old bugs.
Link: https://lkml.org/lkml/2014/11/4/974
Link: https://lkml.org/lkml/2014/11/18/316
Link: https://www.spinics.net/lists/linux-acpi/msg54340.html
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05 16:27:22 +08:00
}
}
static inline void acpi_ec_disable_gpe ( struct acpi_ec * ec , bool close )
{
if ( close )
acpi_disable_gpe ( NULL , ec - > gpe ) ;
2015-02-06 08:58:05 +08:00
else {
BUG_ON ( ec - > reference_count < 1 ) ;
ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode.
This patch switches EC driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode where
the GPE lock is not held for acpi_ec_gpe_handler() and the ACPICA internal
GPE enabling/disabling/clearing operations are bypassed so that further
improvements are possible with the GPE APIs.
There are 2 strong reasons for deploying raw GPE handler mode in the EC
driver:
1. Some hardware logics can control their interrupts via their own
registers, so their interrupts can be disabled/enabled/acknowledged
without using the super IRQ controller provided functions. While there
is no mean (EC commands) for the EC driver to achieve this.
2. During suspending, the EC driver is still working for a while to
complete the platform firmware provided functionailities using ec_poll()
after all GPEs are disabled (see acpi_ec_block_transactions()), which
means the EC driver will drive the EC GPE out of the GPE core's control.
Without deploying the raw GPE handler mode, we can see many races between
the EC driver and the GPE core due to the above restrictions:
1. There is a race condition due to ACPICA internal GPE
disabling/clearing/enabling logics in acpi_ev_gpe_dispatch():
Orignally EC GPE is disabled (EN=0), cleared (STS=0) before invoking a
GPE handler and re-enabled (EN=1) after invoking a GPE handler in
acpi_ev_gpe_dispatch(). When re-enabling appears, GPE may be flagged
(STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() ec_poll()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1
This race condition is the root cause of different issues on different
silicon variations.
A. Silicon variation A:
On some platforms, GPE will be triggered due to "writing 1 to EN when
STS=1". This is because both EN and STS lines are wired to the GPE
trigger line.
1. Issue 1:
We can see no-op acpi_ec_gpe_handler() invoked on such platforms.
This is because:
a. event pending B: An event can arrive after ACPICA's GPE
clearing performed in acpi_ev_gpe_dispatch(), this event may
fail to be detected by EC_SC read that is performed before its
arrival;
b. event handling B: The event can be handled in ec_poll() because
EC lock is released after acpi_ec_gpe_handler() invocation;
c. There is no code in ec_poll() to clear STS but the GPE can
still be triggered by the EN=1 write performed in
acpi_ev_finish_gpe(), this leads to a no-op EC GPE handler
invocation;
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 1:
If we removed GPE disabling/enabling code from
acpi_ev_gpe_dispatch(), we could still see no-op GPE handlers
triggered by the event arriving after the GPE clearing and before
the GPE handling on both silicon variation A and B. This can only
occur if the CPU is very slow (timing slice between STS=0 write
and EC_SC read should be short enough before hardware sets another
GPE indication). Thus this is very rare and is not what we need to
fix.
B. Silicon variation B:
On other platforms, GPE may not be triggered due to "writing 1 to EN
when STS=1". This is because only STS line is wired to the GPE
trigger line.
2. Issue 2:
We can see GPE loss on such platforms. This is because:
a. event pending B vs. event handling A: An event can arrive after
ACPICA's GPE handling performed in acpi_ev_gpe_dispatch(), or
event pending C vs. event handling B: An event can arrive after
Linux's GPE handling performed in ec_poll(),
these events may fail to be detected by EC_SC read that is
performed before their arrival;
b. The GPE cannot be triggered by EN=1 write performed in
acpi_ev_finish_gpe();
c. If no polling mechanism is implemented in the driver for the
pending event (for example, SCI_EVT), this event is lost due to
no GPE being triggered.
Note 2:
On most platforms, there might be another rule that GPE may not be
triggered due to "writing 1 to STS when STS=1 and EN=1".
Then on silicon variation B, an even worse case is if the issue 2
event loss happens, further events may never trigger GPE again on
such platforms due to being blocked by the current STS=1. Unless
someone clears STS, all events have to be polled.
2. There is a race condition due to lacking in GPE status checking in EC
driver:
Originally, GPE status is checked in ACPICA core but not checked in
the GPE handler. Thus since the status checking and handling is not
locked, it can be interrupted by another handling path.
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_detect() ec_poll()
if (EN==1 && STS==1)
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
acpi_ev_gpe_dispatch()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
Unlock(EC)
*****************************************************************
3. Issue 3:
We can see no-op acpi_ec_gpe_handler() invoked on both silicon
variation A and B. This is because:
a. event pending A: An event can arrive to trigger an EC GPE and
ACPICA checks it and is about to invoke the EC GPE handler;
b. event handling A: The event can be handled in ec_poll() because
EC lock is not held after the GPE status checking;
c. event handling B: Then when the EC GPE handler is invoked, it
becomes a no-op GPE handler invocation.
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 3:
This no-op GPE handler invocation is rare because the time between
the IRQ arrival and the acpi_ec_gpe_handler() invocation is less than
the timeout value waited in ec_poll(). So most of the no-op GPE
handler invocations are caused by the reason described in issue 1.
3. There is a race condition due to ACPICA internal GPE clearing logic in
acpi_enable_gpe():
During runtime, acpi_enable_gpe() can be invoked by the EC storming
prevention code. When it is invoked, GPE may be flagged (STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() acpi_ec_transaction()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1 ?
Lock(EC)
Unlock(EC)
=================================================================
(event pending B)
=================================================================
acpi_enable_gpe()
STS=0
EN=1
4. Issue 4:
We can see GPE loss on both silicon variation A and B platforms.
This is because:
a. event pending B: An event can arrive right before ACPICA's GPE
clearing performed in acpi_enable_gpe();
b. If the GPE is cleared when GPE is disabled, then EN=1 write in
acpi_enable_gpe() cannot trigger this GPE;
c. If no polling mechanism is implemented in the driver for this
event (for example, SCI_EVT), this event is lost due to no GPE
being triggered.
Note 4:
Currently we don't have this issue, but after we switch the EC
driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode, we need to take care
of handling this because the EN=1 write in acpi_ev_gpe_dispatch()
will be abandoned.
There might be more race issues for the current GPE handler usages. This is
because the EC IRQ's enabling/disabling/checking/clearing/handling
operations should be locked by a single lock that is under the EC driver's
control to achieve the serialization. Which means we need to invoke GPE
APIs with EC driver's lock held and all ACPICA internal GPE operations
related to the GPE handler should be abandoned. Invoking GPE APIs inside of
the EC driver lock and bypassing ACPICA internal GPE operations requires
the ACPI_GPE_DISPATCH_RAW_HANDLER mode where the same lock used by the APIs
are released prior than invoking the handlers. Otherwise, we can see dead
locks due to circular locking dependencies (see Reference below).
This patch then switches the EC driver into the
ACPI_GPE_DISPATCH_RAW_HANDLER mode so that it can perform correct GPE
operations using the GPE APIs:
1. Bypasses EN modifications performed in acpi_ev_gpe_dispatch() by
using acpi_install_gpe_raw_handler() and invoking all GPE APIs with EC
spin lock held. This can fix issue 1 as it makes a non frequent GPE
enabling/disabling environment.
2. Bypasses STS clearing performed in acpi_enable_gpe() by replacing
acpi_enable_gpe()/acpi_disable_gpe() with acpi_set_gpe(). This can fix
issue 4. And this can also help to fix issue 1 as it makes a no sudden
GPE clearing environment when GPE is frequently enabled/disabled.
3. Ensures STS acknowledged before handling by invoking acpi_clear_gpe()
in advance_transaction(). This can finally fix issue 1 even in a
frequent GPE enabling/disabling environment. And this can also finally
fix issue 3 when issue 2 is fixed.
Note 3:
GPE clearing is edge triggered W1C, which means we can clear it right
before handling it. Since all EC GPE indications are handled in
advance_transaction() by previous commits, we can now move GPE clearing
into it to implement the correct GPE clearing.
Note 4:
We can use acpi_set_gpe() which is not shared GPE safer instead of
acpi_enable_gpe()/acpi_disable_gpe() because EC GPE is not shared by
other hardware, which is mentioned in the ACPI specification 5.0, 12.6
Interrupt Model: "OSPM driver treats this as an edge event (the EC SCI
cannot be shared)". So we can stop using shared GPE safer APIs
acpi_enable_gpe()/acpi_disable_gpe() in the EC driver. Otherwise
cleanups need to be made in acpi_ev_enable_gpe() to bypass the GPE
clearing logic before keeping acpi_enable_gpe().
This patch also invokes advance_transaction() when GPE is re-enabled in the
task context which:
1. Ensures EN=1 can trigger GPE by checking and handling EC status register
right after EN=1 writes. This can fix issue 2.
After applying this patch, without frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() ec_poll()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 1 (event pending B) can arrive as a next GPE
due to the previous IRQ context STS=0 write. And if it is handled by
ec_poll() (event handling B), as it is also acknowledged by ec_poll(), the
event pending for issue 2 (event pending C) can properly arrive as a next
GPE after the task context STS=0 write. So no GPE will be lost and never
triggered due to GPE clearing performed in the wrong position. And since
all GPE handling is performed after a locked GPE status checking, we can
hardly see no-op GPE handler invocations due to issue 1 and 3. We may still
see no-op GPE handler invocations due to "Note 1", but as it is inevitable,
it needn't be fixed.
After applying this patch, with frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() acpi_ec_transaction()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
EN=1
if STS==1
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 2 can be manually handled by
advance_transaction(). And after the STS=0 write performed in the manual
triggered advance_transaction(), GPE can always arrive. So no GPE will be
lost due to frequent GPE disabling/enabling performed in the driver like
issue 4.
Note 5:
It's ideally when EN=1 write occurred, an IRQ thread should be woken up to
handle the GPE when the GPE was raised. But this requires the IRQ thread to
contain the poller code for all EC GPE indications, while currently some of
the indications are handled in the user tasks. It then is very hard for the
code to determine whether a user task should be invoked or the poller work
item should be scheduled. So we have to invoke advance_transaction()
directly now and it leaves us such a restriction for the GPE re-enabling:
it must be performed in the task context to avoid starving the GPEs.
As a conclusion: we can see the EC GPE is always handled in serial after
deploying the raw GPE handler mode:
Lock(EC)
if (STS==1)
STS=0
EC_SC read
EC_SC handled
Unlock(EC)
The EC driver specific lock is responsible to make the EC GPE handling
processes serialized so that EC can handle its GPE from both IRQ and task
contexts and the next IRQ can be ensured to arrive after this process.
Note 6:
We have many EC_FLAGS_MSI qurik users in the current driver. They all seem
to be suffering from unexpected GPE triggering source lost. And they are
false root caused to a timing issue. Since EC communication protocol has
already flow control defined, timing shouldn't be the root cause, while
this fix might be fixing the root cause of the old bugs.
Link: https://lkml.org/lkml/2014/11/4/974
Link: https://lkml.org/lkml/2014/11/18/316
Link: https://www.spinics.net/lists/linux-acpi/msg54340.html
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05 16:27:22 +08:00
acpi_set_gpe ( NULL , ec - > gpe , ACPI_GPE_DISABLE ) ;
2015-02-06 08:58:05 +08:00
}
ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode.
This patch switches EC driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode where
the GPE lock is not held for acpi_ec_gpe_handler() and the ACPICA internal
GPE enabling/disabling/clearing operations are bypassed so that further
improvements are possible with the GPE APIs.
There are 2 strong reasons for deploying raw GPE handler mode in the EC
driver:
1. Some hardware logics can control their interrupts via their own
registers, so their interrupts can be disabled/enabled/acknowledged
without using the super IRQ controller provided functions. While there
is no mean (EC commands) for the EC driver to achieve this.
2. During suspending, the EC driver is still working for a while to
complete the platform firmware provided functionailities using ec_poll()
after all GPEs are disabled (see acpi_ec_block_transactions()), which
means the EC driver will drive the EC GPE out of the GPE core's control.
Without deploying the raw GPE handler mode, we can see many races between
the EC driver and the GPE core due to the above restrictions:
1. There is a race condition due to ACPICA internal GPE
disabling/clearing/enabling logics in acpi_ev_gpe_dispatch():
Orignally EC GPE is disabled (EN=0), cleared (STS=0) before invoking a
GPE handler and re-enabled (EN=1) after invoking a GPE handler in
acpi_ev_gpe_dispatch(). When re-enabling appears, GPE may be flagged
(STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() ec_poll()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1
This race condition is the root cause of different issues on different
silicon variations.
A. Silicon variation A:
On some platforms, GPE will be triggered due to "writing 1 to EN when
STS=1". This is because both EN and STS lines are wired to the GPE
trigger line.
1. Issue 1:
We can see no-op acpi_ec_gpe_handler() invoked on such platforms.
This is because:
a. event pending B: An event can arrive after ACPICA's GPE
clearing performed in acpi_ev_gpe_dispatch(), this event may
fail to be detected by EC_SC read that is performed before its
arrival;
b. event handling B: The event can be handled in ec_poll() because
EC lock is released after acpi_ec_gpe_handler() invocation;
c. There is no code in ec_poll() to clear STS but the GPE can
still be triggered by the EN=1 write performed in
acpi_ev_finish_gpe(), this leads to a no-op EC GPE handler
invocation;
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 1:
If we removed GPE disabling/enabling code from
acpi_ev_gpe_dispatch(), we could still see no-op GPE handlers
triggered by the event arriving after the GPE clearing and before
the GPE handling on both silicon variation A and B. This can only
occur if the CPU is very slow (timing slice between STS=0 write
and EC_SC read should be short enough before hardware sets another
GPE indication). Thus this is very rare and is not what we need to
fix.
B. Silicon variation B:
On other platforms, GPE may not be triggered due to "writing 1 to EN
when STS=1". This is because only STS line is wired to the GPE
trigger line.
2. Issue 2:
We can see GPE loss on such platforms. This is because:
a. event pending B vs. event handling A: An event can arrive after
ACPICA's GPE handling performed in acpi_ev_gpe_dispatch(), or
event pending C vs. event handling B: An event can arrive after
Linux's GPE handling performed in ec_poll(),
these events may fail to be detected by EC_SC read that is
performed before their arrival;
b. The GPE cannot be triggered by EN=1 write performed in
acpi_ev_finish_gpe();
c. If no polling mechanism is implemented in the driver for the
pending event (for example, SCI_EVT), this event is lost due to
no GPE being triggered.
Note 2:
On most platforms, there might be another rule that GPE may not be
triggered due to "writing 1 to STS when STS=1 and EN=1".
Then on silicon variation B, an even worse case is if the issue 2
event loss happens, further events may never trigger GPE again on
such platforms due to being blocked by the current STS=1. Unless
someone clears STS, all events have to be polled.
2. There is a race condition due to lacking in GPE status checking in EC
driver:
Originally, GPE status is checked in ACPICA core but not checked in
the GPE handler. Thus since the status checking and handling is not
locked, it can be interrupted by another handling path.
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_detect() ec_poll()
if (EN==1 && STS==1)
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
acpi_ev_gpe_dispatch()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
Unlock(EC)
*****************************************************************
3. Issue 3:
We can see no-op acpi_ec_gpe_handler() invoked on both silicon
variation A and B. This is because:
a. event pending A: An event can arrive to trigger an EC GPE and
ACPICA checks it and is about to invoke the EC GPE handler;
b. event handling A: The event can be handled in ec_poll() because
EC lock is not held after the GPE status checking;
c. event handling B: Then when the EC GPE handler is invoked, it
becomes a no-op GPE handler invocation.
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 3:
This no-op GPE handler invocation is rare because the time between
the IRQ arrival and the acpi_ec_gpe_handler() invocation is less than
the timeout value waited in ec_poll(). So most of the no-op GPE
handler invocations are caused by the reason described in issue 1.
3. There is a race condition due to ACPICA internal GPE clearing logic in
acpi_enable_gpe():
During runtime, acpi_enable_gpe() can be invoked by the EC storming
prevention code. When it is invoked, GPE may be flagged (STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() acpi_ec_transaction()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1 ?
Lock(EC)
Unlock(EC)
=================================================================
(event pending B)
=================================================================
acpi_enable_gpe()
STS=0
EN=1
4. Issue 4:
We can see GPE loss on both silicon variation A and B platforms.
This is because:
a. event pending B: An event can arrive right before ACPICA's GPE
clearing performed in acpi_enable_gpe();
b. If the GPE is cleared when GPE is disabled, then EN=1 write in
acpi_enable_gpe() cannot trigger this GPE;
c. If no polling mechanism is implemented in the driver for this
event (for example, SCI_EVT), this event is lost due to no GPE
being triggered.
Note 4:
Currently we don't have this issue, but after we switch the EC
driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode, we need to take care
of handling this because the EN=1 write in acpi_ev_gpe_dispatch()
will be abandoned.
There might be more race issues for the current GPE handler usages. This is
because the EC IRQ's enabling/disabling/checking/clearing/handling
operations should be locked by a single lock that is under the EC driver's
control to achieve the serialization. Which means we need to invoke GPE
APIs with EC driver's lock held and all ACPICA internal GPE operations
related to the GPE handler should be abandoned. Invoking GPE APIs inside of
the EC driver lock and bypassing ACPICA internal GPE operations requires
the ACPI_GPE_DISPATCH_RAW_HANDLER mode where the same lock used by the APIs
are released prior than invoking the handlers. Otherwise, we can see dead
locks due to circular locking dependencies (see Reference below).
This patch then switches the EC driver into the
ACPI_GPE_DISPATCH_RAW_HANDLER mode so that it can perform correct GPE
operations using the GPE APIs:
1. Bypasses EN modifications performed in acpi_ev_gpe_dispatch() by
using acpi_install_gpe_raw_handler() and invoking all GPE APIs with EC
spin lock held. This can fix issue 1 as it makes a non frequent GPE
enabling/disabling environment.
2. Bypasses STS clearing performed in acpi_enable_gpe() by replacing
acpi_enable_gpe()/acpi_disable_gpe() with acpi_set_gpe(). This can fix
issue 4. And this can also help to fix issue 1 as it makes a no sudden
GPE clearing environment when GPE is frequently enabled/disabled.
3. Ensures STS acknowledged before handling by invoking acpi_clear_gpe()
in advance_transaction(). This can finally fix issue 1 even in a
frequent GPE enabling/disabling environment. And this can also finally
fix issue 3 when issue 2 is fixed.
Note 3:
GPE clearing is edge triggered W1C, which means we can clear it right
before handling it. Since all EC GPE indications are handled in
advance_transaction() by previous commits, we can now move GPE clearing
into it to implement the correct GPE clearing.
Note 4:
We can use acpi_set_gpe() which is not shared GPE safer instead of
acpi_enable_gpe()/acpi_disable_gpe() because EC GPE is not shared by
other hardware, which is mentioned in the ACPI specification 5.0, 12.6
Interrupt Model: "OSPM driver treats this as an edge event (the EC SCI
cannot be shared)". So we can stop using shared GPE safer APIs
acpi_enable_gpe()/acpi_disable_gpe() in the EC driver. Otherwise
cleanups need to be made in acpi_ev_enable_gpe() to bypass the GPE
clearing logic before keeping acpi_enable_gpe().
This patch also invokes advance_transaction() when GPE is re-enabled in the
task context which:
1. Ensures EN=1 can trigger GPE by checking and handling EC status register
right after EN=1 writes. This can fix issue 2.
After applying this patch, without frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() ec_poll()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 1 (event pending B) can arrive as a next GPE
due to the previous IRQ context STS=0 write. And if it is handled by
ec_poll() (event handling B), as it is also acknowledged by ec_poll(), the
event pending for issue 2 (event pending C) can properly arrive as a next
GPE after the task context STS=0 write. So no GPE will be lost and never
triggered due to GPE clearing performed in the wrong position. And since
all GPE handling is performed after a locked GPE status checking, we can
hardly see no-op GPE handler invocations due to issue 1 and 3. We may still
see no-op GPE handler invocations due to "Note 1", but as it is inevitable,
it needn't be fixed.
After applying this patch, with frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() acpi_ec_transaction()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
EN=1
if STS==1
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 2 can be manually handled by
advance_transaction(). And after the STS=0 write performed in the manual
triggered advance_transaction(), GPE can always arrive. So no GPE will be
lost due to frequent GPE disabling/enabling performed in the driver like
issue 4.
Note 5:
It's ideally when EN=1 write occurred, an IRQ thread should be woken up to
handle the GPE when the GPE was raised. But this requires the IRQ thread to
contain the poller code for all EC GPE indications, while currently some of
the indications are handled in the user tasks. It then is very hard for the
code to determine whether a user task should be invoked or the poller work
item should be scheduled. So we have to invoke advance_transaction()
directly now and it leaves us such a restriction for the GPE re-enabling:
it must be performed in the task context to avoid starving the GPEs.
As a conclusion: we can see the EC GPE is always handled in serial after
deploying the raw GPE handler mode:
Lock(EC)
if (STS==1)
STS=0
EC_SC read
EC_SC handled
Unlock(EC)
The EC driver specific lock is responsible to make the EC GPE handling
processes serialized so that EC can handle its GPE from both IRQ and task
contexts and the next IRQ can be ensured to arrive after this process.
Note 6:
We have many EC_FLAGS_MSI qurik users in the current driver. They all seem
to be suffering from unexpected GPE triggering source lost. And they are
false root caused to a timing issue. Since EC communication protocol has
already flow control defined, timing shouldn't be the root cause, while
this fix might be fixing the root cause of the old bugs.
Link: https://lkml.org/lkml/2014/11/4/974
Link: https://lkml.org/lkml/2014/11/18/316
Link: https://www.spinics.net/lists/linux-acpi/msg54340.html
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05 16:27:22 +08:00
}
/* --------------------------------------------------------------------------
* Transaction Management
* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
2015-02-06 08:57:59 +08:00
static void acpi_ec_submit_request ( struct acpi_ec * ec )
{
ec - > reference_count + + ;
2019-10-14 16:56:01 +08:00
if ( test_bit ( EC_FLAGS_EVENT_HANDLER_INSTALLED , & ec - > flags ) & &
2019-10-14 16:56:02 +08:00
ec - > gpe > = 0 & & ec - > reference_count = = 1 )
2015-02-06 08:57:59 +08:00
acpi_ec_enable_gpe ( ec , true ) ;
}
static void acpi_ec_complete_request ( struct acpi_ec * ec )
{
bool flushed = false ;
ec - > reference_count - - ;
2019-10-14 16:56:01 +08:00
if ( test_bit ( EC_FLAGS_EVENT_HANDLER_INSTALLED , & ec - > flags ) & &
2019-10-14 16:56:02 +08:00
ec - > gpe > = 0 & & ec - > reference_count = = 0 )
2015-02-06 08:57:59 +08:00
acpi_ec_disable_gpe ( ec , true ) ;
flushed = acpi_ec_flushed ( ec ) ;
if ( flushed )
wake_up ( & ec - > wait ) ;
}
2019-10-14 16:56:01 +08:00
static void acpi_ec_mask_events ( struct acpi_ec * ec )
2015-02-06 08:58:05 +08:00
{
2019-10-14 16:56:01 +08:00
if ( ! test_bit ( EC_FLAGS_EVENTS_MASKED , & ec - > flags ) ) {
2019-10-14 16:56:02 +08:00
if ( ec - > gpe > = 0 )
acpi_ec_disable_gpe ( ec , false ) ;
else
disable_irq_nosync ( ec - > irq ) ;
2015-02-27 14:48:15 +08:00
ec_dbg_drv ( " Polling enabled " ) ;
2019-10-14 16:56:01 +08:00
set_bit ( EC_FLAGS_EVENTS_MASKED , & ec - > flags ) ;
2015-02-06 08:58:05 +08:00
}
}
2019-10-14 16:56:01 +08:00
static void acpi_ec_unmask_events ( struct acpi_ec * ec )
2015-02-06 08:58:05 +08:00
{
2019-10-14 16:56:01 +08:00
if ( test_bit ( EC_FLAGS_EVENTS_MASKED , & ec - > flags ) ) {
clear_bit ( EC_FLAGS_EVENTS_MASKED , & ec - > flags ) ;
2019-10-14 16:56:02 +08:00
if ( ec - > gpe > = 0 )
acpi_ec_enable_gpe ( ec , false ) ;
else
enable_irq ( ec - > irq ) ;
2015-02-27 14:48:15 +08:00
ec_dbg_drv ( " Polling disabled " ) ;
2015-02-06 08:58:05 +08:00
}
}
2015-02-06 08:57:59 +08:00
/*
* acpi_ec_submit_flushable_request ( ) - Increase the reference count unless
* the flush operation is not in
* progress
* @ ec : the EC device
*
* This function must be used before taking a new action that should hold
* the reference count . If this function returns false , then the action
* must be discarded or it will prevent the flush operation from being
* completed .
*/
2015-02-11 17:35:05 +01:00
static bool acpi_ec_submit_flushable_request ( struct acpi_ec * ec )
2015-02-06 08:57:59 +08:00
{
2015-02-11 17:35:05 +01:00
if ( ! acpi_ec_started ( ec ) )
return false ;
2015-02-06 08:57:59 +08:00
acpi_ec_submit_request ( ec ) ;
return true ;
}
2022-02-04 18:40:20 +01:00
static void acpi_ec_submit_event ( struct acpi_ec * ec )
ACPI / EC: Fix issues related to the SCI_EVT handling
This patch fixes 2 issues related to the draining behavior. But it doesn't
implement the draining support, it only cleans up code so that further
draining support is possible.
The draining behavior is expected by some platforms (for example, Samsung)
where SCI_EVT is set only once for a set of events and might be cleared for
the very first QR_EC command issued after SCI_EVT is set. EC firmware on
such platforms will return 0x00 to indicate "no outstanding event". Thus
after seeing an SCI_EVT indication, EC driver need to fetch events until
0x00 returned (see acpi_ec_clear()).
Issue 1 - acpi_ec_submit_query():
It's reported on Samsung laptops that SCI_EVT isn't checked when the
transactions are advanced in ec_poll(), which leads to SCI_EVT triggering
source lost:
If no EC GPE IRQs are arrived after that, EC driver cannot detect this
event and handle it.
See comment 244/247 for kernel bugzilla 44161.
This patch fixes this issue by moving SCI_EVT checks into
advance_transaction(). So that SCI_EVT is checked each time we are going to
handle the EC firmware indications. And this check will happen for both IRQ
context and task context.
Since after doing that, SCI_EVT is also checked after completing a
transaction, ec_check_sci() and ec_check_sci_sync() can be removed.
Issue 2 - acpi_ec_complete_query():
We expect to clear EC_FLAGS_QUERY_PENDING to allow queuing another draining
QR_EC after writing a QR_EC command and before reading the event. After
reading the event, SCI_EVT might be cleared by the firmware, thus it may
not be possible to queue such a draining QR_EC at that time.
But putting the EC_FLAGS_QUERY_PENDING clearing code after
start_transaction() is wrong as there are chances that after
start_transaction(), QR_EC can fail to be sent. If this happens,
EC_FLAG_QUERY_PENDING will be cleared earlier. As a consequence, the
draining QR_EC will also be queued earlier than expected.
This patch also moves this code into advance_transaction() where QR_EC is
just sent (ACPI_EC_COMMAND_POLL flagged) to fix this issue.
Notes:
1. After introducing the 2 SCI_EVT related handlings into
advance_transaction(), a next QR_EC can be queued right after writing
the current QR_EC command and before reading the event. But this still
hasn't implemented the draining behavior as the draining support
requires:
If a previous returned event value isn't 0x00, a draining QR_EC need
to be issued even when SCI_EVT isn't set.
2. In this patch, acpi_os_execute() is also converted into a seperate work
item to avoid invoking kmalloc() in the atomic context. We can do this
because of the previous global lock fix.
3. Originally, EC_FLAGS_EVENT_PENDING is also used to avoid queuing up
multiple work items (created by acpi_os_execute()), this can be covered
by only using a single work item. But this patch still keeps this flag
as there are different usages in the driver initialization steps relying
on this flag.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=44161
Reported-by: Kieran Clancy <clancy.kieran@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-14 19:28:47 +08:00
{
2022-02-04 18:40:20 +01:00
/*
* It is safe to mask the events here , because acpi_ec_close_event ( )
* will run at least once after this .
*/
2019-10-14 16:56:01 +08:00
acpi_ec_mask_events ( ec ) ;
ACPI: EC: Fix an EC event IRQ storming issue
The EC event IRQ (SCI_EVT) can only be handled by submitting QR_EC. As the
EC driver handles SCI_EVT in a workqueue, after SCI_EVT is flagged and
before QR_EC is submitted, there is a period risking IRQ storming. EC IRQ
must be masked for this period but linux EC driver never does so.
No end user notices the IRQ storming and no developer fixes this known
issue because:
1. The EC IRQ is always edge triggered GPE, and
2. The kernel can execute no-op EC IRQ handler very fast.
For edge-triggered EC GPE platforms, it is only reported of post-resume EC
event lost issues, there won't be an IRQ storming. For level triggered EC
GPE platforms, fortunately the kernel is always fast enough to execute such
a no-op EC IRQ handler so that the IRQ handler won't be accumulated to
starve the task contexts, causing a real IRQ storming.
But the IRQ storming actually can still happen when:
1. The EC IRQ performs like level triggered GPE, and
2. The kernel EC debugging log is turned on but the console is slow enough.
There are more and more platforms using EC GPE as wake GPE where the EC GPE
is likely designed as level triggered. Then when EC debugging log is
enabled, the EC IRQ handler is no longer a no-op but dumps IRQ status to
the consoles. If the consoles are slow enough, the EC IRQs can arrive much
faster than executing the handler. Finally the accumulated EC event IRQ
handlers starve the task contexts, causing the IRQ storming to occur, and
the kernel hangs can be observed during boot/resume.
This patch fixes this issue by masking EC IRQ for this period:
1. Begins when there is an SCI_EVT IRQ pending, and
2. Ends when there is a QR_EC completed (SCI_EVT acknowledged).
Tested-by: Wang Wendy <wendy.wang@intel.com>
Tested-by: Feng Chenzhou <chenzhoux.feng@intel.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-14 13:59:09 +08:00
if ( ! acpi_ec_event_enabled ( ec ) )
2022-02-04 18:40:20 +01:00
return ;
2021-11-23 19:37:39 +01:00
2022-02-04 18:40:55 +01:00
if ( ec - > event_state ! = EC_EVENT_READY )
return ;
ACPI: EC: Make the event work state machine visible
The EC driver uses a relatively simple state machine for the event
work handling, but it is not really straightforward to figure out.
The states are as follows:
"Ready": The event handling work can be submitted.
In this state, the EC_FLAGS_QUERY_PENDING flag is clear.
"In progress": The event handling work is pending or is being
processed. It cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and both the
events_to_process count is nonzero and the EC_FLAGS_QUERY_GUARDING
flag is clear.
"Complete": The event handling work has been completed, but it still
cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and the
events_to_process count is zero or the EC_FLAGS_QUERY_GUARDING
flag is set.
The state changes from "Ready" to "In progress" when new event is
detected by advance_transaction() and acpi_ec_submit_event() is
called by it.
Next, the state can change from "In progress" directly to "Ready" in
the following situations:
* ec_event_clearing is ACPI_EC_EVT_TIMING_STATUS and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes ACPI_EC_COMMAND_POLL.
* ec_event_clearing is ACPI_EC_EVT_TIMING_QUERY and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* ec_event_clearing is either ACPI_EC_EVT_TIMING_STATUS or
ACPI_EC_EVT_TIMING_QUERY and there are no more events to
process (ie. ec->events_to_process becomes 0).
If ec_event_clearing is ACPI_EC_EVT_TIMING_EVENT, however, the
state must change from "In progress" to "Complete" before it
can change to "Ready". The changes from "In progress" to
"Complete" in that case occur in the following situations:
* The state of an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* There are no more events to process (ie. ec->events_to_process
becomes 0).
Finally, the state changes from "Complete" to "Ready" when
advance_transaction() is invoked when the state is "Complete" and
the state of the current transaction is not ACPI_EC_COMMAND_POLL.
To make this state machine visible in the code, add a new
event_state field to struct acpi_ec and modify the code to use
it istead the EC_FLAGS_QUERY_PENDING and EC_FLAGS_QUERY_GUARDING
flags.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:44:46 +01:00
2022-02-04 18:40:55 +01:00
ec_dbg_evt ( " Command(%s) submitted/blocked " ,
acpi_ec_cmd_string ( ACPI_EC_COMMAND_QUERY ) ) ;
2021-11-23 19:43:05 +01:00
2022-02-04 18:40:55 +01:00
ec - > event_state = EC_EVENT_IN_PROGRESS ;
/*
* If events_to_process is greater than 0 at this point , the while ( )
* loop in acpi_ec_event_handler ( ) is still running and incrementing
* events_to_process will cause it to invoke acpi_ec_submit_query ( ) once
* more , so it is not necessary to queue up the event work to start the
* same loop again .
*/
if ( ec - > events_to_process + + > 0 )
return ;
ec - > events_in_progress + + ;
queue_work ( ec_wq , & ec - > work ) ;
ACPI / EC: Fix issues related to the SCI_EVT handling
This patch fixes 2 issues related to the draining behavior. But it doesn't
implement the draining support, it only cleans up code so that further
draining support is possible.
The draining behavior is expected by some platforms (for example, Samsung)
where SCI_EVT is set only once for a set of events and might be cleared for
the very first QR_EC command issued after SCI_EVT is set. EC firmware on
such platforms will return 0x00 to indicate "no outstanding event". Thus
after seeing an SCI_EVT indication, EC driver need to fetch events until
0x00 returned (see acpi_ec_clear()).
Issue 1 - acpi_ec_submit_query():
It's reported on Samsung laptops that SCI_EVT isn't checked when the
transactions are advanced in ec_poll(), which leads to SCI_EVT triggering
source lost:
If no EC GPE IRQs are arrived after that, EC driver cannot detect this
event and handle it.
See comment 244/247 for kernel bugzilla 44161.
This patch fixes this issue by moving SCI_EVT checks into
advance_transaction(). So that SCI_EVT is checked each time we are going to
handle the EC firmware indications. And this check will happen for both IRQ
context and task context.
Since after doing that, SCI_EVT is also checked after completing a
transaction, ec_check_sci() and ec_check_sci_sync() can be removed.
Issue 2 - acpi_ec_complete_query():
We expect to clear EC_FLAGS_QUERY_PENDING to allow queuing another draining
QR_EC after writing a QR_EC command and before reading the event. After
reading the event, SCI_EVT might be cleared by the firmware, thus it may
not be possible to queue such a draining QR_EC at that time.
But putting the EC_FLAGS_QUERY_PENDING clearing code after
start_transaction() is wrong as there are chances that after
start_transaction(), QR_EC can fail to be sent. If this happens,
EC_FLAG_QUERY_PENDING will be cleared earlier. As a consequence, the
draining QR_EC will also be queued earlier than expected.
This patch also moves this code into advance_transaction() where QR_EC is
just sent (ACPI_EC_COMMAND_POLL flagged) to fix this issue.
Notes:
1. After introducing the 2 SCI_EVT related handlings into
advance_transaction(), a next QR_EC can be queued right after writing
the current QR_EC command and before reading the event. But this still
hasn't implemented the draining behavior as the draining support
requires:
If a previous returned event value isn't 0x00, a draining QR_EC need
to be issued even when SCI_EVT isn't set.
2. In this patch, acpi_os_execute() is also converted into a seperate work
item to avoid invoking kmalloc() in the atomic context. We can do this
because of the previous global lock fix.
3. Originally, EC_FLAGS_EVENT_PENDING is also used to avoid queuing up
multiple work items (created by acpi_os_execute()), this can be covered
by only using a single work item. But this patch still keeps this flag
as there are different usages in the driver initialization steps relying
on this flag.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=44161
Reported-by: Kieran Clancy <clancy.kieran@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-14 19:28:47 +08:00
}
ACPI: EC: Make the event work state machine visible
The EC driver uses a relatively simple state machine for the event
work handling, but it is not really straightforward to figure out.
The states are as follows:
"Ready": The event handling work can be submitted.
In this state, the EC_FLAGS_QUERY_PENDING flag is clear.
"In progress": The event handling work is pending or is being
processed. It cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and both the
events_to_process count is nonzero and the EC_FLAGS_QUERY_GUARDING
flag is clear.
"Complete": The event handling work has been completed, but it still
cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and the
events_to_process count is zero or the EC_FLAGS_QUERY_GUARDING
flag is set.
The state changes from "Ready" to "In progress" when new event is
detected by advance_transaction() and acpi_ec_submit_event() is
called by it.
Next, the state can change from "In progress" directly to "Ready" in
the following situations:
* ec_event_clearing is ACPI_EC_EVT_TIMING_STATUS and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes ACPI_EC_COMMAND_POLL.
* ec_event_clearing is ACPI_EC_EVT_TIMING_QUERY and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* ec_event_clearing is either ACPI_EC_EVT_TIMING_STATUS or
ACPI_EC_EVT_TIMING_QUERY and there are no more events to
process (ie. ec->events_to_process becomes 0).
If ec_event_clearing is ACPI_EC_EVT_TIMING_EVENT, however, the
state must change from "In progress" to "Complete" before it
can change to "Ready". The changes from "In progress" to
"Complete" in that case occur in the following situations:
* The state of an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* There are no more events to process (ie. ec->events_to_process
becomes 0).
Finally, the state changes from "Complete" to "Ready" when
advance_transaction() is invoked when the state is "Complete" and
the state of the current transaction is not ACPI_EC_COMMAND_POLL.
To make this state machine visible in the code, add a new
event_state field to struct acpi_ec and modify the code to use
it istead the EC_FLAGS_QUERY_PENDING and EC_FLAGS_QUERY_GUARDING
flags.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:44:46 +01:00
static void acpi_ec_complete_event ( struct acpi_ec * ec )
{
if ( ec - > event_state = = EC_EVENT_IN_PROGRESS )
ec - > event_state = EC_EVENT_COMPLETE ;
}
2021-11-23 19:42:02 +01:00
static void acpi_ec_close_event ( struct acpi_ec * ec )
ACPI / EC: Fix issues related to the SCI_EVT handling
This patch fixes 2 issues related to the draining behavior. But it doesn't
implement the draining support, it only cleans up code so that further
draining support is possible.
The draining behavior is expected by some platforms (for example, Samsung)
where SCI_EVT is set only once for a set of events and might be cleared for
the very first QR_EC command issued after SCI_EVT is set. EC firmware on
such platforms will return 0x00 to indicate "no outstanding event". Thus
after seeing an SCI_EVT indication, EC driver need to fetch events until
0x00 returned (see acpi_ec_clear()).
Issue 1 - acpi_ec_submit_query():
It's reported on Samsung laptops that SCI_EVT isn't checked when the
transactions are advanced in ec_poll(), which leads to SCI_EVT triggering
source lost:
If no EC GPE IRQs are arrived after that, EC driver cannot detect this
event and handle it.
See comment 244/247 for kernel bugzilla 44161.
This patch fixes this issue by moving SCI_EVT checks into
advance_transaction(). So that SCI_EVT is checked each time we are going to
handle the EC firmware indications. And this check will happen for both IRQ
context and task context.
Since after doing that, SCI_EVT is also checked after completing a
transaction, ec_check_sci() and ec_check_sci_sync() can be removed.
Issue 2 - acpi_ec_complete_query():
We expect to clear EC_FLAGS_QUERY_PENDING to allow queuing another draining
QR_EC after writing a QR_EC command and before reading the event. After
reading the event, SCI_EVT might be cleared by the firmware, thus it may
not be possible to queue such a draining QR_EC at that time.
But putting the EC_FLAGS_QUERY_PENDING clearing code after
start_transaction() is wrong as there are chances that after
start_transaction(), QR_EC can fail to be sent. If this happens,
EC_FLAG_QUERY_PENDING will be cleared earlier. As a consequence, the
draining QR_EC will also be queued earlier than expected.
This patch also moves this code into advance_transaction() where QR_EC is
just sent (ACPI_EC_COMMAND_POLL flagged) to fix this issue.
Notes:
1. After introducing the 2 SCI_EVT related handlings into
advance_transaction(), a next QR_EC can be queued right after writing
the current QR_EC command and before reading the event. But this still
hasn't implemented the draining behavior as the draining support
requires:
If a previous returned event value isn't 0x00, a draining QR_EC need
to be issued even when SCI_EVT isn't set.
2. In this patch, acpi_os_execute() is also converted into a seperate work
item to avoid invoking kmalloc() in the atomic context. We can do this
because of the previous global lock fix.
3. Originally, EC_FLAGS_EVENT_PENDING is also used to avoid queuing up
multiple work items (created by acpi_os_execute()), this can be covered
by only using a single work item. But this patch still keeps this flag
as there are different usages in the driver initialization steps relying
on this flag.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=44161
Reported-by: Kieran Clancy <clancy.kieran@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-14 19:28:47 +08:00
{
ACPI: EC: Make the event work state machine visible
The EC driver uses a relatively simple state machine for the event
work handling, but it is not really straightforward to figure out.
The states are as follows:
"Ready": The event handling work can be submitted.
In this state, the EC_FLAGS_QUERY_PENDING flag is clear.
"In progress": The event handling work is pending or is being
processed. It cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and both the
events_to_process count is nonzero and the EC_FLAGS_QUERY_GUARDING
flag is clear.
"Complete": The event handling work has been completed, but it still
cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and the
events_to_process count is zero or the EC_FLAGS_QUERY_GUARDING
flag is set.
The state changes from "Ready" to "In progress" when new event is
detected by advance_transaction() and acpi_ec_submit_event() is
called by it.
Next, the state can change from "In progress" directly to "Ready" in
the following situations:
* ec_event_clearing is ACPI_EC_EVT_TIMING_STATUS and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes ACPI_EC_COMMAND_POLL.
* ec_event_clearing is ACPI_EC_EVT_TIMING_QUERY and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* ec_event_clearing is either ACPI_EC_EVT_TIMING_STATUS or
ACPI_EC_EVT_TIMING_QUERY and there are no more events to
process (ie. ec->events_to_process becomes 0).
If ec_event_clearing is ACPI_EC_EVT_TIMING_EVENT, however, the
state must change from "In progress" to "Complete" before it
can change to "Ready". The changes from "In progress" to
"Complete" in that case occur in the following situations:
* The state of an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* There are no more events to process (ie. ec->events_to_process
becomes 0).
Finally, the state changes from "Complete" to "Ready" when
advance_transaction() is invoked when the state is "Complete" and
the state of the current transaction is not ACPI_EC_COMMAND_POLL.
To make this state machine visible in the code, add a new
event_state field to struct acpi_ec and modify the code to use
it istead the EC_FLAGS_QUERY_PENDING and EC_FLAGS_QUERY_GUARDING
flags.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:44:46 +01:00
if ( ec - > event_state ! = EC_EVENT_READY )
2015-06-11 13:21:32 +08:00
ec_dbg_evt ( " Command(%s) unblocked " ,
acpi_ec_cmd_string ( ACPI_EC_COMMAND_QUERY ) ) ;
ACPI: EC: Make the event work state machine visible
The EC driver uses a relatively simple state machine for the event
work handling, but it is not really straightforward to figure out.
The states are as follows:
"Ready": The event handling work can be submitted.
In this state, the EC_FLAGS_QUERY_PENDING flag is clear.
"In progress": The event handling work is pending or is being
processed. It cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and both the
events_to_process count is nonzero and the EC_FLAGS_QUERY_GUARDING
flag is clear.
"Complete": The event handling work has been completed, but it still
cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and the
events_to_process count is zero or the EC_FLAGS_QUERY_GUARDING
flag is set.
The state changes from "Ready" to "In progress" when new event is
detected by advance_transaction() and acpi_ec_submit_event() is
called by it.
Next, the state can change from "In progress" directly to "Ready" in
the following situations:
* ec_event_clearing is ACPI_EC_EVT_TIMING_STATUS and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes ACPI_EC_COMMAND_POLL.
* ec_event_clearing is ACPI_EC_EVT_TIMING_QUERY and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* ec_event_clearing is either ACPI_EC_EVT_TIMING_STATUS or
ACPI_EC_EVT_TIMING_QUERY and there are no more events to
process (ie. ec->events_to_process becomes 0).
If ec_event_clearing is ACPI_EC_EVT_TIMING_EVENT, however, the
state must change from "In progress" to "Complete" before it
can change to "Ready". The changes from "In progress" to
"Complete" in that case occur in the following situations:
* The state of an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* There are no more events to process (ie. ec->events_to_process
becomes 0).
Finally, the state changes from "Complete" to "Ready" when
advance_transaction() is invoked when the state is "Complete" and
the state of the current transaction is not ACPI_EC_COMMAND_POLL.
To make this state machine visible in the code, add a new
event_state field to struct acpi_ec and modify the code to use
it istead the EC_FLAGS_QUERY_PENDING and EC_FLAGS_QUERY_GUARDING
flags.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:44:46 +01:00
ec - > event_state = EC_EVENT_READY ;
2019-10-14 16:56:01 +08:00
acpi_ec_unmask_events ( ec ) ;
ACPI / EC: Fix issues related to the SCI_EVT handling
This patch fixes 2 issues related to the draining behavior. But it doesn't
implement the draining support, it only cleans up code so that further
draining support is possible.
The draining behavior is expected by some platforms (for example, Samsung)
where SCI_EVT is set only once for a set of events and might be cleared for
the very first QR_EC command issued after SCI_EVT is set. EC firmware on
such platforms will return 0x00 to indicate "no outstanding event". Thus
after seeing an SCI_EVT indication, EC driver need to fetch events until
0x00 returned (see acpi_ec_clear()).
Issue 1 - acpi_ec_submit_query():
It's reported on Samsung laptops that SCI_EVT isn't checked when the
transactions are advanced in ec_poll(), which leads to SCI_EVT triggering
source lost:
If no EC GPE IRQs are arrived after that, EC driver cannot detect this
event and handle it.
See comment 244/247 for kernel bugzilla 44161.
This patch fixes this issue by moving SCI_EVT checks into
advance_transaction(). So that SCI_EVT is checked each time we are going to
handle the EC firmware indications. And this check will happen for both IRQ
context and task context.
Since after doing that, SCI_EVT is also checked after completing a
transaction, ec_check_sci() and ec_check_sci_sync() can be removed.
Issue 2 - acpi_ec_complete_query():
We expect to clear EC_FLAGS_QUERY_PENDING to allow queuing another draining
QR_EC after writing a QR_EC command and before reading the event. After
reading the event, SCI_EVT might be cleared by the firmware, thus it may
not be possible to queue such a draining QR_EC at that time.
But putting the EC_FLAGS_QUERY_PENDING clearing code after
start_transaction() is wrong as there are chances that after
start_transaction(), QR_EC can fail to be sent. If this happens,
EC_FLAG_QUERY_PENDING will be cleared earlier. As a consequence, the
draining QR_EC will also be queued earlier than expected.
This patch also moves this code into advance_transaction() where QR_EC is
just sent (ACPI_EC_COMMAND_POLL flagged) to fix this issue.
Notes:
1. After introducing the 2 SCI_EVT related handlings into
advance_transaction(), a next QR_EC can be queued right after writing
the current QR_EC command and before reading the event. But this still
hasn't implemented the draining behavior as the draining support
requires:
If a previous returned event value isn't 0x00, a draining QR_EC need
to be issued even when SCI_EVT isn't set.
2. In this patch, acpi_os_execute() is also converted into a seperate work
item to avoid invoking kmalloc() in the atomic context. We can do this
because of the previous global lock fix.
3. Originally, EC_FLAGS_EVENT_PENDING is also used to avoid queuing up
multiple work items (created by acpi_os_execute()), this can be covered
by only using a single work item. But this patch still keeps this flag
as there are different usages in the driver initialization steps relying
on this flag.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=44161
Reported-by: Kieran Clancy <clancy.kieran@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-14 19:28:47 +08:00
}
2016-08-03 16:01:24 +08:00
static inline void __acpi_ec_enable_event ( struct acpi_ec * ec )
{
if ( ! test_and_set_bit ( EC_FLAGS_QUERY_ENABLED , & ec - > flags ) )
ec_log_drv ( " event unblocked " ) ;
2017-09-26 16:54:03 +08:00
/*
* Unconditionally invoke this once after enabling the event
* handling mechanism to detect the pending events .
*/
2020-11-13 19:13:17 +01:00
advance_transaction ( ec , false ) ;
2016-08-03 16:01:24 +08:00
}
static inline void __acpi_ec_disable_event ( struct acpi_ec * ec )
{
if ( test_and_clear_bit ( EC_FLAGS_QUERY_ENABLED , & ec - > flags ) )
ec_log_drv ( " event blocked " ) ;
}
2019-02-01 14:13:41 +08:00
/*
* Process _Q events that might have accumulated in the EC .
* Run with locked ec mutex .
*/
static void acpi_ec_clear ( struct acpi_ec * ec )
{
2021-11-23 19:38:21 +01:00
int i ;
2019-02-01 14:13:41 +08:00
for ( i = 0 ; i < ACPI_EC_CLEAR_MAX ; i + + ) {
2021-11-23 19:42:02 +01:00
if ( acpi_ec_submit_query ( ec ) )
2019-02-01 14:13:41 +08:00
break ;
}
if ( unlikely ( i = = ACPI_EC_CLEAR_MAX ) )
pr_warn ( " Warning: Maximum of %d stale EC events cleared \n " , i ) ;
else
pr_info ( " %d stale EC events cleared \n " , i ) ;
}
2016-08-03 16:01:24 +08:00
static void acpi_ec_enable_event ( struct acpi_ec * ec )
{
unsigned long flags ;
spin_lock_irqsave ( & ec - > lock , flags ) ;
if ( acpi_ec_started ( ec ) )
__acpi_ec_enable_event ( ec ) ;
spin_unlock_irqrestore ( & ec - > lock , flags ) ;
2019-02-01 14:13:41 +08:00
/* Drain additional events if hardware requires that */
if ( EC_FLAGS_CLEAR_ON_RESUME )
acpi_ec_clear ( ec ) ;
2016-08-03 16:01:24 +08:00
}
2016-10-05 10:33:16 -07:00
# ifdef CONFIG_PM_SLEEP
ACPI: EC: Rework flushing of pending work
There is a race condition in the ACPI EC driver, between
__acpi_ec_flush_event() and acpi_ec_event_handler(), that may
cause systems to stay in suspended-to-idle forever after a wakeup
event coming from the EC.
Namely, acpi_s2idle_wake() calls acpi_ec_flush_work() to wait until
the delayed work resulting from the handling of the EC GPE in
acpi_ec_dispatch_gpe() is processed, and that function invokes
__acpi_ec_flush_event() which uses wait_event() to wait for
ec->nr_pending_queries to become zero on ec->wait, and that wait
queue may be woken up too early.
Suppose that acpi_ec_dispatch_gpe() has caused acpi_ec_gpe_handler()
to run, so advance_transaction() has been called and it has invoked
acpi_ec_submit_query() to queue up an event work item, so
ec->nr_pending_queries has been incremented (under ec->lock). The
work function of that work item, acpi_ec_event_handler() runs later
and calls acpi_ec_query() to process the event. That function calls
acpi_ec_transaction() which invokes acpi_ec_transaction_unlocked()
and the latter wakes up ec->wait under ec->lock, but it drops that
lock before returning.
When acpi_ec_query() returns, acpi_ec_event_handler() acquires
ec->lock and decrements ec->nr_pending_queries, but at that point
__acpi_ec_flush_event() (woken up previously) may already have
acquired ec->lock, checked the value of ec->nr_pending_queries (and
it would not have been zero then) and decided to go back to sleep.
Next, if ec->nr_pending_queries is equal to zero now, the loop
in acpi_ec_event_handler() terminates, ec->lock is released and
acpi_ec_check_event() is called, but it does nothing unless
ec_event_clearing is equal to ACPI_EC_EVT_TIMING_EVENT (which is
not the case by default). In the end, if no more event work items
have been queued up while executing acpi_ec_transaction_unlocked(),
there is nothing to wake up __acpi_ec_flush_event() again and it
sleeps forever, so the suspend-to-idle loop cannot make progress and
the system is permanently suspended.
To avoid this issue, notice that it actually is not necessary to
wait for ec->nr_pending_queries to become zero in every case in
which __acpi_ec_flush_event() is used.
First, during platform-based system suspend (not suspend-to-idle),
__acpi_ec_flush_event() is called by acpi_ec_disable_event() after
clearing the EC_FLAGS_QUERY_ENABLED flag, which prevents
acpi_ec_submit_query() from submitting any new event work items,
so calling flush_scheduled_work() and flushing ec_query_wq
subsequently (in order to wait until all of the queries in that
queue have been processed) would be sufficient to flush all of
the pending EC work in that case.
Second, the purpose of the flushing of pending EC work while
suspended-to-idle described above really is to wait until the
first event work item coming from acpi_ec_dispatch_gpe() is
complete, because it should produce system wakeup events if
that is a valid EC-based system wakeup, so calling
flush_scheduled_work() followed by flushing ec_query_wq is also
sufficient for that purpose.
Rework the code to follow the above observations.
Fixes: 56b9918490 ("PM: sleep: Simplify suspend-to-idle control flow")
Reported-by: Kenneth R. Crudup <kenny@panix.com>
Tested-by: Kenneth R. Crudup <kenny@panix.com>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-11-28 23:47:51 +01:00
static void __acpi_ec_flush_work ( void )
ACPI / EC: Add PM operations to improve event handling for suspend process
In the original EC driver, though the event handling is not explicitly
stopped, the EC driver is actually not able to handle events during the
noirq stage as the EC driver is not prepared to handle the EC events in the
polling mode. So if there is no advance_transaction() triggered, the EC
driver couldn't notice the EC events.
However, do we actually need to handle EC events during suspend/resume
stage? EC events are mostly useless for the suspend/resume period (key
strokes and battery/thermal updates, etc.,), and the useful ones (lid
close, power/sleep button press) should have already been delivered to the
OSPM to trigger the power saving operations.
Thus this patch implements acpi_ec_disable_event() to be a reverse call of
acpi_ec_enable_event(), with which, the EC driver is able to stop handling
the EC events in a position before entering the noirq stage.
Since there are actually 2 choices for us:
1. implement event handling in polling mode;
2. stop event handling before entering noirq stage.
And this patch only implements the second choice using .suspend() callback.
Thus this is experimental (first choice is better? or different hook
position is better?). This patch finally keeps the old behavior by default
and prepares a boot parameter to enable this feature.
The differences of the event handling availability between the old behavior
(this patch is not applied) and the new behavior (this patch is applied)
are as follows:
!FreezeEvents FreezeEvents
before suspend Y Y
suspend before EC Y Y
suspend after EC Y N
suspend_late Y N
suspend_noirq Y (actually N) N
resume_noirq Y (actually N) N
resume_late Y (actually N) N
resume before EC Y (actually N) N
resume after EC Y Y
after resume Y Y
Where "actually N" means if there is no EC transactions, the EC driver
is actually not able to notice the pending events.
We can see that FreezeEvents is the only approach now can actually flush
the EC event handling with both query commands and _Qxx evaluations
flushed, other modes can only flush the EC event handling with only query
commands flushed, _Qxx evaluations occurred after stopping the EC driver
may end up failure due to the failure of the EC transaction carried out in
the _Qxx control methods.
We also can see that this feature should be able to trigger some platform
notifications later than resuming other drivers.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-03 16:01:43 +08:00
{
ACPI: EC: Rework flushing of EC work while suspended to idle
The flushing of pending work in the EC driver uses drain_workqueue()
to flush the event handling work that can requeue itself via
advance_transaction(), but this is problematic, because that
work may also be requeued from the query workqueue.
Namely, if an EC transaction is carried out during the execution of
a query handler, it involves calling advance_transaction() which
may queue up the event handling work again. This causes the kernel
to complain about attempts to add a work item to the EC event
workqueue while it is being drained and worst-case it may cause a
valid event to be skipped.
To avoid this problem, introduce two new counters, events_in_progress
and queries_in_progress, incremented when a work item is queued on
the event workqueue or the query workqueue, respectively, and
decremented at the end of the corresponding work function, and make
acpi_ec_dispatch_gpe() the workqueues in a loop until the both of
these counters are zero (or system wakeup is pending) instead of
calling acpi_ec_flush_work().
At the same time, change __acpi_ec_flush_work() to call
flush_workqueue() instead of drain_workqueue() to flush the event
workqueue.
While at it, use the observation that the work item queued in
acpi_ec_query() cannot be pending at that time, because it is used
only once, to simplify the code in there.
Additionally, clean up a comment in acpi_ec_query() and adjust white
space in acpi_ec_event_processor().
Fixes: f0ac20c3f613 ("ACPI: EC: Fix flushing of pending work")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:36:51 +01:00
flush_workqueue ( ec_wq ) ; /* flush ec->work */
ACPI: EC: Rework flushing of pending work
There is a race condition in the ACPI EC driver, between
__acpi_ec_flush_event() and acpi_ec_event_handler(), that may
cause systems to stay in suspended-to-idle forever after a wakeup
event coming from the EC.
Namely, acpi_s2idle_wake() calls acpi_ec_flush_work() to wait until
the delayed work resulting from the handling of the EC GPE in
acpi_ec_dispatch_gpe() is processed, and that function invokes
__acpi_ec_flush_event() which uses wait_event() to wait for
ec->nr_pending_queries to become zero on ec->wait, and that wait
queue may be woken up too early.
Suppose that acpi_ec_dispatch_gpe() has caused acpi_ec_gpe_handler()
to run, so advance_transaction() has been called and it has invoked
acpi_ec_submit_query() to queue up an event work item, so
ec->nr_pending_queries has been incremented (under ec->lock). The
work function of that work item, acpi_ec_event_handler() runs later
and calls acpi_ec_query() to process the event. That function calls
acpi_ec_transaction() which invokes acpi_ec_transaction_unlocked()
and the latter wakes up ec->wait under ec->lock, but it drops that
lock before returning.
When acpi_ec_query() returns, acpi_ec_event_handler() acquires
ec->lock and decrements ec->nr_pending_queries, but at that point
__acpi_ec_flush_event() (woken up previously) may already have
acquired ec->lock, checked the value of ec->nr_pending_queries (and
it would not have been zero then) and decided to go back to sleep.
Next, if ec->nr_pending_queries is equal to zero now, the loop
in acpi_ec_event_handler() terminates, ec->lock is released and
acpi_ec_check_event() is called, but it does nothing unless
ec_event_clearing is equal to ACPI_EC_EVT_TIMING_EVENT (which is
not the case by default). In the end, if no more event work items
have been queued up while executing acpi_ec_transaction_unlocked(),
there is nothing to wake up __acpi_ec_flush_event() again and it
sleeps forever, so the suspend-to-idle loop cannot make progress and
the system is permanently suspended.
To avoid this issue, notice that it actually is not necessary to
wait for ec->nr_pending_queries to become zero in every case in
which __acpi_ec_flush_event() is used.
First, during platform-based system suspend (not suspend-to-idle),
__acpi_ec_flush_event() is called by acpi_ec_disable_event() after
clearing the EC_FLAGS_QUERY_ENABLED flag, which prevents
acpi_ec_submit_query() from submitting any new event work items,
so calling flush_scheduled_work() and flushing ec_query_wq
subsequently (in order to wait until all of the queries in that
queue have been processed) would be sufficient to flush all of
the pending EC work in that case.
Second, the purpose of the flushing of pending EC work while
suspended-to-idle described above really is to wait until the
first event work item coming from acpi_ec_dispatch_gpe() is
complete, because it should produce system wakeup events if
that is a valid EC-based system wakeup, so calling
flush_scheduled_work() followed by flushing ec_query_wq is also
sufficient for that purpose.
Rework the code to follow the above observations.
Fixes: 56b9918490 ("PM: sleep: Simplify suspend-to-idle control flow")
Reported-by: Kenneth R. Crudup <kenny@panix.com>
Tested-by: Kenneth R. Crudup <kenny@panix.com>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-11-28 23:47:51 +01:00
flush_workqueue ( ec_query_wq ) ; /* flush queries */
ACPI / EC: Add PM operations to improve event handling for suspend process
In the original EC driver, though the event handling is not explicitly
stopped, the EC driver is actually not able to handle events during the
noirq stage as the EC driver is not prepared to handle the EC events in the
polling mode. So if there is no advance_transaction() triggered, the EC
driver couldn't notice the EC events.
However, do we actually need to handle EC events during suspend/resume
stage? EC events are mostly useless for the suspend/resume period (key
strokes and battery/thermal updates, etc.,), and the useful ones (lid
close, power/sleep button press) should have already been delivered to the
OSPM to trigger the power saving operations.
Thus this patch implements acpi_ec_disable_event() to be a reverse call of
acpi_ec_enable_event(), with which, the EC driver is able to stop handling
the EC events in a position before entering the noirq stage.
Since there are actually 2 choices for us:
1. implement event handling in polling mode;
2. stop event handling before entering noirq stage.
And this patch only implements the second choice using .suspend() callback.
Thus this is experimental (first choice is better? or different hook
position is better?). This patch finally keeps the old behavior by default
and prepares a boot parameter to enable this feature.
The differences of the event handling availability between the old behavior
(this patch is not applied) and the new behavior (this patch is applied)
are as follows:
!FreezeEvents FreezeEvents
before suspend Y Y
suspend before EC Y Y
suspend after EC Y N
suspend_late Y N
suspend_noirq Y (actually N) N
resume_noirq Y (actually N) N
resume_late Y (actually N) N
resume before EC Y (actually N) N
resume after EC Y Y
after resume Y Y
Where "actually N" means if there is no EC transactions, the EC driver
is actually not able to notice the pending events.
We can see that FreezeEvents is the only approach now can actually flush
the EC event handling with both query commands and _Qxx evaluations
flushed, other modes can only flush the EC event handling with only query
commands flushed, _Qxx evaluations occurred after stopping the EC driver
may end up failure due to the failure of the EC transaction carried out in
the _Qxx control methods.
We also can see that this feature should be able to trigger some platform
notifications later than resuming other drivers.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-03 16:01:43 +08:00
}
static void acpi_ec_disable_event ( struct acpi_ec * ec )
{
unsigned long flags ;
spin_lock_irqsave ( & ec - > lock , flags ) ;
__acpi_ec_disable_event ( ec ) ;
spin_unlock_irqrestore ( & ec - > lock , flags ) ;
ACPI: EC: Rework flushing of pending work
There is a race condition in the ACPI EC driver, between
__acpi_ec_flush_event() and acpi_ec_event_handler(), that may
cause systems to stay in suspended-to-idle forever after a wakeup
event coming from the EC.
Namely, acpi_s2idle_wake() calls acpi_ec_flush_work() to wait until
the delayed work resulting from the handling of the EC GPE in
acpi_ec_dispatch_gpe() is processed, and that function invokes
__acpi_ec_flush_event() which uses wait_event() to wait for
ec->nr_pending_queries to become zero on ec->wait, and that wait
queue may be woken up too early.
Suppose that acpi_ec_dispatch_gpe() has caused acpi_ec_gpe_handler()
to run, so advance_transaction() has been called and it has invoked
acpi_ec_submit_query() to queue up an event work item, so
ec->nr_pending_queries has been incremented (under ec->lock). The
work function of that work item, acpi_ec_event_handler() runs later
and calls acpi_ec_query() to process the event. That function calls
acpi_ec_transaction() which invokes acpi_ec_transaction_unlocked()
and the latter wakes up ec->wait under ec->lock, but it drops that
lock before returning.
When acpi_ec_query() returns, acpi_ec_event_handler() acquires
ec->lock and decrements ec->nr_pending_queries, but at that point
__acpi_ec_flush_event() (woken up previously) may already have
acquired ec->lock, checked the value of ec->nr_pending_queries (and
it would not have been zero then) and decided to go back to sleep.
Next, if ec->nr_pending_queries is equal to zero now, the loop
in acpi_ec_event_handler() terminates, ec->lock is released and
acpi_ec_check_event() is called, but it does nothing unless
ec_event_clearing is equal to ACPI_EC_EVT_TIMING_EVENT (which is
not the case by default). In the end, if no more event work items
have been queued up while executing acpi_ec_transaction_unlocked(),
there is nothing to wake up __acpi_ec_flush_event() again and it
sleeps forever, so the suspend-to-idle loop cannot make progress and
the system is permanently suspended.
To avoid this issue, notice that it actually is not necessary to
wait for ec->nr_pending_queries to become zero in every case in
which __acpi_ec_flush_event() is used.
First, during platform-based system suspend (not suspend-to-idle),
__acpi_ec_flush_event() is called by acpi_ec_disable_event() after
clearing the EC_FLAGS_QUERY_ENABLED flag, which prevents
acpi_ec_submit_query() from submitting any new event work items,
so calling flush_scheduled_work() and flushing ec_query_wq
subsequently (in order to wait until all of the queries in that
queue have been processed) would be sufficient to flush all of
the pending EC work in that case.
Second, the purpose of the flushing of pending EC work while
suspended-to-idle described above really is to wait until the
first event work item coming from acpi_ec_dispatch_gpe() is
complete, because it should produce system wakeup events if
that is a valid EC-based system wakeup, so calling
flush_scheduled_work() followed by flushing ec_query_wq is also
sufficient for that purpose.
Rework the code to follow the above observations.
Fixes: 56b9918490 ("PM: sleep: Simplify suspend-to-idle control flow")
Reported-by: Kenneth R. Crudup <kenny@panix.com>
Tested-by: Kenneth R. Crudup <kenny@panix.com>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-11-28 23:47:51 +01:00
/*
* When ec_freeze_events is true , we need to flush events in
* the proper position before entering the noirq stage .
*/
__acpi_ec_flush_work ( ) ;
ACPI / EC: Add PM operations to improve event handling for suspend process
In the original EC driver, though the event handling is not explicitly
stopped, the EC driver is actually not able to handle events during the
noirq stage as the EC driver is not prepared to handle the EC events in the
polling mode. So if there is no advance_transaction() triggered, the EC
driver couldn't notice the EC events.
However, do we actually need to handle EC events during suspend/resume
stage? EC events are mostly useless for the suspend/resume period (key
strokes and battery/thermal updates, etc.,), and the useful ones (lid
close, power/sleep button press) should have already been delivered to the
OSPM to trigger the power saving operations.
Thus this patch implements acpi_ec_disable_event() to be a reverse call of
acpi_ec_enable_event(), with which, the EC driver is able to stop handling
the EC events in a position before entering the noirq stage.
Since there are actually 2 choices for us:
1. implement event handling in polling mode;
2. stop event handling before entering noirq stage.
And this patch only implements the second choice using .suspend() callback.
Thus this is experimental (first choice is better? or different hook
position is better?). This patch finally keeps the old behavior by default
and prepares a boot parameter to enable this feature.
The differences of the event handling availability between the old behavior
(this patch is not applied) and the new behavior (this patch is applied)
are as follows:
!FreezeEvents FreezeEvents
before suspend Y Y
suspend before EC Y Y
suspend after EC Y N
suspend_late Y N
suspend_noirq Y (actually N) N
resume_noirq Y (actually N) N
resume_late Y (actually N) N
resume before EC Y (actually N) N
resume after EC Y Y
after resume Y Y
Where "actually N" means if there is no EC transactions, the EC driver
is actually not able to notice the pending events.
We can see that FreezeEvents is the only approach now can actually flush
the EC event handling with both query commands and _Qxx evaluations
flushed, other modes can only flush the EC event handling with only query
commands flushed, _Qxx evaluations occurred after stopping the EC driver
may end up failure due to the failure of the EC transaction carried out in
the _Qxx control methods.
We also can see that this feature should be able to trigger some platform
notifications later than resuming other drivers.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-03 16:01:43 +08:00
}
2017-07-20 03:43:12 +02:00
void acpi_ec_flush_work ( void )
{
ACPI: EC: Fix flushing of pending work
Commit 016b87ca5c8c ("ACPI: EC: Rework flushing of pending work")
introduced a subtle bug into the flushing of pending EC work while
suspended to idle, which may cause the EC driver to fail to
re-enable the EC GPE after handling a non-wakeup event (like a
battery status change event, for example).
The problem is that the work item flushed by flush_scheduled_work()
in __acpi_ec_flush_work() may disable the EC GPE and schedule another
work item expected to re-enable it, but that new work item is not
flushed, so __acpi_ec_flush_work() returns with the EC GPE disabled
and the CPU running it goes into an idle state subsequently. If all
of the other CPUs are in idle states at that point, the EC GPE won't
be re-enabled until at least one CPU is woken up by another interrupt
source, so system wakeup events that would normally come from the EC
then don't work.
This is reproducible on a Dell XPS13 9360 in my office which
sometimes stops reacting to power button and lid events (triggered
by the EC on that machine) after switching from AC power to battery
power or vice versa while suspended to idle (each of those switches
causes the EC GPE to trigger for several times in a row, but they
are not system wakeup events).
To avoid this problem, it is necessary to drain the workqueue
entirely in __acpi_ec_flush_work(), but that cannot be done with
respect to system_wq, because work items may be added to it from
other places while __acpi_ec_flush_work() is running. For this
reason, make the EC driver use a dedicated workqueue for EC events
processing (let that workqueue be ordered so that EC events are
processed sequentially) and use drain_workqueue() on it in
__acpi_ec_flush_work().
Fixes: 016b87ca5c8c ("ACPI: EC: Rework flushing of pending work")
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-02-11 10:07:43 +01:00
/* Without ec_wq there is nothing to flush. */
if ( ! ec_wq )
ACPI: EC: Rework flushing of pending work
There is a race condition in the ACPI EC driver, between
__acpi_ec_flush_event() and acpi_ec_event_handler(), that may
cause systems to stay in suspended-to-idle forever after a wakeup
event coming from the EC.
Namely, acpi_s2idle_wake() calls acpi_ec_flush_work() to wait until
the delayed work resulting from the handling of the EC GPE in
acpi_ec_dispatch_gpe() is processed, and that function invokes
__acpi_ec_flush_event() which uses wait_event() to wait for
ec->nr_pending_queries to become zero on ec->wait, and that wait
queue may be woken up too early.
Suppose that acpi_ec_dispatch_gpe() has caused acpi_ec_gpe_handler()
to run, so advance_transaction() has been called and it has invoked
acpi_ec_submit_query() to queue up an event work item, so
ec->nr_pending_queries has been incremented (under ec->lock). The
work function of that work item, acpi_ec_event_handler() runs later
and calls acpi_ec_query() to process the event. That function calls
acpi_ec_transaction() which invokes acpi_ec_transaction_unlocked()
and the latter wakes up ec->wait under ec->lock, but it drops that
lock before returning.
When acpi_ec_query() returns, acpi_ec_event_handler() acquires
ec->lock and decrements ec->nr_pending_queries, but at that point
__acpi_ec_flush_event() (woken up previously) may already have
acquired ec->lock, checked the value of ec->nr_pending_queries (and
it would not have been zero then) and decided to go back to sleep.
Next, if ec->nr_pending_queries is equal to zero now, the loop
in acpi_ec_event_handler() terminates, ec->lock is released and
acpi_ec_check_event() is called, but it does nothing unless
ec_event_clearing is equal to ACPI_EC_EVT_TIMING_EVENT (which is
not the case by default). In the end, if no more event work items
have been queued up while executing acpi_ec_transaction_unlocked(),
there is nothing to wake up __acpi_ec_flush_event() again and it
sleeps forever, so the suspend-to-idle loop cannot make progress and
the system is permanently suspended.
To avoid this issue, notice that it actually is not necessary to
wait for ec->nr_pending_queries to become zero in every case in
which __acpi_ec_flush_event() is used.
First, during platform-based system suspend (not suspend-to-idle),
__acpi_ec_flush_event() is called by acpi_ec_disable_event() after
clearing the EC_FLAGS_QUERY_ENABLED flag, which prevents
acpi_ec_submit_query() from submitting any new event work items,
so calling flush_scheduled_work() and flushing ec_query_wq
subsequently (in order to wait until all of the queries in that
queue have been processed) would be sufficient to flush all of
the pending EC work in that case.
Second, the purpose of the flushing of pending EC work while
suspended-to-idle described above really is to wait until the
first event work item coming from acpi_ec_dispatch_gpe() is
complete, because it should produce system wakeup events if
that is a valid EC-based system wakeup, so calling
flush_scheduled_work() followed by flushing ec_query_wq is also
sufficient for that purpose.
Rework the code to follow the above observations.
Fixes: 56b9918490 ("PM: sleep: Simplify suspend-to-idle control flow")
Reported-by: Kenneth R. Crudup <kenny@panix.com>
Tested-by: Kenneth R. Crudup <kenny@panix.com>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-11-28 23:47:51 +01:00
return ;
2017-07-20 03:43:12 +02:00
ACPI: EC: Rework flushing of pending work
There is a race condition in the ACPI EC driver, between
__acpi_ec_flush_event() and acpi_ec_event_handler(), that may
cause systems to stay in suspended-to-idle forever after a wakeup
event coming from the EC.
Namely, acpi_s2idle_wake() calls acpi_ec_flush_work() to wait until
the delayed work resulting from the handling of the EC GPE in
acpi_ec_dispatch_gpe() is processed, and that function invokes
__acpi_ec_flush_event() which uses wait_event() to wait for
ec->nr_pending_queries to become zero on ec->wait, and that wait
queue may be woken up too early.
Suppose that acpi_ec_dispatch_gpe() has caused acpi_ec_gpe_handler()
to run, so advance_transaction() has been called and it has invoked
acpi_ec_submit_query() to queue up an event work item, so
ec->nr_pending_queries has been incremented (under ec->lock). The
work function of that work item, acpi_ec_event_handler() runs later
and calls acpi_ec_query() to process the event. That function calls
acpi_ec_transaction() which invokes acpi_ec_transaction_unlocked()
and the latter wakes up ec->wait under ec->lock, but it drops that
lock before returning.
When acpi_ec_query() returns, acpi_ec_event_handler() acquires
ec->lock and decrements ec->nr_pending_queries, but at that point
__acpi_ec_flush_event() (woken up previously) may already have
acquired ec->lock, checked the value of ec->nr_pending_queries (and
it would not have been zero then) and decided to go back to sleep.
Next, if ec->nr_pending_queries is equal to zero now, the loop
in acpi_ec_event_handler() terminates, ec->lock is released and
acpi_ec_check_event() is called, but it does nothing unless
ec_event_clearing is equal to ACPI_EC_EVT_TIMING_EVENT (which is
not the case by default). In the end, if no more event work items
have been queued up while executing acpi_ec_transaction_unlocked(),
there is nothing to wake up __acpi_ec_flush_event() again and it
sleeps forever, so the suspend-to-idle loop cannot make progress and
the system is permanently suspended.
To avoid this issue, notice that it actually is not necessary to
wait for ec->nr_pending_queries to become zero in every case in
which __acpi_ec_flush_event() is used.
First, during platform-based system suspend (not suspend-to-idle),
__acpi_ec_flush_event() is called by acpi_ec_disable_event() after
clearing the EC_FLAGS_QUERY_ENABLED flag, which prevents
acpi_ec_submit_query() from submitting any new event work items,
so calling flush_scheduled_work() and flushing ec_query_wq
subsequently (in order to wait until all of the queries in that
queue have been processed) would be sufficient to flush all of
the pending EC work in that case.
Second, the purpose of the flushing of pending EC work while
suspended-to-idle described above really is to wait until the
first event work item coming from acpi_ec_dispatch_gpe() is
complete, because it should produce system wakeup events if
that is a valid EC-based system wakeup, so calling
flush_scheduled_work() followed by flushing ec_query_wq is also
sufficient for that purpose.
Rework the code to follow the above observations.
Fixes: 56b9918490 ("PM: sleep: Simplify suspend-to-idle control flow")
Reported-by: Kenneth R. Crudup <kenny@panix.com>
Tested-by: Kenneth R. Crudup <kenny@panix.com>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-11-28 23:47:51 +01:00
__acpi_ec_flush_work ( ) ;
2017-07-20 03:43:12 +02:00
}
2016-10-05 10:33:16 -07:00
# endif /* CONFIG_PM_SLEEP */
ACPI / EC: Add PM operations to improve event handling for suspend process
In the original EC driver, though the event handling is not explicitly
stopped, the EC driver is actually not able to handle events during the
noirq stage as the EC driver is not prepared to handle the EC events in the
polling mode. So if there is no advance_transaction() triggered, the EC
driver couldn't notice the EC events.
However, do we actually need to handle EC events during suspend/resume
stage? EC events are mostly useless for the suspend/resume period (key
strokes and battery/thermal updates, etc.,), and the useful ones (lid
close, power/sleep button press) should have already been delivered to the
OSPM to trigger the power saving operations.
Thus this patch implements acpi_ec_disable_event() to be a reverse call of
acpi_ec_enable_event(), with which, the EC driver is able to stop handling
the EC events in a position before entering the noirq stage.
Since there are actually 2 choices for us:
1. implement event handling in polling mode;
2. stop event handling before entering noirq stage.
And this patch only implements the second choice using .suspend() callback.
Thus this is experimental (first choice is better? or different hook
position is better?). This patch finally keeps the old behavior by default
and prepares a boot parameter to enable this feature.
The differences of the event handling availability between the old behavior
(this patch is not applied) and the new behavior (this patch is applied)
are as follows:
!FreezeEvents FreezeEvents
before suspend Y Y
suspend before EC Y Y
suspend after EC Y N
suspend_late Y N
suspend_noirq Y (actually N) N
resume_noirq Y (actually N) N
resume_late Y (actually N) N
resume before EC Y (actually N) N
resume after EC Y Y
after resume Y Y
Where "actually N" means if there is no EC transactions, the EC driver
is actually not able to notice the pending events.
We can see that FreezeEvents is the only approach now can actually flush
the EC event handling with both query commands and _Qxx evaluations
flushed, other modes can only flush the EC event handling with only query
commands flushed, _Qxx evaluations occurred after stopping the EC driver
may end up failure due to the failure of the EC transaction carried out in
the _Qxx control methods.
We also can see that this feature should be able to trigger some platform
notifications later than resuming other drivers.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-03 16:01:43 +08:00
2015-06-11 13:21:38 +08:00
static bool acpi_ec_guard_event ( struct acpi_ec * ec )
{
2015-09-24 14:54:54 +08:00
unsigned long flags ;
ACPI: EC: Make the event work state machine visible
The EC driver uses a relatively simple state machine for the event
work handling, but it is not really straightforward to figure out.
The states are as follows:
"Ready": The event handling work can be submitted.
In this state, the EC_FLAGS_QUERY_PENDING flag is clear.
"In progress": The event handling work is pending or is being
processed. It cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and both the
events_to_process count is nonzero and the EC_FLAGS_QUERY_GUARDING
flag is clear.
"Complete": The event handling work has been completed, but it still
cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and the
events_to_process count is zero or the EC_FLAGS_QUERY_GUARDING
flag is set.
The state changes from "Ready" to "In progress" when new event is
detected by advance_transaction() and acpi_ec_submit_event() is
called by it.
Next, the state can change from "In progress" directly to "Ready" in
the following situations:
* ec_event_clearing is ACPI_EC_EVT_TIMING_STATUS and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes ACPI_EC_COMMAND_POLL.
* ec_event_clearing is ACPI_EC_EVT_TIMING_QUERY and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* ec_event_clearing is either ACPI_EC_EVT_TIMING_STATUS or
ACPI_EC_EVT_TIMING_QUERY and there are no more events to
process (ie. ec->events_to_process becomes 0).
If ec_event_clearing is ACPI_EC_EVT_TIMING_EVENT, however, the
state must change from "In progress" to "Complete" before it
can change to "Ready". The changes from "In progress" to
"Complete" in that case occur in the following situations:
* The state of an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* There are no more events to process (ie. ec->events_to_process
becomes 0).
Finally, the state changes from "Complete" to "Ready" when
advance_transaction() is invoked when the state is "Complete" and
the state of the current transaction is not ACPI_EC_COMMAND_POLL.
To make this state machine visible in the code, add a new
event_state field to struct acpi_ec and modify the code to use
it istead the EC_FLAGS_QUERY_PENDING and EC_FLAGS_QUERY_GUARDING
flags.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:44:46 +01:00
bool guarded ;
2015-09-24 14:54:54 +08:00
spin_lock_irqsave ( & ec - > lock , flags ) ;
/*
* If firmware SCI_EVT clearing timing is " event " , we actually
* don ' t know when the SCI_EVT will be cleared by firmware after
* evaluating _Qxx , so we need to re - check SCI_EVT after waiting an
* acceptable period .
*
ACPI: EC: Make the event work state machine visible
The EC driver uses a relatively simple state machine for the event
work handling, but it is not really straightforward to figure out.
The states are as follows:
"Ready": The event handling work can be submitted.
In this state, the EC_FLAGS_QUERY_PENDING flag is clear.
"In progress": The event handling work is pending or is being
processed. It cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and both the
events_to_process count is nonzero and the EC_FLAGS_QUERY_GUARDING
flag is clear.
"Complete": The event handling work has been completed, but it still
cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and the
events_to_process count is zero or the EC_FLAGS_QUERY_GUARDING
flag is set.
The state changes from "Ready" to "In progress" when new event is
detected by advance_transaction() and acpi_ec_submit_event() is
called by it.
Next, the state can change from "In progress" directly to "Ready" in
the following situations:
* ec_event_clearing is ACPI_EC_EVT_TIMING_STATUS and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes ACPI_EC_COMMAND_POLL.
* ec_event_clearing is ACPI_EC_EVT_TIMING_QUERY and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* ec_event_clearing is either ACPI_EC_EVT_TIMING_STATUS or
ACPI_EC_EVT_TIMING_QUERY and there are no more events to
process (ie. ec->events_to_process becomes 0).
If ec_event_clearing is ACPI_EC_EVT_TIMING_EVENT, however, the
state must change from "In progress" to "Complete" before it
can change to "Ready". The changes from "In progress" to
"Complete" in that case occur in the following situations:
* The state of an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* There are no more events to process (ie. ec->events_to_process
becomes 0).
Finally, the state changes from "Complete" to "Ready" when
advance_transaction() is invoked when the state is "Complete" and
the state of the current transaction is not ACPI_EC_COMMAND_POLL.
To make this state machine visible in the code, add a new
event_state field to struct acpi_ec and modify the code to use
it istead the EC_FLAGS_QUERY_PENDING and EC_FLAGS_QUERY_GUARDING
flags.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:44:46 +01:00
* The guarding period is applicable if the event state is not
* EC_EVENT_READY , but otherwise if the current transaction is of the
* ACPI_EC_COMMAND_QUERY type , the guarding should have elapsed already
* and it should not be applied to let the transaction transition into
* the ACPI_EC_COMMAND_POLL state immediately .
2015-09-24 14:54:54 +08:00
*/
ACPI: EC: Make the event work state machine visible
The EC driver uses a relatively simple state machine for the event
work handling, but it is not really straightforward to figure out.
The states are as follows:
"Ready": The event handling work can be submitted.
In this state, the EC_FLAGS_QUERY_PENDING flag is clear.
"In progress": The event handling work is pending or is being
processed. It cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and both the
events_to_process count is nonzero and the EC_FLAGS_QUERY_GUARDING
flag is clear.
"Complete": The event handling work has been completed, but it still
cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and the
events_to_process count is zero or the EC_FLAGS_QUERY_GUARDING
flag is set.
The state changes from "Ready" to "In progress" when new event is
detected by advance_transaction() and acpi_ec_submit_event() is
called by it.
Next, the state can change from "In progress" directly to "Ready" in
the following situations:
* ec_event_clearing is ACPI_EC_EVT_TIMING_STATUS and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes ACPI_EC_COMMAND_POLL.
* ec_event_clearing is ACPI_EC_EVT_TIMING_QUERY and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* ec_event_clearing is either ACPI_EC_EVT_TIMING_STATUS or
ACPI_EC_EVT_TIMING_QUERY and there are no more events to
process (ie. ec->events_to_process becomes 0).
If ec_event_clearing is ACPI_EC_EVT_TIMING_EVENT, however, the
state must change from "In progress" to "Complete" before it
can change to "Ready". The changes from "In progress" to
"Complete" in that case occur in the following situations:
* The state of an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* There are no more events to process (ie. ec->events_to_process
becomes 0).
Finally, the state changes from "Complete" to "Ready" when
advance_transaction() is invoked when the state is "Complete" and
the state of the current transaction is not ACPI_EC_COMMAND_POLL.
To make this state machine visible in the code, add a new
event_state field to struct acpi_ec and modify the code to use
it istead the EC_FLAGS_QUERY_PENDING and EC_FLAGS_QUERY_GUARDING
flags.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:44:46 +01:00
guarded = ec_event_clearing = = ACPI_EC_EVT_TIMING_EVENT & &
ec - > event_state ! = EC_EVENT_READY & &
( ! ec - > curr | | ec - > curr - > command ! = ACPI_EC_COMMAND_QUERY ) ;
2015-09-24 14:54:54 +08:00
spin_unlock_irqrestore ( & ec - > lock , flags ) ;
return guarded ;
2015-06-11 13:21:38 +08:00
}
ACPI / EC: Fix and clean up register access guarding logics.
In the polling mode, EC driver shouldn't access the EC registers too
frequently. Though this statement is concluded from the non-root caused
bugs (see links below), we've maintained the register access guarding
logics in the current EC driver. The guarding logics can be found here and
there, makes it hard to root cause real timing issues. This patch collects
the guarding logics into one single function so that all hidden logics
related to this can be seen clearly.
The current guarding related code also has several issues:
1. Per-transaction timestamp prevents inter-transaction guarding from being
implemented in the same place. We have an inter-transaction udelay() in
acpi_ec_transaction_unblocked(), this logic can be merged into ec_poll()
if we can use per-device timestamp. This patch completes such merge to
form a new ec_guard() function and collects all guarding related hidden
logics in it.
One hidden logic is: there is no inter-transaction guarding performed
for non MSI quirk (wait polling mode), this patch skips
inter-transaction guarding before wait_event_timeout() for the wait
polling mode to reveal the hidden logic.
The other hidden logic is: there is msleep() inter-transaction guarding
performed when the GPE storming is observed. As after merging this
commit:
Commit: e1d4d90fc0313d3d58cbd7912c90f8ef24df45ff
Subject: ACPI / EC: Refine command storm prevention support
EC_FLAGS_COMMAND_STORM is ensured to be cleared after invoking
acpi_ec_transaction_unlocked(), the msleep() guard logic will never
happen now. Since no one complains such change, this logic is likely
added during the old times where the EC race issues are not fixed and
the bugs are false root-caused to the timing issue. This patch simply
removes the out-dated logic. We can restore it by stop skipping
inter-transaction guarding for wait polling mode.
Two different delay values are defined for msleep() and udelay() while
they are merged in this patch to 550us.
2. time_after() causes additional delay in the polling mode (can only be
observed in noirq suspend/resume processes where polling mode is always
used) before advance_transaction() is invoked ("wait polling" log is
added before wait_event_timeout()). We can see 2 wait_event_timeout()
invocations. This is because time_after() ensures a ">" validation while
we only need a ">=" validation here:
[ 86.739909] ACPI: Waking up from system sleep state S3
[ 86.742857] ACPI : EC: 2: Increase command
[ 86.742859] ACPI : EC: ***** Command(RD_EC) started *****
[ 86.742861] ACPI : EC: ===== TASK (0) =====
[ 86.742871] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.742873] ACPI : EC: EC_SC(W) = 0x80
[ 86.742876] ACPI : EC: ***** Event started *****
[ 86.742880] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.743972] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.747966] ACPI : EC: ===== TASK (0) =====
[ 86.747977] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.747978] ACPI : EC: EC_DATA(W) = 0x06
[ 86.747981] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.751971] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755969] ACPI : EC: ===== TASK (0) =====
[ 86.755991] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 86.755993] ACPI : EC: EC_DATA(R) = 0x03
[ 86.755994] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755995] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 86.755996] ACPI : EC: 1: Decrease command
This patch corrects this by using time_before() instead in ec_guard():
[ 54.283146] ACPI: Waking up from system sleep state S3
[ 54.285414] ACPI : EC: 2: Increase command
[ 54.285415] ACPI : EC: ***** Command(RD_EC) started *****
[ 54.285416] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.285417] ACPI : EC: ===== TASK (0) =====
[ 54.285424] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.285425] ACPI : EC: EC_SC(W) = 0x80
[ 54.285427] ACPI : EC: ***** Event started *****
[ 54.285429] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.287209] ACPI : EC: ===== TASK (0) =====
[ 54.287218] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.287219] ACPI : EC: EC_DATA(W) = 0x06
[ 54.287222] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291190] ACPI : EC: ===== TASK (0) =====
[ 54.291210] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 54.291213] ACPI : EC: EC_DATA(R) = 0x03
[ 54.291214] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291215] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 54.291216] ACPI : EC: 1: Decrease command
After cleaning up all guarding logics, we have one single function
ec_guard() collecting all old, non-root-caused, hidden logics. Then we can
easily tune the logics in one place to respond to the bug reports.
Except the time_before() change, all other changes do not change the
behavior of the EC driver.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=12011
Link: https://bugzilla.kernel.org/show_bug.cgi?id=20242
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 14:16:42 +08:00
static int ec_transaction_polled ( struct acpi_ec * ec )
{
unsigned long flags ;
int ret = 0 ;
spin_lock_irqsave ( & ec - > lock , flags ) ;
if ( ec - > curr & & ( ec - > curr - > flags & ACPI_EC_COMMAND_POLL ) )
ret = 1 ;
spin_unlock_irqrestore ( & ec - > lock , flags ) ;
return ret ;
}
2014-06-15 08:41:35 +08:00
static int ec_transaction_completed ( struct acpi_ec * ec )
2006-09-26 19:50:33 +04:00
{
2008-09-25 21:00:31 +04:00
unsigned long flags ;
int ret = 0 ;
2014-10-14 14:24:01 +08:00
2012-10-23 01:29:27 +02:00
spin_lock_irqsave ( & ec - > lock , flags ) ;
2014-06-15 08:42:07 +08:00
if ( ec - > curr & & ( ec - > curr - > flags & ACPI_EC_COMMAND_COMPLETE ) )
2008-09-25 21:00:31 +04:00
ret = 1 ;
2012-10-23 01:29:27 +02:00
spin_unlock_irqrestore ( & ec - > lock , flags ) ;
2008-09-25 21:00:31 +04:00
return ret ;
2005-07-23 04:08:00 -04:00
}
2005-03-19 01:10:05 -05:00
2015-06-11 13:21:23 +08:00
static inline void ec_transaction_transition ( struct acpi_ec * ec , unsigned long flag )
{
ec - > curr - > flags | = flag ;
ACPI: EC: Make the event work state machine visible
The EC driver uses a relatively simple state machine for the event
work handling, but it is not really straightforward to figure out.
The states are as follows:
"Ready": The event handling work can be submitted.
In this state, the EC_FLAGS_QUERY_PENDING flag is clear.
"In progress": The event handling work is pending or is being
processed. It cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and both the
events_to_process count is nonzero and the EC_FLAGS_QUERY_GUARDING
flag is clear.
"Complete": The event handling work has been completed, but it still
cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and the
events_to_process count is zero or the EC_FLAGS_QUERY_GUARDING
flag is set.
The state changes from "Ready" to "In progress" when new event is
detected by advance_transaction() and acpi_ec_submit_event() is
called by it.
Next, the state can change from "In progress" directly to "Ready" in
the following situations:
* ec_event_clearing is ACPI_EC_EVT_TIMING_STATUS and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes ACPI_EC_COMMAND_POLL.
* ec_event_clearing is ACPI_EC_EVT_TIMING_QUERY and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* ec_event_clearing is either ACPI_EC_EVT_TIMING_STATUS or
ACPI_EC_EVT_TIMING_QUERY and there are no more events to
process (ie. ec->events_to_process becomes 0).
If ec_event_clearing is ACPI_EC_EVT_TIMING_EVENT, however, the
state must change from "In progress" to "Complete" before it
can change to "Ready". The changes from "In progress" to
"Complete" in that case occur in the following situations:
* The state of an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* There are no more events to process (ie. ec->events_to_process
becomes 0).
Finally, the state changes from "Complete" to "Ready" when
advance_transaction() is invoked when the state is "Complete" and
the state of the current transaction is not ACPI_EC_COMMAND_POLL.
To make this state machine visible in the code, add a new
event_state field to struct acpi_ec and modify the code to use
it istead the EC_FLAGS_QUERY_PENDING and EC_FLAGS_QUERY_GUARDING
flags.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:44:46 +01:00
if ( ec - > curr - > command ! = ACPI_EC_COMMAND_QUERY )
return ;
switch ( ec_event_clearing ) {
case ACPI_EC_EVT_TIMING_STATUS :
if ( flag = = ACPI_EC_COMMAND_POLL )
2021-11-23 19:42:02 +01:00
acpi_ec_close_event ( ec ) ;
ACPI: EC: Make the event work state machine visible
The EC driver uses a relatively simple state machine for the event
work handling, but it is not really straightforward to figure out.
The states are as follows:
"Ready": The event handling work can be submitted.
In this state, the EC_FLAGS_QUERY_PENDING flag is clear.
"In progress": The event handling work is pending or is being
processed. It cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and both the
events_to_process count is nonzero and the EC_FLAGS_QUERY_GUARDING
flag is clear.
"Complete": The event handling work has been completed, but it still
cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and the
events_to_process count is zero or the EC_FLAGS_QUERY_GUARDING
flag is set.
The state changes from "Ready" to "In progress" when new event is
detected by advance_transaction() and acpi_ec_submit_event() is
called by it.
Next, the state can change from "In progress" directly to "Ready" in
the following situations:
* ec_event_clearing is ACPI_EC_EVT_TIMING_STATUS and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes ACPI_EC_COMMAND_POLL.
* ec_event_clearing is ACPI_EC_EVT_TIMING_QUERY and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* ec_event_clearing is either ACPI_EC_EVT_TIMING_STATUS or
ACPI_EC_EVT_TIMING_QUERY and there are no more events to
process (ie. ec->events_to_process becomes 0).
If ec_event_clearing is ACPI_EC_EVT_TIMING_EVENT, however, the
state must change from "In progress" to "Complete" before it
can change to "Ready". The changes from "In progress" to
"Complete" in that case occur in the following situations:
* The state of an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* There are no more events to process (ie. ec->events_to_process
becomes 0).
Finally, the state changes from "Complete" to "Ready" when
advance_transaction() is invoked when the state is "Complete" and
the state of the current transaction is not ACPI_EC_COMMAND_POLL.
To make this state machine visible in the code, add a new
event_state field to struct acpi_ec and modify the code to use
it istead the EC_FLAGS_QUERY_PENDING and EC_FLAGS_QUERY_GUARDING
flags.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:44:46 +01:00
return ;
case ACPI_EC_EVT_TIMING_QUERY :
if ( flag = = ACPI_EC_COMMAND_COMPLETE )
2021-11-23 19:42:02 +01:00
acpi_ec_close_event ( ec ) ;
ACPI: EC: Make the event work state machine visible
The EC driver uses a relatively simple state machine for the event
work handling, but it is not really straightforward to figure out.
The states are as follows:
"Ready": The event handling work can be submitted.
In this state, the EC_FLAGS_QUERY_PENDING flag is clear.
"In progress": The event handling work is pending or is being
processed. It cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and both the
events_to_process count is nonzero and the EC_FLAGS_QUERY_GUARDING
flag is clear.
"Complete": The event handling work has been completed, but it still
cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and the
events_to_process count is zero or the EC_FLAGS_QUERY_GUARDING
flag is set.
The state changes from "Ready" to "In progress" when new event is
detected by advance_transaction() and acpi_ec_submit_event() is
called by it.
Next, the state can change from "In progress" directly to "Ready" in
the following situations:
* ec_event_clearing is ACPI_EC_EVT_TIMING_STATUS and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes ACPI_EC_COMMAND_POLL.
* ec_event_clearing is ACPI_EC_EVT_TIMING_QUERY and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* ec_event_clearing is either ACPI_EC_EVT_TIMING_STATUS or
ACPI_EC_EVT_TIMING_QUERY and there are no more events to
process (ie. ec->events_to_process becomes 0).
If ec_event_clearing is ACPI_EC_EVT_TIMING_EVENT, however, the
state must change from "In progress" to "Complete" before it
can change to "Ready". The changes from "In progress" to
"Complete" in that case occur in the following situations:
* The state of an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* There are no more events to process (ie. ec->events_to_process
becomes 0).
Finally, the state changes from "Complete" to "Ready" when
advance_transaction() is invoked when the state is "Complete" and
the state of the current transaction is not ACPI_EC_COMMAND_POLL.
To make this state machine visible in the code, add a new
event_state field to struct acpi_ec and modify the code to use
it istead the EC_FLAGS_QUERY_PENDING and EC_FLAGS_QUERY_GUARDING
flags.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:44:46 +01:00
return ;
case ACPI_EC_EVT_TIMING_EVENT :
if ( flag = = ACPI_EC_COMMAND_COMPLETE )
acpi_ec_complete_event ( ec ) ;
2015-06-11 13:21:23 +08:00
}
}
2020-11-23 20:00:42 +01:00
static void acpi_ec_spurious_interrupt ( struct acpi_ec * ec , struct transaction * t )
{
if ( t - > irq_count < ec_storm_threshold )
+ + t - > irq_count ;
/* Trigger if the threshold is 0 too. */
if ( t - > irq_count = = ec_storm_threshold )
acpi_ec_mask_events ( ec ) ;
}
2022-02-04 18:40:20 +01:00
static void advance_transaction ( struct acpi_ec * ec , bool interrupt )
2006-09-26 19:50:33 +04:00
{
2020-11-23 20:00:34 +01:00
struct transaction * t = ec - > curr ;
2014-06-15 08:42:07 +08:00
bool wakeup = false ;
2020-11-23 20:00:34 +01:00
u8 status ;
2012-10-23 01:29:38 +02:00
2020-11-13 19:13:17 +01:00
ec_dbg_stm ( " %s (%d) " , interrupt ? " IRQ " : " TASK " , smp_processor_id ( ) ) ;
2020-11-23 20:00:18 +01:00
ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode.
This patch switches EC driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode where
the GPE lock is not held for acpi_ec_gpe_handler() and the ACPICA internal
GPE enabling/disabling/clearing operations are bypassed so that further
improvements are possible with the GPE APIs.
There are 2 strong reasons for deploying raw GPE handler mode in the EC
driver:
1. Some hardware logics can control their interrupts via their own
registers, so their interrupts can be disabled/enabled/acknowledged
without using the super IRQ controller provided functions. While there
is no mean (EC commands) for the EC driver to achieve this.
2. During suspending, the EC driver is still working for a while to
complete the platform firmware provided functionailities using ec_poll()
after all GPEs are disabled (see acpi_ec_block_transactions()), which
means the EC driver will drive the EC GPE out of the GPE core's control.
Without deploying the raw GPE handler mode, we can see many races between
the EC driver and the GPE core due to the above restrictions:
1. There is a race condition due to ACPICA internal GPE
disabling/clearing/enabling logics in acpi_ev_gpe_dispatch():
Orignally EC GPE is disabled (EN=0), cleared (STS=0) before invoking a
GPE handler and re-enabled (EN=1) after invoking a GPE handler in
acpi_ev_gpe_dispatch(). When re-enabling appears, GPE may be flagged
(STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() ec_poll()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1
This race condition is the root cause of different issues on different
silicon variations.
A. Silicon variation A:
On some platforms, GPE will be triggered due to "writing 1 to EN when
STS=1". This is because both EN and STS lines are wired to the GPE
trigger line.
1. Issue 1:
We can see no-op acpi_ec_gpe_handler() invoked on such platforms.
This is because:
a. event pending B: An event can arrive after ACPICA's GPE
clearing performed in acpi_ev_gpe_dispatch(), this event may
fail to be detected by EC_SC read that is performed before its
arrival;
b. event handling B: The event can be handled in ec_poll() because
EC lock is released after acpi_ec_gpe_handler() invocation;
c. There is no code in ec_poll() to clear STS but the GPE can
still be triggered by the EN=1 write performed in
acpi_ev_finish_gpe(), this leads to a no-op EC GPE handler
invocation;
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 1:
If we removed GPE disabling/enabling code from
acpi_ev_gpe_dispatch(), we could still see no-op GPE handlers
triggered by the event arriving after the GPE clearing and before
the GPE handling on both silicon variation A and B. This can only
occur if the CPU is very slow (timing slice between STS=0 write
and EC_SC read should be short enough before hardware sets another
GPE indication). Thus this is very rare and is not what we need to
fix.
B. Silicon variation B:
On other platforms, GPE may not be triggered due to "writing 1 to EN
when STS=1". This is because only STS line is wired to the GPE
trigger line.
2. Issue 2:
We can see GPE loss on such platforms. This is because:
a. event pending B vs. event handling A: An event can arrive after
ACPICA's GPE handling performed in acpi_ev_gpe_dispatch(), or
event pending C vs. event handling B: An event can arrive after
Linux's GPE handling performed in ec_poll(),
these events may fail to be detected by EC_SC read that is
performed before their arrival;
b. The GPE cannot be triggered by EN=1 write performed in
acpi_ev_finish_gpe();
c. If no polling mechanism is implemented in the driver for the
pending event (for example, SCI_EVT), this event is lost due to
no GPE being triggered.
Note 2:
On most platforms, there might be another rule that GPE may not be
triggered due to "writing 1 to STS when STS=1 and EN=1".
Then on silicon variation B, an even worse case is if the issue 2
event loss happens, further events may never trigger GPE again on
such platforms due to being blocked by the current STS=1. Unless
someone clears STS, all events have to be polled.
2. There is a race condition due to lacking in GPE status checking in EC
driver:
Originally, GPE status is checked in ACPICA core but not checked in
the GPE handler. Thus since the status checking and handling is not
locked, it can be interrupted by another handling path.
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_detect() ec_poll()
if (EN==1 && STS==1)
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
acpi_ev_gpe_dispatch()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
Unlock(EC)
*****************************************************************
3. Issue 3:
We can see no-op acpi_ec_gpe_handler() invoked on both silicon
variation A and B. This is because:
a. event pending A: An event can arrive to trigger an EC GPE and
ACPICA checks it and is about to invoke the EC GPE handler;
b. event handling A: The event can be handled in ec_poll() because
EC lock is not held after the GPE status checking;
c. event handling B: Then when the EC GPE handler is invoked, it
becomes a no-op GPE handler invocation.
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 3:
This no-op GPE handler invocation is rare because the time between
the IRQ arrival and the acpi_ec_gpe_handler() invocation is less than
the timeout value waited in ec_poll(). So most of the no-op GPE
handler invocations are caused by the reason described in issue 1.
3. There is a race condition due to ACPICA internal GPE clearing logic in
acpi_enable_gpe():
During runtime, acpi_enable_gpe() can be invoked by the EC storming
prevention code. When it is invoked, GPE may be flagged (STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() acpi_ec_transaction()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1 ?
Lock(EC)
Unlock(EC)
=================================================================
(event pending B)
=================================================================
acpi_enable_gpe()
STS=0
EN=1
4. Issue 4:
We can see GPE loss on both silicon variation A and B platforms.
This is because:
a. event pending B: An event can arrive right before ACPICA's GPE
clearing performed in acpi_enable_gpe();
b. If the GPE is cleared when GPE is disabled, then EN=1 write in
acpi_enable_gpe() cannot trigger this GPE;
c. If no polling mechanism is implemented in the driver for this
event (for example, SCI_EVT), this event is lost due to no GPE
being triggered.
Note 4:
Currently we don't have this issue, but after we switch the EC
driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode, we need to take care
of handling this because the EN=1 write in acpi_ev_gpe_dispatch()
will be abandoned.
There might be more race issues for the current GPE handler usages. This is
because the EC IRQ's enabling/disabling/checking/clearing/handling
operations should be locked by a single lock that is under the EC driver's
control to achieve the serialization. Which means we need to invoke GPE
APIs with EC driver's lock held and all ACPICA internal GPE operations
related to the GPE handler should be abandoned. Invoking GPE APIs inside of
the EC driver lock and bypassing ACPICA internal GPE operations requires
the ACPI_GPE_DISPATCH_RAW_HANDLER mode where the same lock used by the APIs
are released prior than invoking the handlers. Otherwise, we can see dead
locks due to circular locking dependencies (see Reference below).
This patch then switches the EC driver into the
ACPI_GPE_DISPATCH_RAW_HANDLER mode so that it can perform correct GPE
operations using the GPE APIs:
1. Bypasses EN modifications performed in acpi_ev_gpe_dispatch() by
using acpi_install_gpe_raw_handler() and invoking all GPE APIs with EC
spin lock held. This can fix issue 1 as it makes a non frequent GPE
enabling/disabling environment.
2. Bypasses STS clearing performed in acpi_enable_gpe() by replacing
acpi_enable_gpe()/acpi_disable_gpe() with acpi_set_gpe(). This can fix
issue 4. And this can also help to fix issue 1 as it makes a no sudden
GPE clearing environment when GPE is frequently enabled/disabled.
3. Ensures STS acknowledged before handling by invoking acpi_clear_gpe()
in advance_transaction(). This can finally fix issue 1 even in a
frequent GPE enabling/disabling environment. And this can also finally
fix issue 3 when issue 2 is fixed.
Note 3:
GPE clearing is edge triggered W1C, which means we can clear it right
before handling it. Since all EC GPE indications are handled in
advance_transaction() by previous commits, we can now move GPE clearing
into it to implement the correct GPE clearing.
Note 4:
We can use acpi_set_gpe() which is not shared GPE safer instead of
acpi_enable_gpe()/acpi_disable_gpe() because EC GPE is not shared by
other hardware, which is mentioned in the ACPI specification 5.0, 12.6
Interrupt Model: "OSPM driver treats this as an edge event (the EC SCI
cannot be shared)". So we can stop using shared GPE safer APIs
acpi_enable_gpe()/acpi_disable_gpe() in the EC driver. Otherwise
cleanups need to be made in acpi_ev_enable_gpe() to bypass the GPE
clearing logic before keeping acpi_enable_gpe().
This patch also invokes advance_transaction() when GPE is re-enabled in the
task context which:
1. Ensures EN=1 can trigger GPE by checking and handling EC status register
right after EN=1 writes. This can fix issue 2.
After applying this patch, without frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() ec_poll()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 1 (event pending B) can arrive as a next GPE
due to the previous IRQ context STS=0 write. And if it is handled by
ec_poll() (event handling B), as it is also acknowledged by ec_poll(), the
event pending for issue 2 (event pending C) can properly arrive as a next
GPE after the task context STS=0 write. So no GPE will be lost and never
triggered due to GPE clearing performed in the wrong position. And since
all GPE handling is performed after a locked GPE status checking, we can
hardly see no-op GPE handler invocations due to issue 1 and 3. We may still
see no-op GPE handler invocations due to "Note 1", but as it is inevitable,
it needn't be fixed.
After applying this patch, with frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() acpi_ec_transaction()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
EN=1
if STS==1
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 2 can be manually handled by
advance_transaction(). And after the STS=0 write performed in the manual
triggered advance_transaction(), GPE can always arrive. So no GPE will be
lost due to frequent GPE disabling/enabling performed in the driver like
issue 4.
Note 5:
It's ideally when EN=1 write occurred, an IRQ thread should be woken up to
handle the GPE when the GPE was raised. But this requires the IRQ thread to
contain the poller code for all EC GPE indications, while currently some of
the indications are handled in the user tasks. It then is very hard for the
code to determine whether a user task should be invoked or the poller work
item should be scheduled. So we have to invoke advance_transaction()
directly now and it leaves us such a restriction for the GPE re-enabling:
it must be performed in the task context to avoid starving the GPEs.
As a conclusion: we can see the EC GPE is always handled in serial after
deploying the raw GPE handler mode:
Lock(EC)
if (STS==1)
STS=0
EC_SC read
EC_SC handled
Unlock(EC)
The EC driver specific lock is responsible to make the EC GPE handling
processes serialized so that EC can handle its GPE from both IRQ and task
contexts and the next IRQ can be ensured to arrive after this process.
Note 6:
We have many EC_FLAGS_MSI qurik users in the current driver. They all seem
to be suffering from unexpected GPE triggering source lost. And they are
false root caused to a timing issue. Since EC communication protocol has
already flow control defined, timing shouldn't be the root cause, while
this fix might be fixing the root cause of the old bugs.
Link: https://lkml.org/lkml/2014/11/4/974
Link: https://lkml.org/lkml/2014/11/18/316
Link: https://www.spinics.net/lists/linux-acpi/msg54340.html
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05 16:27:22 +08:00
/*
2020-11-23 20:00:18 +01:00
* Clear GPE_STS upfront to allow subsequent hardware GPE_STS 0 - > 1
* changes to always trigger a GPE interrupt .
*
* GPE STS is a W1C register , which means :
*
* 1. Software can clear it without worrying about clearing the other
* GPEs ' STS bits when the hardware sets them in parallel .
*
* 2. As long as software can ensure only clearing it when it is set ,
* hardware won ' t set it in parallel .
ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode.
This patch switches EC driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode where
the GPE lock is not held for acpi_ec_gpe_handler() and the ACPICA internal
GPE enabling/disabling/clearing operations are bypassed so that further
improvements are possible with the GPE APIs.
There are 2 strong reasons for deploying raw GPE handler mode in the EC
driver:
1. Some hardware logics can control their interrupts via their own
registers, so their interrupts can be disabled/enabled/acknowledged
without using the super IRQ controller provided functions. While there
is no mean (EC commands) for the EC driver to achieve this.
2. During suspending, the EC driver is still working for a while to
complete the platform firmware provided functionailities using ec_poll()
after all GPEs are disabled (see acpi_ec_block_transactions()), which
means the EC driver will drive the EC GPE out of the GPE core's control.
Without deploying the raw GPE handler mode, we can see many races between
the EC driver and the GPE core due to the above restrictions:
1. There is a race condition due to ACPICA internal GPE
disabling/clearing/enabling logics in acpi_ev_gpe_dispatch():
Orignally EC GPE is disabled (EN=0), cleared (STS=0) before invoking a
GPE handler and re-enabled (EN=1) after invoking a GPE handler in
acpi_ev_gpe_dispatch(). When re-enabling appears, GPE may be flagged
(STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() ec_poll()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1
This race condition is the root cause of different issues on different
silicon variations.
A. Silicon variation A:
On some platforms, GPE will be triggered due to "writing 1 to EN when
STS=1". This is because both EN and STS lines are wired to the GPE
trigger line.
1. Issue 1:
We can see no-op acpi_ec_gpe_handler() invoked on such platforms.
This is because:
a. event pending B: An event can arrive after ACPICA's GPE
clearing performed in acpi_ev_gpe_dispatch(), this event may
fail to be detected by EC_SC read that is performed before its
arrival;
b. event handling B: The event can be handled in ec_poll() because
EC lock is released after acpi_ec_gpe_handler() invocation;
c. There is no code in ec_poll() to clear STS but the GPE can
still be triggered by the EN=1 write performed in
acpi_ev_finish_gpe(), this leads to a no-op EC GPE handler
invocation;
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 1:
If we removed GPE disabling/enabling code from
acpi_ev_gpe_dispatch(), we could still see no-op GPE handlers
triggered by the event arriving after the GPE clearing and before
the GPE handling on both silicon variation A and B. This can only
occur if the CPU is very slow (timing slice between STS=0 write
and EC_SC read should be short enough before hardware sets another
GPE indication). Thus this is very rare and is not what we need to
fix.
B. Silicon variation B:
On other platforms, GPE may not be triggered due to "writing 1 to EN
when STS=1". This is because only STS line is wired to the GPE
trigger line.
2. Issue 2:
We can see GPE loss on such platforms. This is because:
a. event pending B vs. event handling A: An event can arrive after
ACPICA's GPE handling performed in acpi_ev_gpe_dispatch(), or
event pending C vs. event handling B: An event can arrive after
Linux's GPE handling performed in ec_poll(),
these events may fail to be detected by EC_SC read that is
performed before their arrival;
b. The GPE cannot be triggered by EN=1 write performed in
acpi_ev_finish_gpe();
c. If no polling mechanism is implemented in the driver for the
pending event (for example, SCI_EVT), this event is lost due to
no GPE being triggered.
Note 2:
On most platforms, there might be another rule that GPE may not be
triggered due to "writing 1 to STS when STS=1 and EN=1".
Then on silicon variation B, an even worse case is if the issue 2
event loss happens, further events may never trigger GPE again on
such platforms due to being blocked by the current STS=1. Unless
someone clears STS, all events have to be polled.
2. There is a race condition due to lacking in GPE status checking in EC
driver:
Originally, GPE status is checked in ACPICA core but not checked in
the GPE handler. Thus since the status checking and handling is not
locked, it can be interrupted by another handling path.
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_detect() ec_poll()
if (EN==1 && STS==1)
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
acpi_ev_gpe_dispatch()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
Unlock(EC)
*****************************************************************
3. Issue 3:
We can see no-op acpi_ec_gpe_handler() invoked on both silicon
variation A and B. This is because:
a. event pending A: An event can arrive to trigger an EC GPE and
ACPICA checks it and is about to invoke the EC GPE handler;
b. event handling A: The event can be handled in ec_poll() because
EC lock is not held after the GPE status checking;
c. event handling B: Then when the EC GPE handler is invoked, it
becomes a no-op GPE handler invocation.
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 3:
This no-op GPE handler invocation is rare because the time between
the IRQ arrival and the acpi_ec_gpe_handler() invocation is less than
the timeout value waited in ec_poll(). So most of the no-op GPE
handler invocations are caused by the reason described in issue 1.
3. There is a race condition due to ACPICA internal GPE clearing logic in
acpi_enable_gpe():
During runtime, acpi_enable_gpe() can be invoked by the EC storming
prevention code. When it is invoked, GPE may be flagged (STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() acpi_ec_transaction()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1 ?
Lock(EC)
Unlock(EC)
=================================================================
(event pending B)
=================================================================
acpi_enable_gpe()
STS=0
EN=1
4. Issue 4:
We can see GPE loss on both silicon variation A and B platforms.
This is because:
a. event pending B: An event can arrive right before ACPICA's GPE
clearing performed in acpi_enable_gpe();
b. If the GPE is cleared when GPE is disabled, then EN=1 write in
acpi_enable_gpe() cannot trigger this GPE;
c. If no polling mechanism is implemented in the driver for this
event (for example, SCI_EVT), this event is lost due to no GPE
being triggered.
Note 4:
Currently we don't have this issue, but after we switch the EC
driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode, we need to take care
of handling this because the EN=1 write in acpi_ev_gpe_dispatch()
will be abandoned.
There might be more race issues for the current GPE handler usages. This is
because the EC IRQ's enabling/disabling/checking/clearing/handling
operations should be locked by a single lock that is under the EC driver's
control to achieve the serialization. Which means we need to invoke GPE
APIs with EC driver's lock held and all ACPICA internal GPE operations
related to the GPE handler should be abandoned. Invoking GPE APIs inside of
the EC driver lock and bypassing ACPICA internal GPE operations requires
the ACPI_GPE_DISPATCH_RAW_HANDLER mode where the same lock used by the APIs
are released prior than invoking the handlers. Otherwise, we can see dead
locks due to circular locking dependencies (see Reference below).
This patch then switches the EC driver into the
ACPI_GPE_DISPATCH_RAW_HANDLER mode so that it can perform correct GPE
operations using the GPE APIs:
1. Bypasses EN modifications performed in acpi_ev_gpe_dispatch() by
using acpi_install_gpe_raw_handler() and invoking all GPE APIs with EC
spin lock held. This can fix issue 1 as it makes a non frequent GPE
enabling/disabling environment.
2. Bypasses STS clearing performed in acpi_enable_gpe() by replacing
acpi_enable_gpe()/acpi_disable_gpe() with acpi_set_gpe(). This can fix
issue 4. And this can also help to fix issue 1 as it makes a no sudden
GPE clearing environment when GPE is frequently enabled/disabled.
3. Ensures STS acknowledged before handling by invoking acpi_clear_gpe()
in advance_transaction(). This can finally fix issue 1 even in a
frequent GPE enabling/disabling environment. And this can also finally
fix issue 3 when issue 2 is fixed.
Note 3:
GPE clearing is edge triggered W1C, which means we can clear it right
before handling it. Since all EC GPE indications are handled in
advance_transaction() by previous commits, we can now move GPE clearing
into it to implement the correct GPE clearing.
Note 4:
We can use acpi_set_gpe() which is not shared GPE safer instead of
acpi_enable_gpe()/acpi_disable_gpe() because EC GPE is not shared by
other hardware, which is mentioned in the ACPI specification 5.0, 12.6
Interrupt Model: "OSPM driver treats this as an edge event (the EC SCI
cannot be shared)". So we can stop using shared GPE safer APIs
acpi_enable_gpe()/acpi_disable_gpe() in the EC driver. Otherwise
cleanups need to be made in acpi_ev_enable_gpe() to bypass the GPE
clearing logic before keeping acpi_enable_gpe().
This patch also invokes advance_transaction() when GPE is re-enabled in the
task context which:
1. Ensures EN=1 can trigger GPE by checking and handling EC status register
right after EN=1 writes. This can fix issue 2.
After applying this patch, without frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() ec_poll()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 1 (event pending B) can arrive as a next GPE
due to the previous IRQ context STS=0 write. And if it is handled by
ec_poll() (event handling B), as it is also acknowledged by ec_poll(), the
event pending for issue 2 (event pending C) can properly arrive as a next
GPE after the task context STS=0 write. So no GPE will be lost and never
triggered due to GPE clearing performed in the wrong position. And since
all GPE handling is performed after a locked GPE status checking, we can
hardly see no-op GPE handler invocations due to issue 1 and 3. We may still
see no-op GPE handler invocations due to "Note 1", but as it is inevitable,
it needn't be fixed.
After applying this patch, with frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() acpi_ec_transaction()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
EN=1
if STS==1
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 2 can be manually handled by
advance_transaction(). And after the STS=0 write performed in the manual
triggered advance_transaction(), GPE can always arrive. So no GPE will be
lost due to frequent GPE disabling/enabling performed in the driver like
issue 4.
Note 5:
It's ideally when EN=1 write occurred, an IRQ thread should be woken up to
handle the GPE when the GPE was raised. But this requires the IRQ thread to
contain the poller code for all EC GPE indications, while currently some of
the indications are handled in the user tasks. It then is very hard for the
code to determine whether a user task should be invoked or the poller work
item should be scheduled. So we have to invoke advance_transaction()
directly now and it leaves us such a restriction for the GPE re-enabling:
it must be performed in the task context to avoid starving the GPEs.
As a conclusion: we can see the EC GPE is always handled in serial after
deploying the raw GPE handler mode:
Lock(EC)
if (STS==1)
STS=0
EC_SC read
EC_SC handled
Unlock(EC)
The EC driver specific lock is responsible to make the EC GPE handling
processes serialized so that EC can handle its GPE from both IRQ and task
contexts and the next IRQ can be ensured to arrive after this process.
Note 6:
We have many EC_FLAGS_MSI qurik users in the current driver. They all seem
to be suffering from unexpected GPE triggering source lost. And they are
false root caused to a timing issue. Since EC communication protocol has
already flow control defined, timing shouldn't be the root cause, while
this fix might be fixing the root cause of the old bugs.
Link: https://lkml.org/lkml/2014/11/4/974
Link: https://lkml.org/lkml/2014/11/18/316
Link: https://www.spinics.net/lists/linux-acpi/msg54340.html
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05 16:27:22 +08:00
*/
2020-11-23 20:00:25 +01:00
if ( ec - > gpe > = 0 & & acpi_ec_gpe_status_set ( ec ) )
2020-11-23 20:00:18 +01:00
acpi_clear_gpe ( NULL , ec - > gpe ) ;
2019-10-14 16:56:02 +08:00
2014-06-15 08:41:17 +08:00
status = acpi_ec_read_status ( ec ) ;
2020-11-23 20:00:34 +01:00
2015-06-11 13:21:38 +08:00
/*
* Another IRQ or a guarded polling mode advancement is detected ,
* the next QR_EC submission is then allowed .
*/
if ( ! t | | ! ( t - > flags & ACPI_EC_COMMAND_POLL ) ) {
if ( ec_event_clearing = = ACPI_EC_EVT_TIMING_EVENT & &
ACPI: EC: Make the event work state machine visible
The EC driver uses a relatively simple state machine for the event
work handling, but it is not really straightforward to figure out.
The states are as follows:
"Ready": The event handling work can be submitted.
In this state, the EC_FLAGS_QUERY_PENDING flag is clear.
"In progress": The event handling work is pending or is being
processed. It cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and both the
events_to_process count is nonzero and the EC_FLAGS_QUERY_GUARDING
flag is clear.
"Complete": The event handling work has been completed, but it still
cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and the
events_to_process count is zero or the EC_FLAGS_QUERY_GUARDING
flag is set.
The state changes from "Ready" to "In progress" when new event is
detected by advance_transaction() and acpi_ec_submit_event() is
called by it.
Next, the state can change from "In progress" directly to "Ready" in
the following situations:
* ec_event_clearing is ACPI_EC_EVT_TIMING_STATUS and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes ACPI_EC_COMMAND_POLL.
* ec_event_clearing is ACPI_EC_EVT_TIMING_QUERY and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* ec_event_clearing is either ACPI_EC_EVT_TIMING_STATUS or
ACPI_EC_EVT_TIMING_QUERY and there are no more events to
process (ie. ec->events_to_process becomes 0).
If ec_event_clearing is ACPI_EC_EVT_TIMING_EVENT, however, the
state must change from "In progress" to "Complete" before it
can change to "Ready". The changes from "In progress" to
"Complete" in that case occur in the following situations:
* The state of an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* There are no more events to process (ie. ec->events_to_process
becomes 0).
Finally, the state changes from "Complete" to "Ready" when
advance_transaction() is invoked when the state is "Complete" and
the state of the current transaction is not ACPI_EC_COMMAND_POLL.
To make this state machine visible in the code, add a new
event_state field to struct acpi_ec and modify the code to use
it istead the EC_FLAGS_QUERY_PENDING and EC_FLAGS_QUERY_GUARDING
flags.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:44:46 +01:00
ec - > event_state = = EC_EVENT_COMPLETE )
2021-11-23 19:42:02 +01:00
acpi_ec_close_event ( ec ) ;
ACPI: EC: Make the event work state machine visible
The EC driver uses a relatively simple state machine for the event
work handling, but it is not really straightforward to figure out.
The states are as follows:
"Ready": The event handling work can be submitted.
In this state, the EC_FLAGS_QUERY_PENDING flag is clear.
"In progress": The event handling work is pending or is being
processed. It cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and both the
events_to_process count is nonzero and the EC_FLAGS_QUERY_GUARDING
flag is clear.
"Complete": The event handling work has been completed, but it still
cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and the
events_to_process count is zero or the EC_FLAGS_QUERY_GUARDING
flag is set.
The state changes from "Ready" to "In progress" when new event is
detected by advance_transaction() and acpi_ec_submit_event() is
called by it.
Next, the state can change from "In progress" directly to "Ready" in
the following situations:
* ec_event_clearing is ACPI_EC_EVT_TIMING_STATUS and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes ACPI_EC_COMMAND_POLL.
* ec_event_clearing is ACPI_EC_EVT_TIMING_QUERY and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* ec_event_clearing is either ACPI_EC_EVT_TIMING_STATUS or
ACPI_EC_EVT_TIMING_QUERY and there are no more events to
process (ie. ec->events_to_process becomes 0).
If ec_event_clearing is ACPI_EC_EVT_TIMING_EVENT, however, the
state must change from "In progress" to "Complete" before it
can change to "Ready". The changes from "In progress" to
"Complete" in that case occur in the following situations:
* The state of an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* There are no more events to process (ie. ec->events_to_process
becomes 0).
Finally, the state changes from "Complete" to "Ready" when
advance_transaction() is invoked when the state is "Complete" and
the state of the current transaction is not ACPI_EC_COMMAND_POLL.
To make this state machine visible in the code, add a new
event_state field to struct acpi_ec and modify the code to use
it istead the EC_FLAGS_QUERY_PENDING and EC_FLAGS_QUERY_GUARDING
flags.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:44:46 +01:00
2020-11-23 20:00:34 +01:00
if ( ! t )
goto out ;
2015-06-11 13:21:38 +08:00
}
2020-11-23 20:00:34 +01:00
2014-06-15 08:41:35 +08:00
if ( t - > flags & ACPI_EC_COMMAND_POLL ) {
if ( t - > wlen > t - > wi ) {
2020-11-23 20:01:01 +01:00
if ( ! ( status & ACPI_EC_FLAG_IBF ) )
2014-06-15 08:41:35 +08:00
acpi_ec_write_data ( ec , t - > wdata [ t - > wi + + ] ) ;
2020-11-23 20:00:42 +01:00
else if ( interrupt & & ! ( status & ACPI_EC_FLAG_SCI ) )
acpi_ec_spurious_interrupt ( ec , t ) ;
2014-06-15 08:41:35 +08:00
} else if ( t - > rlen > t - > ri ) {
2020-11-23 20:01:01 +01:00
if ( status & ACPI_EC_FLAG_OBF ) {
2014-06-15 08:41:35 +08:00
t - > rdata [ t - > ri + + ] = acpi_ec_read_data ( ec ) ;
2014-06-15 08:42:07 +08:00
if ( t - > rlen = = t - > ri ) {
2015-06-11 13:21:23 +08:00
ec_transaction_transition ( ec , ACPI_EC_COMMAND_COMPLETE ) ;
2020-11-23 20:01:01 +01:00
wakeup = true ;
2014-08-21 14:41:13 +08:00
if ( t - > command = = ACPI_EC_COMMAND_QUERY )
2015-06-11 13:21:32 +08:00
ec_dbg_evt ( " Command(%s) completed by hardware " ,
acpi_ec_cmd_string ( ACPI_EC_COMMAND_QUERY ) ) ;
2014-06-15 08:42:07 +08:00
}
2020-11-23 20:00:42 +01:00
} else if ( interrupt & & ! ( status & ACPI_EC_FLAG_SCI ) ) {
acpi_ec_spurious_interrupt ( ec , t ) ;
}
2020-11-23 20:01:01 +01:00
} else if ( t - > wlen = = t - > wi & & ! ( status & ACPI_EC_FLAG_IBF ) ) {
2015-06-11 13:21:23 +08:00
ec_transaction_transition ( ec , ACPI_EC_COMMAND_COMPLETE ) ;
2014-06-15 08:42:07 +08:00
wakeup = true ;
}
2020-03-06 00:17:55 +01:00
} else if ( ! ( status & ACPI_EC_FLAG_IBF ) ) {
acpi_ec_write_cmd ( ec , t - > command ) ;
ec_transaction_transition ( ec , ACPI_EC_COMMAND_POLL ) ;
2014-06-15 08:41:35 +08:00
}
2020-11-23 20:00:34 +01:00
2015-01-14 19:28:22 +08:00
out :
ACPI / EC: Fix issues related to the SCI_EVT handling
This patch fixes 2 issues related to the draining behavior. But it doesn't
implement the draining support, it only cleans up code so that further
draining support is possible.
The draining behavior is expected by some platforms (for example, Samsung)
where SCI_EVT is set only once for a set of events and might be cleared for
the very first QR_EC command issued after SCI_EVT is set. EC firmware on
such platforms will return 0x00 to indicate "no outstanding event". Thus
after seeing an SCI_EVT indication, EC driver need to fetch events until
0x00 returned (see acpi_ec_clear()).
Issue 1 - acpi_ec_submit_query():
It's reported on Samsung laptops that SCI_EVT isn't checked when the
transactions are advanced in ec_poll(), which leads to SCI_EVT triggering
source lost:
If no EC GPE IRQs are arrived after that, EC driver cannot detect this
event and handle it.
See comment 244/247 for kernel bugzilla 44161.
This patch fixes this issue by moving SCI_EVT checks into
advance_transaction(). So that SCI_EVT is checked each time we are going to
handle the EC firmware indications. And this check will happen for both IRQ
context and task context.
Since after doing that, SCI_EVT is also checked after completing a
transaction, ec_check_sci() and ec_check_sci_sync() can be removed.
Issue 2 - acpi_ec_complete_query():
We expect to clear EC_FLAGS_QUERY_PENDING to allow queuing another draining
QR_EC after writing a QR_EC command and before reading the event. After
reading the event, SCI_EVT might be cleared by the firmware, thus it may
not be possible to queue such a draining QR_EC at that time.
But putting the EC_FLAGS_QUERY_PENDING clearing code after
start_transaction() is wrong as there are chances that after
start_transaction(), QR_EC can fail to be sent. If this happens,
EC_FLAG_QUERY_PENDING will be cleared earlier. As a consequence, the
draining QR_EC will also be queued earlier than expected.
This patch also moves this code into advance_transaction() where QR_EC is
just sent (ACPI_EC_COMMAND_POLL flagged) to fix this issue.
Notes:
1. After introducing the 2 SCI_EVT related handlings into
advance_transaction(), a next QR_EC can be queued right after writing
the current QR_EC command and before reading the event. But this still
hasn't implemented the draining behavior as the draining support
requires:
If a previous returned event value isn't 0x00, a draining QR_EC need
to be issued even when SCI_EVT isn't set.
2. In this patch, acpi_os_execute() is also converted into a seperate work
item to avoid invoking kmalloc() in the atomic context. We can do this
because of the previous global lock fix.
3. Originally, EC_FLAGS_EVENT_PENDING is also used to avoid queuing up
multiple work items (created by acpi_os_execute()), this can be covered
by only using a single work item. But this patch still keeps this flag
as there are different usages in the driver initialization steps relying
on this flag.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=44161
Reported-by: Kieran Clancy <clancy.kieran@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-14 19:28:47 +08:00
if ( status & ACPI_EC_FLAG_SCI )
2022-02-04 18:40:20 +01:00
acpi_ec_submit_event ( ec ) ;
2020-11-23 20:01:01 +01:00
2020-11-13 19:13:17 +01:00
if ( wakeup & & interrupt )
2015-01-14 19:28:22 +08:00
wake_up ( & ec - > wait ) ;
2014-06-15 08:41:35 +08:00
}
2012-10-23 01:30:12 +02:00
2014-06-15 08:41:35 +08:00
static void start_transaction ( struct acpi_ec * ec )
{
ec - > curr - > irq_count = ec - > curr - > wi = ec - > curr - > ri = 0 ;
ec - > curr - > flags = 0 ;
ACPI / EC: Fix and clean up register access guarding logics.
In the polling mode, EC driver shouldn't access the EC registers too
frequently. Though this statement is concluded from the non-root caused
bugs (see links below), we've maintained the register access guarding
logics in the current EC driver. The guarding logics can be found here and
there, makes it hard to root cause real timing issues. This patch collects
the guarding logics into one single function so that all hidden logics
related to this can be seen clearly.
The current guarding related code also has several issues:
1. Per-transaction timestamp prevents inter-transaction guarding from being
implemented in the same place. We have an inter-transaction udelay() in
acpi_ec_transaction_unblocked(), this logic can be merged into ec_poll()
if we can use per-device timestamp. This patch completes such merge to
form a new ec_guard() function and collects all guarding related hidden
logics in it.
One hidden logic is: there is no inter-transaction guarding performed
for non MSI quirk (wait polling mode), this patch skips
inter-transaction guarding before wait_event_timeout() for the wait
polling mode to reveal the hidden logic.
The other hidden logic is: there is msleep() inter-transaction guarding
performed when the GPE storming is observed. As after merging this
commit:
Commit: e1d4d90fc0313d3d58cbd7912c90f8ef24df45ff
Subject: ACPI / EC: Refine command storm prevention support
EC_FLAGS_COMMAND_STORM is ensured to be cleared after invoking
acpi_ec_transaction_unlocked(), the msleep() guard logic will never
happen now. Since no one complains such change, this logic is likely
added during the old times where the EC race issues are not fixed and
the bugs are false root-caused to the timing issue. This patch simply
removes the out-dated logic. We can restore it by stop skipping
inter-transaction guarding for wait polling mode.
Two different delay values are defined for msleep() and udelay() while
they are merged in this patch to 550us.
2. time_after() causes additional delay in the polling mode (can only be
observed in noirq suspend/resume processes where polling mode is always
used) before advance_transaction() is invoked ("wait polling" log is
added before wait_event_timeout()). We can see 2 wait_event_timeout()
invocations. This is because time_after() ensures a ">" validation while
we only need a ">=" validation here:
[ 86.739909] ACPI: Waking up from system sleep state S3
[ 86.742857] ACPI : EC: 2: Increase command
[ 86.742859] ACPI : EC: ***** Command(RD_EC) started *****
[ 86.742861] ACPI : EC: ===== TASK (0) =====
[ 86.742871] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.742873] ACPI : EC: EC_SC(W) = 0x80
[ 86.742876] ACPI : EC: ***** Event started *****
[ 86.742880] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.743972] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.747966] ACPI : EC: ===== TASK (0) =====
[ 86.747977] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.747978] ACPI : EC: EC_DATA(W) = 0x06
[ 86.747981] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.751971] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755969] ACPI : EC: ===== TASK (0) =====
[ 86.755991] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 86.755993] ACPI : EC: EC_DATA(R) = 0x03
[ 86.755994] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755995] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 86.755996] ACPI : EC: 1: Decrease command
This patch corrects this by using time_before() instead in ec_guard():
[ 54.283146] ACPI: Waking up from system sleep state S3
[ 54.285414] ACPI : EC: 2: Increase command
[ 54.285415] ACPI : EC: ***** Command(RD_EC) started *****
[ 54.285416] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.285417] ACPI : EC: ===== TASK (0) =====
[ 54.285424] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.285425] ACPI : EC: EC_SC(W) = 0x80
[ 54.285427] ACPI : EC: ***** Event started *****
[ 54.285429] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.287209] ACPI : EC: ===== TASK (0) =====
[ 54.287218] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.287219] ACPI : EC: EC_DATA(W) = 0x06
[ 54.287222] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291190] ACPI : EC: ===== TASK (0) =====
[ 54.291210] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 54.291213] ACPI : EC: EC_DATA(R) = 0x03
[ 54.291214] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291215] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 54.291216] ACPI : EC: 1: Decrease command
After cleaning up all guarding logics, we have one single function
ec_guard() collecting all old, non-root-caused, hidden logics. Then we can
easily tune the logics in one place to respond to the bug reports.
Except the time_before() change, all other changes do not change the
behavior of the EC driver.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=12011
Link: https://bugzilla.kernel.org/show_bug.cgi?id=20242
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 14:16:42 +08:00
}
static int ec_guard ( struct acpi_ec * ec )
{
2017-01-20 16:42:48 +08:00
unsigned long guard = usecs_to_jiffies ( ec - > polling_guard ) ;
ACPI / EC: Fix and clean up register access guarding logics.
In the polling mode, EC driver shouldn't access the EC registers too
frequently. Though this statement is concluded from the non-root caused
bugs (see links below), we've maintained the register access guarding
logics in the current EC driver. The guarding logics can be found here and
there, makes it hard to root cause real timing issues. This patch collects
the guarding logics into one single function so that all hidden logics
related to this can be seen clearly.
The current guarding related code also has several issues:
1. Per-transaction timestamp prevents inter-transaction guarding from being
implemented in the same place. We have an inter-transaction udelay() in
acpi_ec_transaction_unblocked(), this logic can be merged into ec_poll()
if we can use per-device timestamp. This patch completes such merge to
form a new ec_guard() function and collects all guarding related hidden
logics in it.
One hidden logic is: there is no inter-transaction guarding performed
for non MSI quirk (wait polling mode), this patch skips
inter-transaction guarding before wait_event_timeout() for the wait
polling mode to reveal the hidden logic.
The other hidden logic is: there is msleep() inter-transaction guarding
performed when the GPE storming is observed. As after merging this
commit:
Commit: e1d4d90fc0313d3d58cbd7912c90f8ef24df45ff
Subject: ACPI / EC: Refine command storm prevention support
EC_FLAGS_COMMAND_STORM is ensured to be cleared after invoking
acpi_ec_transaction_unlocked(), the msleep() guard logic will never
happen now. Since no one complains such change, this logic is likely
added during the old times where the EC race issues are not fixed and
the bugs are false root-caused to the timing issue. This patch simply
removes the out-dated logic. We can restore it by stop skipping
inter-transaction guarding for wait polling mode.
Two different delay values are defined for msleep() and udelay() while
they are merged in this patch to 550us.
2. time_after() causes additional delay in the polling mode (can only be
observed in noirq suspend/resume processes where polling mode is always
used) before advance_transaction() is invoked ("wait polling" log is
added before wait_event_timeout()). We can see 2 wait_event_timeout()
invocations. This is because time_after() ensures a ">" validation while
we only need a ">=" validation here:
[ 86.739909] ACPI: Waking up from system sleep state S3
[ 86.742857] ACPI : EC: 2: Increase command
[ 86.742859] ACPI : EC: ***** Command(RD_EC) started *****
[ 86.742861] ACPI : EC: ===== TASK (0) =====
[ 86.742871] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.742873] ACPI : EC: EC_SC(W) = 0x80
[ 86.742876] ACPI : EC: ***** Event started *****
[ 86.742880] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.743972] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.747966] ACPI : EC: ===== TASK (0) =====
[ 86.747977] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.747978] ACPI : EC: EC_DATA(W) = 0x06
[ 86.747981] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.751971] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755969] ACPI : EC: ===== TASK (0) =====
[ 86.755991] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 86.755993] ACPI : EC: EC_DATA(R) = 0x03
[ 86.755994] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755995] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 86.755996] ACPI : EC: 1: Decrease command
This patch corrects this by using time_before() instead in ec_guard():
[ 54.283146] ACPI: Waking up from system sleep state S3
[ 54.285414] ACPI : EC: 2: Increase command
[ 54.285415] ACPI : EC: ***** Command(RD_EC) started *****
[ 54.285416] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.285417] ACPI : EC: ===== TASK (0) =====
[ 54.285424] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.285425] ACPI : EC: EC_SC(W) = 0x80
[ 54.285427] ACPI : EC: ***** Event started *****
[ 54.285429] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.287209] ACPI : EC: ===== TASK (0) =====
[ 54.287218] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.287219] ACPI : EC: EC_DATA(W) = 0x06
[ 54.287222] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291190] ACPI : EC: ===== TASK (0) =====
[ 54.291210] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 54.291213] ACPI : EC: EC_DATA(R) = 0x03
[ 54.291214] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291215] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 54.291216] ACPI : EC: 1: Decrease command
After cleaning up all guarding logics, we have one single function
ec_guard() collecting all old, non-root-caused, hidden logics. Then we can
easily tune the logics in one place to respond to the bug reports.
Except the time_before() change, all other changes do not change the
behavior of the EC driver.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=12011
Link: https://bugzilla.kernel.org/show_bug.cgi?id=20242
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 14:16:42 +08:00
unsigned long timeout = ec - > timestamp + guard ;
2015-09-24 14:54:54 +08:00
/* Ensure guarding period before polling EC status */
ACPI / EC: Fix and clean up register access guarding logics.
In the polling mode, EC driver shouldn't access the EC registers too
frequently. Though this statement is concluded from the non-root caused
bugs (see links below), we've maintained the register access guarding
logics in the current EC driver. The guarding logics can be found here and
there, makes it hard to root cause real timing issues. This patch collects
the guarding logics into one single function so that all hidden logics
related to this can be seen clearly.
The current guarding related code also has several issues:
1. Per-transaction timestamp prevents inter-transaction guarding from being
implemented in the same place. We have an inter-transaction udelay() in
acpi_ec_transaction_unblocked(), this logic can be merged into ec_poll()
if we can use per-device timestamp. This patch completes such merge to
form a new ec_guard() function and collects all guarding related hidden
logics in it.
One hidden logic is: there is no inter-transaction guarding performed
for non MSI quirk (wait polling mode), this patch skips
inter-transaction guarding before wait_event_timeout() for the wait
polling mode to reveal the hidden logic.
The other hidden logic is: there is msleep() inter-transaction guarding
performed when the GPE storming is observed. As after merging this
commit:
Commit: e1d4d90fc0313d3d58cbd7912c90f8ef24df45ff
Subject: ACPI / EC: Refine command storm prevention support
EC_FLAGS_COMMAND_STORM is ensured to be cleared after invoking
acpi_ec_transaction_unlocked(), the msleep() guard logic will never
happen now. Since no one complains such change, this logic is likely
added during the old times where the EC race issues are not fixed and
the bugs are false root-caused to the timing issue. This patch simply
removes the out-dated logic. We can restore it by stop skipping
inter-transaction guarding for wait polling mode.
Two different delay values are defined for msleep() and udelay() while
they are merged in this patch to 550us.
2. time_after() causes additional delay in the polling mode (can only be
observed in noirq suspend/resume processes where polling mode is always
used) before advance_transaction() is invoked ("wait polling" log is
added before wait_event_timeout()). We can see 2 wait_event_timeout()
invocations. This is because time_after() ensures a ">" validation while
we only need a ">=" validation here:
[ 86.739909] ACPI: Waking up from system sleep state S3
[ 86.742857] ACPI : EC: 2: Increase command
[ 86.742859] ACPI : EC: ***** Command(RD_EC) started *****
[ 86.742861] ACPI : EC: ===== TASK (0) =====
[ 86.742871] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.742873] ACPI : EC: EC_SC(W) = 0x80
[ 86.742876] ACPI : EC: ***** Event started *****
[ 86.742880] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.743972] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.747966] ACPI : EC: ===== TASK (0) =====
[ 86.747977] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.747978] ACPI : EC: EC_DATA(W) = 0x06
[ 86.747981] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.751971] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755969] ACPI : EC: ===== TASK (0) =====
[ 86.755991] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 86.755993] ACPI : EC: EC_DATA(R) = 0x03
[ 86.755994] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755995] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 86.755996] ACPI : EC: 1: Decrease command
This patch corrects this by using time_before() instead in ec_guard():
[ 54.283146] ACPI: Waking up from system sleep state S3
[ 54.285414] ACPI : EC: 2: Increase command
[ 54.285415] ACPI : EC: ***** Command(RD_EC) started *****
[ 54.285416] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.285417] ACPI : EC: ===== TASK (0) =====
[ 54.285424] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.285425] ACPI : EC: EC_SC(W) = 0x80
[ 54.285427] ACPI : EC: ***** Event started *****
[ 54.285429] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.287209] ACPI : EC: ===== TASK (0) =====
[ 54.287218] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.287219] ACPI : EC: EC_DATA(W) = 0x06
[ 54.287222] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291190] ACPI : EC: ===== TASK (0) =====
[ 54.291210] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 54.291213] ACPI : EC: EC_DATA(R) = 0x03
[ 54.291214] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291215] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 54.291216] ACPI : EC: 1: Decrease command
After cleaning up all guarding logics, we have one single function
ec_guard() collecting all old, non-root-caused, hidden logics. Then we can
easily tune the logics in one place to respond to the bug reports.
Except the time_before() change, all other changes do not change the
behavior of the EC driver.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=12011
Link: https://bugzilla.kernel.org/show_bug.cgi?id=20242
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 14:16:42 +08:00
do {
2017-01-20 16:42:48 +08:00
if ( ec - > busy_polling ) {
ACPI / EC: Fix and clean up register access guarding logics.
In the polling mode, EC driver shouldn't access the EC registers too
frequently. Though this statement is concluded from the non-root caused
bugs (see links below), we've maintained the register access guarding
logics in the current EC driver. The guarding logics can be found here and
there, makes it hard to root cause real timing issues. This patch collects
the guarding logics into one single function so that all hidden logics
related to this can be seen clearly.
The current guarding related code also has several issues:
1. Per-transaction timestamp prevents inter-transaction guarding from being
implemented in the same place. We have an inter-transaction udelay() in
acpi_ec_transaction_unblocked(), this logic can be merged into ec_poll()
if we can use per-device timestamp. This patch completes such merge to
form a new ec_guard() function and collects all guarding related hidden
logics in it.
One hidden logic is: there is no inter-transaction guarding performed
for non MSI quirk (wait polling mode), this patch skips
inter-transaction guarding before wait_event_timeout() for the wait
polling mode to reveal the hidden logic.
The other hidden logic is: there is msleep() inter-transaction guarding
performed when the GPE storming is observed. As after merging this
commit:
Commit: e1d4d90fc0313d3d58cbd7912c90f8ef24df45ff
Subject: ACPI / EC: Refine command storm prevention support
EC_FLAGS_COMMAND_STORM is ensured to be cleared after invoking
acpi_ec_transaction_unlocked(), the msleep() guard logic will never
happen now. Since no one complains such change, this logic is likely
added during the old times where the EC race issues are not fixed and
the bugs are false root-caused to the timing issue. This patch simply
removes the out-dated logic. We can restore it by stop skipping
inter-transaction guarding for wait polling mode.
Two different delay values are defined for msleep() and udelay() while
they are merged in this patch to 550us.
2. time_after() causes additional delay in the polling mode (can only be
observed in noirq suspend/resume processes where polling mode is always
used) before advance_transaction() is invoked ("wait polling" log is
added before wait_event_timeout()). We can see 2 wait_event_timeout()
invocations. This is because time_after() ensures a ">" validation while
we only need a ">=" validation here:
[ 86.739909] ACPI: Waking up from system sleep state S3
[ 86.742857] ACPI : EC: 2: Increase command
[ 86.742859] ACPI : EC: ***** Command(RD_EC) started *****
[ 86.742861] ACPI : EC: ===== TASK (0) =====
[ 86.742871] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.742873] ACPI : EC: EC_SC(W) = 0x80
[ 86.742876] ACPI : EC: ***** Event started *****
[ 86.742880] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.743972] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.747966] ACPI : EC: ===== TASK (0) =====
[ 86.747977] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.747978] ACPI : EC: EC_DATA(W) = 0x06
[ 86.747981] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.751971] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755969] ACPI : EC: ===== TASK (0) =====
[ 86.755991] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 86.755993] ACPI : EC: EC_DATA(R) = 0x03
[ 86.755994] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755995] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 86.755996] ACPI : EC: 1: Decrease command
This patch corrects this by using time_before() instead in ec_guard():
[ 54.283146] ACPI: Waking up from system sleep state S3
[ 54.285414] ACPI : EC: 2: Increase command
[ 54.285415] ACPI : EC: ***** Command(RD_EC) started *****
[ 54.285416] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.285417] ACPI : EC: ===== TASK (0) =====
[ 54.285424] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.285425] ACPI : EC: EC_SC(W) = 0x80
[ 54.285427] ACPI : EC: ***** Event started *****
[ 54.285429] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.287209] ACPI : EC: ===== TASK (0) =====
[ 54.287218] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.287219] ACPI : EC: EC_DATA(W) = 0x06
[ 54.287222] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291190] ACPI : EC: ===== TASK (0) =====
[ 54.291210] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 54.291213] ACPI : EC: EC_DATA(R) = 0x03
[ 54.291214] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291215] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 54.291216] ACPI : EC: 1: Decrease command
After cleaning up all guarding logics, we have one single function
ec_guard() collecting all old, non-root-caused, hidden logics. Then we can
easily tune the logics in one place to respond to the bug reports.
Except the time_before() change, all other changes do not change the
behavior of the EC driver.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=12011
Link: https://bugzilla.kernel.org/show_bug.cgi?id=20242
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 14:16:42 +08:00
/* Perform busy polling */
if ( ec_transaction_completed ( ec ) )
return 0 ;
udelay ( jiffies_to_usecs ( guard ) ) ;
} else {
/*
* Perform wait polling
2015-09-24 14:54:54 +08:00
* 1. Wait the transaction to be completed by the
* GPE handler after the transaction enters
* ACPI_EC_COMMAND_POLL state .
* 2. A special guarding logic is also required
* for event clearing mode " event " before the
* transaction enters ACPI_EC_COMMAND_POLL
* state .
ACPI / EC: Fix and clean up register access guarding logics.
In the polling mode, EC driver shouldn't access the EC registers too
frequently. Though this statement is concluded from the non-root caused
bugs (see links below), we've maintained the register access guarding
logics in the current EC driver. The guarding logics can be found here and
there, makes it hard to root cause real timing issues. This patch collects
the guarding logics into one single function so that all hidden logics
related to this can be seen clearly.
The current guarding related code also has several issues:
1. Per-transaction timestamp prevents inter-transaction guarding from being
implemented in the same place. We have an inter-transaction udelay() in
acpi_ec_transaction_unblocked(), this logic can be merged into ec_poll()
if we can use per-device timestamp. This patch completes such merge to
form a new ec_guard() function and collects all guarding related hidden
logics in it.
One hidden logic is: there is no inter-transaction guarding performed
for non MSI quirk (wait polling mode), this patch skips
inter-transaction guarding before wait_event_timeout() for the wait
polling mode to reveal the hidden logic.
The other hidden logic is: there is msleep() inter-transaction guarding
performed when the GPE storming is observed. As after merging this
commit:
Commit: e1d4d90fc0313d3d58cbd7912c90f8ef24df45ff
Subject: ACPI / EC: Refine command storm prevention support
EC_FLAGS_COMMAND_STORM is ensured to be cleared after invoking
acpi_ec_transaction_unlocked(), the msleep() guard logic will never
happen now. Since no one complains such change, this logic is likely
added during the old times where the EC race issues are not fixed and
the bugs are false root-caused to the timing issue. This patch simply
removes the out-dated logic. We can restore it by stop skipping
inter-transaction guarding for wait polling mode.
Two different delay values are defined for msleep() and udelay() while
they are merged in this patch to 550us.
2. time_after() causes additional delay in the polling mode (can only be
observed in noirq suspend/resume processes where polling mode is always
used) before advance_transaction() is invoked ("wait polling" log is
added before wait_event_timeout()). We can see 2 wait_event_timeout()
invocations. This is because time_after() ensures a ">" validation while
we only need a ">=" validation here:
[ 86.739909] ACPI: Waking up from system sleep state S3
[ 86.742857] ACPI : EC: 2: Increase command
[ 86.742859] ACPI : EC: ***** Command(RD_EC) started *****
[ 86.742861] ACPI : EC: ===== TASK (0) =====
[ 86.742871] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.742873] ACPI : EC: EC_SC(W) = 0x80
[ 86.742876] ACPI : EC: ***** Event started *****
[ 86.742880] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.743972] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.747966] ACPI : EC: ===== TASK (0) =====
[ 86.747977] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.747978] ACPI : EC: EC_DATA(W) = 0x06
[ 86.747981] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.751971] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755969] ACPI : EC: ===== TASK (0) =====
[ 86.755991] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 86.755993] ACPI : EC: EC_DATA(R) = 0x03
[ 86.755994] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755995] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 86.755996] ACPI : EC: 1: Decrease command
This patch corrects this by using time_before() instead in ec_guard():
[ 54.283146] ACPI: Waking up from system sleep state S3
[ 54.285414] ACPI : EC: 2: Increase command
[ 54.285415] ACPI : EC: ***** Command(RD_EC) started *****
[ 54.285416] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.285417] ACPI : EC: ===== TASK (0) =====
[ 54.285424] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.285425] ACPI : EC: EC_SC(W) = 0x80
[ 54.285427] ACPI : EC: ***** Event started *****
[ 54.285429] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.287209] ACPI : EC: ===== TASK (0) =====
[ 54.287218] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.287219] ACPI : EC: EC_DATA(W) = 0x06
[ 54.287222] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291190] ACPI : EC: ===== TASK (0) =====
[ 54.291210] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 54.291213] ACPI : EC: EC_DATA(R) = 0x03
[ 54.291214] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291215] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 54.291216] ACPI : EC: 1: Decrease command
After cleaning up all guarding logics, we have one single function
ec_guard() collecting all old, non-root-caused, hidden logics. Then we can
easily tune the logics in one place to respond to the bug reports.
Except the time_before() change, all other changes do not change the
behavior of the EC driver.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=12011
Link: https://bugzilla.kernel.org/show_bug.cgi?id=20242
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 14:16:42 +08:00
*/
2015-06-11 13:21:38 +08:00
if ( ! ec_transaction_polled ( ec ) & &
! acpi_ec_guard_event ( ec ) )
ACPI / EC: Fix and clean up register access guarding logics.
In the polling mode, EC driver shouldn't access the EC registers too
frequently. Though this statement is concluded from the non-root caused
bugs (see links below), we've maintained the register access guarding
logics in the current EC driver. The guarding logics can be found here and
there, makes it hard to root cause real timing issues. This patch collects
the guarding logics into one single function so that all hidden logics
related to this can be seen clearly.
The current guarding related code also has several issues:
1. Per-transaction timestamp prevents inter-transaction guarding from being
implemented in the same place. We have an inter-transaction udelay() in
acpi_ec_transaction_unblocked(), this logic can be merged into ec_poll()
if we can use per-device timestamp. This patch completes such merge to
form a new ec_guard() function and collects all guarding related hidden
logics in it.
One hidden logic is: there is no inter-transaction guarding performed
for non MSI quirk (wait polling mode), this patch skips
inter-transaction guarding before wait_event_timeout() for the wait
polling mode to reveal the hidden logic.
The other hidden logic is: there is msleep() inter-transaction guarding
performed when the GPE storming is observed. As after merging this
commit:
Commit: e1d4d90fc0313d3d58cbd7912c90f8ef24df45ff
Subject: ACPI / EC: Refine command storm prevention support
EC_FLAGS_COMMAND_STORM is ensured to be cleared after invoking
acpi_ec_transaction_unlocked(), the msleep() guard logic will never
happen now. Since no one complains such change, this logic is likely
added during the old times where the EC race issues are not fixed and
the bugs are false root-caused to the timing issue. This patch simply
removes the out-dated logic. We can restore it by stop skipping
inter-transaction guarding for wait polling mode.
Two different delay values are defined for msleep() and udelay() while
they are merged in this patch to 550us.
2. time_after() causes additional delay in the polling mode (can only be
observed in noirq suspend/resume processes where polling mode is always
used) before advance_transaction() is invoked ("wait polling" log is
added before wait_event_timeout()). We can see 2 wait_event_timeout()
invocations. This is because time_after() ensures a ">" validation while
we only need a ">=" validation here:
[ 86.739909] ACPI: Waking up from system sleep state S3
[ 86.742857] ACPI : EC: 2: Increase command
[ 86.742859] ACPI : EC: ***** Command(RD_EC) started *****
[ 86.742861] ACPI : EC: ===== TASK (0) =====
[ 86.742871] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.742873] ACPI : EC: EC_SC(W) = 0x80
[ 86.742876] ACPI : EC: ***** Event started *****
[ 86.742880] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.743972] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.747966] ACPI : EC: ===== TASK (0) =====
[ 86.747977] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.747978] ACPI : EC: EC_DATA(W) = 0x06
[ 86.747981] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.751971] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755969] ACPI : EC: ===== TASK (0) =====
[ 86.755991] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 86.755993] ACPI : EC: EC_DATA(R) = 0x03
[ 86.755994] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755995] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 86.755996] ACPI : EC: 1: Decrease command
This patch corrects this by using time_before() instead in ec_guard():
[ 54.283146] ACPI: Waking up from system sleep state S3
[ 54.285414] ACPI : EC: 2: Increase command
[ 54.285415] ACPI : EC: ***** Command(RD_EC) started *****
[ 54.285416] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.285417] ACPI : EC: ===== TASK (0) =====
[ 54.285424] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.285425] ACPI : EC: EC_SC(W) = 0x80
[ 54.285427] ACPI : EC: ***** Event started *****
[ 54.285429] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.287209] ACPI : EC: ===== TASK (0) =====
[ 54.287218] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.287219] ACPI : EC: EC_DATA(W) = 0x06
[ 54.287222] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291190] ACPI : EC: ===== TASK (0) =====
[ 54.291210] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 54.291213] ACPI : EC: EC_DATA(R) = 0x03
[ 54.291214] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291215] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 54.291216] ACPI : EC: 1: Decrease command
After cleaning up all guarding logics, we have one single function
ec_guard() collecting all old, non-root-caused, hidden logics. Then we can
easily tune the logics in one place to respond to the bug reports.
Except the time_before() change, all other changes do not change the
behavior of the EC driver.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=12011
Link: https://bugzilla.kernel.org/show_bug.cgi?id=20242
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 14:16:42 +08:00
break ;
if ( wait_event_timeout ( ec - > wait ,
ec_transaction_completed ( ec ) ,
guard ) )
return 0 ;
}
} while ( time_before ( jiffies , timeout ) ) ;
return - ETIME ;
2008-03-21 17:07:03 +03:00
}
2008-01-23 22:28:34 -05:00
2008-09-25 21:00:31 +04:00
static int ec_poll ( struct acpi_ec * ec )
{
2009-08-30 03:06:14 +04:00
unsigned long flags ;
2013-05-06 03:23:40 +00:00
int repeat = 5 ; /* number of command restarts */
2014-10-14 14:24:01 +08:00
2009-08-30 03:06:14 +04:00
while ( repeat - - ) {
unsigned long delay = jiffies +
2010-10-21 18:24:57 +02:00
msecs_to_jiffies ( ec_delay ) ;
2009-08-30 03:06:14 +04:00
do {
ACPI / EC: Fix and clean up register access guarding logics.
In the polling mode, EC driver shouldn't access the EC registers too
frequently. Though this statement is concluded from the non-root caused
bugs (see links below), we've maintained the register access guarding
logics in the current EC driver. The guarding logics can be found here and
there, makes it hard to root cause real timing issues. This patch collects
the guarding logics into one single function so that all hidden logics
related to this can be seen clearly.
The current guarding related code also has several issues:
1. Per-transaction timestamp prevents inter-transaction guarding from being
implemented in the same place. We have an inter-transaction udelay() in
acpi_ec_transaction_unblocked(), this logic can be merged into ec_poll()
if we can use per-device timestamp. This patch completes such merge to
form a new ec_guard() function and collects all guarding related hidden
logics in it.
One hidden logic is: there is no inter-transaction guarding performed
for non MSI quirk (wait polling mode), this patch skips
inter-transaction guarding before wait_event_timeout() for the wait
polling mode to reveal the hidden logic.
The other hidden logic is: there is msleep() inter-transaction guarding
performed when the GPE storming is observed. As after merging this
commit:
Commit: e1d4d90fc0313d3d58cbd7912c90f8ef24df45ff
Subject: ACPI / EC: Refine command storm prevention support
EC_FLAGS_COMMAND_STORM is ensured to be cleared after invoking
acpi_ec_transaction_unlocked(), the msleep() guard logic will never
happen now. Since no one complains such change, this logic is likely
added during the old times where the EC race issues are not fixed and
the bugs are false root-caused to the timing issue. This patch simply
removes the out-dated logic. We can restore it by stop skipping
inter-transaction guarding for wait polling mode.
Two different delay values are defined for msleep() and udelay() while
they are merged in this patch to 550us.
2. time_after() causes additional delay in the polling mode (can only be
observed in noirq suspend/resume processes where polling mode is always
used) before advance_transaction() is invoked ("wait polling" log is
added before wait_event_timeout()). We can see 2 wait_event_timeout()
invocations. This is because time_after() ensures a ">" validation while
we only need a ">=" validation here:
[ 86.739909] ACPI: Waking up from system sleep state S3
[ 86.742857] ACPI : EC: 2: Increase command
[ 86.742859] ACPI : EC: ***** Command(RD_EC) started *****
[ 86.742861] ACPI : EC: ===== TASK (0) =====
[ 86.742871] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.742873] ACPI : EC: EC_SC(W) = 0x80
[ 86.742876] ACPI : EC: ***** Event started *****
[ 86.742880] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.743972] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.747966] ACPI : EC: ===== TASK (0) =====
[ 86.747977] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.747978] ACPI : EC: EC_DATA(W) = 0x06
[ 86.747981] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.751971] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755969] ACPI : EC: ===== TASK (0) =====
[ 86.755991] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 86.755993] ACPI : EC: EC_DATA(R) = 0x03
[ 86.755994] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755995] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 86.755996] ACPI : EC: 1: Decrease command
This patch corrects this by using time_before() instead in ec_guard():
[ 54.283146] ACPI: Waking up from system sleep state S3
[ 54.285414] ACPI : EC: 2: Increase command
[ 54.285415] ACPI : EC: ***** Command(RD_EC) started *****
[ 54.285416] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.285417] ACPI : EC: ===== TASK (0) =====
[ 54.285424] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.285425] ACPI : EC: EC_SC(W) = 0x80
[ 54.285427] ACPI : EC: ***** Event started *****
[ 54.285429] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.287209] ACPI : EC: ===== TASK (0) =====
[ 54.287218] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.287219] ACPI : EC: EC_DATA(W) = 0x06
[ 54.287222] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291190] ACPI : EC: ===== TASK (0) =====
[ 54.291210] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 54.291213] ACPI : EC: EC_DATA(R) = 0x03
[ 54.291214] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291215] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 54.291216] ACPI : EC: 1: Decrease command
After cleaning up all guarding logics, we have one single function
ec_guard() collecting all old, non-root-caused, hidden logics. Then we can
easily tune the logics in one place to respond to the bug reports.
Except the time_before() change, all other changes do not change the
behavior of the EC driver.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=12011
Link: https://bugzilla.kernel.org/show_bug.cgi?id=20242
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 14:16:42 +08:00
if ( ! ec_guard ( ec ) )
return 0 ;
2014-06-15 08:41:35 +08:00
spin_lock_irqsave ( & ec - > lock , flags ) ;
2020-11-13 19:13:17 +01:00
advance_transaction ( ec , false ) ;
2014-06-15 08:41:35 +08:00
spin_unlock_irqrestore ( & ec - > lock , flags ) ;
2009-08-30 03:06:14 +04:00
} while ( time_before ( jiffies , delay ) ) ;
2013-09-12 03:32:04 -04:00
pr_debug ( " controller reset, restart transaction \n " ) ;
2012-10-23 01:29:27 +02:00
spin_lock_irqsave ( & ec - > lock , flags ) ;
2009-08-30 03:06:14 +04:00
start_transaction ( ec ) ;
2012-10-23 01:29:27 +02:00
spin_unlock_irqrestore ( & ec - > lock , flags ) ;
2006-12-07 18:42:16 +03:00
}
2008-03-21 17:07:15 +03:00
return - ETIME ;
2005-04-16 15:20:36 -07:00
}
2008-09-26 00:54:28 +04:00
static int acpi_ec_transaction_unlocked ( struct acpi_ec * ec ,
2009-08-30 03:06:14 +04:00
struct transaction * t )
2005-07-23 04:08:00 -04:00
{
2008-09-25 21:00:31 +04:00
unsigned long tmp ;
int ret = 0 ;
2014-10-14 14:24:01 +08:00
2008-09-25 21:00:31 +04:00
/* start transaction */
2012-10-23 01:29:27 +02:00
spin_lock_irqsave ( & ec - > lock , tmp ) ;
2015-02-06 08:57:59 +08:00
/* Enable GPE for command processing (IBF=0/OBF=1) */
2015-02-11 17:35:05 +01:00
if ( ! acpi_ec_submit_flushable_request ( ec ) ) {
2015-02-06 08:57:52 +08:00
ret = - EINVAL ;
goto unlock ;
}
2015-02-27 14:48:24 +08:00
ec_dbg_ref ( ec , " Increase command " ) ;
2008-09-25 21:00:31 +04:00
/* following two actions should be kept atomic */
2008-09-26 00:54:28 +04:00
ec - > curr = t ;
2015-02-27 14:48:15 +08:00
ec_dbg_req ( " Command(%s) started " , acpi_ec_cmd_string ( t - > command ) ) ;
2008-11-12 01:40:19 +03:00
start_transaction ( ec ) ;
2014-10-29 11:33:43 +08:00
spin_unlock_irqrestore ( & ec - > lock , tmp ) ;
ACPI / EC: Fix and clean up register access guarding logics.
In the polling mode, EC driver shouldn't access the EC registers too
frequently. Though this statement is concluded from the non-root caused
bugs (see links below), we've maintained the register access guarding
logics in the current EC driver. The guarding logics can be found here and
there, makes it hard to root cause real timing issues. This patch collects
the guarding logics into one single function so that all hidden logics
related to this can be seen clearly.
The current guarding related code also has several issues:
1. Per-transaction timestamp prevents inter-transaction guarding from being
implemented in the same place. We have an inter-transaction udelay() in
acpi_ec_transaction_unblocked(), this logic can be merged into ec_poll()
if we can use per-device timestamp. This patch completes such merge to
form a new ec_guard() function and collects all guarding related hidden
logics in it.
One hidden logic is: there is no inter-transaction guarding performed
for non MSI quirk (wait polling mode), this patch skips
inter-transaction guarding before wait_event_timeout() for the wait
polling mode to reveal the hidden logic.
The other hidden logic is: there is msleep() inter-transaction guarding
performed when the GPE storming is observed. As after merging this
commit:
Commit: e1d4d90fc0313d3d58cbd7912c90f8ef24df45ff
Subject: ACPI / EC: Refine command storm prevention support
EC_FLAGS_COMMAND_STORM is ensured to be cleared after invoking
acpi_ec_transaction_unlocked(), the msleep() guard logic will never
happen now. Since no one complains such change, this logic is likely
added during the old times where the EC race issues are not fixed and
the bugs are false root-caused to the timing issue. This patch simply
removes the out-dated logic. We can restore it by stop skipping
inter-transaction guarding for wait polling mode.
Two different delay values are defined for msleep() and udelay() while
they are merged in this patch to 550us.
2. time_after() causes additional delay in the polling mode (can only be
observed in noirq suspend/resume processes where polling mode is always
used) before advance_transaction() is invoked ("wait polling" log is
added before wait_event_timeout()). We can see 2 wait_event_timeout()
invocations. This is because time_after() ensures a ">" validation while
we only need a ">=" validation here:
[ 86.739909] ACPI: Waking up from system sleep state S3
[ 86.742857] ACPI : EC: 2: Increase command
[ 86.742859] ACPI : EC: ***** Command(RD_EC) started *****
[ 86.742861] ACPI : EC: ===== TASK (0) =====
[ 86.742871] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.742873] ACPI : EC: EC_SC(W) = 0x80
[ 86.742876] ACPI : EC: ***** Event started *****
[ 86.742880] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.743972] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.747966] ACPI : EC: ===== TASK (0) =====
[ 86.747977] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.747978] ACPI : EC: EC_DATA(W) = 0x06
[ 86.747981] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.751971] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755969] ACPI : EC: ===== TASK (0) =====
[ 86.755991] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 86.755993] ACPI : EC: EC_DATA(R) = 0x03
[ 86.755994] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755995] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 86.755996] ACPI : EC: 1: Decrease command
This patch corrects this by using time_before() instead in ec_guard():
[ 54.283146] ACPI: Waking up from system sleep state S3
[ 54.285414] ACPI : EC: 2: Increase command
[ 54.285415] ACPI : EC: ***** Command(RD_EC) started *****
[ 54.285416] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.285417] ACPI : EC: ===== TASK (0) =====
[ 54.285424] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.285425] ACPI : EC: EC_SC(W) = 0x80
[ 54.285427] ACPI : EC: ***** Event started *****
[ 54.285429] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.287209] ACPI : EC: ===== TASK (0) =====
[ 54.287218] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.287219] ACPI : EC: EC_DATA(W) = 0x06
[ 54.287222] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291190] ACPI : EC: ===== TASK (0) =====
[ 54.291210] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 54.291213] ACPI : EC: EC_DATA(R) = 0x03
[ 54.291214] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291215] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 54.291216] ACPI : EC: 1: Decrease command
After cleaning up all guarding logics, we have one single function
ec_guard() collecting all old, non-root-caused, hidden logics. Then we can
easily tune the logics in one place to respond to the bug reports.
Except the time_before() change, all other changes do not change the
behavior of the EC driver.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=12011
Link: https://bugzilla.kernel.org/show_bug.cgi?id=20242
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 14:16:42 +08:00
2014-10-29 11:33:43 +08:00
ret = ec_poll ( ec ) ;
ACPI / EC: Fix and clean up register access guarding logics.
In the polling mode, EC driver shouldn't access the EC registers too
frequently. Though this statement is concluded from the non-root caused
bugs (see links below), we've maintained the register access guarding
logics in the current EC driver. The guarding logics can be found here and
there, makes it hard to root cause real timing issues. This patch collects
the guarding logics into one single function so that all hidden logics
related to this can be seen clearly.
The current guarding related code also has several issues:
1. Per-transaction timestamp prevents inter-transaction guarding from being
implemented in the same place. We have an inter-transaction udelay() in
acpi_ec_transaction_unblocked(), this logic can be merged into ec_poll()
if we can use per-device timestamp. This patch completes such merge to
form a new ec_guard() function and collects all guarding related hidden
logics in it.
One hidden logic is: there is no inter-transaction guarding performed
for non MSI quirk (wait polling mode), this patch skips
inter-transaction guarding before wait_event_timeout() for the wait
polling mode to reveal the hidden logic.
The other hidden logic is: there is msleep() inter-transaction guarding
performed when the GPE storming is observed. As after merging this
commit:
Commit: e1d4d90fc0313d3d58cbd7912c90f8ef24df45ff
Subject: ACPI / EC: Refine command storm prevention support
EC_FLAGS_COMMAND_STORM is ensured to be cleared after invoking
acpi_ec_transaction_unlocked(), the msleep() guard logic will never
happen now. Since no one complains such change, this logic is likely
added during the old times where the EC race issues are not fixed and
the bugs are false root-caused to the timing issue. This patch simply
removes the out-dated logic. We can restore it by stop skipping
inter-transaction guarding for wait polling mode.
Two different delay values are defined for msleep() and udelay() while
they are merged in this patch to 550us.
2. time_after() causes additional delay in the polling mode (can only be
observed in noirq suspend/resume processes where polling mode is always
used) before advance_transaction() is invoked ("wait polling" log is
added before wait_event_timeout()). We can see 2 wait_event_timeout()
invocations. This is because time_after() ensures a ">" validation while
we only need a ">=" validation here:
[ 86.739909] ACPI: Waking up from system sleep state S3
[ 86.742857] ACPI : EC: 2: Increase command
[ 86.742859] ACPI : EC: ***** Command(RD_EC) started *****
[ 86.742861] ACPI : EC: ===== TASK (0) =====
[ 86.742871] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.742873] ACPI : EC: EC_SC(W) = 0x80
[ 86.742876] ACPI : EC: ***** Event started *****
[ 86.742880] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.743972] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.747966] ACPI : EC: ===== TASK (0) =====
[ 86.747977] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.747978] ACPI : EC: EC_DATA(W) = 0x06
[ 86.747981] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.751971] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755969] ACPI : EC: ===== TASK (0) =====
[ 86.755991] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 86.755993] ACPI : EC: EC_DATA(R) = 0x03
[ 86.755994] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755995] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 86.755996] ACPI : EC: 1: Decrease command
This patch corrects this by using time_before() instead in ec_guard():
[ 54.283146] ACPI: Waking up from system sleep state S3
[ 54.285414] ACPI : EC: 2: Increase command
[ 54.285415] ACPI : EC: ***** Command(RD_EC) started *****
[ 54.285416] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.285417] ACPI : EC: ===== TASK (0) =====
[ 54.285424] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.285425] ACPI : EC: EC_SC(W) = 0x80
[ 54.285427] ACPI : EC: ***** Event started *****
[ 54.285429] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.287209] ACPI : EC: ===== TASK (0) =====
[ 54.287218] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.287219] ACPI : EC: EC_DATA(W) = 0x06
[ 54.287222] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291190] ACPI : EC: ===== TASK (0) =====
[ 54.291210] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 54.291213] ACPI : EC: EC_DATA(R) = 0x03
[ 54.291214] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291215] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 54.291216] ACPI : EC: 1: Decrease command
After cleaning up all guarding logics, we have one single function
ec_guard() collecting all old, non-root-caused, hidden logics. Then we can
easily tune the logics in one place to respond to the bug reports.
Except the time_before() change, all other changes do not change the
behavior of the EC driver.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=12011
Link: https://bugzilla.kernel.org/show_bug.cgi?id=20242
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 14:16:42 +08:00
2014-10-29 11:33:43 +08:00
spin_lock_irqsave ( & ec - > lock , tmp ) ;
2015-02-06 08:58:05 +08:00
if ( t - > irq_count = = ec_storm_threshold )
2019-10-14 16:56:01 +08:00
acpi_ec_unmask_events ( ec ) ;
2015-02-27 14:48:15 +08:00
ec_dbg_req ( " Command(%s) stopped " , acpi_ec_cmd_string ( t - > command ) ) ;
2008-09-26 00:54:28 +04:00
ec - > curr = NULL ;
2015-02-06 08:57:59 +08:00
/* Disable GPE for command processing (IBF=0/OBF=1) */
acpi_ec_complete_request ( ec ) ;
2015-02-27 14:48:24 +08:00
ec_dbg_ref ( ec , " Decrease command " ) ;
2015-02-06 08:57:52 +08:00
unlock :
2012-10-23 01:29:27 +02:00
spin_unlock_irqrestore ( & ec - > lock , tmp ) ;
2008-09-25 21:00:31 +04:00
return ret ;
}
2009-08-30 03:06:14 +04:00
static int acpi_ec_transaction ( struct acpi_ec * ec , struct transaction * t )
2005-04-16 15:20:36 -07:00
{
2006-09-05 12:12:24 -04:00
int status ;
2005-08-11 17:32:05 -04:00
u32 glk ;
2014-10-14 14:24:01 +08:00
2008-09-26 00:54:28 +04:00
if ( ! ec | | ( ! t ) | | ( t - > wlen & & ! t - > wdata ) | | ( t - > rlen & & ! t - > rdata ) )
2006-06-27 00:41:40 -04:00
return - EINVAL ;
2008-09-26 00:54:28 +04:00
if ( t - > rdata )
memset ( t - > rdata , 0 , t - > rlen ) ;
ACPI / EC: Fix and clean up register access guarding logics.
In the polling mode, EC driver shouldn't access the EC registers too
frequently. Though this statement is concluded from the non-root caused
bugs (see links below), we've maintained the register access guarding
logics in the current EC driver. The guarding logics can be found here and
there, makes it hard to root cause real timing issues. This patch collects
the guarding logics into one single function so that all hidden logics
related to this can be seen clearly.
The current guarding related code also has several issues:
1. Per-transaction timestamp prevents inter-transaction guarding from being
implemented in the same place. We have an inter-transaction udelay() in
acpi_ec_transaction_unblocked(), this logic can be merged into ec_poll()
if we can use per-device timestamp. This patch completes such merge to
form a new ec_guard() function and collects all guarding related hidden
logics in it.
One hidden logic is: there is no inter-transaction guarding performed
for non MSI quirk (wait polling mode), this patch skips
inter-transaction guarding before wait_event_timeout() for the wait
polling mode to reveal the hidden logic.
The other hidden logic is: there is msleep() inter-transaction guarding
performed when the GPE storming is observed. As after merging this
commit:
Commit: e1d4d90fc0313d3d58cbd7912c90f8ef24df45ff
Subject: ACPI / EC: Refine command storm prevention support
EC_FLAGS_COMMAND_STORM is ensured to be cleared after invoking
acpi_ec_transaction_unlocked(), the msleep() guard logic will never
happen now. Since no one complains such change, this logic is likely
added during the old times where the EC race issues are not fixed and
the bugs are false root-caused to the timing issue. This patch simply
removes the out-dated logic. We can restore it by stop skipping
inter-transaction guarding for wait polling mode.
Two different delay values are defined for msleep() and udelay() while
they are merged in this patch to 550us.
2. time_after() causes additional delay in the polling mode (can only be
observed in noirq suspend/resume processes where polling mode is always
used) before advance_transaction() is invoked ("wait polling" log is
added before wait_event_timeout()). We can see 2 wait_event_timeout()
invocations. This is because time_after() ensures a ">" validation while
we only need a ">=" validation here:
[ 86.739909] ACPI: Waking up from system sleep state S3
[ 86.742857] ACPI : EC: 2: Increase command
[ 86.742859] ACPI : EC: ***** Command(RD_EC) started *****
[ 86.742861] ACPI : EC: ===== TASK (0) =====
[ 86.742871] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.742873] ACPI : EC: EC_SC(W) = 0x80
[ 86.742876] ACPI : EC: ***** Event started *****
[ 86.742880] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.743972] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.747966] ACPI : EC: ===== TASK (0) =====
[ 86.747977] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.747978] ACPI : EC: EC_DATA(W) = 0x06
[ 86.747981] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.751971] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755969] ACPI : EC: ===== TASK (0) =====
[ 86.755991] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 86.755993] ACPI : EC: EC_DATA(R) = 0x03
[ 86.755994] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755995] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 86.755996] ACPI : EC: 1: Decrease command
This patch corrects this by using time_before() instead in ec_guard():
[ 54.283146] ACPI: Waking up from system sleep state S3
[ 54.285414] ACPI : EC: 2: Increase command
[ 54.285415] ACPI : EC: ***** Command(RD_EC) started *****
[ 54.285416] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.285417] ACPI : EC: ===== TASK (0) =====
[ 54.285424] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.285425] ACPI : EC: EC_SC(W) = 0x80
[ 54.285427] ACPI : EC: ***** Event started *****
[ 54.285429] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.287209] ACPI : EC: ===== TASK (0) =====
[ 54.287218] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.287219] ACPI : EC: EC_DATA(W) = 0x06
[ 54.287222] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291190] ACPI : EC: ===== TASK (0) =====
[ 54.291210] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 54.291213] ACPI : EC: EC_DATA(R) = 0x03
[ 54.291214] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291215] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 54.291216] ACPI : EC: 1: Decrease command
After cleaning up all guarding logics, we have one single function
ec_guard() collecting all old, non-root-caused, hidden logics. Then we can
easily tune the logics in one place to respond to the bug reports.
Except the time_before() change, all other changes do not change the
behavior of the EC driver.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=12011
Link: https://bugzilla.kernel.org/show_bug.cgi?id=20242
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 14:16:42 +08:00
2012-10-23 01:29:27 +02:00
mutex_lock ( & ec - > mutex ) ;
2006-09-26 19:50:33 +04:00
if ( ec - > global_lock ) {
2005-04-16 15:20:36 -07:00
status = acpi_acquire_global_lock ( ACPI_EC_UDELAY_GLK , & glk ) ;
2007-02-15 23:16:18 +03:00
if ( ACPI_FAILURE ( status ) ) {
2008-09-25 21:00:31 +04:00
status = - ENODEV ;
goto unlock ;
2007-02-15 23:16:18 +03:00
}
2005-04-16 15:20:36 -07:00
}
2009-12-24 11:34:16 +03:00
2009-08-30 03:06:14 +04:00
status = acpi_ec_transaction_unlocked ( ec , t ) ;
2009-12-24 11:34:16 +03:00
2006-09-26 19:50:33 +04:00
if ( ec - > global_lock )
2005-04-16 15:20:36 -07:00
acpi_release_global_lock ( glk ) ;
2008-09-25 21:00:31 +04:00
unlock :
2012-10-23 01:29:27 +02:00
mutex_unlock ( & ec - > mutex ) ;
2006-06-27 00:41:40 -04:00
return status ;
2005-04-16 15:20:36 -07:00
}
2008-12-09 20:45:30 +01:00
static int acpi_ec_burst_enable ( struct acpi_ec * ec )
2007-03-07 22:28:00 +03:00
{
u8 d ;
2008-09-26 00:54:28 +04:00
struct transaction t = { . command = ACPI_EC_BURST_ENABLE ,
. wdata = NULL , . rdata = & d ,
. wlen = 0 , . rlen = 1 } ;
2009-08-30 03:06:14 +04:00
return acpi_ec_transaction ( ec , & t ) ;
2007-03-07 22:28:00 +03:00
}
2008-12-09 20:45:30 +01:00
static int acpi_ec_burst_disable ( struct acpi_ec * ec )
2007-03-07 22:28:00 +03:00
{
2008-09-26 00:54:28 +04:00
struct transaction t = { . command = ACPI_EC_BURST_DISABLE ,
. wdata = NULL , . rdata = NULL ,
. wlen = 0 , . rlen = 0 } ;
2008-09-25 21:00:31 +04:00
return ( acpi_ec_read_status ( ec ) & ACPI_EC_FLAG_BURST ) ?
2009-08-30 03:06:14 +04:00
acpi_ec_transaction ( ec , & t ) : 0 ;
2007-03-07 22:28:00 +03:00
}
2014-10-14 14:24:01 +08:00
static int acpi_ec_read ( struct acpi_ec * ec , u8 address , u8 * data )
2006-09-26 19:50:33 +04:00
{
int result ;
u8 d ;
2008-09-26 00:54:28 +04:00
struct transaction t = { . command = ACPI_EC_COMMAND_READ ,
. wdata = & address , . rdata = & d ,
. wlen = 1 , . rlen = 1 } ;
2006-09-26 19:50:33 +04:00
2009-08-30 03:06:14 +04:00
result = acpi_ec_transaction ( ec , & t ) ;
2006-09-26 19:50:33 +04:00
* data = d ;
return result ;
}
2006-09-26 19:50:33 +04:00
2006-09-26 19:50:33 +04:00
static int acpi_ec_write ( struct acpi_ec * ec , u8 address , u8 data )
{
2006-12-07 18:42:17 +03:00
u8 wdata [ 2 ] = { address , data } ;
2008-09-26 00:54:28 +04:00
struct transaction t = { . command = ACPI_EC_COMMAND_WRITE ,
. wdata = wdata , . rdata = NULL ,
. wlen = 2 , . rlen = 0 } ;
2009-08-30 03:06:14 +04:00
return acpi_ec_transaction ( ec , & t ) ;
2006-09-26 19:50:33 +04:00
}
2012-10-23 01:29:38 +02:00
int ec_read ( u8 addr , u8 * val )
2005-04-16 15:20:36 -07:00
{
int err ;
2006-09-26 19:50:33 +04:00
u8 temp_data ;
2005-04-16 15:20:36 -07:00
if ( ! first_ec )
return - ENODEV ;
2007-03-07 22:28:00 +03:00
err = acpi_ec_read ( first_ec , addr , & temp_data ) ;
2005-04-16 15:20:36 -07:00
if ( ! err ) {
* val = temp_data ;
return 0 ;
2014-10-14 14:24:01 +08:00
}
return err ;
2005-04-16 15:20:36 -07:00
}
EXPORT_SYMBOL ( ec_read ) ;
2005-08-11 17:32:05 -04:00
int ec_write ( u8 addr , u8 val )
2005-04-16 15:20:36 -07:00
{
if ( ! first_ec )
return - ENODEV ;
2022-08-25 07:27:44 +00:00
return acpi_ec_write ( first_ec , addr , val ) ;
2005-04-16 15:20:36 -07:00
}
EXPORT_SYMBOL ( ec_write ) ;
2006-10-27 01:47:34 -04:00
int ec_transaction ( u8 command ,
2014-10-14 14:24:01 +08:00
const u8 * wdata , unsigned wdata_len ,
u8 * rdata , unsigned rdata_len )
2005-07-23 04:08:00 -04:00
{
2008-09-26 00:54:28 +04:00
struct transaction t = { . command = command ,
. wdata = wdata , . rdata = rdata ,
. wlen = wdata_len , . rlen = rdata_len } ;
2014-10-14 14:24:01 +08:00
2006-09-05 12:12:24 -04:00
if ( ! first_ec )
return - ENODEV ;
2005-07-23 04:08:00 -04:00
2009-08-30 03:06:14 +04:00
return acpi_ec_transaction ( first_ec , & t ) ;
2005-07-23 04:08:00 -04:00
}
2006-10-03 22:49:00 -04:00
EXPORT_SYMBOL ( ec_transaction ) ;
2012-01-18 13:44:08 -06:00
/* Get the handle to the EC device */
acpi_handle ec_get_handle ( void )
{
if ( ! first_ec )
return NULL ;
return first_ec - > handle ;
}
EXPORT_SYMBOL ( ec_get_handle ) ;
2015-02-06 08:57:52 +08:00
static void acpi_ec_start ( struct acpi_ec * ec , bool resuming )
{
unsigned long flags ;
spin_lock_irqsave ( & ec - > lock , flags ) ;
if ( ! test_and_set_bit ( EC_FLAGS_STARTED , & ec - > flags ) ) {
2015-02-27 14:48:15 +08:00
ec_dbg_drv ( " Starting EC " ) ;
2015-02-06 08:57:59 +08:00
/* Enable GPE for event processing (SCI_EVT=1) */
2015-02-27 14:48:24 +08:00
if ( ! resuming ) {
2015-02-06 08:57:59 +08:00
acpi_ec_submit_request ( ec ) ;
2015-02-27 14:48:24 +08:00
ec_dbg_ref ( ec , " Increase driver " ) ;
ACPI / EC: Add PM operations to improve event handling for resume process
This patch makes 2 changes:
1. Restore old behavior
Originally, EC driver stops handling both events and transactions in
acpi_ec_block_transactions(), and restarts to handle transactions in
acpi_ec_unblock_transactions_early(), restarts to handle both events and
transactions in acpi_ec_unblock_transactions().
While currently, EC driver still stops handling both events and
transactions in acpi_ec_block_transactions(), but restarts to handle both
events and transactions in acpi_ec_unblock_transactions_early().
This patch tries to restore the old behavior by dropping
__acpi_ec_enable_event() from acpi_unblock_transactions_early().
2. Improve old behavior
However this still cannot fix the real issue as both of the
acpi_ec_unblock_xxx() functions are invoked in the noirq stage. Since the
EC driver actually doesn't implement the event handling in the polling
mode, re-enabling the event handling too early in the noirq stage could
result in the problem that if there is no triggering source causing
advance_transaction() to be invoked, pending SCI_EVT cannot be detected by
the EC driver and _Qxx cannot be triggered.
It actually makes sense to restart the event handling in any point during
resuming after the noirq stage. Just like the boot stage where the event
handling is enabled in .add(), this patch further moves
acpi_ec_enable_event() to .resume(). After doing that, the following 2
functions can be combined:
acpi_ec_unblock_transactions_early()/acpi_ec_unblock_transactions().
The differences of the event handling availability between the old behavior
(this patch isn't applied) and the new behavior (this patch is applied) are
as follows:
!Applied Applied
before suspend Y Y
suspend before EC Y Y
suspend after EC Y Y
suspend_late Y Y
suspend_noirq Y (actually N) Y (actually N)
resume_noirq Y (actually N) Y (actually N)
resume_late Y (actually N) Y (actually N)
resume before EC Y (actually N) Y (actually N)
resume after EC Y (actually N) Y
after resume Y (actually N) Y
Where "actually N" means if there is no triggering source, the EC driver
is actually not able to notice the pending SCI_EVT occurred in the noirq
stage. So we can clearly see that this patch has improved the situation.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-03 16:01:36 +08:00
}
2015-02-27 14:48:15 +08:00
ec_log_drv ( " EC started " ) ;
2015-02-06 08:57:52 +08:00
}
spin_unlock_irqrestore ( & ec - > lock , flags ) ;
}
2015-02-06 08:57:59 +08:00
static bool acpi_ec_stopped ( struct acpi_ec * ec )
{
unsigned long flags ;
bool flushed ;
spin_lock_irqsave ( & ec - > lock , flags ) ;
flushed = acpi_ec_flushed ( ec ) ;
spin_unlock_irqrestore ( & ec - > lock , flags ) ;
return flushed ;
}
2015-02-06 08:57:52 +08:00
static void acpi_ec_stop ( struct acpi_ec * ec , bool suspending )
{
unsigned long flags ;
spin_lock_irqsave ( & ec - > lock , flags ) ;
if ( acpi_ec_started ( ec ) ) {
2015-02-27 14:48:15 +08:00
ec_dbg_drv ( " Stopping EC " ) ;
2015-02-06 08:57:52 +08:00
set_bit ( EC_FLAGS_STOPPED , & ec - > flags ) ;
2015-02-06 08:57:59 +08:00
spin_unlock_irqrestore ( & ec - > lock , flags ) ;
wait_event ( ec - > wait , acpi_ec_stopped ( ec ) ) ;
spin_lock_irqsave ( & ec - > lock , flags ) ;
/* Disable GPE for event processing (SCI_EVT=1) */
2015-02-27 14:48:24 +08:00
if ( ! suspending ) {
2015-02-06 08:57:59 +08:00
acpi_ec_complete_request ( ec ) ;
2015-02-27 14:48:24 +08:00
ec_dbg_ref ( ec , " Decrease driver " ) ;
ACPI / EC: Add PM operations to improve event handling for suspend process
In the original EC driver, though the event handling is not explicitly
stopped, the EC driver is actually not able to handle events during the
noirq stage as the EC driver is not prepared to handle the EC events in the
polling mode. So if there is no advance_transaction() triggered, the EC
driver couldn't notice the EC events.
However, do we actually need to handle EC events during suspend/resume
stage? EC events are mostly useless for the suspend/resume period (key
strokes and battery/thermal updates, etc.,), and the useful ones (lid
close, power/sleep button press) should have already been delivered to the
OSPM to trigger the power saving operations.
Thus this patch implements acpi_ec_disable_event() to be a reverse call of
acpi_ec_enable_event(), with which, the EC driver is able to stop handling
the EC events in a position before entering the noirq stage.
Since there are actually 2 choices for us:
1. implement event handling in polling mode;
2. stop event handling before entering noirq stage.
And this patch only implements the second choice using .suspend() callback.
Thus this is experimental (first choice is better? or different hook
position is better?). This patch finally keeps the old behavior by default
and prepares a boot parameter to enable this feature.
The differences of the event handling availability between the old behavior
(this patch is not applied) and the new behavior (this patch is applied)
are as follows:
!FreezeEvents FreezeEvents
before suspend Y Y
suspend before EC Y Y
suspend after EC Y N
suspend_late Y N
suspend_noirq Y (actually N) N
resume_noirq Y (actually N) N
resume_late Y (actually N) N
resume before EC Y (actually N) N
resume after EC Y Y
after resume Y Y
Where "actually N" means if there is no EC transactions, the EC driver
is actually not able to notice the pending events.
We can see that FreezeEvents is the only approach now can actually flush
the EC event handling with both query commands and _Qxx evaluations
flushed, other modes can only flush the EC event handling with only query
commands flushed, _Qxx evaluations occurred after stopping the EC driver
may end up failure due to the failure of the EC transaction carried out in
the _Qxx control methods.
We also can see that this feature should be able to trigger some platform
notifications later than resuming other drivers.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-03 16:01:43 +08:00
} else if ( ! ec_freeze_events )
2016-08-03 16:01:24 +08:00
__acpi_ec_disable_event ( ec ) ;
2015-02-06 08:57:52 +08:00
clear_bit ( EC_FLAGS_STARTED , & ec - > flags ) ;
clear_bit ( EC_FLAGS_STOPPED , & ec - > flags ) ;
2015-02-27 14:48:15 +08:00
ec_log_drv ( " EC stopped " ) ;
2015-02-06 08:57:52 +08:00
}
spin_unlock_irqrestore ( & ec - > lock , flags ) ;
}
2017-01-20 16:42:48 +08:00
static void acpi_ec_enter_noirq ( struct acpi_ec * ec )
{
unsigned long flags ;
spin_lock_irqsave ( & ec - > lock , flags ) ;
ec - > busy_polling = true ;
ec - > polling_guard = 0 ;
ec_log_drv ( " interrupt blocked " ) ;
spin_unlock_irqrestore ( & ec - > lock , flags ) ;
}
static void acpi_ec_leave_noirq ( struct acpi_ec * ec )
{
unsigned long flags ;
spin_lock_irqsave ( & ec - > lock , flags ) ;
ec - > busy_polling = ec_busy_polling ;
ec - > polling_guard = ec_polling_guard ;
ec_log_drv ( " interrupt unblocked " ) ;
spin_unlock_irqrestore ( & ec - > lock , flags ) ;
}
2010-04-09 01:40:38 +02:00
void acpi_ec_block_transactions ( void )
2010-03-04 01:52:58 +01:00
{
struct acpi_ec * ec = first_ec ;
if ( ! ec )
return ;
2012-10-23 01:29:27 +02:00
mutex_lock ( & ec - > mutex ) ;
2010-03-04 01:52:58 +01:00
/* Prevent transactions from being carried out */
2015-02-06 08:57:52 +08:00
acpi_ec_stop ( ec , true ) ;
2012-10-23 01:29:27 +02:00
mutex_unlock ( & ec - > mutex ) ;
2010-03-04 01:52:58 +01:00
}
2010-04-09 01:40:38 +02:00
void acpi_ec_unblock_transactions ( void )
2010-04-09 01:39:40 +02:00
{
/*
* Allow transactions to happen again ( this function is called from
* atomic context during wakeup , so we don ' t need to acquire the mutex ) .
*/
if ( first_ec )
2015-02-06 08:57:52 +08:00
acpi_ec_start ( first_ec , true ) ;
2010-04-09 01:39:40 +02:00
}
2005-04-16 15:20:36 -07:00
/* --------------------------------------------------------------------------
Event Management
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
2015-09-24 14:54:48 +08:00
static struct acpi_ec_query_handler *
acpi_ec_get_query_handler_by_value ( struct acpi_ec * ec , u8 value )
{
struct acpi_ec_query_handler * handler ;
mutex_lock ( & ec - > mutex ) ;
list_for_each_entry ( handler , & ec - > list , node ) {
if ( value = = handler - > query_bit ) {
2019-12-27 11:04:21 +01:00
kref_get ( & handler - > kref ) ;
mutex_unlock ( & ec - > mutex ) ;
return handler ;
2015-09-24 14:54:48 +08:00
}
}
mutex_unlock ( & ec - > mutex ) ;
2019-12-27 11:04:21 +01:00
return NULL ;
2015-09-24 14:54:48 +08:00
}
2015-01-14 19:28:28 +08:00
static void acpi_ec_query_handler_release ( struct kref * kref )
{
struct acpi_ec_query_handler * handler =
container_of ( kref , struct acpi_ec_query_handler , kref ) ;
kfree ( handler ) ;
}
static void acpi_ec_put_query_handler ( struct acpi_ec_query_handler * handler )
{
kref_put ( & handler - > kref , acpi_ec_query_handler_release ) ;
}
2007-05-29 16:43:02 +04:00
int acpi_ec_add_query_handler ( struct acpi_ec * ec , u8 query_bit ,
acpi_handle handle , acpi_ec_query_func func ,
void * data )
{
2023-03-24 21:26:26 +01:00
struct acpi_ec_query_handler * handler ;
if ( ! handle & & ! func )
return - EINVAL ;
2014-10-14 14:24:01 +08:00
2023-03-24 21:26:26 +01:00
handler = kzalloc ( sizeof ( * handler ) , GFP_KERNEL ) ;
2007-05-29 16:43:02 +04:00
if ( ! handler )
return - ENOMEM ;
handler - > query_bit = query_bit ;
handler - > handle = handle ;
handler - > func = func ;
handler - > data = data ;
2012-10-23 01:29:27 +02:00
mutex_lock ( & ec - > mutex ) ;
2015-01-14 19:28:28 +08:00
kref_init ( & handler - > kref ) ;
2007-09-26 19:43:22 +04:00
list_add ( & handler - > node , & ec - > list ) ;
2012-10-23 01:29:27 +02:00
mutex_unlock ( & ec - > mutex ) ;
2023-03-24 21:26:26 +01:00
2007-05-29 16:43:02 +04:00
return 0 ;
}
EXPORT_SYMBOL_GPL ( acpi_ec_add_query_handler ) ;
2015-09-24 14:54:48 +08:00
static void acpi_ec_remove_query_handlers ( struct acpi_ec * ec ,
bool remove_all , u8 query_bit )
2007-05-29 16:43:02 +04:00
{
2007-10-24 18:26:00 +02:00
struct acpi_ec_query_handler * handler , * tmp ;
2015-01-14 19:28:28 +08:00
LIST_HEAD ( free_list ) ;
2014-10-14 14:24:01 +08:00
2012-10-23 01:29:27 +02:00
mutex_lock ( & ec - > mutex ) ;
2007-10-24 18:26:00 +02:00
list_for_each_entry_safe ( handler , tmp , & ec - > list , node ) {
2023-03-24 21:26:26 +01:00
/*
* When remove_all is false , only remove custom query handlers
* which have handler - > func set . This is done to preserve query
* handlers discovered thru ACPI , as they should continue handling
* EC queries .
*/
if ( remove_all | | ( handler - > func & & handler - > query_bit = = query_bit ) ) {
2015-01-14 19:28:28 +08:00
list_del_init ( & handler - > node ) ;
list_add ( & handler - > node , & free_list ) ;
2023-03-24 21:26:26 +01:00
2007-05-29 16:43:02 +04:00
}
}
2012-10-23 01:29:27 +02:00
mutex_unlock ( & ec - > mutex ) ;
2015-04-22 00:25:36 +01:00
list_for_each_entry_safe ( handler , tmp , & free_list , node )
2015-01-14 19:28:28 +08:00
acpi_ec_put_query_handler ( handler ) ;
2007-05-29 16:43:02 +04:00
}
2015-09-24 14:54:48 +08:00
void acpi_ec_remove_query_handler ( struct acpi_ec * ec , u8 query_bit )
{
acpi_ec_remove_query_handlers ( ec , false , query_bit ) ;
2023-03-24 21:26:27 +01:00
flush_workqueue ( ec_query_wq ) ;
2015-09-24 14:54:48 +08:00
}
2007-05-29 16:43:02 +04:00
EXPORT_SYMBOL_GPL ( acpi_ec_remove_query_handler ) ;
2005-04-16 15:20:36 -07:00
2015-08-12 11:12:02 +08:00
static void acpi_ec_event_processor ( struct work_struct * work )
{
struct acpi_ec_query * q = container_of ( work , struct acpi_ec_query , work ) ;
struct acpi_ec_query_handler * handler = q - > handler ;
ACPI: EC: Rework flushing of EC work while suspended to idle
The flushing of pending work in the EC driver uses drain_workqueue()
to flush the event handling work that can requeue itself via
advance_transaction(), but this is problematic, because that
work may also be requeued from the query workqueue.
Namely, if an EC transaction is carried out during the execution of
a query handler, it involves calling advance_transaction() which
may queue up the event handling work again. This causes the kernel
to complain about attempts to add a work item to the EC event
workqueue while it is being drained and worst-case it may cause a
valid event to be skipped.
To avoid this problem, introduce two new counters, events_in_progress
and queries_in_progress, incremented when a work item is queued on
the event workqueue or the query workqueue, respectively, and
decremented at the end of the corresponding work function, and make
acpi_ec_dispatch_gpe() the workqueues in a loop until the both of
these counters are zero (or system wakeup is pending) instead of
calling acpi_ec_flush_work().
At the same time, change __acpi_ec_flush_work() to call
flush_workqueue() instead of drain_workqueue() to flush the event
workqueue.
While at it, use the observation that the work item queued in
acpi_ec_query() cannot be pending at that time, because it is used
only once, to simplify the code in there.
Additionally, clean up a comment in acpi_ec_query() and adjust white
space in acpi_ec_event_processor().
Fixes: f0ac20c3f613 ("ACPI: EC: Fix flushing of pending work")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:36:51 +01:00
struct acpi_ec * ec = q - > ec ;
2014-10-14 14:24:01 +08:00
2015-02-27 14:48:15 +08:00
ec_dbg_evt ( " Query(0x%02x) started " , handler - > query_bit ) ;
ACPI: EC: Rework flushing of EC work while suspended to idle
The flushing of pending work in the EC driver uses drain_workqueue()
to flush the event handling work that can requeue itself via
advance_transaction(), but this is problematic, because that
work may also be requeued from the query workqueue.
Namely, if an EC transaction is carried out during the execution of
a query handler, it involves calling advance_transaction() which
may queue up the event handling work again. This causes the kernel
to complain about attempts to add a work item to the EC event
workqueue while it is being drained and worst-case it may cause a
valid event to be skipped.
To avoid this problem, introduce two new counters, events_in_progress
and queries_in_progress, incremented when a work item is queued on
the event workqueue or the query workqueue, respectively, and
decremented at the end of the corresponding work function, and make
acpi_ec_dispatch_gpe() the workqueues in a loop until the both of
these counters are zero (or system wakeup is pending) instead of
calling acpi_ec_flush_work().
At the same time, change __acpi_ec_flush_work() to call
flush_workqueue() instead of drain_workqueue() to flush the event
workqueue.
While at it, use the observation that the work item queued in
acpi_ec_query() cannot be pending at that time, because it is used
only once, to simplify the code in there.
Additionally, clean up a comment in acpi_ec_query() and adjust white
space in acpi_ec_event_processor().
Fixes: f0ac20c3f613 ("ACPI: EC: Fix flushing of pending work")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:36:51 +01:00
2009-12-24 11:34:16 +03:00
if ( handler - > func )
handler - > func ( handler - > data ) ;
else if ( handler - > handle )
acpi_evaluate_object ( handler - > handle , NULL , NULL , NULL ) ;
ACPI: EC: Rework flushing of EC work while suspended to idle
The flushing of pending work in the EC driver uses drain_workqueue()
to flush the event handling work that can requeue itself via
advance_transaction(), but this is problematic, because that
work may also be requeued from the query workqueue.
Namely, if an EC transaction is carried out during the execution of
a query handler, it involves calling advance_transaction() which
may queue up the event handling work again. This causes the kernel
to complain about attempts to add a work item to the EC event
workqueue while it is being drained and worst-case it may cause a
valid event to be skipped.
To avoid this problem, introduce two new counters, events_in_progress
and queries_in_progress, incremented when a work item is queued on
the event workqueue or the query workqueue, respectively, and
decremented at the end of the corresponding work function, and make
acpi_ec_dispatch_gpe() the workqueues in a loop until the both of
these counters are zero (or system wakeup is pending) instead of
calling acpi_ec_flush_work().
At the same time, change __acpi_ec_flush_work() to call
flush_workqueue() instead of drain_workqueue() to flush the event
workqueue.
While at it, use the observation that the work item queued in
acpi_ec_query() cannot be pending at that time, because it is used
only once, to simplify the code in there.
Additionally, clean up a comment in acpi_ec_query() and adjust white
space in acpi_ec_event_processor().
Fixes: f0ac20c3f613 ("ACPI: EC: Fix flushing of pending work")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:36:51 +01:00
2015-02-27 14:48:15 +08:00
ec_dbg_evt ( " Query(0x%02x) stopped " , handler - > query_bit ) ;
ACPI: EC: Rework flushing of EC work while suspended to idle
The flushing of pending work in the EC driver uses drain_workqueue()
to flush the event handling work that can requeue itself via
advance_transaction(), but this is problematic, because that
work may also be requeued from the query workqueue.
Namely, if an EC transaction is carried out during the execution of
a query handler, it involves calling advance_transaction() which
may queue up the event handling work again. This causes the kernel
to complain about attempts to add a work item to the EC event
workqueue while it is being drained and worst-case it may cause a
valid event to be skipped.
To avoid this problem, introduce two new counters, events_in_progress
and queries_in_progress, incremented when a work item is queued on
the event workqueue or the query workqueue, respectively, and
decremented at the end of the corresponding work function, and make
acpi_ec_dispatch_gpe() the workqueues in a loop until the both of
these counters are zero (or system wakeup is pending) instead of
calling acpi_ec_flush_work().
At the same time, change __acpi_ec_flush_work() to call
flush_workqueue() instead of drain_workqueue() to flush the event
workqueue.
While at it, use the observation that the work item queued in
acpi_ec_query() cannot be pending at that time, because it is used
only once, to simplify the code in there.
Additionally, clean up a comment in acpi_ec_query() and adjust white
space in acpi_ec_event_processor().
Fixes: f0ac20c3f613 ("ACPI: EC: Fix flushing of pending work")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:36:51 +01:00
spin_lock_irq ( & ec - > lock ) ;
ec - > queries_in_progress - - ;
spin_unlock_irq ( & ec - > lock ) ;
2021-11-23 19:46:09 +01:00
acpi_ec_put_query_handler ( handler ) ;
kfree ( q ) ;
}
static struct acpi_ec_query * acpi_ec_create_query ( struct acpi_ec * ec , u8 * pval )
{
struct acpi_ec_query * q ;
struct transaction * t ;
q = kzalloc ( sizeof ( struct acpi_ec_query ) , GFP_KERNEL ) ;
if ( ! q )
return NULL ;
INIT_WORK ( & q - > work , acpi_ec_event_processor ) ;
t = & q - > transaction ;
t - > command = ACPI_EC_COMMAND_QUERY ;
t - > rdata = pval ;
t - > rlen = 1 ;
q - > ec = ec ;
return q ;
2009-12-24 11:34:16 +03:00
}
2021-11-23 19:42:02 +01:00
static int acpi_ec_submit_query ( struct acpi_ec * ec )
2009-12-24 11:34:16 +03:00
{
2021-11-23 19:38:21 +01:00
struct acpi_ec_query * q ;
2009-12-24 11:34:16 +03:00
u8 value = 0 ;
2015-01-14 19:28:33 +08:00
int result ;
2015-08-12 11:12:02 +08:00
ACPI: EC: Rework flushing of EC work while suspended to idle
The flushing of pending work in the EC driver uses drain_workqueue()
to flush the event handling work that can requeue itself via
advance_transaction(), but this is problematic, because that
work may also be requeued from the query workqueue.
Namely, if an EC transaction is carried out during the execution of
a query handler, it involves calling advance_transaction() which
may queue up the event handling work again. This causes the kernel
to complain about attempts to add a work item to the EC event
workqueue while it is being drained and worst-case it may cause a
valid event to be skipped.
To avoid this problem, introduce two new counters, events_in_progress
and queries_in_progress, incremented when a work item is queued on
the event workqueue or the query workqueue, respectively, and
decremented at the end of the corresponding work function, and make
acpi_ec_dispatch_gpe() the workqueues in a loop until the both of
these counters are zero (or system wakeup is pending) instead of
calling acpi_ec_flush_work().
At the same time, change __acpi_ec_flush_work() to call
flush_workqueue() instead of drain_workqueue() to flush the event
workqueue.
While at it, use the observation that the work item queued in
acpi_ec_query() cannot be pending at that time, because it is used
only once, to simplify the code in there.
Additionally, clean up a comment in acpi_ec_query() and adjust white
space in acpi_ec_event_processor().
Fixes: f0ac20c3f613 ("ACPI: EC: Fix flushing of pending work")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:36:51 +01:00
q = acpi_ec_create_query ( ec , & value ) ;
2015-08-12 11:12:02 +08:00
if ( ! q )
return - ENOMEM ;
2014-04-30 00:21:20 +09:30
2015-01-14 19:28:53 +08:00
/*
* Query the EC to find out which _Qxx method we need to evaluate .
* Note that successful completion of the query causes the ACPI_EC_SCI
* bit to be cleared ( and thus clearing the interrupt source ) .
*/
2015-08-12 11:12:02 +08:00
result = acpi_ec_transaction ( ec , & q - > transaction ) ;
if ( result )
goto err_exit ;
2014-04-30 00:21:20 +09:30
2021-11-23 19:38:21 +01:00
if ( ! value ) {
result = - ENODATA ;
goto err_exit ;
}
2015-09-24 14:54:48 +08:00
q - > handler = acpi_ec_get_query_handler_by_value ( ec , value ) ;
if ( ! q - > handler ) {
result = - ENODATA ;
goto err_exit ;
}
/*
ACPI: EC: Rework flushing of EC work while suspended to idle
The flushing of pending work in the EC driver uses drain_workqueue()
to flush the event handling work that can requeue itself via
advance_transaction(), but this is problematic, because that
work may also be requeued from the query workqueue.
Namely, if an EC transaction is carried out during the execution of
a query handler, it involves calling advance_transaction() which
may queue up the event handling work again. This causes the kernel
to complain about attempts to add a work item to the EC event
workqueue while it is being drained and worst-case it may cause a
valid event to be skipped.
To avoid this problem, introduce two new counters, events_in_progress
and queries_in_progress, incremented when a work item is queued on
the event workqueue or the query workqueue, respectively, and
decremented at the end of the corresponding work function, and make
acpi_ec_dispatch_gpe() the workqueues in a loop until the both of
these counters are zero (or system wakeup is pending) instead of
calling acpi_ec_flush_work().
At the same time, change __acpi_ec_flush_work() to call
flush_workqueue() instead of drain_workqueue() to flush the event
workqueue.
While at it, use the observation that the work item queued in
acpi_ec_query() cannot be pending at that time, because it is used
only once, to simplify the code in there.
Additionally, clean up a comment in acpi_ec_query() and adjust white
space in acpi_ec_event_processor().
Fixes: f0ac20c3f613 ("ACPI: EC: Fix flushing of pending work")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:36:51 +01:00
* It is reported that _Qxx are evaluated in a parallel way on Windows :
2015-09-24 14:54:48 +08:00
* https : //bugzilla.kernel.org/show_bug.cgi?id=94411
*
ACPI: EC: Rework flushing of EC work while suspended to idle
The flushing of pending work in the EC driver uses drain_workqueue()
to flush the event handling work that can requeue itself via
advance_transaction(), but this is problematic, because that
work may also be requeued from the query workqueue.
Namely, if an EC transaction is carried out during the execution of
a query handler, it involves calling advance_transaction() which
may queue up the event handling work again. This causes the kernel
to complain about attempts to add a work item to the EC event
workqueue while it is being drained and worst-case it may cause a
valid event to be skipped.
To avoid this problem, introduce two new counters, events_in_progress
and queries_in_progress, incremented when a work item is queued on
the event workqueue or the query workqueue, respectively, and
decremented at the end of the corresponding work function, and make
acpi_ec_dispatch_gpe() the workqueues in a loop until the both of
these counters are zero (or system wakeup is pending) instead of
calling acpi_ec_flush_work().
At the same time, change __acpi_ec_flush_work() to call
flush_workqueue() instead of drain_workqueue() to flush the event
workqueue.
While at it, use the observation that the work item queued in
acpi_ec_query() cannot be pending at that time, because it is used
only once, to simplify the code in there.
Additionally, clean up a comment in acpi_ec_query() and adjust white
space in acpi_ec_event_processor().
Fixes: f0ac20c3f613 ("ACPI: EC: Fix flushing of pending work")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:36:51 +01:00
* Put this log entry before queue_work ( ) to make it appear in the log
* before any other messages emitted during workqueue handling .
2015-09-24 14:54:48 +08:00
*/
ec_dbg_evt ( " Query(0x%02x) scheduled " , value ) ;
ACPI: EC: Rework flushing of EC work while suspended to idle
The flushing of pending work in the EC driver uses drain_workqueue()
to flush the event handling work that can requeue itself via
advance_transaction(), but this is problematic, because that
work may also be requeued from the query workqueue.
Namely, if an EC transaction is carried out during the execution of
a query handler, it involves calling advance_transaction() which
may queue up the event handling work again. This causes the kernel
to complain about attempts to add a work item to the EC event
workqueue while it is being drained and worst-case it may cause a
valid event to be skipped.
To avoid this problem, introduce two new counters, events_in_progress
and queries_in_progress, incremented when a work item is queued on
the event workqueue or the query workqueue, respectively, and
decremented at the end of the corresponding work function, and make
acpi_ec_dispatch_gpe() the workqueues in a loop until the both of
these counters are zero (or system wakeup is pending) instead of
calling acpi_ec_flush_work().
At the same time, change __acpi_ec_flush_work() to call
flush_workqueue() instead of drain_workqueue() to flush the event
workqueue.
While at it, use the observation that the work item queued in
acpi_ec_query() cannot be pending at that time, because it is used
only once, to simplify the code in there.
Additionally, clean up a comment in acpi_ec_query() and adjust white
space in acpi_ec_event_processor().
Fixes: f0ac20c3f613 ("ACPI: EC: Fix flushing of pending work")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:36:51 +01:00
spin_lock_irq ( & ec - > lock ) ;
ec - > queries_in_progress + + ;
queue_work ( ec_query_wq , & q - > work ) ;
spin_unlock_irq ( & ec - > lock ) ;
2015-08-12 11:12:02 +08:00
2021-11-23 19:46:09 +01:00
return 0 ;
2015-08-12 11:12:02 +08:00
err_exit :
2021-11-23 19:46:09 +01:00
kfree ( q ) ;
2021-11-23 19:38:21 +01:00
2015-01-14 19:28:33 +08:00
return result ;
2009-12-24 11:34:16 +03:00
}
2015-06-11 13:21:32 +08:00
static void acpi_ec_event_handler ( struct work_struct * work )
2009-12-24 11:34:16 +03:00
{
ACPI / EC: Fix issues related to the SCI_EVT handling
This patch fixes 2 issues related to the draining behavior. But it doesn't
implement the draining support, it only cleans up code so that further
draining support is possible.
The draining behavior is expected by some platforms (for example, Samsung)
where SCI_EVT is set only once for a set of events and might be cleared for
the very first QR_EC command issued after SCI_EVT is set. EC firmware on
such platforms will return 0x00 to indicate "no outstanding event". Thus
after seeing an SCI_EVT indication, EC driver need to fetch events until
0x00 returned (see acpi_ec_clear()).
Issue 1 - acpi_ec_submit_query():
It's reported on Samsung laptops that SCI_EVT isn't checked when the
transactions are advanced in ec_poll(), which leads to SCI_EVT triggering
source lost:
If no EC GPE IRQs are arrived after that, EC driver cannot detect this
event and handle it.
See comment 244/247 for kernel bugzilla 44161.
This patch fixes this issue by moving SCI_EVT checks into
advance_transaction(). So that SCI_EVT is checked each time we are going to
handle the EC firmware indications. And this check will happen for both IRQ
context and task context.
Since after doing that, SCI_EVT is also checked after completing a
transaction, ec_check_sci() and ec_check_sci_sync() can be removed.
Issue 2 - acpi_ec_complete_query():
We expect to clear EC_FLAGS_QUERY_PENDING to allow queuing another draining
QR_EC after writing a QR_EC command and before reading the event. After
reading the event, SCI_EVT might be cleared by the firmware, thus it may
not be possible to queue such a draining QR_EC at that time.
But putting the EC_FLAGS_QUERY_PENDING clearing code after
start_transaction() is wrong as there are chances that after
start_transaction(), QR_EC can fail to be sent. If this happens,
EC_FLAG_QUERY_PENDING will be cleared earlier. As a consequence, the
draining QR_EC will also be queued earlier than expected.
This patch also moves this code into advance_transaction() where QR_EC is
just sent (ACPI_EC_COMMAND_POLL flagged) to fix this issue.
Notes:
1. After introducing the 2 SCI_EVT related handlings into
advance_transaction(), a next QR_EC can be queued right after writing
the current QR_EC command and before reading the event. But this still
hasn't implemented the draining behavior as the draining support
requires:
If a previous returned event value isn't 0x00, a draining QR_EC need
to be issued even when SCI_EVT isn't set.
2. In this patch, acpi_os_execute() is also converted into a seperate work
item to avoid invoking kmalloc() in the atomic context. We can do this
because of the previous global lock fix.
3. Originally, EC_FLAGS_EVENT_PENDING is also used to avoid queuing up
multiple work items (created by acpi_os_execute()), this can be covered
by only using a single work item. But this patch still keeps this flag
as there are different usages in the driver initialization steps relying
on this flag.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=44161
Reported-by: Kieran Clancy <clancy.kieran@gmail.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-01-14 19:28:47 +08:00
struct acpi_ec * ec = container_of ( work , struct acpi_ec , work ) ;
2014-10-14 14:24:01 +08:00
2015-06-11 13:21:32 +08:00
ec_dbg_evt ( " Event started " ) ;
2021-11-23 19:40:50 +01:00
spin_lock_irq ( & ec - > lock ) ;
2021-11-23 19:43:05 +01:00
while ( ec - > events_to_process ) {
2021-11-23 19:40:50 +01:00
spin_unlock_irq ( & ec - > lock ) ;
2021-11-23 19:38:21 +01:00
2021-11-23 19:42:02 +01:00
acpi_ec_submit_query ( ec ) ;
2021-11-23 19:38:21 +01:00
2021-11-23 19:40:50 +01:00
spin_lock_irq ( & ec - > lock ) ;
2022-02-04 18:43:14 +01:00
2021-11-23 19:43:05 +01:00
ec - > events_to_process - - ;
2015-06-11 13:21:32 +08:00
}
2021-11-23 19:40:04 +01:00
/*
* Before exit , make sure that the it will be possible to queue up the
* event handling work again regardless of whether or not the query
* queued up above is processed successfully .
*/
2022-02-04 18:43:14 +01:00
if ( ec_event_clearing = = ACPI_EC_EVT_TIMING_EVENT ) {
bool guard_timeout ;
ACPI: EC: Make the event work state machine visible
The EC driver uses a relatively simple state machine for the event
work handling, but it is not really straightforward to figure out.
The states are as follows:
"Ready": The event handling work can be submitted.
In this state, the EC_FLAGS_QUERY_PENDING flag is clear.
"In progress": The event handling work is pending or is being
processed. It cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and both the
events_to_process count is nonzero and the EC_FLAGS_QUERY_GUARDING
flag is clear.
"Complete": The event handling work has been completed, but it still
cannot be submitted again.
In ths state, the EC_FLAGS_QUERY_PENDING flag is set and the
events_to_process count is zero or the EC_FLAGS_QUERY_GUARDING
flag is set.
The state changes from "Ready" to "In progress" when new event is
detected by advance_transaction() and acpi_ec_submit_event() is
called by it.
Next, the state can change from "In progress" directly to "Ready" in
the following situations:
* ec_event_clearing is ACPI_EC_EVT_TIMING_STATUS and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes ACPI_EC_COMMAND_POLL.
* ec_event_clearing is ACPI_EC_EVT_TIMING_QUERY and the state of
an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* ec_event_clearing is either ACPI_EC_EVT_TIMING_STATUS or
ACPI_EC_EVT_TIMING_QUERY and there are no more events to
process (ie. ec->events_to_process becomes 0).
If ec_event_clearing is ACPI_EC_EVT_TIMING_EVENT, however, the
state must change from "In progress" to "Complete" before it
can change to "Ready". The changes from "In progress" to
"Complete" in that case occur in the following situations:
* The state of an ACPI_EC_COMMAND_QUERY transaction becomes
ACPI_EC_COMMAND_COMPLETE.
* There are no more events to process (ie. ec->events_to_process
becomes 0).
Finally, the state changes from "Complete" to "Ready" when
advance_transaction() is invoked when the state is "Complete" and
the state of the current transaction is not ACPI_EC_COMMAND_POLL.
To make this state machine visible in the code, add a new
event_state field to struct acpi_ec and modify the code to use
it istead the EC_FLAGS_QUERY_PENDING and EC_FLAGS_QUERY_GUARDING
flags.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:44:46 +01:00
acpi_ec_complete_event ( ec ) ;
2021-11-23 19:40:04 +01:00
2022-02-04 18:43:14 +01:00
ec_dbg_evt ( " Event stopped " ) ;
2015-06-11 13:21:32 +08:00
2022-02-04 18:43:14 +01:00
spin_unlock_irq ( & ec - > lock ) ;
guard_timeout = ! ! ec_guard ( ec ) ;
2015-06-11 13:21:38 +08:00
2021-11-23 19:40:50 +01:00
spin_lock_irq ( & ec - > lock ) ;
2021-11-23 19:39:05 +01:00
/* Take care of SCI_EVT unless someone else is doing that. */
2022-02-04 18:43:14 +01:00
if ( guard_timeout & & ! ec - > curr )
2021-11-23 19:39:05 +01:00
advance_transaction ( ec , false ) ;
2022-02-04 18:43:14 +01:00
} else {
acpi_ec_close_event ( ec ) ;
2021-11-23 19:39:05 +01:00
2022-02-04 18:43:14 +01:00
ec_dbg_evt ( " Event stopped " ) ;
2021-11-23 19:39:05 +01:00
}
ACPI: EC: Rework flushing of EC work while suspended to idle
The flushing of pending work in the EC driver uses drain_workqueue()
to flush the event handling work that can requeue itself via
advance_transaction(), but this is problematic, because that
work may also be requeued from the query workqueue.
Namely, if an EC transaction is carried out during the execution of
a query handler, it involves calling advance_transaction() which
may queue up the event handling work again. This causes the kernel
to complain about attempts to add a work item to the EC event
workqueue while it is being drained and worst-case it may cause a
valid event to be skipped.
To avoid this problem, introduce two new counters, events_in_progress
and queries_in_progress, incremented when a work item is queued on
the event workqueue or the query workqueue, respectively, and
decremented at the end of the corresponding work function, and make
acpi_ec_dispatch_gpe() the workqueues in a loop until the both of
these counters are zero (or system wakeup is pending) instead of
calling acpi_ec_flush_work().
At the same time, change __acpi_ec_flush_work() to call
flush_workqueue() instead of drain_workqueue() to flush the event
workqueue.
While at it, use the observation that the work item queued in
acpi_ec_query() cannot be pending at that time, because it is used
only once, to simplify the code in there.
Additionally, clean up a comment in acpi_ec_query() and adjust white
space in acpi_ec_event_processor().
Fixes: f0ac20c3f613 ("ACPI: EC: Fix flushing of pending work")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:36:51 +01:00
ec - > events_in_progress - - ;
2022-02-04 18:43:14 +01:00
2021-11-23 19:40:50 +01:00
spin_unlock_irq ( & ec - > lock ) ;
2005-07-23 04:08:00 -04:00
}
2005-04-16 15:20:36 -07:00
2019-10-14 16:56:02 +08:00
static void acpi_ec_handle_interrupt ( struct acpi_ec * ec )
2005-04-16 15:20:36 -07:00
{
2014-06-15 08:41:35 +08:00
unsigned long flags ;
2008-09-25 21:00:31 +04:00
2014-06-15 08:41:35 +08:00
spin_lock_irqsave ( & ec - > lock , flags ) ;
2020-11-13 19:13:17 +01:00
advance_transaction ( ec , true ) ;
2014-06-15 08:42:07 +08:00
spin_unlock_irqrestore ( & ec - > lock , flags ) ;
2019-10-14 16:56:02 +08:00
}
static u32 acpi_ec_gpe_handler ( acpi_handle gpe_device ,
u32 gpe_number , void * data )
{
acpi_ec_handle_interrupt ( data ) ;
ACPI / EC: Fix several GPE handling issues by deploying ACPI_GPE_DISPATCH_RAW_HANDLER mode.
This patch switches EC driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode where
the GPE lock is not held for acpi_ec_gpe_handler() and the ACPICA internal
GPE enabling/disabling/clearing operations are bypassed so that further
improvements are possible with the GPE APIs.
There are 2 strong reasons for deploying raw GPE handler mode in the EC
driver:
1. Some hardware logics can control their interrupts via their own
registers, so their interrupts can be disabled/enabled/acknowledged
without using the super IRQ controller provided functions. While there
is no mean (EC commands) for the EC driver to achieve this.
2. During suspending, the EC driver is still working for a while to
complete the platform firmware provided functionailities using ec_poll()
after all GPEs are disabled (see acpi_ec_block_transactions()), which
means the EC driver will drive the EC GPE out of the GPE core's control.
Without deploying the raw GPE handler mode, we can see many races between
the EC driver and the GPE core due to the above restrictions:
1. There is a race condition due to ACPICA internal GPE
disabling/clearing/enabling logics in acpi_ev_gpe_dispatch():
Orignally EC GPE is disabled (EN=0), cleared (STS=0) before invoking a
GPE handler and re-enabled (EN=1) after invoking a GPE handler in
acpi_ev_gpe_dispatch(). When re-enabling appears, GPE may be flagged
(STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() ec_poll()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1
This race condition is the root cause of different issues on different
silicon variations.
A. Silicon variation A:
On some platforms, GPE will be triggered due to "writing 1 to EN when
STS=1". This is because both EN and STS lines are wired to the GPE
trigger line.
1. Issue 1:
We can see no-op acpi_ec_gpe_handler() invoked on such platforms.
This is because:
a. event pending B: An event can arrive after ACPICA's GPE
clearing performed in acpi_ev_gpe_dispatch(), this event may
fail to be detected by EC_SC read that is performed before its
arrival;
b. event handling B: The event can be handled in ec_poll() because
EC lock is released after acpi_ec_gpe_handler() invocation;
c. There is no code in ec_poll() to clear STS but the GPE can
still be triggered by the EN=1 write performed in
acpi_ev_finish_gpe(), this leads to a no-op EC GPE handler
invocation;
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 1:
If we removed GPE disabling/enabling code from
acpi_ev_gpe_dispatch(), we could still see no-op GPE handlers
triggered by the event arriving after the GPE clearing and before
the GPE handling on both silicon variation A and B. This can only
occur if the CPU is very slow (timing slice between STS=0 write
and EC_SC read should be short enough before hardware sets another
GPE indication). Thus this is very rare and is not what we need to
fix.
B. Silicon variation B:
On other platforms, GPE may not be triggered due to "writing 1 to EN
when STS=1". This is because only STS line is wired to the GPE
trigger line.
2. Issue 2:
We can see GPE loss on such platforms. This is because:
a. event pending B vs. event handling A: An event can arrive after
ACPICA's GPE handling performed in acpi_ev_gpe_dispatch(), or
event pending C vs. event handling B: An event can arrive after
Linux's GPE handling performed in ec_poll(),
these events may fail to be detected by EC_SC read that is
performed before their arrival;
b. The GPE cannot be triggered by EN=1 write performed in
acpi_ev_finish_gpe();
c. If no polling mechanism is implemented in the driver for the
pending event (for example, SCI_EVT), this event is lost due to
no GPE being triggered.
Note 2:
On most platforms, there might be another rule that GPE may not be
triggered due to "writing 1 to STS when STS=1 and EN=1".
Then on silicon variation B, an even worse case is if the issue 2
event loss happens, further events may never trigger GPE again on
such platforms due to being blocked by the current STS=1. Unless
someone clears STS, all events have to be polled.
2. There is a race condition due to lacking in GPE status checking in EC
driver:
Originally, GPE status is checked in ACPICA core but not checked in
the GPE handler. Thus since the status checking and handling is not
locked, it can be interrupted by another handling path.
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_detect() ec_poll()
if (EN==1 && STS==1)
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
acpi_ev_gpe_dispatch()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
EC_SC read
Unlock(EC)
*****************************************************************
3. Issue 3:
We can see no-op acpi_ec_gpe_handler() invoked on both silicon
variation A and B. This is because:
a. event pending A: An event can arrive to trigger an EC GPE and
ACPICA checks it and is about to invoke the EC GPE handler;
b. event handling A: The event can be handled in ec_poll() because
EC lock is not held after the GPE status checking;
c. event handling B: Then when the EC GPE handler is invoked, it
becomes a no-op GPE handler invocation.
d. As no-op GPE handler invocations are counted by the EC driver
to trigger the command storming conditions, the wrong no-op
GPE handler invocations thus can easily trigger wrong command
storming conditions.
Note 3:
This no-op GPE handler invocation is rare because the time between
the IRQ arrival and the acpi_ec_gpe_handler() invocation is less than
the timeout value waited in ec_poll(). So most of the no-op GPE
handler invocations are caused by the reason described in issue 1.
3. There is a race condition due to ACPICA internal GPE clearing logic in
acpi_enable_gpe():
During runtime, acpi_enable_gpe() can be invoked by the EC storming
prevention code. When it is invoked, GPE may be flagged (STS=1).
=================================================================
(event pending A)
=================================================================
acpi_ev_gpe_dispatch() acpi_ec_transaction()
EN=0
STS=0
acpi_ec_gpe_handler()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
EC_SC read
EC_SC handled
Unlock(EC)
*****************************************************************
EN=1 ?
Lock(EC)
Unlock(EC)
=================================================================
(event pending B)
=================================================================
acpi_enable_gpe()
STS=0
EN=1
4. Issue 4:
We can see GPE loss on both silicon variation A and B platforms.
This is because:
a. event pending B: An event can arrive right before ACPICA's GPE
clearing performed in acpi_enable_gpe();
b. If the GPE is cleared when GPE is disabled, then EN=1 write in
acpi_enable_gpe() cannot trigger this GPE;
c. If no polling mechanism is implemented in the driver for this
event (for example, SCI_EVT), this event is lost due to no GPE
being triggered.
Note 4:
Currently we don't have this issue, but after we switch the EC
driver into ACPI_GPE_DISPATCH_RAW_HANDLER mode, we need to take care
of handling this because the EN=1 write in acpi_ev_gpe_dispatch()
will be abandoned.
There might be more race issues for the current GPE handler usages. This is
because the EC IRQ's enabling/disabling/checking/clearing/handling
operations should be locked by a single lock that is under the EC driver's
control to achieve the serialization. Which means we need to invoke GPE
APIs with EC driver's lock held and all ACPICA internal GPE operations
related to the GPE handler should be abandoned. Invoking GPE APIs inside of
the EC driver lock and bypassing ACPICA internal GPE operations requires
the ACPI_GPE_DISPATCH_RAW_HANDLER mode where the same lock used by the APIs
are released prior than invoking the handlers. Otherwise, we can see dead
locks due to circular locking dependencies (see Reference below).
This patch then switches the EC driver into the
ACPI_GPE_DISPATCH_RAW_HANDLER mode so that it can perform correct GPE
operations using the GPE APIs:
1. Bypasses EN modifications performed in acpi_ev_gpe_dispatch() by
using acpi_install_gpe_raw_handler() and invoking all GPE APIs with EC
spin lock held. This can fix issue 1 as it makes a non frequent GPE
enabling/disabling environment.
2. Bypasses STS clearing performed in acpi_enable_gpe() by replacing
acpi_enable_gpe()/acpi_disable_gpe() with acpi_set_gpe(). This can fix
issue 4. And this can also help to fix issue 1 as it makes a no sudden
GPE clearing environment when GPE is frequently enabled/disabled.
3. Ensures STS acknowledged before handling by invoking acpi_clear_gpe()
in advance_transaction(). This can finally fix issue 1 even in a
frequent GPE enabling/disabling environment. And this can also finally
fix issue 3 when issue 2 is fixed.
Note 3:
GPE clearing is edge triggered W1C, which means we can clear it right
before handling it. Since all EC GPE indications are handled in
advance_transaction() by previous commits, we can now move GPE clearing
into it to implement the correct GPE clearing.
Note 4:
We can use acpi_set_gpe() which is not shared GPE safer instead of
acpi_enable_gpe()/acpi_disable_gpe() because EC GPE is not shared by
other hardware, which is mentioned in the ACPI specification 5.0, 12.6
Interrupt Model: "OSPM driver treats this as an edge event (the EC SCI
cannot be shared)". So we can stop using shared GPE safer APIs
acpi_enable_gpe()/acpi_disable_gpe() in the EC driver. Otherwise
cleanups need to be made in acpi_ev_enable_gpe() to bypass the GPE
clearing logic before keeping acpi_enable_gpe().
This patch also invokes advance_transaction() when GPE is re-enabled in the
task context which:
1. Ensures EN=1 can trigger GPE by checking and handling EC status register
right after EN=1 writes. This can fix issue 2.
After applying this patch, without frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() ec_poll()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 1 (event pending B) can arrive as a next GPE
due to the previous IRQ context STS=0 write. And if it is handled by
ec_poll() (event handling B), as it is also acknowledged by ec_poll(), the
event pending for issue 2 (event pending C) can properly arrive as a next
GPE after the task context STS=0 write. So no GPE will be lost and never
triggered due to GPE clearing performed in the wrong position. And since
all GPE handling is performed after a locked GPE status checking, we can
hardly see no-op GPE handler invocations due to issue 1 and 3. We may still
see no-op GPE handler invocations due to "Note 1", but as it is inevitable,
it needn't be fixed.
After applying this patch, with frequent GPE enablings considered:
=================================================================
(event pending A)
=================================================================
acpi_ec_gpe_handler() acpi_ec_transaction()
*****************************************************************
(event handling A)
Lock(EC)
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending B)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
*****************************************************************
(event handling B)
Lock(EC)
EN=1
if STS==1
advance_transaction()
if STS==1
STS=0
EC_SC read
=================================================================
(event pending C)
=================================================================
EC_SC handled
Unlock(EC)
*****************************************************************
The event pending for issue 2 can be manually handled by
advance_transaction(). And after the STS=0 write performed in the manual
triggered advance_transaction(), GPE can always arrive. So no GPE will be
lost due to frequent GPE disabling/enabling performed in the driver like
issue 4.
Note 5:
It's ideally when EN=1 write occurred, an IRQ thread should be woken up to
handle the GPE when the GPE was raised. But this requires the IRQ thread to
contain the poller code for all EC GPE indications, while currently some of
the indications are handled in the user tasks. It then is very hard for the
code to determine whether a user task should be invoked or the poller work
item should be scheduled. So we have to invoke advance_transaction()
directly now and it leaves us such a restriction for the GPE re-enabling:
it must be performed in the task context to avoid starving the GPEs.
As a conclusion: we can see the EC GPE is always handled in serial after
deploying the raw GPE handler mode:
Lock(EC)
if (STS==1)
STS=0
EC_SC read
EC_SC handled
Unlock(EC)
The EC driver specific lock is responsible to make the EC GPE handling
processes serialized so that EC can handle its GPE from both IRQ and task
contexts and the next IRQ can be ensured to arrive after this process.
Note 6:
We have many EC_FLAGS_MSI qurik users in the current driver. They all seem
to be suffering from unexpected GPE triggering source lost. And they are
false root caused to a timing issue. Since EC communication protocol has
already flow control defined, timing shouldn't be the root cause, while
this fix might be fixing the root cause of the old bugs.
Link: https://lkml.org/lkml/2014/11/4/974
Link: https://lkml.org/lkml/2014/11/18/316
Link: https://www.spinics.net/lists/linux-acpi/msg54340.html
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-05 16:27:22 +08:00
return ACPI_INTERRUPT_HANDLED ;
2008-03-21 17:07:03 +03:00
}
2019-10-14 16:56:02 +08:00
static irqreturn_t acpi_ec_irq_handler ( int irq , void * data )
{
acpi_ec_handle_interrupt ( data ) ;
return IRQ_HANDLED ;
}
2005-04-16 15:20:36 -07:00
/* --------------------------------------------------------------------------
2014-10-14 14:24:01 +08:00
* Address Space Management
* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
2005-04-16 15:20:36 -07:00
static acpi_status
2007-05-29 16:42:52 +04:00
acpi_ec_space_handler ( u32 function , acpi_physical_address address ,
2010-03-17 13:14:13 -04:00
u32 bits , u64 * value64 ,
2005-08-11 17:32:05 -04:00
void * handler_context , void * region_context )
2005-04-16 15:20:36 -07:00
{
2007-03-07 22:28:00 +03:00
struct acpi_ec * ec = handler_context ;
2010-03-17 13:14:13 -04:00
int result = 0 , i , bytes = bits / 8 ;
u8 * value = ( u8 * ) value64 ;
2005-04-16 15:20:36 -07:00
if ( ( address > 0xFF ) | | ! value | | ! handler_context )
2006-06-27 00:41:40 -04:00
return AE_BAD_PARAMETER ;
2005-04-16 15:20:36 -07:00
2007-05-29 16:42:52 +04:00
if ( function ! = ACPI_READ & & function ! = ACPI_WRITE )
2006-06-27 00:41:40 -04:00
return AE_BAD_PARAMETER ;
2005-04-16 15:20:36 -07:00
2017-01-20 16:42:48 +08:00
if ( ec - > busy_polling | | bits > 8 )
2009-08-28 23:29:44 +04:00
acpi_ec_burst_enable ( ec ) ;
2008-01-11 02:42:57 +03:00
2010-03-17 13:14:13 -04:00
for ( i = 0 ; i < bytes ; + + i , + + address , + + value )
result = ( function = = ACPI_READ ) ?
acpi_ec_read ( ec , address , value ) :
acpi_ec_write ( ec , address , * value ) ;
2005-04-16 15:20:36 -07:00
2017-01-20 16:42:48 +08:00
if ( ec - > busy_polling | | bits > 8 )
2009-08-28 23:29:44 +04:00
acpi_ec_burst_disable ( ec ) ;
2008-01-11 02:42:57 +03:00
2005-04-16 15:20:36 -07:00
switch ( result ) {
case - EINVAL :
2006-06-27 00:41:40 -04:00
return AE_BAD_PARAMETER ;
2005-04-16 15:20:36 -07:00
case - ENODEV :
2006-06-27 00:41:40 -04:00
return AE_NOT_FOUND ;
2005-04-16 15:20:36 -07:00
case - ETIME :
2006-06-27 00:41:40 -04:00
return AE_TIME ;
2005-04-16 15:20:36 -07:00
default :
2006-06-27 00:41:40 -04:00
return AE_OK ;
2005-04-16 15:20:36 -07:00
}
}
/* --------------------------------------------------------------------------
2014-10-14 14:24:01 +08:00
* Driver Interface
* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */
2007-03-07 22:28:00 +03:00
static acpi_status
ec_parse_io_ports ( struct acpi_resource * resource , void * context ) ;
2016-09-07 16:50:08 +08:00
static void acpi_ec_free ( struct acpi_ec * ec )
{
if ( first_ec = = ec )
first_ec = NULL ;
if ( boot_ec = = ec )
boot_ec = NULL ;
kfree ( ec ) ;
}
static struct acpi_ec * acpi_ec_alloc ( void )
2007-03-07 22:28:00 +03:00
{
struct acpi_ec * ec = kzalloc ( sizeof ( struct acpi_ec ) , GFP_KERNEL ) ;
2014-10-14 14:24:01 +08:00
2007-03-07 22:28:00 +03:00
if ( ! ec )
return NULL ;
2012-10-23 01:29:27 +02:00
mutex_init ( & ec - > mutex ) ;
2007-03-07 22:28:00 +03:00
init_waitqueue_head ( & ec - > wait ) ;
2007-05-29 16:43:02 +04:00
INIT_LIST_HEAD ( & ec - > list ) ;
2012-10-23 01:29:27 +02:00
spin_lock_init ( & ec - > lock ) ;
2015-06-11 13:21:32 +08:00
INIT_WORK ( & ec - > work , acpi_ec_event_handler ) ;
ACPI / EC: Fix and clean up register access guarding logics.
In the polling mode, EC driver shouldn't access the EC registers too
frequently. Though this statement is concluded from the non-root caused
bugs (see links below), we've maintained the register access guarding
logics in the current EC driver. The guarding logics can be found here and
there, makes it hard to root cause real timing issues. This patch collects
the guarding logics into one single function so that all hidden logics
related to this can be seen clearly.
The current guarding related code also has several issues:
1. Per-transaction timestamp prevents inter-transaction guarding from being
implemented in the same place. We have an inter-transaction udelay() in
acpi_ec_transaction_unblocked(), this logic can be merged into ec_poll()
if we can use per-device timestamp. This patch completes such merge to
form a new ec_guard() function and collects all guarding related hidden
logics in it.
One hidden logic is: there is no inter-transaction guarding performed
for non MSI quirk (wait polling mode), this patch skips
inter-transaction guarding before wait_event_timeout() for the wait
polling mode to reveal the hidden logic.
The other hidden logic is: there is msleep() inter-transaction guarding
performed when the GPE storming is observed. As after merging this
commit:
Commit: e1d4d90fc0313d3d58cbd7912c90f8ef24df45ff
Subject: ACPI / EC: Refine command storm prevention support
EC_FLAGS_COMMAND_STORM is ensured to be cleared after invoking
acpi_ec_transaction_unlocked(), the msleep() guard logic will never
happen now. Since no one complains such change, this logic is likely
added during the old times where the EC race issues are not fixed and
the bugs are false root-caused to the timing issue. This patch simply
removes the out-dated logic. We can restore it by stop skipping
inter-transaction guarding for wait polling mode.
Two different delay values are defined for msleep() and udelay() while
they are merged in this patch to 550us.
2. time_after() causes additional delay in the polling mode (can only be
observed in noirq suspend/resume processes where polling mode is always
used) before advance_transaction() is invoked ("wait polling" log is
added before wait_event_timeout()). We can see 2 wait_event_timeout()
invocations. This is because time_after() ensures a ">" validation while
we only need a ">=" validation here:
[ 86.739909] ACPI: Waking up from system sleep state S3
[ 86.742857] ACPI : EC: 2: Increase command
[ 86.742859] ACPI : EC: ***** Command(RD_EC) started *****
[ 86.742861] ACPI : EC: ===== TASK (0) =====
[ 86.742871] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.742873] ACPI : EC: EC_SC(W) = 0x80
[ 86.742876] ACPI : EC: ***** Event started *****
[ 86.742880] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.743972] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.747966] ACPI : EC: ===== TASK (0) =====
[ 86.747977] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 86.747978] ACPI : EC: EC_DATA(W) = 0x06
[ 86.747981] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.751971] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755969] ACPI : EC: ===== TASK (0) =====
[ 86.755991] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 86.755993] ACPI : EC: EC_DATA(R) = 0x03
[ 86.755994] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 86.755995] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 86.755996] ACPI : EC: 1: Decrease command
This patch corrects this by using time_before() instead in ec_guard():
[ 54.283146] ACPI: Waking up from system sleep state S3
[ 54.285414] ACPI : EC: 2: Increase command
[ 54.285415] ACPI : EC: ***** Command(RD_EC) started *****
[ 54.285416] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.285417] ACPI : EC: ===== TASK (0) =====
[ 54.285424] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.285425] ACPI : EC: EC_SC(W) = 0x80
[ 54.285427] ACPI : EC: ***** Event started *****
[ 54.285429] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.287209] ACPI : EC: ===== TASK (0) =====
[ 54.287218] ACPI : EC: EC_SC(R) = 0x20 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=0
[ 54.287219] ACPI : EC: EC_DATA(W) = 0x06
[ 54.287222] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291190] ACPI : EC: ===== TASK (0) =====
[ 54.291210] ACPI : EC: EC_SC(R) = 0x21 SCI_EVT=1 BURST=0 CMD=0 IBF=0 OBF=1
[ 54.291213] ACPI : EC: EC_DATA(R) = 0x03
[ 54.291214] ACPI : EC: ~~~~~ wait polling ~~~~~
[ 54.291215] ACPI : EC: ***** Command(RD_EC) stopped *****
[ 54.291216] ACPI : EC: 1: Decrease command
After cleaning up all guarding logics, we have one single function
ec_guard() collecting all old, non-root-caused, hidden logics. Then we can
easily tune the logics in one place to respond to the bug reports.
Except the time_before() change, all other changes do not change the
behavior of the EC driver.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=12011
Link: https://bugzilla.kernel.org/show_bug.cgi?id=20242
Link: https://bugzilla.kernel.org/show_bug.cgi?id=77431
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-05-15 14:16:42 +08:00
ec - > timestamp = jiffies ;
2017-01-20 16:42:48 +08:00
ec - > busy_polling = true ;
ec - > polling_guard = 0 ;
2019-10-14 16:56:02 +08:00
ec - > gpe = - 1 ;
ec - > irq = - 1 ;
2007-03-07 22:28:00 +03:00
return ec ;
}
2007-05-29 16:43:02 +04:00
2007-08-14 01:03:42 -04:00
static acpi_status
acpi_ec_register_query_methods ( acpi_handle handle , u32 level ,
void * context , void * * return_value )
{
2008-12-16 16:46:12 +08:00
char node_name [ 5 ] ;
struct acpi_buffer buffer = { sizeof ( node_name ) , node_name } ;
2007-08-14 01:03:42 -04:00
struct acpi_ec * ec = context ;
int value = 0 ;
2008-12-16 16:46:12 +08:00
acpi_status status ;
status = acpi_get_name ( handle , ACPI_SINGLE_NAME , & buffer ) ;
2014-10-14 14:24:01 +08:00
if ( ACPI_SUCCESS ( status ) & & sscanf ( node_name , " _Q%x " , & value ) = = 1 )
2007-08-14 01:03:42 -04:00
acpi_ec_add_query_handler ( ec , value , handle , NULL , NULL ) ;
return AE_OK ;
}
2007-08-03 17:52:48 -04:00
static acpi_status
ec_parse_device ( acpi_handle handle , u32 Level , void * context , void * * retval )
2007-05-29 16:43:02 +04:00
{
2007-08-03 17:52:48 -04:00
acpi_status status ;
2008-11-03 14:26:40 -05:00
unsigned long long tmp = 0 ;
2007-08-03 17:52:48 -04:00
struct acpi_ec * ec = context ;
2009-04-01 01:33:15 -04:00
/* clear addr values, ec_parse_io_ports depend on it */
ec - > command_addr = ec - > data_addr = 0 ;
2007-08-03 17:52:48 -04:00
status = acpi_walk_resources ( handle , METHOD_NAME__CRS ,
ec_parse_io_ports , ec ) ;
if ( ACPI_FAILURE ( status ) )
return status ;
2017-06-15 09:41:35 +08:00
if ( ec - > data_addr = = 0 | | ec - > command_addr = = 0 )
return AE_OK ;
2007-05-29 16:43:02 +04:00
ACPI: EC: Drop the EC_FLAGS_IGNORE_DSDT_GPE quirk
It seems that these quirks are no longer necessary since
commit 69b957c26b32 ("ACPI: EC: Fix possible issues related to EC
initialization order"), which has fixed this in a generic manner.
There are 3 commits adding DMI entries with this quirk (adding multiple
DMI entries per commit). 2/3 commits are from before the generic fix.
Which leaves commit 6306f0431914 ("ACPI: EC: Make more Asus laptops
use ECDT _GPE"), which was committed way after the generic fix.
But this was just due to slow upstreaming of it. This commit stems
from Endless from 15 Aug 2017 (committed upstream 20 May 2021):
https://github.com/endlessm/linux/pull/288
The current code should work fine without this:
1. The EC_FLAGS_IGNORE_DSDT_GPE flag is only checked in ec_parse_device(),
like this:
if (boot_ec && boot_ec_is_ecdt && EC_FLAGS_IGNORE_DSDT_GPE) {
ec->gpe = boot_ec->gpe;
} else {
/* parse GPE */
}
2. ec_parse_device() is only called from acpi_ec_add() and
acpi_ec_dsdt_probe()
3. acpi_ec_dsdt_probe() starts with:
if (boot_ec)
return;
so it only calls ec_parse_device() when boot_ec == NULL, meaning that
the quirk never triggers for this call. So only the call in
acpi_ec_add() matters.
4. acpi_ec_add() does the following after the ec_parse_device() call:
if (boot_ec && ec->command_addr == boot_ec->command_addr &&
ec->data_addr == boot_ec->data_addr &&
!EC_FLAGS_TRUST_DSDT_GPE) {
/*
* Trust PNP0C09 namespace location rather than
* ECDT ID. But trust ECDT GPE rather than _GPE
* because of ASUS quirks, so do not change
* boot_ec->gpe to ec->gpe.
*/
boot_ec->handle = ec->handle;
acpi_handle_debug(ec->handle, "duplicated.\n");
acpi_ec_free(ec);
ec = boot_ec;
}
The quirk only matters if boot_ec != NULL and EC_FLAGS_TRUST_DSDT_GPE
is never set at the same time as EC_FLAGS_IGNORE_DSDT_GPE.
That means that if the addresses match we always enter this if block and
then only the ec->handle part of the data stored in ec by ec_parse_device()
is used and the rest is thrown away, after which ec is made to point
to boot_ec, at which point ec->gpe == boot_ec->gpe, so the same result
as with the quirk set, independent of the value of the quirk.
Also note the comment in this block which indicates that the gpe result
from ec_parse_device() is deliberately not taken to deal with buggy
Asus laptops and all DMI quirks setting EC_FLAGS_IGNORE_DSDT_GPE are for
Asus laptops.
Based on the above I believe that unless on some quirked laptops
the ECDT and DSDT EC addresses do not match we can drop the quirk.
I've checked dmesg output to ensure the ECDT and DSDT EC addresses match
for quirked models using https://linux-hardware.org hw-probe reports.
I've been able to confirm that the addresses match for the following
models this way: GL702VMK, X505BA, X505BP, X550VXK, X580VD.
Whereas for the following models I could find any dmesg output:
FX502VD, FX502VE, X542BA, X542BP.
Note the models without dmesg all were submitted in patches with a batch
of models and other models from the same batch checkout ok.
This, combined with that all the code adding the quirks was written before
the generic fix makes me believe that it is safe to remove this quirk now.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Daniel Drake <drake@endlessos.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-06-20 11:25:44 +02:00
/* Get GPE bit assignment (EC events). */
/* TODO: Add support for _GPE returning a package */
status = acpi_evaluate_integer ( handle , " _GPE " , NULL , & tmp ) ;
if ( ACPI_SUCCESS ( status ) )
ec - > gpe = tmp ;
/*
* Errors are non - fatal , allowing for ACPI Reduced Hardware
* platforms which use GpioInt instead of GPE .
*/
2019-10-14 16:56:02 +08:00
2007-05-29 16:43:02 +04:00
/* Use the global lock for all EC transactions? */
2008-11-03 14:26:40 -05:00
tmp = 0 ;
2008-10-10 02:22:59 -04:00
acpi_evaluate_integer ( handle , " _GLK " , NULL , & tmp ) ;
ec - > global_lock = tmp ;
2007-05-29 16:43:02 +04:00
ec - > handle = handle ;
2007-08-03 17:52:48 -04:00
return AE_CTRL_TERMINATE ;
2007-05-29 16:43:02 +04:00
}
2020-03-03 20:15:04 +01:00
static bool install_gpe_event_handler ( struct acpi_ec * ec )
2019-10-14 16:56:02 +08:00
{
2020-03-03 20:15:04 +01:00
acpi_status status ;
2019-10-14 16:56:02 +08:00
2020-03-03 20:15:04 +01:00
status = acpi_install_gpe_raw_handler ( NULL , ec - > gpe ,
ACPI_GPE_EDGE_TRIGGERED ,
& acpi_ec_gpe_handler , ec ) ;
if ( ACPI_FAILURE ( status ) )
return false ;
2019-10-14 16:56:02 +08:00
2020-03-03 20:15:04 +01:00
if ( test_bit ( EC_FLAGS_STARTED , & ec - > flags ) & & ec - > reference_count > = 1 )
acpi_ec_enable_gpe ( ec , true ) ;
2019-10-14 16:56:02 +08:00
2020-03-03 20:15:04 +01:00
return true ;
}
2019-10-14 16:56:02 +08:00
2020-03-03 20:15:04 +01:00
static bool install_gpio_irq_event_handler ( struct acpi_ec * ec )
{
return request_irq ( ec - > irq , acpi_ec_irq_handler , IRQF_SHARED ,
" ACPI EC " , ec ) > = 0 ;
2019-10-14 16:56:02 +08:00
}
2020-03-03 20:15:04 +01:00
/**
* ec_install_handlers - Install service callbacks and register query methods .
* @ ec : Target EC .
* @ device : ACPI device object corresponding to @ ec .
2022-12-08 15:23:35 +01:00
* @ call_reg : If _REG should be called to notify OpRegion availability
2020-03-03 20:15:04 +01:00
*
* Install a handler for the EC address space type unless it has been installed
* already . If @ device is not NULL , also look for EC query methods in the
* namespace and register them , and install an event ( either GPE or GPIO IRQ )
* handler for the EC , if possible .
*
* Return :
* - ENODEV if the address space handler cannot be installed , which means
* " unable to handle transactions " ,
* - EPROBE_DEFER if GPIO IRQ acquisition needs to be deferred ,
* or 0 ( success ) otherwise .
2016-09-07 16:50:08 +08:00
*/
2022-12-08 15:23:35 +01:00
static int ec_install_handlers ( struct acpi_ec * ec , struct acpi_device * device ,
bool call_reg )
2009-06-22 20:41:30 +00:00
{
acpi_status status ;
2014-10-14 14:24:01 +08:00
2015-02-06 08:57:52 +08:00
acpi_ec_start ( ec , false ) ;
2016-03-24 10:42:47 +08:00
if ( ! test_bit ( EC_FLAGS_EC_HANDLER_INSTALLED , & ec - > flags ) ) {
2017-01-20 16:42:48 +08:00
acpi_ec_enter_noirq ( ec ) ;
2022-12-08 15:23:35 +01:00
status = acpi_install_address_space_handler_no_reg ( ec - > handle ,
ACPI_ADR_SPACE_EC ,
& acpi_ec_space_handler ,
NULL , ec ) ;
2016-03-24 10:42:47 +08:00
if ( ACPI_FAILURE ( status ) ) {
2020-02-27 22:56:28 +01:00
acpi_ec_stop ( ec , false ) ;
return - ENODEV ;
2016-03-24 10:42:47 +08:00
}
set_bit ( EC_FLAGS_EC_HANDLER_INSTALLED , & ec - > flags ) ;
2022-12-08 15:23:34 +01:00
ec - > address_space_handler_holder = ec - > handle ;
2016-03-24 10:42:47 +08:00
}
2022-12-08 15:23:35 +01:00
if ( call_reg & & ! test_bit ( EC_FLAGS_EC_REG_CALLED , & ec - > flags ) ) {
acpi_execute_reg_methods ( ec - > handle , ACPI_ADR_SPACE_EC ) ;
set_bit ( EC_FLAGS_EC_REG_CALLED , & ec - > flags ) ;
2016-03-24 10:42:47 +08:00
}
2020-02-27 22:55:22 +01:00
if ( ! device )
2016-09-07 16:50:21 +08:00
return 0 ;
2020-03-03 20:15:04 +01:00
if ( ec - > gpe < 0 ) {
/* ACPI reduced hardware platforms use a GpioInt from _CRS. */
int irq = acpi_dev_gpio_irq_get ( device , 0 ) ;
/*
* Bail out right away for deferred probing or complete the
* initialization regardless of any other errors .
*/
if ( irq = = - EPROBE_DEFER )
return - EPROBE_DEFER ;
else if ( irq > = 0 )
ec - > irq = irq ;
}
2019-10-14 16:56:01 +08:00
if ( ! test_bit ( EC_FLAGS_QUERY_METHODS_INSTALLED , & ec - > flags ) ) {
2016-09-07 16:50:21 +08:00
/* Find and register all query methods */
acpi_walk_namespace ( ACPI_TYPE_METHOD , ec - > handle , 1 ,
acpi_ec_register_query_methods ,
NULL , ec , NULL ) ;
2019-10-14 16:56:01 +08:00
set_bit ( EC_FLAGS_QUERY_METHODS_INSTALLED , & ec - > flags ) ;
2016-09-07 16:50:21 +08:00
}
2019-10-14 16:56:01 +08:00
if ( ! test_bit ( EC_FLAGS_EVENT_HANDLER_INSTALLED , & ec - > flags ) ) {
2020-03-03 20:15:04 +01:00
bool ready = false ;
if ( ec - > gpe > = 0 )
ready = install_gpe_event_handler ( ec ) ;
else if ( ec - > irq > = 0 )
ready = install_gpio_irq_event_handler ( ec ) ;
if ( ready ) {
set_bit ( EC_FLAGS_EVENT_HANDLER_INSTALLED , & ec - > flags ) ;
acpi_ec_leave_noirq ( ec ) ;
2009-06-22 20:41:30 +00:00
}
2020-03-03 20:15:04 +01:00
/*
* Failures to install an event handler are not fatal , because
* the EC can be polled for events .
*/
2009-06-22 20:41:30 +00:00
}
2017-09-26 16:54:03 +08:00
/* EC is fully operational, allow queries */
acpi_ec_enable_event ( ec ) ;
2009-06-22 20:41:30 +00:00
return 0 ;
}
2007-09-05 19:56:38 -04:00
static void ec_remove_handlers ( struct acpi_ec * ec )
{
2016-03-24 10:42:47 +08:00
if ( test_bit ( EC_FLAGS_EC_HANDLER_INSTALLED , & ec - > flags ) ) {
2022-12-08 15:23:34 +01:00
if ( ACPI_FAILURE ( acpi_remove_address_space_handler (
ec - > address_space_handler_holder ,
2016-03-24 10:42:47 +08:00
ACPI_ADR_SPACE_EC , & acpi_ec_space_handler ) ) )
pr_err ( " failed to remove space handler \n " ) ;
clear_bit ( EC_FLAGS_EC_HANDLER_INSTALLED , & ec - > flags ) ;
}
2016-07-08 09:25:05 +08:00
/*
* Stops handling the EC transactions after removing the operation
* region handler . This is required because _REG ( DISCONNECT )
* invoked during the removal can result in new EC transactions .
*
* Flushes the EC requests and thus disables the GPE before
* removing the GPE handler . This is required by the current ACPICA
* GPE core . ACPICA GPE core will automatically disable a GPE when
* it is indicated but there is no way to handle it . So the drivers
* must disable the GPEs prior to removing the GPE handlers .
*/
acpi_ec_stop ( ec , false ) ;
2019-10-14 16:56:01 +08:00
if ( test_bit ( EC_FLAGS_EVENT_HANDLER_INSTALLED , & ec - > flags ) ) {
2019-10-14 16:56:02 +08:00
if ( ec - > gpe > = 0 & &
ACPI_FAILURE ( acpi_remove_gpe_handler ( NULL , ec - > gpe ,
& acpi_ec_gpe_handler ) ) )
2016-03-24 10:42:47 +08:00
pr_err ( " failed to remove gpe handler \n " ) ;
2019-10-14 16:56:02 +08:00
if ( ec - > irq > = 0 )
free_irq ( ec - > irq , ec ) ;
2019-10-14 16:56:01 +08:00
clear_bit ( EC_FLAGS_EVENT_HANDLER_INSTALLED , & ec - > flags ) ;
2016-03-24 10:42:47 +08:00
}
2019-10-14 16:56:01 +08:00
if ( test_bit ( EC_FLAGS_QUERY_METHODS_INSTALLED , & ec - > flags ) ) {
2016-09-07 16:50:21 +08:00
acpi_ec_remove_query_handlers ( ec , true , 0 ) ;
2019-10-14 16:56:01 +08:00
clear_bit ( EC_FLAGS_QUERY_METHODS_INSTALLED , & ec - > flags ) ;
2016-09-07 16:50:21 +08:00
}
2007-09-05 19:56:38 -04:00
}
2022-12-08 15:23:35 +01:00
static int acpi_ec_setup ( struct acpi_ec * ec , struct acpi_device * device , bool call_reg )
2005-04-16 15:20:36 -07:00
{
2016-09-07 16:50:08 +08:00
int ret ;
2007-03-07 22:28:00 +03:00
2022-12-08 15:23:35 +01:00
ret = ec_install_handlers ( ec , device , call_reg ) ;
2016-09-07 16:50:08 +08:00
if ( ret )
return ret ;
/* First EC capable of handling transactions */
2020-02-27 22:51:36 +01:00
if ( ! first_ec )
2016-09-07 16:50:08 +08:00
first_ec = ec ;
2020-02-27 22:51:36 +01:00
pr_info ( " EC_CMD/EC_SC=0x%lx, EC_DATA=0x%lx \n " , ec - > command_addr ,
ec - > data_addr ) ;
2016-09-07 16:50:27 +08:00
2020-02-27 22:51:36 +01:00
if ( test_bit ( EC_FLAGS_EVENT_HANDLER_INSTALLED , & ec - > flags ) ) {
if ( ec - > gpe > = 0 )
pr_info ( " GPE=0x%x \n " , ec - > gpe ) ;
else
pr_info ( " IRQ=%d \n " , ec - > irq ) ;
2009-04-01 01:33:15 -04:00
}
2016-09-07 16:50:27 +08:00
2016-09-07 16:50:08 +08:00
return ret ;
2016-09-07 16:50:27 +08:00
}
2016-06-21 17:42:05 +08:00
static int acpi_ec_add ( struct acpi_device * device )
{
ACPI: EC: Simplify acpi_ec_add()
First, notice that if the device ID in acpi_ec_add() is equal to
ACPI_ECDT_HID, boot_ec_is_ecdt must be set, because this means
that the device object passed to acpi_ec_add() comes from
acpi_ec_ecdt_start() which fails if boot_ec_is_ecdt is unset.
Accordingly, boot_ec_is_ecdt need not be set again in that case,
so drop that redundant update of it from the code.
Next, ec->handle must be a valid ACPI handle right before
returning 0 from acpi_ec_add(), because it either is the handle
of the device object passed to that function, or it is the boot EC
handle coming from acpi_ec_ecdt_start() which fails if it cannot
find a valid handle for the boot EC. Moreover, the object with
that handle is regarded as a valid representation of the EC in all
cases, so there is no reason to avoid the _DEP list update walk if
that handle is the boot EC handle. Accordingly, drop the dep_update
local variable from acpi_ec_add() and call acpi_walk_dep_device_list()
for ec->handle unconditionally before returning 0 from it.
Finally, the ec local variable in acpi_ec_add() need not be
initialized to NULL and the status local variable declaration
can be moved to the block in which it is used, so change the code
in accordance with these observations.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-03-02 11:25:13 +01:00
struct acpi_ec * ec ;
2019-02-01 12:56:03 +01:00
int ret ;
2016-06-21 17:42:05 +08:00
strcpy ( acpi_device_name ( device ) , ACPI_EC_DEVICE_NAME ) ;
strcpy ( acpi_device_class ( device ) , ACPI_EC_CLASS ) ;
2020-04-06 17:40:27 +02:00
if ( boot_ec & & ( boot_ec - > handle = = device - > handle | |
! strcmp ( acpi_device_hid ( device ) , ACPI_ECDT_HID ) ) ) {
ACPI: EC: Simplify acpi_ec_add()
First, notice that if the device ID in acpi_ec_add() is equal to
ACPI_ECDT_HID, boot_ec_is_ecdt must be set, because this means
that the device object passed to acpi_ec_add() comes from
acpi_ec_ecdt_start() which fails if boot_ec_is_ecdt is unset.
Accordingly, boot_ec_is_ecdt need not be set again in that case,
so drop that redundant update of it from the code.
Next, ec->handle must be a valid ACPI handle right before
returning 0 from acpi_ec_add(), because it either is the handle
of the device object passed to that function, or it is the boot EC
handle coming from acpi_ec_ecdt_start() which fails if it cannot
find a valid handle for the boot EC. Moreover, the object with
that handle is regarded as a valid representation of the EC in all
cases, so there is no reason to avoid the _DEP list update walk if
that handle is the boot EC handle. Accordingly, drop the dep_update
local variable from acpi_ec_add() and call acpi_walk_dep_device_list()
for ec->handle unconditionally before returning 0 from it.
Finally, the ec local variable in acpi_ec_add() need not be
initialized to NULL and the status local variable declaration
can be moved to the block in which it is used, so change the code
in accordance with these observations.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-03-02 11:25:13 +01:00
/* Fast path: this device corresponds to the boot EC. */
2017-09-26 16:54:09 +08:00
ec = boot_ec ;
} else {
ACPI: EC: Simplify acpi_ec_add()
First, notice that if the device ID in acpi_ec_add() is equal to
ACPI_ECDT_HID, boot_ec_is_ecdt must be set, because this means
that the device object passed to acpi_ec_add() comes from
acpi_ec_ecdt_start() which fails if boot_ec_is_ecdt is unset.
Accordingly, boot_ec_is_ecdt need not be set again in that case,
so drop that redundant update of it from the code.
Next, ec->handle must be a valid ACPI handle right before
returning 0 from acpi_ec_add(), because it either is the handle
of the device object passed to that function, or it is the boot EC
handle coming from acpi_ec_ecdt_start() which fails if it cannot
find a valid handle for the boot EC. Moreover, the object with
that handle is regarded as a valid representation of the EC in all
cases, so there is no reason to avoid the _DEP list update walk if
that handle is the boot EC handle. Accordingly, drop the dep_update
local variable from acpi_ec_add() and call acpi_walk_dep_device_list()
for ec->handle unconditionally before returning 0 from it.
Finally, the ec local variable in acpi_ec_add() need not be
initialized to NULL and the status local variable declaration
can be moved to the block in which it is used, so change the code
in accordance with these observations.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-03-02 11:25:13 +01:00
acpi_status status ;
2017-09-26 16:54:09 +08:00
ec = acpi_ec_alloc ( ) ;
if ( ! ec )
return - ENOMEM ;
2019-02-01 12:56:03 +01:00
2017-09-26 16:54:09 +08:00
status = ec_parse_device ( device - > handle , 0 , ec , NULL ) ;
if ( status ! = AE_CTRL_TERMINATE ) {
2016-09-07 16:50:14 +08:00
ret = - EINVAL ;
2020-03-03 20:15:04 +01:00
goto err ;
2017-09-26 16:54:09 +08:00
}
2007-09-05 19:56:38 -04:00
2019-02-01 12:56:03 +01:00
if ( boot_ec & & ec - > command_addr = = boot_ec - > command_addr & &
2022-06-20 11:25:45 +02:00
ec - > data_addr = = boot_ec - > data_addr ) {
2017-09-26 16:54:09 +08:00
/*
2022-06-20 11:25:45 +02:00
* Trust PNP0C09 namespace location rather than ECDT ID .
* But trust ECDT GPE rather than _GPE because of ASUS
* quirks . So do not change boot_ec - > gpe to ec - > gpe ,
* except when the TRUST_DSDT_GPE quirk is set .
2017-09-26 16:54:09 +08:00
*/
boot_ec - > handle = ec - > handle ;
2022-06-20 11:25:45 +02:00
if ( EC_FLAGS_TRUST_DSDT_GPE )
boot_ec - > gpe = ec - > gpe ;
2017-09-26 16:54:09 +08:00
acpi_handle_debug ( ec - > handle , " duplicated. \n " ) ;
acpi_ec_free ( ec ) ;
ec = boot_ec ;
}
2019-02-01 12:55:31 +01:00
}
2022-12-08 15:23:35 +01:00
ret = acpi_ec_setup ( ec , device , true ) ;
2016-09-07 16:50:14 +08:00
if ( ret )
2020-03-03 20:15:04 +01:00
goto err ;
2016-09-07 16:50:14 +08:00
2019-02-01 12:55:31 +01:00
if ( ec = = boot_ec )
acpi_handle_info ( boot_ec - > handle ,
2020-03-06 00:15:24 +01:00
" Boot %s EC initialization complete \n " ,
2019-02-01 12:56:03 +01:00
boot_ec_is_ecdt ? " ECDT " : " DSDT " ) ;
2019-02-01 12:55:31 +01:00
2020-03-06 00:15:24 +01:00
acpi_handle_info ( ec - > handle ,
" EC: Used to handle transactions and events \n " ) ;
2008-09-22 14:37:34 -07:00
device - > driver_data = ec ;
2010-07-29 22:08:44 +02:00
2012-02-06 08:17:08 -08:00
ret = ! ! request_region ( ec - > data_addr , 1 , " EC data " ) ;
WARN ( ! ret , " Could not request EC data io port 0x%lx " , ec - > data_addr ) ;
ret = ! ! request_region ( ec - > command_addr , 1 , " EC cmd " ) ;
WARN ( ! ret , " Could not request EC cmd io port 0x%lx " , ec - > command_addr ) ;
2010-07-29 22:08:44 +02:00
ACPI: EC: Simplify acpi_ec_add()
First, notice that if the device ID in acpi_ec_add() is equal to
ACPI_ECDT_HID, boot_ec_is_ecdt must be set, because this means
that the device object passed to acpi_ec_add() comes from
acpi_ec_ecdt_start() which fails if boot_ec_is_ecdt is unset.
Accordingly, boot_ec_is_ecdt need not be set again in that case,
so drop that redundant update of it from the code.
Next, ec->handle must be a valid ACPI handle right before
returning 0 from acpi_ec_add(), because it either is the handle
of the device object passed to that function, or it is the boot EC
handle coming from acpi_ec_ecdt_start() which fails if it cannot
find a valid handle for the boot EC. Moreover, the object with
that handle is regarded as a valid representation of the EC in all
cases, so there is no reason to avoid the _DEP list update walk if
that handle is the boot EC handle. Accordingly, drop the dep_update
local variable from acpi_ec_add() and call acpi_walk_dep_device_list()
for ec->handle unconditionally before returning 0 from it.
Finally, the ec local variable in acpi_ec_add() need not be
initialized to NULL and the status local variable declaration
can be moved to the block in which it is used, so change the code
in accordance with these observations.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-03-02 11:25:13 +01:00
/* Reprobe devices depending on the EC */
2021-06-03 23:40:02 +01:00
acpi_dev_clear_dependencies ( device ) ;
ACPI: EC: Simplify acpi_ec_add()
First, notice that if the device ID in acpi_ec_add() is equal to
ACPI_ECDT_HID, boot_ec_is_ecdt must be set, because this means
that the device object passed to acpi_ec_add() comes from
acpi_ec_ecdt_start() which fails if boot_ec_is_ecdt is unset.
Accordingly, boot_ec_is_ecdt need not be set again in that case,
so drop that redundant update of it from the code.
Next, ec->handle must be a valid ACPI handle right before
returning 0 from acpi_ec_add(), because it either is the handle
of the device object passed to that function, or it is the boot EC
handle coming from acpi_ec_ecdt_start() which fails if it cannot
find a valid handle for the boot EC. Moreover, the object with
that handle is regarded as a valid representation of the EC in all
cases, so there is no reason to avoid the _DEP list update walk if
that handle is the boot EC handle. Accordingly, drop the dep_update
local variable from acpi_ec_add() and call acpi_walk_dep_device_list()
for ec->handle unconditionally before returning 0 from it.
Finally, the ec local variable in acpi_ec_add() need not be
initialized to NULL and the status local variable declaration
can be moved to the block in which it is used, so change the code
in accordance with these observations.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-03-02 11:25:13 +01:00
2016-09-07 16:50:27 +08:00
acpi_handle_debug ( ec - > handle , " enumerated. \n " ) ;
2016-09-07 16:50:14 +08:00
return 0 ;
2020-03-03 20:15:04 +01:00
err :
2016-09-07 16:50:27 +08:00
if ( ec ! = boot_ec )
acpi_ec_free ( ec ) ;
2020-03-03 20:15:04 +01:00
2009-06-22 20:41:30 +00:00
return ret ;
2005-04-16 15:20:36 -07:00
}
2022-11-14 00:26:09 +08:00
static void acpi_ec_remove ( struct acpi_device * device )
2005-04-16 15:20:36 -07:00
{
2007-03-07 22:28:00 +03:00
struct acpi_ec * ec ;
2005-04-16 15:20:36 -07:00
if ( ! device )
2022-11-14 00:26:09 +08:00
return ;
2005-04-16 15:20:36 -07:00
ec = acpi_driver_data ( device ) ;
2010-07-29 22:08:44 +02:00
release_region ( ec - > data_addr , 1 ) ;
release_region ( ec - > command_addr , 1 ) ;
2008-09-22 14:37:34 -07:00
device - > driver_data = NULL ;
2016-09-07 16:50:27 +08:00
if ( ec ! = boot_ec ) {
ec_remove_handlers ( ec ) ;
acpi_ec_free ( ec ) ;
}
2005-04-16 15:20:36 -07:00
}
static acpi_status
2007-03-07 22:28:00 +03:00
ec_parse_io_ports ( struct acpi_resource * resource , void * context )
2005-04-16 15:20:36 -07:00
{
2007-03-07 22:28:00 +03:00
struct acpi_ec * ec = context ;
2005-04-16 15:20:36 -07:00
2007-03-07 22:28:00 +03:00
if ( resource - > type ! = ACPI_RESOURCE_TYPE_IO )
2005-04-16 15:20:36 -07:00
return AE_OK ;
/*
* The first address region returned is the data port , and
* the second address region returned is the status / command
* port .
*/
2010-07-29 22:08:44 +02:00
if ( ec - > data_addr = = 0 )
2006-09-26 19:50:33 +04:00
ec - > data_addr = resource - > data . io . minimum ;
2010-07-29 22:08:44 +02:00
else if ( ec - > command_addr = = 0 )
2006-09-26 19:50:33 +04:00
ec - > command_addr = resource - > data . io . minimum ;
2007-03-07 22:28:00 +03:00
else
2005-04-16 15:20:36 -07:00
return AE_CTRL_TERMINATE ;
return AE_OK ;
}
2016-06-03 10:26:12 +08:00
static const struct acpi_device_id ec_device_ids [ ] = {
{ " PNP0C09 " , 0 } ,
2017-09-26 16:54:09 +08:00
{ ACPI_ECDT_HID , 0 } ,
2016-06-03 10:26:12 +08:00
{ " " , 0 } ,
} ;
ACPI / EC: Add support to skip boot stage DSDT probe
We prepared _INI/_STA methods for \_SB, \_SB.PCI0, \_SB.LID0 and
\_SB.EC, _HID(PNP0C09)/_CRS/_GPE for \_SB.EC to poke Windows behavior
with qemu, we got the following execution sequence:
\_SB._INI
\_SB.PCI0._STA
\_SB.LID0._STA
\_SB.EC._STA
\_SB.PCI0._INI
\_SB.LID0._INI
\_SB.EC._INI
There is no extra DSDT EC device enumeration process occurring before
the main ACPI device enumeration process. That means acpi_ec_dsdt_probe()
is not Windows-compatible.
Tracking back, it was added by the following commit:
Commit: c5279dee26c0e8d7c4200993bfc4b540d2469598
Subject: ACPI: EC: Add some basic check for ECDT data
but that commit was misguided.
Why we shouldn't enumerate DSDT EC before the main ACPI device
enumeration?
The only way to know if the DSDT EC is valid would be to evaluate its
_STA control method, but it's not safe to evaluate this control method
that early and out of the ACPI enumeration process, because _STA may
refer to entities (such as resources or ACPI device objects) that may
not have been initialized before OSPM starts to enumerate them via
the main ACPI device enumeration.
But after we had reverted back to the expected behavior, a regression
was reported. On that platform, there is no ECDT, but the platform
control methods access EC operation region earlier than Linux expects
causing some ACPI method execution errors. For this reason, we just
go back to old behavior to still probe DSDT EC as the boot EC.
However, that turns out to lead to yet another functional breakage
and in order to work around all of the problems, we skip boot stage
DSDT probe when the ECDT exists so that a later quirk can always use
correct ECDT GPE setting.
Link: http://bugzilla.kernel.org/show_bug.cgi?id=11880
Link: http://bugzilla.kernel.org/show_bug.cgi?id=119261
Link: http://bugzilla.kernel.org/show_bug.cgi?id=195651
Tested-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
[ rjw: Changelog & comments massage ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-15 09:41:41 +08:00
/*
* This function is not Windows - compatible as Windows never enumerates the
* namespace EC before the main ACPI device enumeration process . It is
* retained for historical reason and will be deprecated in the future .
*/
2019-01-21 13:07:50 +01:00
void __init acpi_ec_dsdt_probe ( void )
2008-01-01 14:12:55 -05:00
{
2016-06-21 17:42:05 +08:00
struct acpi_ec * ec ;
2019-01-21 13:07:50 +01:00
acpi_status status ;
2016-06-21 17:42:05 +08:00
int ret ;
2016-06-03 10:26:12 +08:00
ACPI / EC: Add support to skip boot stage DSDT probe
We prepared _INI/_STA methods for \_SB, \_SB.PCI0, \_SB.LID0 and
\_SB.EC, _HID(PNP0C09)/_CRS/_GPE for \_SB.EC to poke Windows behavior
with qemu, we got the following execution sequence:
\_SB._INI
\_SB.PCI0._STA
\_SB.LID0._STA
\_SB.EC._STA
\_SB.PCI0._INI
\_SB.LID0._INI
\_SB.EC._INI
There is no extra DSDT EC device enumeration process occurring before
the main ACPI device enumeration process. That means acpi_ec_dsdt_probe()
is not Windows-compatible.
Tracking back, it was added by the following commit:
Commit: c5279dee26c0e8d7c4200993bfc4b540d2469598
Subject: ACPI: EC: Add some basic check for ECDT data
but that commit was misguided.
Why we shouldn't enumerate DSDT EC before the main ACPI device
enumeration?
The only way to know if the DSDT EC is valid would be to evaluate its
_STA control method, but it's not safe to evaluate this control method
that early and out of the ACPI enumeration process, because _STA may
refer to entities (such as resources or ACPI device objects) that may
not have been initialized before OSPM starts to enumerate them via
the main ACPI device enumeration.
But after we had reverted back to the expected behavior, a regression
was reported. On that platform, there is no ECDT, but the platform
control methods access EC operation region earlier than Linux expects
causing some ACPI method execution errors. For this reason, we just
go back to old behavior to still probe DSDT EC as the boot EC.
However, that turns out to lead to yet another functional breakage
and in order to work around all of the problems, we skip boot stage
DSDT probe when the ECDT exists so that a later quirk can always use
correct ECDT GPE setting.
Link: http://bugzilla.kernel.org/show_bug.cgi?id=11880
Link: http://bugzilla.kernel.org/show_bug.cgi?id=119261
Link: http://bugzilla.kernel.org/show_bug.cgi?id=195651
Tested-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
[ rjw: Changelog & comments massage ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-15 09:41:41 +08:00
/*
* If a platform has ECDT , there is no need to proceed as the
* following probe is not a part of the ACPI device enumeration ,
* executing _STA is not safe , and thus this probe may risk of
* picking up an invalid EC device .
*/
if ( boot_ec )
2019-01-21 13:07:50 +01:00
return ;
ACPI / EC: Add support to skip boot stage DSDT probe
We prepared _INI/_STA methods for \_SB, \_SB.PCI0, \_SB.LID0 and
\_SB.EC, _HID(PNP0C09)/_CRS/_GPE for \_SB.EC to poke Windows behavior
with qemu, we got the following execution sequence:
\_SB._INI
\_SB.PCI0._STA
\_SB.LID0._STA
\_SB.EC._STA
\_SB.PCI0._INI
\_SB.LID0._INI
\_SB.EC._INI
There is no extra DSDT EC device enumeration process occurring before
the main ACPI device enumeration process. That means acpi_ec_dsdt_probe()
is not Windows-compatible.
Tracking back, it was added by the following commit:
Commit: c5279dee26c0e8d7c4200993bfc4b540d2469598
Subject: ACPI: EC: Add some basic check for ECDT data
but that commit was misguided.
Why we shouldn't enumerate DSDT EC before the main ACPI device
enumeration?
The only way to know if the DSDT EC is valid would be to evaluate its
_STA control method, but it's not safe to evaluate this control method
that early and out of the ACPI enumeration process, because _STA may
refer to entities (such as resources or ACPI device objects) that may
not have been initialized before OSPM starts to enumerate them via
the main ACPI device enumeration.
But after we had reverted back to the expected behavior, a regression
was reported. On that platform, there is no ECDT, but the platform
control methods access EC operation region earlier than Linux expects
causing some ACPI method execution errors. For this reason, we just
go back to old behavior to still probe DSDT EC as the boot EC.
However, that turns out to lead to yet another functional breakage
and in order to work around all of the problems, we skip boot stage
DSDT probe when the ECDT exists so that a later quirk can always use
correct ECDT GPE setting.
Link: http://bugzilla.kernel.org/show_bug.cgi?id=11880
Link: http://bugzilla.kernel.org/show_bug.cgi?id=119261
Link: http://bugzilla.kernel.org/show_bug.cgi?id=195651
Tested-by: Daniel Drake <drake@endlessm.com>
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
[ rjw: Changelog & comments massage ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-06-15 09:41:41 +08:00
2016-06-21 17:42:05 +08:00
ec = acpi_ec_alloc ( ) ;
if ( ! ec )
2019-01-21 13:07:50 +01:00
return ;
2016-06-03 10:26:12 +08:00
/*
2016-09-07 16:50:21 +08:00
* At this point , the namespace is initialized , so start to find
* the namespace objects .
2016-06-03 10:26:12 +08:00
*/
2019-01-21 13:07:50 +01:00
status = acpi_get_devices ( ec_device_ids [ 0 ] . id , ec_parse_device , ec , NULL ) ;
2016-06-21 17:42:05 +08:00
if ( ACPI_FAILURE ( status ) | | ! ec - > handle ) {
2019-01-21 13:07:50 +01:00
acpi_ec_free ( ec ) ;
return ;
2008-01-01 14:12:55 -05:00
}
2019-01-21 13:07:50 +01:00
2016-09-07 16:50:21 +08:00
/*
* When the DSDT EC is available , always re - configure boot EC to
* have _REG evaluated . _REG can only be evaluated after the
* namespace initialization .
* At this point , the GPE is not fully initialized , so do not to
* handle the events .
*/
2022-12-08 15:23:35 +01:00
ret = acpi_ec_setup ( ec , NULL , true ) ;
2019-02-01 10:59:42 +01:00
if ( ret ) {
2016-09-07 16:50:08 +08:00
acpi_ec_free ( ec ) ;
2019-02-01 10:59:42 +01:00
return ;
}
boot_ec = ec ;
acpi_handle_info ( ec - > handle ,
" Boot DSDT EC used to handle transactions \n " ) ;
2008-01-01 14:12:55 -05:00
}
2016-09-07 16:50:21 +08:00
/*
2020-03-06 00:14:35 +01:00
* acpi_ec_ecdt_start - Finalize the boot ECDT EC initialization .
*
* First , look for an ACPI handle for the boot ECDT EC if acpi_ec_add ( ) has not
* found a matching object in the namespace .
*
* Next , in case the DSDT EC is not functioning , it is still necessary to
* provide a functional ECDT EC to handle events , so add an extra device object
* to represent it ( see https : //bugzilla.kernel.org/show_bug.cgi?id=115021).
*
* This is useful on platforms with valid ECDT and invalid DSDT EC settings ,
* like ASUS X550ZE ( see https : //bugzilla.kernel.org/show_bug.cgi?id=196847).
2016-09-07 16:50:21 +08:00
*/
2020-03-06 00:14:35 +01:00
static void __init acpi_ec_ecdt_start ( void )
2016-09-07 16:50:21 +08:00
{
2020-03-06 00:14:35 +01:00
struct acpi_table_ecdt * ecdt_ptr ;
2016-09-07 16:50:21 +08:00
acpi_handle handle ;
2020-03-06 00:14:35 +01:00
acpi_status status ;
2016-09-07 16:50:21 +08:00
2020-03-06 00:14:35 +01:00
/* Bail out if a matching EC has been found in the namespace. */
if ( ! boot_ec | | boot_ec - > handle ! = ACPI_ROOT_OBJECT )
return ;
2016-09-07 16:50:21 +08:00
2020-03-06 00:14:35 +01:00
/* Look up the object pointed to from the ECDT in the namespace. */
status = acpi_get_table ( ACPI_SIG_ECDT , 1 ,
( struct acpi_table_header * * ) & ecdt_ptr ) ;
if ( ACPI_FAILURE ( status ) )
return ;
2017-09-26 16:54:09 +08:00
2020-03-06 00:14:35 +01:00
status = acpi_get_handle ( NULL , ecdt_ptr - > id , & handle ) ;
2020-05-07 17:09:19 +08:00
if ( ACPI_SUCCESS ( status ) ) {
boot_ec - > handle = handle ;
2016-09-07 16:50:21 +08:00
2020-05-07 17:09:19 +08:00
/* Add a special ACPI device object to represent the boot EC. */
acpi_bus_register_early_device ( ACPI_BUS_TYPE_ECDT_EC ) ;
}
2020-03-06 00:14:35 +01:00
2020-05-07 17:09:19 +08:00
acpi_put_table ( ( struct acpi_table_header * ) ecdt_ptr ) ;
2014-10-29 11:33:49 +08:00
}
2019-02-01 14:13:41 +08:00
/*
* On some hardware it is necessary to clear events accumulated by the EC during
* sleep . These ECs stop reporting GPEs until they are manually polled , if too
* many events are accumulated . ( e . g . Samsung Series 5 / 9 notebooks )
*
* https : //bugzilla.kernel.org/show_bug.cgi?id=44161
*
* Ideally , the EC should also be instructed NOT to accumulate events during
* sleep ( which Windows seems to do somehow ) , but the interface to control this
* behaviour is not known at this time .
*
* Models known to be affected are Samsung 530U xx / 535U xx / 540U xx / 550 Pxx / 900 Xxx ,
* however it is very likely that other Samsung models are affected .
*
* On systems which don ' t accumulate _Q events during sleep , this extra check
* should be harmless .
*/
static int ec_clear_on_resume ( const struct dmi_system_id * id )
{
pr_debug ( " Detected system needing EC poll on resume. \n " ) ;
EC_FLAGS_CLEAR_ON_RESUME = 1 ;
ec_event_clearing = ACPI_EC_EVT_TIMING_STATUS ;
return 0 ;
}
ACPI / EC: Remove wrong ECDT correction quirks
Our Windows probe result shows that EC._REG is evaluated after evaluating
all _INI/_STA control methods.
With boot EC always switched in acpi_ec_dsdt_probe(), we can see that as
long as there is no EC opregion accesses in the MLC (module level code, AML
code out of any control methods) and in _INI/_STA, there is no need to make
sure that ECDT must be correct.
Bugs of 9399/12461 were reported against an order issue that BAT0/1._STA
evaluations contain EC accesses while the ECDT setting is wrong.
>From the acpidump output posted on bug 9399, we can see that it is actually
a different issue. In this table, if EC._REG is not executed, EC accesses
will be done in a platform specific manner. As we've already ensured not to
execute EC._REG during the eary stage, we can remove the quirks for bug
9399.
From the acpidump output posted on bug 12461, we can see that it still
needs the quirk. In this table, EC._REG flags a named object whose default
value is One, thus BAT1._STA surely should invoke EC accesses whatever we
invoke EC._REG or not. We have to keep the quirk for it before we can root
cause the issue.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-21 17:42:12 +08:00
/*
* Some ECDTs contain wrong register addresses .
* MSI MS - 171F
* https : //bugzilla.kernel.org/show_bug.cgi?id=12461
*/
ACPI 2.0 / ECDT: Remove early namespace reference from EC
All operation region accesses are allowed by AML interpreter when AML is
executed, so actually BIOSen are responsible to avoid the operation region
accesses in AML before OSPM has prepared an operation region driver. This
is done via _REG control method. So AML code normally sets a global named
object REGC to 1 when _REG(3, 1) is evaluated.
Then what is ECDT? Quoting from ACPI spec 6.0, 5.2.15 Embedded Controller
Boot Resources Table (ECDT):
"The presence of this table allows OSPM to provide Embedded Controller
operation region space access before the namespace has been evaluated."
Spec also suggests a compatible mean to indicate the early EC access
availability:
Device (EC)
{
Name (REGC, Ones)
Method (_REG, 2)
{
If (LEqual (Arg0, 3))
{
Store (Arg1, REGC)
}
}
Method (ECAV)
{
If (LEqual (REGC, Ones))
{
If (LGreaterEqual (_REV, 2))
{
Return (One)
}
Else
{
Return (Zero)
}
}
Else
{
Return (REGC)
}
}
}
In this way, it allows EC accesses to happen before EC._REG(3, 1) is
invoked.
But ECAV is not the only way practical BIOSen using to indicate the early
EC access availibility, the known variations include:
1. Setting REGC to One in \_SB._INI when _REV >= 2. Since \_SB._INI is the
first control method evaluated by OSPM during the enumeration, this
allows EC accesses to happen for the entire enumeration process before
the namespace EC is enumerated.
2. Initialize REGC to One by default, this even allows EC accesses to
happen during the table loading.
Linux is now broken around ECDT support during the long term bug fixing
work because it has merged many wrong ECDT bug fixes (see details below).
Linux currently uses namespace EC's settings instead of ECDT settings when
ECDT is detected. This apparently will result in namespace walk and
_CRS/_GPE/_REG evaluations. Such stuffs could only happen after namespace
is ready, while ECDT is purposely to be used before namespace is ready.
The wrong bug fixing story is:
1. Link 1:
At Linux ACPI early stages, "no _Lxx/_Exx/_Qxx evaluation can happen
before the namespace is ready" are not ensured by ACPICA core and Linux.
This is currently ensured by deferred enabling of GPE and defered
registering of EC query methods (acpi_ec_register_query_methods).
2. Link 2:
Reporters reported buggy ECDTs, expecting quirks for the platform.
Originally, the quirk is simple, only doing things with ECDT.
Bug 9399 and 12461 are platforms (Asus L4R, Asus M6R, MSI MS-171F)
reported to have wrong ECDT IO port addresses, the port addresses are
reversed.
Bug 11880 is a platform (Asus X50GL) reported to have 0 valued port
addresses, we can see that all EC accesses are protected by ECAV on
this platform, so actually no early EC accesses is required by this
platform.
3. Link 3:
But when the bug fixing developer was requested to provide a handy and
non-quirk bug fix, he tried to use correct EC settings from namespace
and broke the spec purpose. We can even see that the developer was
suffered from many regrssions. One interesting one is 14086, where the
actual root cause obviously should be: _REG is evaluated too early. But
unfortunately, the bug is fixed in a totally wrong way.
So everything goes wrong from these commits:
Commit: c6cb0e878446c79f42e7833d7bb69ed6bfbb381f
Subject: ACPI: EC: Don't trust ECDT tables from ASUS
Commit: a5032bfdd9c80e0231a6324661e123818eb46ecd
Subject: ACPI: EC: Always parse EC device
This patch reverts Linux behavior to simple ECDT quirk support in order to
stop early _CRS/_GPE/_REG evaluations.
For Bug 9399, 12461, since it is reported that the platforms require early
EC accesses, this patch restores the simple ECDT quirks for them.
For Bug 11880, since it is not reported that the platform requires early EC
accesses and its ACPI tables contain correct ECAV, we choose an ECDT
enumeration failure for this platform.
Link 1: https://bugzilla.kernel.org/show_bug.cgi?id=9916
http://bugzilla.kernel.org/show_bug.cgi?id=10100
https://lkml.org/lkml/2008/2/25/282
Link 2: https://bugzilla.kernel.org/show_bug.cgi?id=9399
https://bugzilla.kernel.org/show_bug.cgi?id=12461
https://bugzilla.kernel.org/show_bug.cgi?id=11880
Link 3: https://bugzilla.kernel.org/show_bug.cgi?id=11884
https://bugzilla.kernel.org/show_bug.cgi?id=14081
https://bugzilla.kernel.org/show_bug.cgi?id=14086
https://bugzilla.kernel.org/show_bug.cgi?id=14446
Link 4: https://bugzilla.kernel.org/show_bug.cgi?id=112911
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-24 10:42:53 +08:00
static int ec_correct_ecdt ( const struct dmi_system_id * id )
{
pr_debug ( " Detected system needing ECDT address correction. \n " ) ;
EC_FLAGS_CORRECT_ECDT = 1 ;
return 0 ;
}
2021-06-21 09:37:27 +08:00
/*
* Some ECDTs contain wrong GPE setting , but they share the same port addresses
* with DSDT EC , don ' t duplicate the DSDT EC with ECDT EC in this case .
* https : //bugzilla.kernel.org/show_bug.cgi?id=209989
*/
static int ec_honor_dsdt_gpe ( const struct dmi_system_id * id )
{
pr_debug ( " Detected system needing DSDT GPE setting. \n " ) ;
EC_FLAGS_TRUST_DSDT_GPE = 1 ;
return 0 ;
}
2017-09-14 11:59:30 +02:00
static const struct dmi_system_id ec_dmi_table [ ] __initconst = {
2009-10-02 20:21:33 +04:00
{
2022-06-20 11:25:46 +02:00
/*
* MSI MS - 171F
* https : //bugzilla.kernel.org/show_bug.cgi?id=12461
*/
. callback = ec_correct_ecdt ,
. matches = {
DMI_MATCH ( DMI_SYS_VENDOR , " Micro-Star " ) ,
DMI_MATCH ( DMI_PRODUCT_NAME , " MS-171F " ) ,
} ,
} ,
2017-06-15 09:41:47 +08:00
{
2022-06-20 11:25:46 +02:00
/*
* HP Pavilion Gaming Laptop 15 - cx0xxx
* https : //bugzilla.kernel.org/show_bug.cgi?id=209989
*/
. callback = ec_honor_dsdt_gpe ,
. matches = {
DMI_MATCH ( DMI_SYS_VENDOR , " HP " ) ,
DMI_MATCH ( DMI_PRODUCT_NAME , " HP Pavilion Gaming Laptop 15-cx0xxx " ) ,
} ,
} ,
2022-10-30 01:20:08 +03:00
{
/*
* HP Pavilion Gaming Laptop 15 - cx0041ur
*/
. callback = ec_honor_dsdt_gpe ,
. matches = {
DMI_MATCH ( DMI_SYS_VENDOR , " HP " ) ,
DMI_MATCH ( DMI_PRODUCT_NAME , " HP 15-cx0041ur " ) ,
} ,
} ,
2021-06-21 09:37:27 +08:00
{
2022-06-20 11:25:46 +02:00
/*
* Samsung hardware
* https : //bugzilla.kernel.org/show_bug.cgi?id=44161
*/
. callback = ec_clear_on_resume ,
. matches = {
DMI_MATCH ( DMI_SYS_VENDOR , " SAMSUNG ELECTRONICS CO., LTD. " ) ,
} ,
} ,
{ }
2009-10-02 20:21:33 +04:00
} ;
2019-01-21 13:07:50 +01:00
void __init acpi_ec_ecdt_probe ( void )
2007-03-07 22:28:00 +03:00
{
2005-08-11 17:32:05 -04:00
struct acpi_table_ecdt * ecdt_ptr ;
2016-06-21 17:42:05 +08:00
struct acpi_ec * ec ;
2019-01-21 13:07:50 +01:00
acpi_status status ;
int ret ;
2005-07-23 04:08:00 -04:00
2019-01-21 13:07:50 +01:00
/* Generate a boot ec context. */
2009-10-02 20:21:33 +04:00
dmi_check_system ( ec_dmi_table ) ;
2007-02-02 19:48:22 +03:00
status = acpi_get_table ( ACPI_SIG_ECDT , 1 ,
( struct acpi_table_header * * ) & ecdt_ptr ) ;
2019-01-21 13:07:50 +01:00
if ( ACPI_FAILURE ( status ) )
return ;
2009-10-02 20:21:33 +04:00
ACPI 2.0 / ECDT: Remove early namespace reference from EC
All operation region accesses are allowed by AML interpreter when AML is
executed, so actually BIOSen are responsible to avoid the operation region
accesses in AML before OSPM has prepared an operation region driver. This
is done via _REG control method. So AML code normally sets a global named
object REGC to 1 when _REG(3, 1) is evaluated.
Then what is ECDT? Quoting from ACPI spec 6.0, 5.2.15 Embedded Controller
Boot Resources Table (ECDT):
"The presence of this table allows OSPM to provide Embedded Controller
operation region space access before the namespace has been evaluated."
Spec also suggests a compatible mean to indicate the early EC access
availability:
Device (EC)
{
Name (REGC, Ones)
Method (_REG, 2)
{
If (LEqual (Arg0, 3))
{
Store (Arg1, REGC)
}
}
Method (ECAV)
{
If (LEqual (REGC, Ones))
{
If (LGreaterEqual (_REV, 2))
{
Return (One)
}
Else
{
Return (Zero)
}
}
Else
{
Return (REGC)
}
}
}
In this way, it allows EC accesses to happen before EC._REG(3, 1) is
invoked.
But ECAV is not the only way practical BIOSen using to indicate the early
EC access availibility, the known variations include:
1. Setting REGC to One in \_SB._INI when _REV >= 2. Since \_SB._INI is the
first control method evaluated by OSPM during the enumeration, this
allows EC accesses to happen for the entire enumeration process before
the namespace EC is enumerated.
2. Initialize REGC to One by default, this even allows EC accesses to
happen during the table loading.
Linux is now broken around ECDT support during the long term bug fixing
work because it has merged many wrong ECDT bug fixes (see details below).
Linux currently uses namespace EC's settings instead of ECDT settings when
ECDT is detected. This apparently will result in namespace walk and
_CRS/_GPE/_REG evaluations. Such stuffs could only happen after namespace
is ready, while ECDT is purposely to be used before namespace is ready.
The wrong bug fixing story is:
1. Link 1:
At Linux ACPI early stages, "no _Lxx/_Exx/_Qxx evaluation can happen
before the namespace is ready" are not ensured by ACPICA core and Linux.
This is currently ensured by deferred enabling of GPE and defered
registering of EC query methods (acpi_ec_register_query_methods).
2. Link 2:
Reporters reported buggy ECDTs, expecting quirks for the platform.
Originally, the quirk is simple, only doing things with ECDT.
Bug 9399 and 12461 are platforms (Asus L4R, Asus M6R, MSI MS-171F)
reported to have wrong ECDT IO port addresses, the port addresses are
reversed.
Bug 11880 is a platform (Asus X50GL) reported to have 0 valued port
addresses, we can see that all EC accesses are protected by ECAV on
this platform, so actually no early EC accesses is required by this
platform.
3. Link 3:
But when the bug fixing developer was requested to provide a handy and
non-quirk bug fix, he tried to use correct EC settings from namespace
and broke the spec purpose. We can even see that the developer was
suffered from many regrssions. One interesting one is 14086, where the
actual root cause obviously should be: _REG is evaluated too early. But
unfortunately, the bug is fixed in a totally wrong way.
So everything goes wrong from these commits:
Commit: c6cb0e878446c79f42e7833d7bb69ed6bfbb381f
Subject: ACPI: EC: Don't trust ECDT tables from ASUS
Commit: a5032bfdd9c80e0231a6324661e123818eb46ecd
Subject: ACPI: EC: Always parse EC device
This patch reverts Linux behavior to simple ECDT quirk support in order to
stop early _CRS/_GPE/_REG evaluations.
For Bug 9399, 12461, since it is reported that the platforms require early
EC accesses, this patch restores the simple ECDT quirks for them.
For Bug 11880, since it is not reported that the platform requires early EC
accesses and its ACPI tables contain correct ECAV, we choose an ECDT
enumeration failure for this platform.
Link 1: https://bugzilla.kernel.org/show_bug.cgi?id=9916
http://bugzilla.kernel.org/show_bug.cgi?id=10100
https://lkml.org/lkml/2008/2/25/282
Link 2: https://bugzilla.kernel.org/show_bug.cgi?id=9399
https://bugzilla.kernel.org/show_bug.cgi?id=12461
https://bugzilla.kernel.org/show_bug.cgi?id=11880
Link 3: https://bugzilla.kernel.org/show_bug.cgi?id=11884
https://bugzilla.kernel.org/show_bug.cgi?id=14081
https://bugzilla.kernel.org/show_bug.cgi?id=14086
https://bugzilla.kernel.org/show_bug.cgi?id=14446
Link 4: https://bugzilla.kernel.org/show_bug.cgi?id=112911
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-24 10:42:53 +08:00
if ( ! ecdt_ptr - > control . address | | ! ecdt_ptr - > data . address ) {
/*
* Asus X50GL :
* https : //bugzilla.kernel.org/show_bug.cgi?id=11880
*/
2020-05-07 17:09:19 +08:00
goto out ;
2014-07-03 00:35:09 +01:00
}
2009-10-02 20:21:40 +04:00
2019-01-21 13:07:50 +01:00
ec = acpi_ec_alloc ( ) ;
if ( ! ec )
2020-05-07 17:09:19 +08:00
goto out ;
2019-01-21 13:07:50 +01:00
ACPI 2.0 / ECDT: Remove early namespace reference from EC
All operation region accesses are allowed by AML interpreter when AML is
executed, so actually BIOSen are responsible to avoid the operation region
accesses in AML before OSPM has prepared an operation region driver. This
is done via _REG control method. So AML code normally sets a global named
object REGC to 1 when _REG(3, 1) is evaluated.
Then what is ECDT? Quoting from ACPI spec 6.0, 5.2.15 Embedded Controller
Boot Resources Table (ECDT):
"The presence of this table allows OSPM to provide Embedded Controller
operation region space access before the namespace has been evaluated."
Spec also suggests a compatible mean to indicate the early EC access
availability:
Device (EC)
{
Name (REGC, Ones)
Method (_REG, 2)
{
If (LEqual (Arg0, 3))
{
Store (Arg1, REGC)
}
}
Method (ECAV)
{
If (LEqual (REGC, Ones))
{
If (LGreaterEqual (_REV, 2))
{
Return (One)
}
Else
{
Return (Zero)
}
}
Else
{
Return (REGC)
}
}
}
In this way, it allows EC accesses to happen before EC._REG(3, 1) is
invoked.
But ECAV is not the only way practical BIOSen using to indicate the early
EC access availibility, the known variations include:
1. Setting REGC to One in \_SB._INI when _REV >= 2. Since \_SB._INI is the
first control method evaluated by OSPM during the enumeration, this
allows EC accesses to happen for the entire enumeration process before
the namespace EC is enumerated.
2. Initialize REGC to One by default, this even allows EC accesses to
happen during the table loading.
Linux is now broken around ECDT support during the long term bug fixing
work because it has merged many wrong ECDT bug fixes (see details below).
Linux currently uses namespace EC's settings instead of ECDT settings when
ECDT is detected. This apparently will result in namespace walk and
_CRS/_GPE/_REG evaluations. Such stuffs could only happen after namespace
is ready, while ECDT is purposely to be used before namespace is ready.
The wrong bug fixing story is:
1. Link 1:
At Linux ACPI early stages, "no _Lxx/_Exx/_Qxx evaluation can happen
before the namespace is ready" are not ensured by ACPICA core and Linux.
This is currently ensured by deferred enabling of GPE and defered
registering of EC query methods (acpi_ec_register_query_methods).
2. Link 2:
Reporters reported buggy ECDTs, expecting quirks for the platform.
Originally, the quirk is simple, only doing things with ECDT.
Bug 9399 and 12461 are platforms (Asus L4R, Asus M6R, MSI MS-171F)
reported to have wrong ECDT IO port addresses, the port addresses are
reversed.
Bug 11880 is a platform (Asus X50GL) reported to have 0 valued port
addresses, we can see that all EC accesses are protected by ECAV on
this platform, so actually no early EC accesses is required by this
platform.
3. Link 3:
But when the bug fixing developer was requested to provide a handy and
non-quirk bug fix, he tried to use correct EC settings from namespace
and broke the spec purpose. We can even see that the developer was
suffered from many regrssions. One interesting one is 14086, where the
actual root cause obviously should be: _REG is evaluated too early. But
unfortunately, the bug is fixed in a totally wrong way.
So everything goes wrong from these commits:
Commit: c6cb0e878446c79f42e7833d7bb69ed6bfbb381f
Subject: ACPI: EC: Don't trust ECDT tables from ASUS
Commit: a5032bfdd9c80e0231a6324661e123818eb46ecd
Subject: ACPI: EC: Always parse EC device
This patch reverts Linux behavior to simple ECDT quirk support in order to
stop early _CRS/_GPE/_REG evaluations.
For Bug 9399, 12461, since it is reported that the platforms require early
EC accesses, this patch restores the simple ECDT quirks for them.
For Bug 11880, since it is not reported that the platform requires early EC
accesses and its ACPI tables contain correct ECAV, we choose an ECDT
enumeration failure for this platform.
Link 1: https://bugzilla.kernel.org/show_bug.cgi?id=9916
http://bugzilla.kernel.org/show_bug.cgi?id=10100
https://lkml.org/lkml/2008/2/25/282
Link 2: https://bugzilla.kernel.org/show_bug.cgi?id=9399
https://bugzilla.kernel.org/show_bug.cgi?id=12461
https://bugzilla.kernel.org/show_bug.cgi?id=11880
Link 3: https://bugzilla.kernel.org/show_bug.cgi?id=11884
https://bugzilla.kernel.org/show_bug.cgi?id=14081
https://bugzilla.kernel.org/show_bug.cgi?id=14086
https://bugzilla.kernel.org/show_bug.cgi?id=14446
Link 4: https://bugzilla.kernel.org/show_bug.cgi?id=112911
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-03-24 10:42:53 +08:00
if ( EC_FLAGS_CORRECT_ECDT ) {
2016-06-21 17:42:05 +08:00
ec - > command_addr = ecdt_ptr - > data . address ;
ec - > data_addr = ecdt_ptr - > control . address ;
2009-01-14 02:57:53 +03:00
} else {
2016-06-21 17:42:05 +08:00
ec - > command_addr = ecdt_ptr - > control . address ;
ec - > data_addr = ecdt_ptr - > data . address ;
2009-01-14 02:57:53 +03:00
}
2019-10-14 16:56:02 +08:00
/*
* Ignore the GPE value on Reduced Hardware platforms .
* Some products have this set to an erroneous value .
*/
if ( ! acpi_gbl_reduced_hardware )
ec - > gpe = ecdt_ptr - > gpe ;
2019-02-01 10:58:20 +01:00
ec - > handle = ACPI_ROOT_OBJECT ;
2016-09-07 16:50:21 +08:00
/*
* At this point , the namespace is not initialized , so do not find
* the namespace objects , or handle the events .
*/
2022-12-08 15:23:35 +01:00
ret = acpi_ec_setup ( ec , NULL , false ) ;
2019-02-01 10:58:20 +01:00
if ( ret ) {
2016-09-07 16:50:08 +08:00
acpi_ec_free ( ec ) ;
2020-05-07 17:09:19 +08:00
goto out ;
2019-02-01 10:58:20 +01:00
}
boot_ec = ec ;
boot_ec_is_ecdt = true ;
pr_info ( " Boot ECDT EC used to handle transactions \n " ) ;
2020-05-07 17:09:19 +08:00
out :
acpi_put_table ( ( struct acpi_table_header * ) ecdt_ptr ) ;
2005-04-16 15:20:36 -07:00
}
2016-08-03 09:07:58 +08:00
# ifdef CONFIG_PM_SLEEP
ACPI / EC: Add PM operations to improve event handling for suspend process
In the original EC driver, though the event handling is not explicitly
stopped, the EC driver is actually not able to handle events during the
noirq stage as the EC driver is not prepared to handle the EC events in the
polling mode. So if there is no advance_transaction() triggered, the EC
driver couldn't notice the EC events.
However, do we actually need to handle EC events during suspend/resume
stage? EC events are mostly useless for the suspend/resume period (key
strokes and battery/thermal updates, etc.,), and the useful ones (lid
close, power/sleep button press) should have already been delivered to the
OSPM to trigger the power saving operations.
Thus this patch implements acpi_ec_disable_event() to be a reverse call of
acpi_ec_enable_event(), with which, the EC driver is able to stop handling
the EC events in a position before entering the noirq stage.
Since there are actually 2 choices for us:
1. implement event handling in polling mode;
2. stop event handling before entering noirq stage.
And this patch only implements the second choice using .suspend() callback.
Thus this is experimental (first choice is better? or different hook
position is better?). This patch finally keeps the old behavior by default
and prepares a boot parameter to enable this feature.
The differences of the event handling availability between the old behavior
(this patch is not applied) and the new behavior (this patch is applied)
are as follows:
!FreezeEvents FreezeEvents
before suspend Y Y
suspend before EC Y Y
suspend after EC Y N
suspend_late Y N
suspend_noirq Y (actually N) N
resume_noirq Y (actually N) N
resume_late Y (actually N) N
resume before EC Y (actually N) N
resume after EC Y Y
after resume Y Y
Where "actually N" means if there is no EC transactions, the EC driver
is actually not able to notice the pending events.
We can see that FreezeEvents is the only approach now can actually flush
the EC event handling with both query commands and _Qxx evaluations
flushed, other modes can only flush the EC event handling with only query
commands flushed, _Qxx evaluations occurred after stopping the EC driver
may end up failure due to the failure of the EC transaction carried out in
the _Qxx control methods.
We also can see that this feature should be able to trigger some platform
notifications later than resuming other drivers.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-03 16:01:43 +08:00
static int acpi_ec_suspend ( struct device * dev )
{
struct acpi_ec * ec =
acpi_driver_data ( to_acpi_device ( dev ) ) ;
2019-07-31 11:05:42 +02:00
if ( ! pm_suspend_no_platform ( ) & & ec_freeze_events )
ACPI / EC: Add PM operations to improve event handling for suspend process
In the original EC driver, though the event handling is not explicitly
stopped, the EC driver is actually not able to handle events during the
noirq stage as the EC driver is not prepared to handle the EC events in the
polling mode. So if there is no advance_transaction() triggered, the EC
driver couldn't notice the EC events.
However, do we actually need to handle EC events during suspend/resume
stage? EC events are mostly useless for the suspend/resume period (key
strokes and battery/thermal updates, etc.,), and the useful ones (lid
close, power/sleep button press) should have already been delivered to the
OSPM to trigger the power saving operations.
Thus this patch implements acpi_ec_disable_event() to be a reverse call of
acpi_ec_enable_event(), with which, the EC driver is able to stop handling
the EC events in a position before entering the noirq stage.
Since there are actually 2 choices for us:
1. implement event handling in polling mode;
2. stop event handling before entering noirq stage.
And this patch only implements the second choice using .suspend() callback.
Thus this is experimental (first choice is better? or different hook
position is better?). This patch finally keeps the old behavior by default
and prepares a boot parameter to enable this feature.
The differences of the event handling availability between the old behavior
(this patch is not applied) and the new behavior (this patch is applied)
are as follows:
!FreezeEvents FreezeEvents
before suspend Y Y
suspend before EC Y Y
suspend after EC Y N
suspend_late Y N
suspend_noirq Y (actually N) N
resume_noirq Y (actually N) N
resume_late Y (actually N) N
resume before EC Y (actually N) N
resume after EC Y Y
after resume Y Y
Where "actually N" means if there is no EC transactions, the EC driver
is actually not able to notice the pending events.
We can see that FreezeEvents is the only approach now can actually flush
the EC event handling with both query commands and _Qxx evaluations
flushed, other modes can only flush the EC event handling with only query
commands flushed, _Qxx evaluations occurred after stopping the EC driver
may end up failure due to the failure of the EC transaction carried out in
the _Qxx control methods.
We also can see that this feature should be able to trigger some platform
notifications later than resuming other drivers.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-03 16:01:43 +08:00
acpi_ec_disable_event ( ec ) ;
return 0 ;
}
ACPI / EC: Add parameter to force disable the GPE on suspend
After commit 8110dd281e15 (ACPI / sleep: EC-based wakeup from
suspend-to-idle on recent systems) the configuration of GPEs,
including the EC one, is not changed during suspend-to-idle on
recent systems. That's in order to make system wakeup events
generated by the EC work, in particular.
However, on some of the systems in question (for example on Dell
XPS13 9365), in addition to generating system wakeup events the
EC generates a heartbeat sequence of interrupts that have nothing
to do with wakeup while suspended, and the Low Power Idle S0 _DSM
interface doesn't change that behavior.
The users of those systems may prefer to disable the EC GPE during
system suspend, for the cost of non-functional power button wakeup
or similar, but currently there is no way to do that.
For this reason, add a new module parameter, ec_no_wakeup, for the
EC driver module that, if set, will cause the EC GPE to be disabled
during system suspend and re-enabled during the subsequent system
resume.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=192591#c106
Amends: 8110dd281e15 (ACPI / sleep: EC-based wakeup from suspend-to-idle on recent systems)
Reported-and-tested-by: Patrik Kullman <patrik.kullman@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-19 14:41:58 +02:00
static int acpi_ec_suspend_noirq ( struct device * dev )
{
struct acpi_ec * ec = acpi_driver_data ( to_acpi_device ( dev ) ) ;
/*
* The SCI handler doesn ' t run at this point , so the GPE can be
* masked at the low level without side effects .
*/
if ( ec_no_wakeup & & test_bit ( EC_FLAGS_STARTED , & ec - > flags ) & &
2019-10-14 16:56:02 +08:00
ec - > gpe > = 0 & & ec - > reference_count > = 1 )
ACPI / EC: Add parameter to force disable the GPE on suspend
After commit 8110dd281e15 (ACPI / sleep: EC-based wakeup from
suspend-to-idle on recent systems) the configuration of GPEs,
including the EC one, is not changed during suspend-to-idle on
recent systems. That's in order to make system wakeup events
generated by the EC work, in particular.
However, on some of the systems in question (for example on Dell
XPS13 9365), in addition to generating system wakeup events the
EC generates a heartbeat sequence of interrupts that have nothing
to do with wakeup while suspended, and the Low Power Idle S0 _DSM
interface doesn't change that behavior.
The users of those systems may prefer to disable the EC GPE during
system suspend, for the cost of non-functional power button wakeup
or similar, but currently there is no way to do that.
For this reason, add a new module parameter, ec_no_wakeup, for the
EC driver module that, if set, will cause the EC GPE to be disabled
during system suspend and re-enabled during the subsequent system
resume.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=192591#c106
Amends: 8110dd281e15 (ACPI / sleep: EC-based wakeup from suspend-to-idle on recent systems)
Reported-and-tested-by: Patrik Kullman <patrik.kullman@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-19 14:41:58 +02:00
acpi_set_gpe ( NULL , ec - > gpe , ACPI_GPE_DISABLE ) ;
2019-07-31 11:05:33 +02:00
acpi_ec_enter_noirq ( ec ) ;
2018-02-09 22:55:28 +01:00
ACPI / EC: Add parameter to force disable the GPE on suspend
After commit 8110dd281e15 (ACPI / sleep: EC-based wakeup from
suspend-to-idle on recent systems) the configuration of GPEs,
including the EC one, is not changed during suspend-to-idle on
recent systems. That's in order to make system wakeup events
generated by the EC work, in particular.
However, on some of the systems in question (for example on Dell
XPS13 9365), in addition to generating system wakeup events the
EC generates a heartbeat sequence of interrupts that have nothing
to do with wakeup while suspended, and the Low Power Idle S0 _DSM
interface doesn't change that behavior.
The users of those systems may prefer to disable the EC GPE during
system suspend, for the cost of non-functional power button wakeup
or similar, but currently there is no way to do that.
For this reason, add a new module parameter, ec_no_wakeup, for the
EC driver module that, if set, will cause the EC GPE to be disabled
during system suspend and re-enabled during the subsequent system
resume.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=192591#c106
Amends: 8110dd281e15 (ACPI / sleep: EC-based wakeup from suspend-to-idle on recent systems)
Reported-and-tested-by: Patrik Kullman <patrik.kullman@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-19 14:41:58 +02:00
return 0 ;
}
static int acpi_ec_resume_noirq ( struct device * dev )
{
struct acpi_ec * ec = acpi_driver_data ( to_acpi_device ( dev ) ) ;
2019-07-31 11:05:33 +02:00
acpi_ec_leave_noirq ( ec ) ;
2018-02-09 22:55:28 +01:00
ACPI / EC: Add parameter to force disable the GPE on suspend
After commit 8110dd281e15 (ACPI / sleep: EC-based wakeup from
suspend-to-idle on recent systems) the configuration of GPEs,
including the EC one, is not changed during suspend-to-idle on
recent systems. That's in order to make system wakeup events
generated by the EC work, in particular.
However, on some of the systems in question (for example on Dell
XPS13 9365), in addition to generating system wakeup events the
EC generates a heartbeat sequence of interrupts that have nothing
to do with wakeup while suspended, and the Low Power Idle S0 _DSM
interface doesn't change that behavior.
The users of those systems may prefer to disable the EC GPE during
system suspend, for the cost of non-functional power button wakeup
or similar, but currently there is no way to do that.
For this reason, add a new module parameter, ec_no_wakeup, for the
EC driver module that, if set, will cause the EC GPE to be disabled
during system suspend and re-enabled during the subsequent system
resume.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=192591#c106
Amends: 8110dd281e15 (ACPI / sleep: EC-based wakeup from suspend-to-idle on recent systems)
Reported-and-tested-by: Patrik Kullman <patrik.kullman@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-19 14:41:58 +02:00
if ( ec_no_wakeup & & test_bit ( EC_FLAGS_STARTED , & ec - > flags ) & &
2019-10-14 16:56:02 +08:00
ec - > gpe > = 0 & & ec - > reference_count > = 1 )
ACPI / EC: Add parameter to force disable the GPE on suspend
After commit 8110dd281e15 (ACPI / sleep: EC-based wakeup from
suspend-to-idle on recent systems) the configuration of GPEs,
including the EC one, is not changed during suspend-to-idle on
recent systems. That's in order to make system wakeup events
generated by the EC work, in particular.
However, on some of the systems in question (for example on Dell
XPS13 9365), in addition to generating system wakeup events the
EC generates a heartbeat sequence of interrupts that have nothing
to do with wakeup while suspended, and the Low Power Idle S0 _DSM
interface doesn't change that behavior.
The users of those systems may prefer to disable the EC GPE during
system suspend, for the cost of non-functional power button wakeup
or similar, but currently there is no way to do that.
For this reason, add a new module parameter, ec_no_wakeup, for the
EC driver module that, if set, will cause the EC GPE to be disabled
during system suspend and re-enabled during the subsequent system
resume.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=192591#c106
Amends: 8110dd281e15 (ACPI / sleep: EC-based wakeup from suspend-to-idle on recent systems)
Reported-and-tested-by: Patrik Kullman <patrik.kullman@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-19 14:41:58 +02:00
acpi_set_gpe ( NULL , ec - > gpe , ACPI_GPE_ENABLE ) ;
return 0 ;
}
ACPI / EC: Add PM operations to improve event handling for resume process
This patch makes 2 changes:
1. Restore old behavior
Originally, EC driver stops handling both events and transactions in
acpi_ec_block_transactions(), and restarts to handle transactions in
acpi_ec_unblock_transactions_early(), restarts to handle both events and
transactions in acpi_ec_unblock_transactions().
While currently, EC driver still stops handling both events and
transactions in acpi_ec_block_transactions(), but restarts to handle both
events and transactions in acpi_ec_unblock_transactions_early().
This patch tries to restore the old behavior by dropping
__acpi_ec_enable_event() from acpi_unblock_transactions_early().
2. Improve old behavior
However this still cannot fix the real issue as both of the
acpi_ec_unblock_xxx() functions are invoked in the noirq stage. Since the
EC driver actually doesn't implement the event handling in the polling
mode, re-enabling the event handling too early in the noirq stage could
result in the problem that if there is no triggering source causing
advance_transaction() to be invoked, pending SCI_EVT cannot be detected by
the EC driver and _Qxx cannot be triggered.
It actually makes sense to restart the event handling in any point during
resuming after the noirq stage. Just like the boot stage where the event
handling is enabled in .add(), this patch further moves
acpi_ec_enable_event() to .resume(). After doing that, the following 2
functions can be combined:
acpi_ec_unblock_transactions_early()/acpi_ec_unblock_transactions().
The differences of the event handling availability between the old behavior
(this patch isn't applied) and the new behavior (this patch is applied) are
as follows:
!Applied Applied
before suspend Y Y
suspend before EC Y Y
suspend after EC Y Y
suspend_late Y Y
suspend_noirq Y (actually N) Y (actually N)
resume_noirq Y (actually N) Y (actually N)
resume_late Y (actually N) Y (actually N)
resume before EC Y (actually N) Y (actually N)
resume after EC Y (actually N) Y
after resume Y (actually N) Y
Where "actually N" means if there is no triggering source, the EC driver
is actually not able to notice the pending SCI_EVT occurred in the noirq
stage. So we can clearly see that this patch has improved the situation.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-03 16:01:36 +08:00
static int acpi_ec_resume ( struct device * dev )
{
struct acpi_ec * ec =
acpi_driver_data ( to_acpi_device ( dev ) ) ;
acpi_ec_enable_event ( ec ) ;
return 0 ;
}
2019-07-31 11:05:52 +02:00
void acpi_ec_mark_gpe_for_wake ( void )
{
if ( first_ec & & ! ec_no_wakeup )
acpi_mark_gpe_for_wake ( NULL , first_ec - > gpe ) ;
}
EXPORT_SYMBOL_GPL ( acpi_ec_mark_gpe_for_wake ) ;
void acpi_ec_set_gpe_wake_mask ( u8 action )
{
if ( pm_suspend_no_platform ( ) & & first_ec & & ! ec_no_wakeup )
acpi_set_gpe_wake_mask ( NULL , first_ec - > gpe , action ) ;
}
2022-02-04 18:40:20 +01:00
static bool acpi_ec_work_in_progress ( struct acpi_ec * ec )
{
return ec - > events_in_progress + ec - > queries_in_progress > 0 ;
}
2019-07-31 11:05:52 +02:00
bool acpi_ec_dispatch_gpe ( void )
{
2021-11-23 19:37:39 +01:00
bool work_in_progress = false ;
2019-07-31 11:05:52 +02:00
if ( ! first_ec )
ACPI: EC: PM: Avoid premature returns from acpi_s2idle_wake()
If the EC GPE status is not set after checking all of the other GPEs,
acpi_s2idle_wake() returns 'false', to indicate that the SCI event
that has just triggered is not a system wakeup one, but it does that
without canceling the pending wakeup and re-arming the SCI for system
wakeup which is a mistake, because it may cause s2idle_loop() to busy
spin until the next valid wakeup event. [If that happens, the first
spurious wakeup is still pending after acpi_s2idle_wake() has
returned, so s2idle_enter() does nothing, acpi_s2idle_wake()
is called again and it sees that the SCI has triggered, but no GPEs
are active, so 'false' is returned again, and so on.]
Fix that by moving all of the GPE checking logic from
acpi_s2idle_wake() to acpi_ec_dispatch_gpe() and making the
latter return 'true' only if a non-EC GPE has triggered and
'false' otherwise, which will cause acpi_s2idle_wake() to
cancel the pending SCI wakeup and re-arm the SCI for system
wakeup regardless of the EC GPE status.
This also addresses a lockup observed on an Elitegroup EF20EA laptop
after attempting to wake it up from suspend-to-idle by a key press.
Fixes: d5406284ff80 ("ACPI: PM: s2idle: Refine active GPEs check")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=207603
Reported-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Fixes: fdde0ff8590b ("ACPI: PM: s2idle: Prevent spurious SCIs from waking up the system")
Link: https://lore.kernel.org/linux-acpi/CAB4CAwdqo7=MvyG_PE+PGVfeA17AHF5i5JucgaKqqMX6mjArbQ@mail.gmail.com/
Reported-by: Chris Chiu <chiu@endlessm.com>
Tested-by: Chris Chiu <chiu@endlessm.com>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-05-09 10:44:41 +02:00
return acpi_any_gpe_status_set ( U32_MAX ) ;
/*
* Report wakeup if the status bit is set for any enabled GPE other
* than the EC one .
*/
if ( acpi_any_gpe_status_set ( first_ec - > gpe ) )
return true ;
2022-02-04 18:31:02 +01:00
/*
* Cancel the SCI wakeup and process all pending events in case there
* are any wakeup ones in there .
*
* Note that if any non - EC GPEs are active at this point , the SCI will
* retrigger after the rearming in acpi_s2idle_wake ( ) , so no events
* should be missed by canceling the wakeup here .
*/
pm_system_cancel_wakeup ( ) ;
ACPI: EC: PM: Avoid premature returns from acpi_s2idle_wake()
If the EC GPE status is not set after checking all of the other GPEs,
acpi_s2idle_wake() returns 'false', to indicate that the SCI event
that has just triggered is not a system wakeup one, but it does that
without canceling the pending wakeup and re-arming the SCI for system
wakeup which is a mistake, because it may cause s2idle_loop() to busy
spin until the next valid wakeup event. [If that happens, the first
spurious wakeup is still pending after acpi_s2idle_wake() has
returned, so s2idle_enter() does nothing, acpi_s2idle_wake()
is called again and it sees that the SCI has triggered, but no GPEs
are active, so 'false' is returned again, and so on.]
Fix that by moving all of the GPE checking logic from
acpi_s2idle_wake() to acpi_ec_dispatch_gpe() and making the
latter return 'true' only if a non-EC GPE has triggered and
'false' otherwise, which will cause acpi_s2idle_wake() to
cancel the pending SCI wakeup and re-arm the SCI for system
wakeup regardless of the EC GPE status.
This also addresses a lockup observed on an Elitegroup EF20EA laptop
after attempting to wake it up from suspend-to-idle by a key press.
Fixes: d5406284ff80 ("ACPI: PM: s2idle: Refine active GPEs check")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=207603
Reported-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Fixes: fdde0ff8590b ("ACPI: PM: s2idle: Prevent spurious SCIs from waking up the system")
Link: https://lore.kernel.org/linux-acpi/CAB4CAwdqo7=MvyG_PE+PGVfeA17AHF5i5JucgaKqqMX6mjArbQ@mail.gmail.com/
Reported-by: Chris Chiu <chiu@endlessm.com>
Tested-by: Chris Chiu <chiu@endlessm.com>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-05-09 10:44:41 +02:00
/*
* Dispatch the EC GPE in - band , but do not report wakeup in any case
* to allow the caller to process events properly after that .
*/
2021-11-23 19:37:39 +01:00
spin_lock_irq ( & first_ec - > lock ) ;
2022-02-01 20:18:56 +01:00
if ( acpi_ec_gpe_status_set ( first_ec ) ) {
pm_pr_dbg ( " ACPI EC GPE status set \n " ) ;
2022-02-04 18:40:20 +01:00
advance_transaction ( first_ec , false ) ;
work_in_progress = acpi_ec_work_in_progress ( first_ec ) ;
2022-02-01 20:18:56 +01:00
}
2021-11-23 19:37:39 +01:00
spin_unlock_irq ( & first_ec - > lock ) ;
if ( ! work_in_progress )
return false ;
pm_pr_dbg ( " ACPI EC GPE dispatched \n " ) ;
ACPI: EC: PM: Avoid premature returns from acpi_s2idle_wake()
If the EC GPE status is not set after checking all of the other GPEs,
acpi_s2idle_wake() returns 'false', to indicate that the SCI event
that has just triggered is not a system wakeup one, but it does that
without canceling the pending wakeup and re-arming the SCI for system
wakeup which is a mistake, because it may cause s2idle_loop() to busy
spin until the next valid wakeup event. [If that happens, the first
spurious wakeup is still pending after acpi_s2idle_wake() has
returned, so s2idle_enter() does nothing, acpi_s2idle_wake()
is called again and it sees that the SCI has triggered, but no GPEs
are active, so 'false' is returned again, and so on.]
Fix that by moving all of the GPE checking logic from
acpi_s2idle_wake() to acpi_ec_dispatch_gpe() and making the
latter return 'true' only if a non-EC GPE has triggered and
'false' otherwise, which will cause acpi_s2idle_wake() to
cancel the pending SCI wakeup and re-arm the SCI for system
wakeup regardless of the EC GPE status.
This also addresses a lockup observed on an Elitegroup EF20EA laptop
after attempting to wake it up from suspend-to-idle by a key press.
Fixes: d5406284ff80 ("ACPI: PM: s2idle: Refine active GPEs check")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=207603
Reported-by: Todd Brandt <todd.e.brandt@linux.intel.com>
Fixes: fdde0ff8590b ("ACPI: PM: s2idle: Prevent spurious SCIs from waking up the system")
Link: https://lore.kernel.org/linux-acpi/CAB4CAwdqo7=MvyG_PE+PGVfeA17AHF5i5JucgaKqqMX6mjArbQ@mail.gmail.com/
Reported-by: Chris Chiu <chiu@endlessm.com>
Tested-by: Chris Chiu <chiu@endlessm.com>
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-05-09 10:44:41 +02:00
ACPI: EC: Rework flushing of EC work while suspended to idle
The flushing of pending work in the EC driver uses drain_workqueue()
to flush the event handling work that can requeue itself via
advance_transaction(), but this is problematic, because that
work may also be requeued from the query workqueue.
Namely, if an EC transaction is carried out during the execution of
a query handler, it involves calling advance_transaction() which
may queue up the event handling work again. This causes the kernel
to complain about attempts to add a work item to the EC event
workqueue while it is being drained and worst-case it may cause a
valid event to be skipped.
To avoid this problem, introduce two new counters, events_in_progress
and queries_in_progress, incremented when a work item is queued on
the event workqueue or the query workqueue, respectively, and
decremented at the end of the corresponding work function, and make
acpi_ec_dispatch_gpe() the workqueues in a loop until the both of
these counters are zero (or system wakeup is pending) instead of
calling acpi_ec_flush_work().
At the same time, change __acpi_ec_flush_work() to call
flush_workqueue() instead of drain_workqueue() to flush the event
workqueue.
While at it, use the observation that the work item queued in
acpi_ec_query() cannot be pending at that time, because it is used
only once, to simplify the code in there.
Additionally, clean up a comment in acpi_ec_query() and adjust white
space in acpi_ec_event_processor().
Fixes: f0ac20c3f613 ("ACPI: EC: Fix flushing of pending work")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:36:51 +01:00
/* Drain EC work. */
do {
acpi_ec_flush_work ( ) ;
pm_pr_dbg ( " ACPI EC work flushed \n " ) ;
spin_lock_irq ( & first_ec - > lock ) ;
2022-02-04 18:40:20 +01:00
work_in_progress = acpi_ec_work_in_progress ( first_ec ) ;
ACPI: EC: Rework flushing of EC work while suspended to idle
The flushing of pending work in the EC driver uses drain_workqueue()
to flush the event handling work that can requeue itself via
advance_transaction(), but this is problematic, because that
work may also be requeued from the query workqueue.
Namely, if an EC transaction is carried out during the execution of
a query handler, it involves calling advance_transaction() which
may queue up the event handling work again. This causes the kernel
to complain about attempts to add a work item to the EC event
workqueue while it is being drained and worst-case it may cause a
valid event to be skipped.
To avoid this problem, introduce two new counters, events_in_progress
and queries_in_progress, incremented when a work item is queued on
the event workqueue or the query workqueue, respectively, and
decremented at the end of the corresponding work function, and make
acpi_ec_dispatch_gpe() the workqueues in a loop until the both of
these counters are zero (or system wakeup is pending) instead of
calling acpi_ec_flush_work().
At the same time, change __acpi_ec_flush_work() to call
flush_workqueue() instead of drain_workqueue() to flush the event
workqueue.
While at it, use the observation that the work item queued in
acpi_ec_query() cannot be pending at that time, because it is used
only once, to simplify the code in there.
Additionally, clean up a comment in acpi_ec_query() and adjust white
space in acpi_ec_event_processor().
Fixes: f0ac20c3f613 ("ACPI: EC: Fix flushing of pending work")
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-11-23 19:36:51 +01:00
spin_unlock_irq ( & first_ec - > lock ) ;
} while ( work_in_progress & & ! pm_wakeup_pending ( ) ) ;
2020-05-15 12:58:19 +02:00
2019-07-31 11:06:00 +02:00
return false ;
2019-07-31 11:05:52 +02:00
}
# endif /* CONFIG_PM_SLEEP */
2016-08-03 09:07:58 +08:00
static const struct dev_pm_ops acpi_ec_pm = {
ACPI / EC: Add parameter to force disable the GPE on suspend
After commit 8110dd281e15 (ACPI / sleep: EC-based wakeup from
suspend-to-idle on recent systems) the configuration of GPEs,
including the EC one, is not changed during suspend-to-idle on
recent systems. That's in order to make system wakeup events
generated by the EC work, in particular.
However, on some of the systems in question (for example on Dell
XPS13 9365), in addition to generating system wakeup events the
EC generates a heartbeat sequence of interrupts that have nothing
to do with wakeup while suspended, and the Low Power Idle S0 _DSM
interface doesn't change that behavior.
The users of those systems may prefer to disable the EC GPE during
system suspend, for the cost of non-functional power button wakeup
or similar, but currently there is no way to do that.
For this reason, add a new module parameter, ec_no_wakeup, for the
EC driver module that, if set, will cause the EC GPE to be disabled
during system suspend and re-enabled during the subsequent system
resume.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=192591#c106
Amends: 8110dd281e15 (ACPI / sleep: EC-based wakeup from suspend-to-idle on recent systems)
Reported-and-tested-by: Patrik Kullman <patrik.kullman@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-07-19 14:41:58 +02:00
SET_NOIRQ_SYSTEM_SLEEP_PM_OPS ( acpi_ec_suspend_noirq , acpi_ec_resume_noirq )
ACPI / EC: Add PM operations to improve event handling for suspend process
In the original EC driver, though the event handling is not explicitly
stopped, the EC driver is actually not able to handle events during the
noirq stage as the EC driver is not prepared to handle the EC events in the
polling mode. So if there is no advance_transaction() triggered, the EC
driver couldn't notice the EC events.
However, do we actually need to handle EC events during suspend/resume
stage? EC events are mostly useless for the suspend/resume period (key
strokes and battery/thermal updates, etc.,), and the useful ones (lid
close, power/sleep button press) should have already been delivered to the
OSPM to trigger the power saving operations.
Thus this patch implements acpi_ec_disable_event() to be a reverse call of
acpi_ec_enable_event(), with which, the EC driver is able to stop handling
the EC events in a position before entering the noirq stage.
Since there are actually 2 choices for us:
1. implement event handling in polling mode;
2. stop event handling before entering noirq stage.
And this patch only implements the second choice using .suspend() callback.
Thus this is experimental (first choice is better? or different hook
position is better?). This patch finally keeps the old behavior by default
and prepares a boot parameter to enable this feature.
The differences of the event handling availability between the old behavior
(this patch is not applied) and the new behavior (this patch is applied)
are as follows:
!FreezeEvents FreezeEvents
before suspend Y Y
suspend before EC Y Y
suspend after EC Y N
suspend_late Y N
suspend_noirq Y (actually N) N
resume_noirq Y (actually N) N
resume_late Y (actually N) N
resume before EC Y (actually N) N
resume after EC Y Y
after resume Y Y
Where "actually N" means if there is no EC transactions, the EC driver
is actually not able to notice the pending events.
We can see that FreezeEvents is the only approach now can actually flush
the EC event handling with both query commands and _Qxx evaluations
flushed, other modes can only flush the EC event handling with only query
commands flushed, _Qxx evaluations occurred after stopping the EC driver
may end up failure due to the failure of the EC transaction carried out in
the _Qxx control methods.
We also can see that this feature should be able to trigger some platform
notifications later than resuming other drivers.
Signed-off-by: Lv Zheng <lv.zheng@intel.com>
Tested-by: Todd E Brandt <todd.e.brandt@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-08-03 16:01:43 +08:00
SET_SYSTEM_SLEEP_PM_OPS ( acpi_ec_suspend , acpi_ec_resume )
2016-08-03 09:07:58 +08:00
} ;
treewide: Fix function prototypes for module_param_call()
Several function prototypes for the set/get functions defined by
module_param_call() have a slightly wrong argument types. This fixes
those in an effort to clean up the calls when running under type-enforced
compiler instrumentation for CFI. This is the result of running the
following semantic patch:
@match_module_param_call_function@
declarer name module_param_call;
identifier _name, _set_func, _get_func;
expression _arg, _mode;
@@
module_param_call(_name, _set_func, _get_func, _arg, _mode);
@fix_set_prototype
depends on match_module_param_call_function@
identifier match_module_param_call_function._set_func;
identifier _val, _param;
type _val_type, _param_type;
@@
int _set_func(
-_val_type _val
+const char * _val
,
-_param_type _param
+const struct kernel_param * _param
) { ... }
@fix_get_prototype
depends on match_module_param_call_function@
identifier match_module_param_call_function._get_func;
identifier _val, _param;
type _val_type, _param_type;
@@
int _get_func(
-_val_type _val
+char * _val
,
-_param_type _param
+const struct kernel_param * _param
) { ... }
Two additional by-hand changes are included for places where the above
Coccinelle script didn't notice them:
drivers/platform/x86/thinkpad_acpi.c
fs/lockd/svc.c
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2017-10-17 19:04:42 -07:00
static int param_set_event_clearing ( const char * val ,
const struct kernel_param * kp )
2015-06-11 13:21:38 +08:00
{
int result = 0 ;
if ( ! strncmp ( val , " status " , sizeof ( " status " ) - 1 ) ) {
ec_event_clearing = ACPI_EC_EVT_TIMING_STATUS ;
pr_info ( " Assuming SCI_EVT clearing on EC_SC accesses \n " ) ;
} else if ( ! strncmp ( val , " query " , sizeof ( " query " ) - 1 ) ) {
ec_event_clearing = ACPI_EC_EVT_TIMING_QUERY ;
pr_info ( " Assuming SCI_EVT clearing on QR_EC writes \n " ) ;
} else if ( ! strncmp ( val , " event " , sizeof ( " event " ) - 1 ) ) {
ec_event_clearing = ACPI_EC_EVT_TIMING_EVENT ;
pr_info ( " Assuming SCI_EVT clearing on event reads \n " ) ;
} else
result = - EINVAL ;
return result ;
}
treewide: Fix function prototypes for module_param_call()
Several function prototypes for the set/get functions defined by
module_param_call() have a slightly wrong argument types. This fixes
those in an effort to clean up the calls when running under type-enforced
compiler instrumentation for CFI. This is the result of running the
following semantic patch:
@match_module_param_call_function@
declarer name module_param_call;
identifier _name, _set_func, _get_func;
expression _arg, _mode;
@@
module_param_call(_name, _set_func, _get_func, _arg, _mode);
@fix_set_prototype
depends on match_module_param_call_function@
identifier match_module_param_call_function._set_func;
identifier _val, _param;
type _val_type, _param_type;
@@
int _set_func(
-_val_type _val
+const char * _val
,
-_param_type _param
+const struct kernel_param * _param
) { ... }
@fix_get_prototype
depends on match_module_param_call_function@
identifier match_module_param_call_function._get_func;
identifier _val, _param;
type _val_type, _param_type;
@@
int _get_func(
-_val_type _val
+char * _val
,
-_param_type _param
+const struct kernel_param * _param
) { ... }
Two additional by-hand changes are included for places where the above
Coccinelle script didn't notice them:
drivers/platform/x86/thinkpad_acpi.c
fs/lockd/svc.c
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Jessica Yu <jeyu@kernel.org>
2017-10-17 19:04:42 -07:00
static int param_get_event_clearing ( char * buffer ,
const struct kernel_param * kp )
2015-06-11 13:21:38 +08:00
{
switch ( ec_event_clearing ) {
case ACPI_EC_EVT_TIMING_STATUS :
2020-06-16 17:14:08 +08:00
return sprintf ( buffer , " status \n " ) ;
2015-06-11 13:21:38 +08:00
case ACPI_EC_EVT_TIMING_QUERY :
2020-06-16 17:14:08 +08:00
return sprintf ( buffer , " query \n " ) ;
2015-06-11 13:21:38 +08:00
case ACPI_EC_EVT_TIMING_EVENT :
2020-06-16 17:14:08 +08:00
return sprintf ( buffer , " event \n " ) ;
2015-06-11 13:21:38 +08:00
default :
2020-06-16 17:14:08 +08:00
return sprintf ( buffer , " invalid \n " ) ;
2015-06-11 13:21:38 +08:00
}
return 0 ;
}
module_param_call ( ec_event_clearing , param_set_event_clearing , param_get_event_clearing ,
NULL , 0644 ) ;
MODULE_PARM_DESC ( ec_event_clearing , " Assumed SCI_EVT clearing timing " ) ;
2008-03-21 17:07:21 +03:00
static struct acpi_driver acpi_ec_driver = {
. name = " ec " ,
. class = ACPI_EC_CLASS ,
. ids = ec_device_ids ,
. ops = {
. add = acpi_ec_add ,
. remove = acpi_ec_remove ,
} ,
2016-08-03 09:07:58 +08:00
. drv . pm = & acpi_ec_pm ,
2008-03-21 17:07:21 +03:00
} ;
ACPI: EC: Fix flushing of pending work
Commit 016b87ca5c8c ("ACPI: EC: Rework flushing of pending work")
introduced a subtle bug into the flushing of pending EC work while
suspended to idle, which may cause the EC driver to fail to
re-enable the EC GPE after handling a non-wakeup event (like a
battery status change event, for example).
The problem is that the work item flushed by flush_scheduled_work()
in __acpi_ec_flush_work() may disable the EC GPE and schedule another
work item expected to re-enable it, but that new work item is not
flushed, so __acpi_ec_flush_work() returns with the EC GPE disabled
and the CPU running it goes into an idle state subsequently. If all
of the other CPUs are in idle states at that point, the EC GPE won't
be re-enabled until at least one CPU is woken up by another interrupt
source, so system wakeup events that would normally come from the EC
then don't work.
This is reproducible on a Dell XPS13 9360 in my office which
sometimes stops reacting to power button and lid events (triggered
by the EC on that machine) after switching from AC power to battery
power or vice versa while suspended to idle (each of those switches
causes the EC GPE to trigger for several times in a row, but they
are not system wakeup events).
To avoid this problem, it is necessary to drain the workqueue
entirely in __acpi_ec_flush_work(), but that cannot be done with
respect to system_wq, because work items may be added to it from
other places while __acpi_ec_flush_work() is running. For this
reason, make the EC driver use a dedicated workqueue for EC events
processing (let that workqueue be ordered so that EC events are
processed sequentially) and use drain_workqueue() on it in
__acpi_ec_flush_work().
Fixes: 016b87ca5c8c ("ACPI: EC: Rework flushing of pending work")
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-02-11 10:07:43 +01:00
static void acpi_ec_destroy_workqueues ( void )
2016-08-03 09:00:14 +08:00
{
ACPI: EC: Fix flushing of pending work
Commit 016b87ca5c8c ("ACPI: EC: Rework flushing of pending work")
introduced a subtle bug into the flushing of pending EC work while
suspended to idle, which may cause the EC driver to fail to
re-enable the EC GPE after handling a non-wakeup event (like a
battery status change event, for example).
The problem is that the work item flushed by flush_scheduled_work()
in __acpi_ec_flush_work() may disable the EC GPE and schedule another
work item expected to re-enable it, but that new work item is not
flushed, so __acpi_ec_flush_work() returns with the EC GPE disabled
and the CPU running it goes into an idle state subsequently. If all
of the other CPUs are in idle states at that point, the EC GPE won't
be re-enabled until at least one CPU is woken up by another interrupt
source, so system wakeup events that would normally come from the EC
then don't work.
This is reproducible on a Dell XPS13 9360 in my office which
sometimes stops reacting to power button and lid events (triggered
by the EC on that machine) after switching from AC power to battery
power or vice versa while suspended to idle (each of those switches
causes the EC GPE to trigger for several times in a row, but they
are not system wakeup events).
To avoid this problem, it is necessary to drain the workqueue
entirely in __acpi_ec_flush_work(), but that cannot be done with
respect to system_wq, because work items may be added to it from
other places while __acpi_ec_flush_work() is running. For this
reason, make the EC driver use a dedicated workqueue for EC events
processing (let that workqueue be ordered so that EC events are
processed sequentially) and use drain_workqueue() on it in
__acpi_ec_flush_work().
Fixes: 016b87ca5c8c ("ACPI: EC: Rework flushing of pending work")
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-02-11 10:07:43 +01:00
if ( ec_wq ) {
destroy_workqueue ( ec_wq ) ;
ec_wq = NULL ;
2016-08-03 09:00:14 +08:00
}
if ( ec_query_wq ) {
destroy_workqueue ( ec_query_wq ) ;
ec_query_wq = NULL ;
}
}
ACPI: EC: Fix flushing of pending work
Commit 016b87ca5c8c ("ACPI: EC: Rework flushing of pending work")
introduced a subtle bug into the flushing of pending EC work while
suspended to idle, which may cause the EC driver to fail to
re-enable the EC GPE after handling a non-wakeup event (like a
battery status change event, for example).
The problem is that the work item flushed by flush_scheduled_work()
in __acpi_ec_flush_work() may disable the EC GPE and schedule another
work item expected to re-enable it, but that new work item is not
flushed, so __acpi_ec_flush_work() returns with the EC GPE disabled
and the CPU running it goes into an idle state subsequently. If all
of the other CPUs are in idle states at that point, the EC GPE won't
be re-enabled until at least one CPU is woken up by another interrupt
source, so system wakeup events that would normally come from the EC
then don't work.
This is reproducible on a Dell XPS13 9360 in my office which
sometimes stops reacting to power button and lid events (triggered
by the EC on that machine) after switching from AC power to battery
power or vice versa while suspended to idle (each of those switches
causes the EC GPE to trigger for several times in a row, but they
are not system wakeup events).
To avoid this problem, it is necessary to drain the workqueue
entirely in __acpi_ec_flush_work(), but that cannot be done with
respect to system_wq, because work items may be added to it from
other places while __acpi_ec_flush_work() is running. For this
reason, make the EC driver use a dedicated workqueue for EC events
processing (let that workqueue be ordered so that EC events are
processed sequentially) and use drain_workqueue() on it in
__acpi_ec_flush_work().
Fixes: 016b87ca5c8c ("ACPI: EC: Rework flushing of pending work")
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-02-11 10:07:43 +01:00
static int acpi_ec_init_workqueues ( void )
{
if ( ! ec_wq )
ec_wq = alloc_ordered_workqueue ( " kec " , 0 ) ;
if ( ! ec_query_wq )
ec_query_wq = alloc_workqueue ( " kec_query " , 0 , ec_max_queries ) ;
if ( ! ec_wq | | ! ec_query_wq ) {
acpi_ec_destroy_workqueues ( ) ;
return - ENODEV ;
}
return 0 ;
}
2018-06-18 14:17:16 +03:00
static const struct dmi_system_id acpi_ec_no_wakeup [ ] = {
{
. matches = {
DMI_MATCH ( DMI_SYS_VENDOR , " LENOVO " ) ,
2018-07-13 20:50:47 +00:00
DMI_MATCH ( DMI_PRODUCT_FAMILY , " Thinkpad X1 Carbon 6th " ) ,
2018-06-18 14:17:16 +03:00
} ,
} ,
2018-07-31 18:52:39 +08:00
{
. matches = {
DMI_MATCH ( DMI_SYS_VENDOR , " LENOVO " ) ,
DMI_MATCH ( DMI_PRODUCT_FAMILY , " ThinkPad X1 Yoga 3rd " ) ,
} ,
} ,
2021-10-28 21:44:32 +08:00
{
. matches = {
DMI_MATCH ( DMI_SYS_VENDOR , " HP " ) ,
DMI_MATCH ( DMI_PRODUCT_FAMILY , " 103C_5336AN HP ZHAN 66 Pro " ) ,
} ,
} ,
2018-06-18 14:17:16 +03:00
{ } ,
} ;
2020-03-06 00:14:35 +01:00
void __init acpi_ec_init ( void )
2005-04-16 15:20:36 -07:00
{
2016-08-03 09:00:14 +08:00
int result ;
2005-04-16 15:20:36 -07:00
ACPI: EC: Fix flushing of pending work
Commit 016b87ca5c8c ("ACPI: EC: Rework flushing of pending work")
introduced a subtle bug into the flushing of pending EC work while
suspended to idle, which may cause the EC driver to fail to
re-enable the EC GPE after handling a non-wakeup event (like a
battery status change event, for example).
The problem is that the work item flushed by flush_scheduled_work()
in __acpi_ec_flush_work() may disable the EC GPE and schedule another
work item expected to re-enable it, but that new work item is not
flushed, so __acpi_ec_flush_work() returns with the EC GPE disabled
and the CPU running it goes into an idle state subsequently. If all
of the other CPUs are in idle states at that point, the EC GPE won't
be re-enabled until at least one CPU is woken up by another interrupt
source, so system wakeup events that would normally come from the EC
then don't work.
This is reproducible on a Dell XPS13 9360 in my office which
sometimes stops reacting to power button and lid events (triggered
by the EC on that machine) after switching from AC power to battery
power or vice versa while suspended to idle (each of those switches
causes the EC GPE to trigger for several times in a row, but they
are not system wakeup events).
To avoid this problem, it is necessary to drain the workqueue
entirely in __acpi_ec_flush_work(), but that cannot be done with
respect to system_wq, because work items may be added to it from
other places while __acpi_ec_flush_work() is running. For this
reason, make the EC driver use a dedicated workqueue for EC events
processing (let that workqueue be ordered so that EC events are
processed sequentially) and use drain_workqueue() on it in
__acpi_ec_flush_work().
Fixes: 016b87ca5c8c ("ACPI: EC: Rework flushing of pending work")
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-02-11 10:07:43 +01:00
result = acpi_ec_init_workqueues ( ) ;
2016-08-03 09:00:14 +08:00
if ( result )
2020-03-06 00:14:35 +01:00
return ;
2005-04-16 15:20:36 -07:00
2018-06-18 14:17:16 +03:00
/*
* Disable EC wakeup on following systems to prevent periodic
* wakeup from EC GPE .
*/
if ( dmi_check_system ( acpi_ec_no_wakeup ) ) {
ec_no_wakeup = true ;
pr_debug ( " Disabling EC wakeup on suspend-to-idle \n " ) ;
}
2020-03-06 00:14:35 +01:00
/* Driver must be registered after acpi_ec_init_workqueues(). */
acpi_bus_register_driver ( & acpi_ec_driver ) ;
acpi_ec_ecdt_start ( ) ;
2005-04-16 15:20:36 -07:00
}
/* EC driver currently not unloadable */
#if 0
2005-08-11 17:32:05 -04:00
static void __exit acpi_ec_exit ( void )
2005-04-16 15:20:36 -07:00
{
acpi_bus_unregister_driver ( & acpi_ec_driver ) ;
ACPI: EC: Fix flushing of pending work
Commit 016b87ca5c8c ("ACPI: EC: Rework flushing of pending work")
introduced a subtle bug into the flushing of pending EC work while
suspended to idle, which may cause the EC driver to fail to
re-enable the EC GPE after handling a non-wakeup event (like a
battery status change event, for example).
The problem is that the work item flushed by flush_scheduled_work()
in __acpi_ec_flush_work() may disable the EC GPE and schedule another
work item expected to re-enable it, but that new work item is not
flushed, so __acpi_ec_flush_work() returns with the EC GPE disabled
and the CPU running it goes into an idle state subsequently. If all
of the other CPUs are in idle states at that point, the EC GPE won't
be re-enabled until at least one CPU is woken up by another interrupt
source, so system wakeup events that would normally come from the EC
then don't work.
This is reproducible on a Dell XPS13 9360 in my office which
sometimes stops reacting to power button and lid events (triggered
by the EC on that machine) after switching from AC power to battery
power or vice versa while suspended to idle (each of those switches
causes the EC GPE to trigger for several times in a row, but they
are not system wakeup events).
To avoid this problem, it is necessary to drain the workqueue
entirely in __acpi_ec_flush_work(), but that cannot be done with
respect to system_wq, because work items may be added to it from
other places while __acpi_ec_flush_work() is running. For this
reason, make the EC driver use a dedicated workqueue for EC events
processing (let that workqueue be ordered so that EC events are
processed sequentially) and use drain_workqueue() on it in
__acpi_ec_flush_work().
Fixes: 016b87ca5c8c ("ACPI: EC: Rework flushing of pending work")
Cc: 5.4+ <stable@vger.kernel.org> # 5.4+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-02-11 10:07:43 +01:00
acpi_ec_destroy_workqueues ( ) ;
2005-04-16 15:20:36 -07:00
}
2007-10-22 14:18:43 +04:00
# endif /* 0 */