Merge branch 'intel_idle+acpi'
Merge changes updating the ACPI processor driver in order to export acpi_processor_evaluate_cst() to the code outside of it and adding ACPI support to the intel_idle driver based on that. * intel_idle+acpi: Documentation: admin-guide: PM: Add intel_idle document intel_idle: Use ACPI _CST on server systems intel_idle: Add module parameter to prevent ACPI _CST from being used intel_idle: Allow ACPI _CST to be used for selected known processors cpuidle: Allow idle states to be disabled by default intel_idle: Use ACPI _CST for processor models without C-state tables intel_idle: Refactor intel_idle_cpuidle_driver_init() ACPI: processor: Export acpi_processor_evaluate_cst() ACPI: processor: Make ACPI_PROCESSOR_CSTATE depend on ACPI_PROCESSOR ACPI: processor: Clean up acpi_processor_evaluate_cst() ACPI: processor: Introduce acpi_processor_evaluate_cst() ACPI: processor: Export function to claim _CST control
This commit is contained in:
commit
e6cf623ba3
@ -196,6 +196,12 @@ Description:
|
||||
does not reflect it. Likewise, if one enables a deep state but a
|
||||
lighter state still is disabled, then this has no effect.
|
||||
|
||||
What: /sys/devices/system/cpu/cpuX/cpuidle/stateN/default_status
|
||||
Date: December 2019
|
||||
KernelVersion: v5.6
|
||||
Contact: Linux power management list <linux-pm@vger.kernel.org>
|
||||
Description:
|
||||
(RO) The default status of this state, "enabled" or "disabled".
|
||||
|
||||
What: /sys/devices/system/cpu/cpuX/cpuidle/stateN/residency
|
||||
Date: March 2014
|
||||
|
@ -506,6 +506,9 @@ object corresponding to it, as follows:
|
||||
``disable``
|
||||
Whether or not this idle state is disabled.
|
||||
|
||||
``default_status``
|
||||
The default status of this state, "enabled" or "disabled".
|
||||
|
||||
``latency``
|
||||
Exit latency of the idle state in microseconds.
|
||||
|
||||
|
246
Documentation/admin-guide/pm/intel_idle.rst
Normal file
246
Documentation/admin-guide/pm/intel_idle.rst
Normal file
@ -0,0 +1,246 @@
|
||||
.. SPDX-License-Identifier: GPL-2.0
|
||||
.. include:: <isonum.txt>
|
||||
|
||||
==============================================
|
||||
``intel_idle`` CPU Idle Time Management Driver
|
||||
==============================================
|
||||
|
||||
:Copyright: |copy| 2020 Intel Corporation
|
||||
|
||||
:Author: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
||||
|
||||
|
||||
General Information
|
||||
===================
|
||||
|
||||
``intel_idle`` is a part of the
|
||||
:doc:`CPU idle time management subsystem <cpuidle>` in the Linux kernel
|
||||
(``CPUIdle``). It is the default CPU idle time management driver for the
|
||||
Nehalem and later generations of Intel processors, but the level of support for
|
||||
a particular processor model in it depends on whether or not it recognizes that
|
||||
processor model and may also depend on information coming from the platform
|
||||
firmware. [To understand ``intel_idle`` it is necessary to know how ``CPUIdle``
|
||||
works in general, so this is the time to get familiar with :doc:`cpuidle` if you
|
||||
have not done that yet.]
|
||||
|
||||
``intel_idle`` uses the ``MWAIT`` instruction to inform the processor that the
|
||||
logical CPU executing it is idle and so it may be possible to put some of the
|
||||
processor's functional blocks into low-power states. That instruction takes two
|
||||
arguments (passed in the ``EAX`` and ``ECX`` registers of the target CPU), the
|
||||
first of which, referred to as a *hint*, can be used by the processor to
|
||||
determine what can be done (for details refer to Intel Software Developer’s
|
||||
Manual [1]_). Accordingly, ``intel_idle`` refuses to work with processors in
|
||||
which the support for the ``MWAIT`` instruction has been disabled (for example,
|
||||
via the platform firmware configuration menu) or which do not support that
|
||||
instruction at all.
|
||||
|
||||
``intel_idle`` is not modular, so it cannot be unloaded, which means that the
|
||||
only way to pass early-configuration-time parameters to it is via the kernel
|
||||
command line.
|
||||
|
||||
|
||||
.. _intel-idle-enumeration-of-states:
|
||||
|
||||
Enumeration of Idle States
|
||||
==========================
|
||||
|
||||
Each ``MWAIT`` hint value is interpreted by the processor as a license to
|
||||
reconfigure itself in a certain way in order to save energy. The processor
|
||||
configurations (with reduced power draw) resulting from that are referred to
|
||||
as C-states (in the ACPI terminology) or idle states. The list of meaningful
|
||||
``MWAIT`` hint values and idle states (i.e. low-power configurations of the
|
||||
processor) corresponding to them depends on the processor model and it may also
|
||||
depend on the configuration of the platform.
|
||||
|
||||
In order to create a list of available idle states required by the ``CPUIdle``
|
||||
subsystem (see :ref:`idle-states-representation` in :doc:`cpuidle`),
|
||||
``intel_idle`` can use two sources of information: static tables of idle states
|
||||
for different processor models included in the driver itself and the ACPI tables
|
||||
of the system. The former are always used if the processor model at hand is
|
||||
recognized by ``intel_idle`` and the latter are used if that is required for
|
||||
the given processor model (which is the case for all server processor models
|
||||
recognized by ``intel_idle``) or if the processor model is not recognized.
|
||||
|
||||
If the ACPI tables are going to be used for building the list of available idle
|
||||
states, ``intel_idle`` first looks for a ``_CST`` object under one of the ACPI
|
||||
objects corresponding to the CPUs in the system (refer to the ACPI specification
|
||||
[2]_ for the description of ``_CST`` and its output package). Because the
|
||||
``CPUIdle`` subsystem expects that the list of idle states supplied by the
|
||||
driver will be suitable for all of the CPUs handled by it and ``intel_idle`` is
|
||||
registered as the ``CPUIdle`` driver for all of the CPUs in the system, the
|
||||
driver looks for the first ``_CST`` object returning at least one valid idle
|
||||
state description and such that all of the idle states included in its return
|
||||
package are of the FFH (Functional Fixed Hardware) type, which means that the
|
||||
``MWAIT`` instruction is expected to be used to tell the processor that it can
|
||||
enter one of them. The return package of that ``_CST`` is then assumed to be
|
||||
applicable to all of the other CPUs in the system and the idle state
|
||||
descriptions extracted from it are stored in a preliminary list of idle states
|
||||
coming from the ACPI tables. [This step is skipped if ``intel_idle`` is
|
||||
configured to ignore the ACPI tables; see `below <intel-idle-parameters_>`_.]
|
||||
|
||||
Next, the first (index 0) entry in the list of available idle states is
|
||||
initialized to represent a "polling idle state" (a pseudo-idle state in which
|
||||
the target CPU continuously fetches and executes instructions), and the
|
||||
subsequent (real) idle state entries are populated as follows.
|
||||
|
||||
If the processor model at hand is recognized by ``intel_idle``, there is a
|
||||
(static) table of idle state descriptions for it in the driver. In that case,
|
||||
the "internal" table is the primary source of information on idle states and the
|
||||
information from it is copied to the final list of available idle states. If
|
||||
using the ACPI tables for the enumeration of idle states is not required
|
||||
(depending on the processor model), all of the listed idle state are enabled by
|
||||
default (so all of them will be taken into consideration by ``CPUIdle``
|
||||
governors during CPU idle state selection). Otherwise, some of the listed idle
|
||||
states may not be enabled by default if there are no matching entries in the
|
||||
preliminary list of idle states coming from the ACPI tables. In that case user
|
||||
space still can enable them later (on a per-CPU basis) with the help of
|
||||
the ``disable`` idle state attribute in ``sysfs`` (see
|
||||
:ref:`idle-states-representation` in :doc:`cpuidle`). This basically means that
|
||||
the idle states "known" to the driver may not be enabled by default if they have
|
||||
not been exposed by the platform firmware (through the ACPI tables).
|
||||
|
||||
If the given processor model is not recognized by ``intel_idle``, but it
|
||||
supports ``MWAIT``, the preliminary list of idle states coming from the ACPI
|
||||
tables is used for building the final list that will be supplied to the
|
||||
``CPUIdle`` core during driver registration. For each idle state in that list,
|
||||
the description, ``MWAIT`` hint and exit latency are copied to the corresponding
|
||||
entry in the final list of idle states. The name of the idle state represented
|
||||
by it (to be returned by the ``name`` idle state attribute in ``sysfs``) is
|
||||
"CX_ACPI", where X is the index of that idle state in the final list (note that
|
||||
the minimum value of X is 1, because 0 is reserved for the "polling" state), and
|
||||
its target residency is based on the exit latency value. Specifically, for
|
||||
C1-type idle states the exit latency value is also used as the target residency
|
||||
(for compatibility with the majority of the "internal" tables of idle states for
|
||||
various processor models recognized by ``intel_idle``) and for the other idle
|
||||
state types (C2 and C3) the target residency value is 3 times the exit latency
|
||||
(again, that is because it reflects the target residency to exit latency ratio
|
||||
in the majority of cases for the processor models recognized by ``intel_idle``).
|
||||
All of the idle states in the final list are enabled by default in this case.
|
||||
|
||||
|
||||
.. _intel-idle-initialization:
|
||||
|
||||
Initialization
|
||||
==============
|
||||
|
||||
The initialization of ``intel_idle`` starts with checking if the kernel command
|
||||
line options forbid the use of the ``MWAIT`` instruction. If that is the case,
|
||||
an error code is returned right away.
|
||||
|
||||
The next step is to check whether or not the processor model is known to the
|
||||
driver, which determines the idle states enumeration method (see
|
||||
`above <intel-idle-enumeration-of-states_>`_), and whether or not the processor
|
||||
supports ``MWAIT`` (the initialization fails if that is not the case). Then,
|
||||
the ``MWAIT`` support in the processor is enumerated through ``CPUID`` and the
|
||||
driver initialization fails if the level of support is not as expected (for
|
||||
example, if the total number of ``MWAIT`` substates returned is 0).
|
||||
|
||||
Next, if the driver is not configured to ignore the ACPI tables (see
|
||||
`below <intel-idle-parameters_>`_), the idle states information provided by the
|
||||
platform firmware is extracted from them.
|
||||
|
||||
Then, ``CPUIdle`` device objects are allocated for all CPUs and the list of
|
||||
available idle states is created as explained
|
||||
`above <intel-idle-enumeration-of-states_>`_.
|
||||
|
||||
Finally, ``intel_idle`` is registered with the help of cpuidle_register_driver()
|
||||
as the ``CPUIdle`` driver for all CPUs in the system and a CPU online callback
|
||||
for configuring individual CPUs is registered via cpuhp_setup_state(), which
|
||||
(among other things) causes the callback routine to be invoked for all of the
|
||||
CPUs present in the system at that time (each CPU executes its own instance of
|
||||
the callback routine). That routine registers a ``CPUIdle`` device for the CPU
|
||||
running it (which enables the ``CPUIdle`` subsystem to operate that CPU) and
|
||||
optionally performs some CPU-specific initialization actions that may be
|
||||
required for the given processor model.
|
||||
|
||||
|
||||
.. _intel-idle-parameters:
|
||||
|
||||
Kernel Command Line Options and Module Parameters
|
||||
=================================================
|
||||
|
||||
The *x86* architecture support code recognizes three kernel command line
|
||||
options related to CPU idle time management: ``idle=poll``, ``idle=halt``,
|
||||
and ``idle=nomwait``. If any of them is present in the kernel command line, the
|
||||
``MWAIT`` instruction is not allowed to be used, so the initialization of
|
||||
``intel_idle`` will fail.
|
||||
|
||||
Apart from that there are two module parameters recognized by ``intel_idle``
|
||||
itself that can be set via the kernel command line (they cannot be updated via
|
||||
sysfs, so that is the only way to change their values).
|
||||
|
||||
The ``max_cstate`` parameter value is the maximum idle state index in the list
|
||||
of idle states supplied to the ``CPUIdle`` core during the registration of the
|
||||
driver. It is also the maximum number of regular (non-polling) idle states that
|
||||
can be used by ``intel_idle``, so the enumeration of idle states is terminated
|
||||
after finding that number of usable idle states (the other idle states that
|
||||
potentially might have been used if ``max_cstate`` had been greater are not
|
||||
taken into consideration at all). Setting ``max_cstate`` can prevent
|
||||
``intel_idle`` from exposing idle states that are regarded as "too deep" for
|
||||
some reason to the ``CPUIdle`` core, but it does so by making them effectively
|
||||
invisible until the system is shut down and started again which may not always
|
||||
be desirable. In practice, it is only really necessary to do that if the idle
|
||||
states in question cannot be enabled during system startup, because in the
|
||||
working state of the system the CPU power management quality of service (PM
|
||||
QoS) feature can be used to prevent ``CPUIdle`` from touching those idle states
|
||||
even if they have been enumerated (see :ref:`cpu-pm-qos` in :doc:`cpuidle`).
|
||||
Setting ``max_cstate`` to 0 causes the ``intel_idle`` initialization to fail.
|
||||
|
||||
The ``noacpi`` module parameter (which is recognized by ``intel_idle`` if the
|
||||
kernel has been configured with ACPI support), can be set to make the driver
|
||||
ignore the system's ACPI tables entirely (it is unset by default).
|
||||
|
||||
|
||||
.. _intel-idle-core-and-package-idle-states:
|
||||
|
||||
Core and Package Levels of Idle States
|
||||
======================================
|
||||
|
||||
Typically, in a processor supporting the ``MWAIT`` instruction there are (at
|
||||
least) two levels of idle states (or C-states). One level, referred to as
|
||||
"core C-states", covers individual cores in the processor, whereas the other
|
||||
level, referred to as "package C-states", covers the entire processor package
|
||||
and it may also involve other components of the system (GPUs, memory
|
||||
controllers, I/O hubs etc.).
|
||||
|
||||
Some of the ``MWAIT`` hint values allow the processor to use core C-states only
|
||||
(most importantly, that is the case for the ``MWAIT`` hint value corresponding
|
||||
to the ``C1`` idle state), but the majority of them give it a license to put
|
||||
the target core (i.e. the core containing the logical CPU executing ``MWAIT``
|
||||
with the given hint value) into a specific core C-state and then (if possible)
|
||||
to enter a specific package C-state at the deeper level. For example, the
|
||||
``MWAIT`` hint value representing the ``C3`` idle state allows the processor to
|
||||
put the target core into the low-power state referred to as "core ``C3``" (or
|
||||
``CC3``), which happens if all of the logical CPUs (SMT siblings) in that core
|
||||
have executed ``MWAIT`` with the ``C3`` hint value (or with a hint value
|
||||
representing a deeper idle state), and in addition to that (in the majority of
|
||||
cases) it gives the processor a license to put the entire package (possibly
|
||||
including some non-CPU components such as a GPU or a memory controller) into the
|
||||
low-power state referred to as "package ``C3``" (or ``PC3``), which happens if
|
||||
all of the cores have gone into the ``CC3`` state and (possibly) some additional
|
||||
conditions are satisfied (for instance, if the GPU is covered by ``PC3``, it may
|
||||
be required to be in a certain GPU-specific low-power state for ``PC3`` to be
|
||||
reachable).
|
||||
|
||||
As a rule, there is no simple way to make the processor use core C-states only
|
||||
if the conditions for entering the corresponding package C-states are met, so
|
||||
the logical CPU executing ``MWAIT`` with a hint value that is not core-level
|
||||
only (like for ``C1``) must always assume that this may cause the processor to
|
||||
enter a package C-state. [That is why the exit latency and target residency
|
||||
values corresponding to the majority of ``MWAIT`` hint values in the "internal"
|
||||
tables of idle states in ``intel_idle`` reflect the properties of package
|
||||
C-states.] If using package C-states is not desirable at all, either
|
||||
:ref:`PM QoS <cpu-pm-qos>` or the ``max_cstate`` module parameter of
|
||||
``intel_idle`` described `above <intel-idle-parameters_>`_ must be used to
|
||||
restrict the range of permissible idle states to the ones with core-level only
|
||||
``MWAIT`` hint values (like ``C1``).
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
.. [1] *Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 2B*,
|
||||
https://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-software-developer-vol-2b-manual.html
|
||||
|
||||
.. [2] *Advanced Configuration and Power Interface (ACPI) Specification*,
|
||||
https://uefi.org/specifications
|
@ -8,6 +8,7 @@ Working-State Power Management
|
||||
:maxdepth: 2
|
||||
|
||||
cpuidle
|
||||
intel_idle
|
||||
cpufreq
|
||||
intel_pstate
|
||||
intel_epb
|
||||
|
@ -241,6 +241,7 @@ config ACPI_CPU_FREQ_PSS
|
||||
|
||||
config ACPI_PROCESSOR_CSTATE
|
||||
def_bool y
|
||||
depends on ACPI_PROCESSOR
|
||||
depends on IA64 || X86
|
||||
|
||||
config ACPI_PROCESSOR_IDLE
|
||||
|
@ -705,3 +705,185 @@ void __init acpi_processor_init(void)
|
||||
acpi_scan_add_handler_with_hotplug(&processor_handler, "processor");
|
||||
acpi_scan_add_handler(&processor_container_handler);
|
||||
}
|
||||
|
||||
#ifdef CONFIG_ACPI_PROCESSOR_CSTATE
|
||||
/**
|
||||
* acpi_processor_claim_cst_control - Request _CST control from the platform.
|
||||
*/
|
||||
bool acpi_processor_claim_cst_control(void)
|
||||
{
|
||||
static bool cst_control_claimed;
|
||||
acpi_status status;
|
||||
|
||||
if (!acpi_gbl_FADT.cst_control || cst_control_claimed)
|
||||
return true;
|
||||
|
||||
status = acpi_os_write_port(acpi_gbl_FADT.smi_command,
|
||||
acpi_gbl_FADT.cst_control, 8);
|
||||
if (ACPI_FAILURE(status)) {
|
||||
pr_warn("ACPI: Failed to claim processor _CST control\n");
|
||||
return false;
|
||||
}
|
||||
|
||||
cst_control_claimed = true;
|
||||
return true;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(acpi_processor_claim_cst_control);
|
||||
|
||||
/**
|
||||
* acpi_processor_evaluate_cst - Evaluate the processor _CST control method.
|
||||
* @handle: ACPI handle of the processor object containing the _CST.
|
||||
* @cpu: The numeric ID of the target CPU.
|
||||
* @info: Object write the C-states information into.
|
||||
*
|
||||
* Extract the C-state information for the given CPU from the output of the _CST
|
||||
* control method under the corresponding ACPI processor object (or processor
|
||||
* device object) and populate @info with it.
|
||||
*
|
||||
* If any ACPI_ADR_SPACE_FIXED_HARDWARE C-states are found, invoke
|
||||
* acpi_processor_ffh_cstate_probe() to verify them and update the
|
||||
* cpu_cstate_entry data for @cpu.
|
||||
*/
|
||||
int acpi_processor_evaluate_cst(acpi_handle handle, u32 cpu,
|
||||
struct acpi_processor_power *info)
|
||||
{
|
||||
struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
|
||||
union acpi_object *cst;
|
||||
acpi_status status;
|
||||
u64 count;
|
||||
int last_index = 0;
|
||||
int i, ret = 0;
|
||||
|
||||
status = acpi_evaluate_object(handle, "_CST", NULL, &buffer);
|
||||
if (ACPI_FAILURE(status)) {
|
||||
acpi_handle_debug(handle, "No _CST\n");
|
||||
return -ENODEV;
|
||||
}
|
||||
|
||||
cst = buffer.pointer;
|
||||
|
||||
/* There must be at least 2 elements. */
|
||||
if (!cst || cst->type != ACPI_TYPE_PACKAGE || cst->package.count < 2) {
|
||||
acpi_handle_warn(handle, "Invalid _CST output\n");
|
||||
ret = -EFAULT;
|
||||
goto end;
|
||||
}
|
||||
|
||||
count = cst->package.elements[0].integer.value;
|
||||
|
||||
/* Validate the number of C-states. */
|
||||
if (count < 1 || count != cst->package.count - 1) {
|
||||
acpi_handle_warn(handle, "Inconsistent _CST data\n");
|
||||
ret = -EFAULT;
|
||||
goto end;
|
||||
}
|
||||
|
||||
for (i = 1; i <= count; i++) {
|
||||
union acpi_object *element;
|
||||
union acpi_object *obj;
|
||||
struct acpi_power_register *reg;
|
||||
struct acpi_processor_cx cx;
|
||||
|
||||
/*
|
||||
* If there is not enough space for all C-states, skip the
|
||||
* excess ones and log a warning.
|
||||
*/
|
||||
if (last_index >= ACPI_PROCESSOR_MAX_POWER - 1) {
|
||||
acpi_handle_warn(handle,
|
||||
"No room for more idle states (limit: %d)\n",
|
||||
ACPI_PROCESSOR_MAX_POWER - 1);
|
||||
break;
|
||||
}
|
||||
|
||||
memset(&cx, 0, sizeof(cx));
|
||||
|
||||
element = &cst->package.elements[i];
|
||||
if (element->type != ACPI_TYPE_PACKAGE)
|
||||
continue;
|
||||
|
||||
if (element->package.count != 4)
|
||||
continue;
|
||||
|
||||
obj = &element->package.elements[0];
|
||||
|
||||
if (obj->type != ACPI_TYPE_BUFFER)
|
||||
continue;
|
||||
|
||||
reg = (struct acpi_power_register *)obj->buffer.pointer;
|
||||
|
||||
obj = &element->package.elements[1];
|
||||
if (obj->type != ACPI_TYPE_INTEGER)
|
||||
continue;
|
||||
|
||||
cx.type = obj->integer.value;
|
||||
/*
|
||||
* There are known cases in which the _CST output does not
|
||||
* contain C1, so if the type of the first state found is not
|
||||
* C1, leave an empty slot for C1 to be filled in later.
|
||||
*/
|
||||
if (i == 1 && cx.type != ACPI_STATE_C1)
|
||||
last_index = 1;
|
||||
|
||||
cx.address = reg->address;
|
||||
cx.index = last_index + 1;
|
||||
|
||||
if (reg->space_id == ACPI_ADR_SPACE_FIXED_HARDWARE) {
|
||||
if (!acpi_processor_ffh_cstate_probe(cpu, &cx, reg)) {
|
||||
/*
|
||||
* In the majority of cases _CST describes C1 as
|
||||
* a FIXED_HARDWARE C-state, but if the command
|
||||
* line forbids using MWAIT, use CSTATE_HALT for
|
||||
* C1 regardless.
|
||||
*/
|
||||
if (cx.type == ACPI_STATE_C1 &&
|
||||
boot_option_idle_override == IDLE_NOMWAIT) {
|
||||
cx.entry_method = ACPI_CSTATE_HALT;
|
||||
snprintf(cx.desc, ACPI_CX_DESC_LEN, "ACPI HLT");
|
||||
} else {
|
||||
cx.entry_method = ACPI_CSTATE_FFH;
|
||||
}
|
||||
} else if (cx.type == ACPI_STATE_C1) {
|
||||
/*
|
||||
* In the special case of C1, FIXED_HARDWARE can
|
||||
* be handled by executing the HLT instruction.
|
||||
*/
|
||||
cx.entry_method = ACPI_CSTATE_HALT;
|
||||
snprintf(cx.desc, ACPI_CX_DESC_LEN, "ACPI HLT");
|
||||
} else {
|
||||
continue;
|
||||
}
|
||||
} else if (reg->space_id == ACPI_ADR_SPACE_SYSTEM_IO) {
|
||||
cx.entry_method = ACPI_CSTATE_SYSTEMIO;
|
||||
snprintf(cx.desc, ACPI_CX_DESC_LEN, "ACPI IOPORT 0x%x",
|
||||
cx.address);
|
||||
} else {
|
||||
continue;
|
||||
}
|
||||
|
||||
if (cx.type == ACPI_STATE_C1)
|
||||
cx.valid = 1;
|
||||
|
||||
obj = &element->package.elements[2];
|
||||
if (obj->type != ACPI_TYPE_INTEGER)
|
||||
continue;
|
||||
|
||||
cx.latency = obj->integer.value;
|
||||
|
||||
obj = &element->package.elements[3];
|
||||
if (obj->type != ACPI_TYPE_INTEGER)
|
||||
continue;
|
||||
|
||||
memcpy(&info->states[++last_index], &cx, sizeof(cx));
|
||||
}
|
||||
|
||||
acpi_handle_info(handle, "Found %d idle states\n", last_index);
|
||||
|
||||
info->count = last_index;
|
||||
|
||||
end:
|
||||
kfree(buffer.pointer);
|
||||
|
||||
return ret;
|
||||
}
|
||||
EXPORT_SYMBOL_GPL(acpi_processor_evaluate_cst);
|
||||
#endif /* CONFIG_ACPI_PROCESSOR_CSTATE */
|
||||
|
@ -299,164 +299,24 @@ static int acpi_processor_get_power_info_default(struct acpi_processor *pr)
|
||||
|
||||
static int acpi_processor_get_power_info_cst(struct acpi_processor *pr)
|
||||
{
|
||||
acpi_status status;
|
||||
u64 count;
|
||||
int current_count;
|
||||
int i, ret = 0;
|
||||
struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL };
|
||||
union acpi_object *cst;
|
||||
int ret;
|
||||
|
||||
if (nocst)
|
||||
return -ENODEV;
|
||||
|
||||
current_count = 0;
|
||||
|
||||
status = acpi_evaluate_object(pr->handle, "_CST", NULL, &buffer);
|
||||
if (ACPI_FAILURE(status)) {
|
||||
ACPI_DEBUG_PRINT((ACPI_DB_INFO, "No _CST, giving up\n"));
|
||||
return -ENODEV;
|
||||
}
|
||||
|
||||
cst = buffer.pointer;
|
||||
|
||||
/* There must be at least 2 elements */
|
||||
if (!cst || (cst->type != ACPI_TYPE_PACKAGE) || cst->package.count < 2) {
|
||||
pr_err("not enough elements in _CST\n");
|
||||
ret = -EFAULT;
|
||||
goto end;
|
||||
}
|
||||
|
||||
count = cst->package.elements[0].integer.value;
|
||||
|
||||
/* Validate number of power states. */
|
||||
if (count < 1 || count != cst->package.count - 1) {
|
||||
pr_err("count given by _CST is not valid\n");
|
||||
ret = -EFAULT;
|
||||
goto end;
|
||||
}
|
||||
|
||||
/* Tell driver that at least _CST is supported. */
|
||||
pr->flags.has_cst = 1;
|
||||
|
||||
for (i = 1; i <= count; i++) {
|
||||
union acpi_object *element;
|
||||
union acpi_object *obj;
|
||||
struct acpi_power_register *reg;
|
||||
struct acpi_processor_cx cx;
|
||||
|
||||
memset(&cx, 0, sizeof(cx));
|
||||
|
||||
element = &(cst->package.elements[i]);
|
||||
if (element->type != ACPI_TYPE_PACKAGE)
|
||||
continue;
|
||||
|
||||
if (element->package.count != 4)
|
||||
continue;
|
||||
|
||||
obj = &(element->package.elements[0]);
|
||||
|
||||
if (obj->type != ACPI_TYPE_BUFFER)
|
||||
continue;
|
||||
|
||||
reg = (struct acpi_power_register *)obj->buffer.pointer;
|
||||
|
||||
if (reg->space_id != ACPI_ADR_SPACE_SYSTEM_IO &&
|
||||
(reg->space_id != ACPI_ADR_SPACE_FIXED_HARDWARE))
|
||||
continue;
|
||||
|
||||
/* There should be an easy way to extract an integer... */
|
||||
obj = &(element->package.elements[1]);
|
||||
if (obj->type != ACPI_TYPE_INTEGER)
|
||||
continue;
|
||||
|
||||
cx.type = obj->integer.value;
|
||||
/*
|
||||
* Some buggy BIOSes won't list C1 in _CST -
|
||||
* Let acpi_processor_get_power_info_default() handle them later
|
||||
*/
|
||||
if (i == 1 && cx.type != ACPI_STATE_C1)
|
||||
current_count++;
|
||||
|
||||
cx.address = reg->address;
|
||||
cx.index = current_count + 1;
|
||||
|
||||
cx.entry_method = ACPI_CSTATE_SYSTEMIO;
|
||||
if (reg->space_id == ACPI_ADR_SPACE_FIXED_HARDWARE) {
|
||||
if (acpi_processor_ffh_cstate_probe
|
||||
(pr->id, &cx, reg) == 0) {
|
||||
cx.entry_method = ACPI_CSTATE_FFH;
|
||||
} else if (cx.type == ACPI_STATE_C1) {
|
||||
/*
|
||||
* C1 is a special case where FIXED_HARDWARE
|
||||
* can be handled in non-MWAIT way as well.
|
||||
* In that case, save this _CST entry info.
|
||||
* Otherwise, ignore this info and continue.
|
||||
*/
|
||||
cx.entry_method = ACPI_CSTATE_HALT;
|
||||
snprintf(cx.desc, ACPI_CX_DESC_LEN, "ACPI HLT");
|
||||
} else {
|
||||
continue;
|
||||
}
|
||||
if (cx.type == ACPI_STATE_C1 &&
|
||||
(boot_option_idle_override == IDLE_NOMWAIT)) {
|
||||
/*
|
||||
* In most cases the C1 space_id obtained from
|
||||
* _CST object is FIXED_HARDWARE access mode.
|
||||
* But when the option of idle=halt is added,
|
||||
* the entry_method type should be changed from
|
||||
* CSTATE_FFH to CSTATE_HALT.
|
||||
* When the option of idle=nomwait is added,
|
||||
* the C1 entry_method type should be
|
||||
* CSTATE_HALT.
|
||||
*/
|
||||
cx.entry_method = ACPI_CSTATE_HALT;
|
||||
snprintf(cx.desc, ACPI_CX_DESC_LEN, "ACPI HLT");
|
||||
}
|
||||
} else {
|
||||
snprintf(cx.desc, ACPI_CX_DESC_LEN, "ACPI IOPORT 0x%x",
|
||||
cx.address);
|
||||
}
|
||||
|
||||
if (cx.type == ACPI_STATE_C1) {
|
||||
cx.valid = 1;
|
||||
}
|
||||
|
||||
obj = &(element->package.elements[2]);
|
||||
if (obj->type != ACPI_TYPE_INTEGER)
|
||||
continue;
|
||||
|
||||
cx.latency = obj->integer.value;
|
||||
|
||||
obj = &(element->package.elements[3]);
|
||||
if (obj->type != ACPI_TYPE_INTEGER)
|
||||
continue;
|
||||
|
||||
current_count++;
|
||||
memcpy(&(pr->power.states[current_count]), &cx, sizeof(cx));
|
||||
|
||||
/*
|
||||
* We support total ACPI_PROCESSOR_MAX_POWER - 1
|
||||
* (From 1 through ACPI_PROCESSOR_MAX_POWER - 1)
|
||||
*/
|
||||
if (current_count >= (ACPI_PROCESSOR_MAX_POWER - 1)) {
|
||||
pr_warn("Limiting number of power states to max (%d)\n",
|
||||
ACPI_PROCESSOR_MAX_POWER);
|
||||
pr_warn("Please increase ACPI_PROCESSOR_MAX_POWER if needed.\n");
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Found %d power states\n",
|
||||
current_count));
|
||||
|
||||
/* Validate number of power states discovered */
|
||||
if (current_count < 2)
|
||||
ret = -EFAULT;
|
||||
|
||||
end:
|
||||
kfree(buffer.pointer);
|
||||
|
||||
ret = acpi_processor_evaluate_cst(pr->handle, pr->id, &pr->power);
|
||||
if (ret)
|
||||
return ret;
|
||||
|
||||
/*
|
||||
* It is expected that there will be at least 2 states, C1 and
|
||||
* something else (C2 or C3), so fail if that is not the case.
|
||||
*/
|
||||
if (pr->power.count < 2)
|
||||
return -EFAULT;
|
||||
|
||||
pr->flags.has_cst = 1;
|
||||
return 0;
|
||||
}
|
||||
|
||||
static void acpi_processor_power_verify_c3(struct acpi_processor *pr,
|
||||
@ -909,7 +769,6 @@ static int acpi_processor_setup_cstates(struct acpi_processor *pr)
|
||||
|
||||
static inline void acpi_processor_cstate_first_run_checks(void)
|
||||
{
|
||||
acpi_status status;
|
||||
static int first_run;
|
||||
|
||||
if (first_run)
|
||||
@ -921,13 +780,10 @@ static inline void acpi_processor_cstate_first_run_checks(void)
|
||||
max_cstate);
|
||||
first_run++;
|
||||
|
||||
if (acpi_gbl_FADT.cst_control && !nocst) {
|
||||
status = acpi_os_write_port(acpi_gbl_FADT.smi_command,
|
||||
acpi_gbl_FADT.cst_control, 8);
|
||||
if (ACPI_FAILURE(status))
|
||||
ACPI_EXCEPTION((AE_INFO, status,
|
||||
"Notifying BIOS of _CST ability failed"));
|
||||
}
|
||||
if (nocst)
|
||||
return;
|
||||
|
||||
acpi_processor_claim_cst_control();
|
||||
}
|
||||
#else
|
||||
|
||||
|
@ -575,10 +575,14 @@ static int __cpuidle_register_device(struct cpuidle_device *dev)
|
||||
if (!try_module_get(drv->owner))
|
||||
return -EINVAL;
|
||||
|
||||
for (i = 0; i < drv->state_count; i++)
|
||||
for (i = 0; i < drv->state_count; i++) {
|
||||
if (drv->states[i].flags & CPUIDLE_FLAG_UNUSABLE)
|
||||
dev->states_usage[i].disable |= CPUIDLE_STATE_DISABLED_BY_DRIVER;
|
||||
|
||||
if (drv->states[i].flags & CPUIDLE_FLAG_OFF)
|
||||
dev->states_usage[i].disable |= CPUIDLE_STATE_DISABLED_BY_USER;
|
||||
}
|
||||
|
||||
per_cpu(cpuidle_devices, dev->cpu) = dev;
|
||||
list_add(&dev->device_list, &cpuidle_detected_devices);
|
||||
|
||||
|
@ -329,6 +329,14 @@ static ssize_t store_state_disable(struct cpuidle_state *state,
|
||||
return size;
|
||||
}
|
||||
|
||||
static ssize_t show_state_default_status(struct cpuidle_state *state,
|
||||
struct cpuidle_state_usage *state_usage,
|
||||
char *buf)
|
||||
{
|
||||
return sprintf(buf, "%s\n",
|
||||
state->flags & CPUIDLE_FLAG_OFF ? "disabled" : "enabled");
|
||||
}
|
||||
|
||||
define_one_state_ro(name, show_state_name);
|
||||
define_one_state_ro(desc, show_state_desc);
|
||||
define_one_state_ro(latency, show_state_exit_latency);
|
||||
@ -339,6 +347,7 @@ define_one_state_ro(time, show_state_time);
|
||||
define_one_state_rw(disable, show_state_disable, store_state_disable);
|
||||
define_one_state_ro(above, show_state_above);
|
||||
define_one_state_ro(below, show_state_below);
|
||||
define_one_state_ro(default_status, show_state_default_status);
|
||||
|
||||
static struct attribute *cpuidle_state_default_attrs[] = {
|
||||
&attr_name.attr,
|
||||
@ -351,6 +360,7 @@ static struct attribute *cpuidle_state_default_attrs[] = {
|
||||
&attr_disable.attr,
|
||||
&attr_above.attr,
|
||||
&attr_below.attr,
|
||||
&attr_default_status.attr,
|
||||
NULL
|
||||
};
|
||||
|
||||
|
@ -41,6 +41,7 @@
|
||||
|
||||
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
|
||||
|
||||
#include <linux/acpi.h>
|
||||
#include <linux/kernel.h>
|
||||
#include <linux/cpuidle.h>
|
||||
#include <linux/tick.h>
|
||||
@ -79,6 +80,7 @@ struct idle_cpu {
|
||||
unsigned long auto_demotion_disable_flags;
|
||||
bool byt_auto_demotion_disable_flag;
|
||||
bool disable_promotion_to_c1e;
|
||||
bool use_acpi;
|
||||
};
|
||||
|
||||
static const struct idle_cpu *icpu;
|
||||
@ -89,6 +91,11 @@ static void intel_idle_s2idle(struct cpuidle_device *dev,
|
||||
struct cpuidle_driver *drv, int index);
|
||||
static struct cpuidle_state *cpuidle_state_table;
|
||||
|
||||
/*
|
||||
* Enable this state by default even if the ACPI _CST does not list it.
|
||||
*/
|
||||
#define CPUIDLE_FLAG_ALWAYS_ENABLE BIT(15)
|
||||
|
||||
/*
|
||||
* Set this flag for states where the HW flushes the TLB for us
|
||||
* and so we don't need cross-calls to keep it consistent.
|
||||
@ -124,7 +131,7 @@ static struct cpuidle_state nehalem_cstates[] = {
|
||||
{
|
||||
.name = "C1E",
|
||||
.desc = "MWAIT 0x01",
|
||||
.flags = MWAIT2flg(0x01),
|
||||
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
|
||||
.exit_latency = 10,
|
||||
.target_residency = 20,
|
||||
.enter = &intel_idle,
|
||||
@ -161,7 +168,7 @@ static struct cpuidle_state snb_cstates[] = {
|
||||
{
|
||||
.name = "C1E",
|
||||
.desc = "MWAIT 0x01",
|
||||
.flags = MWAIT2flg(0x01),
|
||||
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
|
||||
.exit_latency = 10,
|
||||
.target_residency = 20,
|
||||
.enter = &intel_idle,
|
||||
@ -296,7 +303,7 @@ static struct cpuidle_state ivb_cstates[] = {
|
||||
{
|
||||
.name = "C1E",
|
||||
.desc = "MWAIT 0x01",
|
||||
.flags = MWAIT2flg(0x01),
|
||||
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
|
||||
.exit_latency = 10,
|
||||
.target_residency = 20,
|
||||
.enter = &intel_idle,
|
||||
@ -341,7 +348,7 @@ static struct cpuidle_state ivt_cstates[] = {
|
||||
{
|
||||
.name = "C1E",
|
||||
.desc = "MWAIT 0x01",
|
||||
.flags = MWAIT2flg(0x01),
|
||||
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
|
||||
.exit_latency = 10,
|
||||
.target_residency = 80,
|
||||
.enter = &intel_idle,
|
||||
@ -378,7 +385,7 @@ static struct cpuidle_state ivt_cstates_4s[] = {
|
||||
{
|
||||
.name = "C1E",
|
||||
.desc = "MWAIT 0x01",
|
||||
.flags = MWAIT2flg(0x01),
|
||||
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
|
||||
.exit_latency = 10,
|
||||
.target_residency = 250,
|
||||
.enter = &intel_idle,
|
||||
@ -415,7 +422,7 @@ static struct cpuidle_state ivt_cstates_8s[] = {
|
||||
{
|
||||
.name = "C1E",
|
||||
.desc = "MWAIT 0x01",
|
||||
.flags = MWAIT2flg(0x01),
|
||||
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
|
||||
.exit_latency = 10,
|
||||
.target_residency = 500,
|
||||
.enter = &intel_idle,
|
||||
@ -452,7 +459,7 @@ static struct cpuidle_state hsw_cstates[] = {
|
||||
{
|
||||
.name = "C1E",
|
||||
.desc = "MWAIT 0x01",
|
||||
.flags = MWAIT2flg(0x01),
|
||||
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
|
||||
.exit_latency = 10,
|
||||
.target_residency = 20,
|
||||
.enter = &intel_idle,
|
||||
@ -520,7 +527,7 @@ static struct cpuidle_state bdw_cstates[] = {
|
||||
{
|
||||
.name = "C1E",
|
||||
.desc = "MWAIT 0x01",
|
||||
.flags = MWAIT2flg(0x01),
|
||||
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
|
||||
.exit_latency = 10,
|
||||
.target_residency = 20,
|
||||
.enter = &intel_idle,
|
||||
@ -589,7 +596,7 @@ static struct cpuidle_state skl_cstates[] = {
|
||||
{
|
||||
.name = "C1E",
|
||||
.desc = "MWAIT 0x01",
|
||||
.flags = MWAIT2flg(0x01),
|
||||
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
|
||||
.exit_latency = 10,
|
||||
.target_residency = 20,
|
||||
.enter = &intel_idle,
|
||||
@ -658,7 +665,7 @@ static struct cpuidle_state skx_cstates[] = {
|
||||
{
|
||||
.name = "C1E",
|
||||
.desc = "MWAIT 0x01",
|
||||
.flags = MWAIT2flg(0x01),
|
||||
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
|
||||
.exit_latency = 10,
|
||||
.target_residency = 20,
|
||||
.enter = &intel_idle,
|
||||
@ -808,7 +815,7 @@ static struct cpuidle_state bxt_cstates[] = {
|
||||
{
|
||||
.name = "C1E",
|
||||
.desc = "MWAIT 0x01",
|
||||
.flags = MWAIT2flg(0x01),
|
||||
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
|
||||
.exit_latency = 10,
|
||||
.target_residency = 20,
|
||||
.enter = &intel_idle,
|
||||
@ -869,7 +876,7 @@ static struct cpuidle_state dnv_cstates[] = {
|
||||
{
|
||||
.name = "C1E",
|
||||
.desc = "MWAIT 0x01",
|
||||
.flags = MWAIT2flg(0x01),
|
||||
.flags = MWAIT2flg(0x01) | CPUIDLE_FLAG_ALWAYS_ENABLE,
|
||||
.exit_latency = 10,
|
||||
.target_residency = 20,
|
||||
.enter = &intel_idle,
|
||||
@ -944,6 +951,22 @@ static void intel_idle_s2idle(struct cpuidle_device *dev,
|
||||
mwait_idle_with_hints(eax, ecx);
|
||||
}
|
||||
|
||||
static bool intel_idle_verify_cstate(unsigned int mwait_hint)
|
||||
{
|
||||
unsigned int mwait_cstate = MWAIT_HINT2CSTATE(mwait_hint) + 1;
|
||||
unsigned int num_substates = (mwait_substates >> mwait_cstate * 4) &
|
||||
MWAIT_SUBSTATE_MASK;
|
||||
|
||||
/* Ignore the C-state if there are NO sub-states in CPUID for it. */
|
||||
if (num_substates == 0)
|
||||
return false;
|
||||
|
||||
if (mwait_cstate > 2 && !boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
|
||||
mark_tsc_unstable("TSC halts in idle states deeper than C2");
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
static void __setup_broadcast_timer(bool on)
|
||||
{
|
||||
if (on)
|
||||
@ -975,6 +998,13 @@ static const struct idle_cpu idle_cpu_nehalem = {
|
||||
.disable_promotion_to_c1e = true,
|
||||
};
|
||||
|
||||
static const struct idle_cpu idle_cpu_nhx = {
|
||||
.state_table = nehalem_cstates,
|
||||
.auto_demotion_disable_flags = NHM_C1_AUTO_DEMOTE | NHM_C3_AUTO_DEMOTE,
|
||||
.disable_promotion_to_c1e = true,
|
||||
.use_acpi = true,
|
||||
};
|
||||
|
||||
static const struct idle_cpu idle_cpu_atom = {
|
||||
.state_table = atom_cstates,
|
||||
};
|
||||
@ -993,6 +1023,12 @@ static const struct idle_cpu idle_cpu_snb = {
|
||||
.disable_promotion_to_c1e = true,
|
||||
};
|
||||
|
||||
static const struct idle_cpu idle_cpu_snx = {
|
||||
.state_table = snb_cstates,
|
||||
.disable_promotion_to_c1e = true,
|
||||
.use_acpi = true,
|
||||
};
|
||||
|
||||
static const struct idle_cpu idle_cpu_byt = {
|
||||
.state_table = byt_cstates,
|
||||
.disable_promotion_to_c1e = true,
|
||||
@ -1013,6 +1049,7 @@ static const struct idle_cpu idle_cpu_ivb = {
|
||||
static const struct idle_cpu idle_cpu_ivt = {
|
||||
.state_table = ivt_cstates,
|
||||
.disable_promotion_to_c1e = true,
|
||||
.use_acpi = true,
|
||||
};
|
||||
|
||||
static const struct idle_cpu idle_cpu_hsw = {
|
||||
@ -1020,11 +1057,23 @@ static const struct idle_cpu idle_cpu_hsw = {
|
||||
.disable_promotion_to_c1e = true,
|
||||
};
|
||||
|
||||
static const struct idle_cpu idle_cpu_hsx = {
|
||||
.state_table = hsw_cstates,
|
||||
.disable_promotion_to_c1e = true,
|
||||
.use_acpi = true,
|
||||
};
|
||||
|
||||
static const struct idle_cpu idle_cpu_bdw = {
|
||||
.state_table = bdw_cstates,
|
||||
.disable_promotion_to_c1e = true,
|
||||
};
|
||||
|
||||
static const struct idle_cpu idle_cpu_bdx = {
|
||||
.state_table = bdw_cstates,
|
||||
.disable_promotion_to_c1e = true,
|
||||
.use_acpi = true,
|
||||
};
|
||||
|
||||
static const struct idle_cpu idle_cpu_skl = {
|
||||
.state_table = skl_cstates,
|
||||
.disable_promotion_to_c1e = true,
|
||||
@ -1033,15 +1082,18 @@ static const struct idle_cpu idle_cpu_skl = {
|
||||
static const struct idle_cpu idle_cpu_skx = {
|
||||
.state_table = skx_cstates,
|
||||
.disable_promotion_to_c1e = true,
|
||||
.use_acpi = true,
|
||||
};
|
||||
|
||||
static const struct idle_cpu idle_cpu_avn = {
|
||||
.state_table = avn_cstates,
|
||||
.disable_promotion_to_c1e = true,
|
||||
.use_acpi = true,
|
||||
};
|
||||
|
||||
static const struct idle_cpu idle_cpu_knl = {
|
||||
.state_table = knl_cstates,
|
||||
.use_acpi = true,
|
||||
};
|
||||
|
||||
static const struct idle_cpu idle_cpu_bxt = {
|
||||
@ -1052,20 +1104,21 @@ static const struct idle_cpu idle_cpu_bxt = {
|
||||
static const struct idle_cpu idle_cpu_dnv = {
|
||||
.state_table = dnv_cstates,
|
||||
.disable_promotion_to_c1e = true,
|
||||
.use_acpi = true,
|
||||
};
|
||||
|
||||
static const struct x86_cpu_id intel_idle_ids[] __initconst = {
|
||||
INTEL_CPU_FAM6(NEHALEM_EP, idle_cpu_nehalem),
|
||||
INTEL_CPU_FAM6(NEHALEM_EP, idle_cpu_nhx),
|
||||
INTEL_CPU_FAM6(NEHALEM, idle_cpu_nehalem),
|
||||
INTEL_CPU_FAM6(NEHALEM_G, idle_cpu_nehalem),
|
||||
INTEL_CPU_FAM6(WESTMERE, idle_cpu_nehalem),
|
||||
INTEL_CPU_FAM6(WESTMERE_EP, idle_cpu_nehalem),
|
||||
INTEL_CPU_FAM6(NEHALEM_EX, idle_cpu_nehalem),
|
||||
INTEL_CPU_FAM6(WESTMERE_EP, idle_cpu_nhx),
|
||||
INTEL_CPU_FAM6(NEHALEM_EX, idle_cpu_nhx),
|
||||
INTEL_CPU_FAM6(ATOM_BONNELL, idle_cpu_atom),
|
||||
INTEL_CPU_FAM6(ATOM_BONNELL_MID, idle_cpu_lincroft),
|
||||
INTEL_CPU_FAM6(WESTMERE_EX, idle_cpu_nehalem),
|
||||
INTEL_CPU_FAM6(WESTMERE_EX, idle_cpu_nhx),
|
||||
INTEL_CPU_FAM6(SANDYBRIDGE, idle_cpu_snb),
|
||||
INTEL_CPU_FAM6(SANDYBRIDGE_X, idle_cpu_snb),
|
||||
INTEL_CPU_FAM6(SANDYBRIDGE_X, idle_cpu_snx),
|
||||
INTEL_CPU_FAM6(ATOM_SALTWELL, idle_cpu_atom),
|
||||
INTEL_CPU_FAM6(ATOM_SILVERMONT, idle_cpu_byt),
|
||||
INTEL_CPU_FAM6(ATOM_SILVERMONT_MID, idle_cpu_tangier),
|
||||
@ -1073,14 +1126,14 @@ static const struct x86_cpu_id intel_idle_ids[] __initconst = {
|
||||
INTEL_CPU_FAM6(IVYBRIDGE, idle_cpu_ivb),
|
||||
INTEL_CPU_FAM6(IVYBRIDGE_X, idle_cpu_ivt),
|
||||
INTEL_CPU_FAM6(HASWELL, idle_cpu_hsw),
|
||||
INTEL_CPU_FAM6(HASWELL_X, idle_cpu_hsw),
|
||||
INTEL_CPU_FAM6(HASWELL_X, idle_cpu_hsx),
|
||||
INTEL_CPU_FAM6(HASWELL_L, idle_cpu_hsw),
|
||||
INTEL_CPU_FAM6(HASWELL_G, idle_cpu_hsw),
|
||||
INTEL_CPU_FAM6(ATOM_SILVERMONT_D, idle_cpu_avn),
|
||||
INTEL_CPU_FAM6(BROADWELL, idle_cpu_bdw),
|
||||
INTEL_CPU_FAM6(BROADWELL_G, idle_cpu_bdw),
|
||||
INTEL_CPU_FAM6(BROADWELL_X, idle_cpu_bdw),
|
||||
INTEL_CPU_FAM6(BROADWELL_D, idle_cpu_bdw),
|
||||
INTEL_CPU_FAM6(BROADWELL_X, idle_cpu_bdx),
|
||||
INTEL_CPU_FAM6(BROADWELL_D, idle_cpu_bdx),
|
||||
INTEL_CPU_FAM6(SKYLAKE_L, idle_cpu_skl),
|
||||
INTEL_CPU_FAM6(SKYLAKE, idle_cpu_skl),
|
||||
INTEL_CPU_FAM6(KABYLAKE_L, idle_cpu_skl),
|
||||
@ -1095,6 +1148,162 @@ static const struct x86_cpu_id intel_idle_ids[] __initconst = {
|
||||
{}
|
||||
};
|
||||
|
||||
#define INTEL_CPU_FAM6_MWAIT \
|
||||
{ X86_VENDOR_INTEL, 6, X86_MODEL_ANY, X86_FEATURE_MWAIT, 0 }
|
||||
|
||||
static const struct x86_cpu_id intel_mwait_ids[] __initconst = {
|
||||
INTEL_CPU_FAM6_MWAIT,
|
||||
{}
|
||||
};
|
||||
|
||||
static bool intel_idle_max_cstate_reached(int cstate)
|
||||
{
|
||||
if (cstate + 1 > max_cstate) {
|
||||
pr_info("max_cstate %d reached\n", max_cstate);
|
||||
return true;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
#ifdef CONFIG_ACPI_PROCESSOR_CSTATE
|
||||
#include <acpi/processor.h>
|
||||
|
||||
static bool no_acpi __read_mostly;
|
||||
module_param(no_acpi, bool, 0444);
|
||||
MODULE_PARM_DESC(no_acpi, "Do not use ACPI _CST for building the idle states list");
|
||||
|
||||
static struct acpi_processor_power acpi_state_table;
|
||||
|
||||
/**
|
||||
* intel_idle_cst_usable - Check if the _CST information can be used.
|
||||
*
|
||||
* Check if all of the C-states listed by _CST in the max_cstate range are
|
||||
* ACPI_CSTATE_FFH, which means that they should be entered via MWAIT.
|
||||
*/
|
||||
static bool intel_idle_cst_usable(void)
|
||||
{
|
||||
int cstate, limit;
|
||||
|
||||
limit = min_t(int, min_t(int, CPUIDLE_STATE_MAX, max_cstate + 1),
|
||||
acpi_state_table.count);
|
||||
|
||||
for (cstate = 1; cstate < limit; cstate++) {
|
||||
struct acpi_processor_cx *cx = &acpi_state_table.states[cstate];
|
||||
|
||||
if (cx->entry_method != ACPI_CSTATE_FFH)
|
||||
return false;
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
static bool intel_idle_acpi_cst_extract(void)
|
||||
{
|
||||
unsigned int cpu;
|
||||
|
||||
if (no_acpi) {
|
||||
pr_debug("Not allowed to use ACPI _CST\n");
|
||||
return false;
|
||||
}
|
||||
|
||||
for_each_possible_cpu(cpu) {
|
||||
struct acpi_processor *pr = per_cpu(processors, cpu);
|
||||
|
||||
if (!pr)
|
||||
continue;
|
||||
|
||||
if (acpi_processor_evaluate_cst(pr->handle, cpu, &acpi_state_table))
|
||||
continue;
|
||||
|
||||
acpi_state_table.count++;
|
||||
|
||||
if (!intel_idle_cst_usable())
|
||||
continue;
|
||||
|
||||
if (!acpi_processor_claim_cst_control()) {
|
||||
acpi_state_table.count = 0;
|
||||
return false;
|
||||
}
|
||||
|
||||
return true;
|
||||
}
|
||||
|
||||
pr_debug("ACPI _CST not found or not usable\n");
|
||||
return false;
|
||||
}
|
||||
|
||||
static void intel_idle_init_cstates_acpi(struct cpuidle_driver *drv)
|
||||
{
|
||||
int cstate, limit = min_t(int, CPUIDLE_STATE_MAX, acpi_state_table.count);
|
||||
|
||||
/*
|
||||
* If limit > 0, intel_idle_cst_usable() has returned 'true', so all of
|
||||
* the interesting states are ACPI_CSTATE_FFH.
|
||||
*/
|
||||
for (cstate = 1; cstate < limit; cstate++) {
|
||||
struct acpi_processor_cx *cx;
|
||||
struct cpuidle_state *state;
|
||||
|
||||
if (intel_idle_max_cstate_reached(cstate))
|
||||
break;
|
||||
|
||||
cx = &acpi_state_table.states[cstate];
|
||||
|
||||
state = &drv->states[drv->state_count++];
|
||||
|
||||
snprintf(state->name, CPUIDLE_NAME_LEN, "C%d_ACPI", cstate);
|
||||
strlcpy(state->desc, cx->desc, CPUIDLE_DESC_LEN);
|
||||
state->exit_latency = cx->latency;
|
||||
/*
|
||||
* For C1-type C-states use the same number for both the exit
|
||||
* latency and target residency, because that is the case for
|
||||
* C1 in the majority of the static C-states tables above.
|
||||
* For the other types of C-states, however, set the target
|
||||
* residency to 3 times the exit latency which should lead to
|
||||
* a reasonable balance between energy-efficiency and
|
||||
* performance in the majority of interesting cases.
|
||||
*/
|
||||
state->target_residency = cx->latency;
|
||||
if (cx->type > ACPI_STATE_C1)
|
||||
state->target_residency *= 3;
|
||||
|
||||
state->flags = MWAIT2flg(cx->address);
|
||||
if (cx->type > ACPI_STATE_C2)
|
||||
state->flags |= CPUIDLE_FLAG_TLB_FLUSHED;
|
||||
|
||||
state->enter = intel_idle;
|
||||
state->enter_s2idle = intel_idle_s2idle;
|
||||
}
|
||||
}
|
||||
|
||||
static bool intel_idle_off_by_default(u32 mwait_hint)
|
||||
{
|
||||
int cstate, limit;
|
||||
|
||||
/*
|
||||
* If there are no _CST C-states, do not disable any C-states by
|
||||
* default.
|
||||
*/
|
||||
if (!acpi_state_table.count)
|
||||
return false;
|
||||
|
||||
limit = min_t(int, CPUIDLE_STATE_MAX, acpi_state_table.count);
|
||||
/*
|
||||
* If limit > 0, intel_idle_cst_usable() has returned 'true', so all of
|
||||
* the interesting states are ACPI_CSTATE_FFH.
|
||||
*/
|
||||
for (cstate = 1; cstate < limit; cstate++) {
|
||||
if (acpi_state_table.states[cstate].address == mwait_hint)
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
#else /* !CONFIG_ACPI_PROCESSOR_CSTATE */
|
||||
static inline bool intel_idle_acpi_cst_extract(void) { return false; }
|
||||
static inline void intel_idle_init_cstates_acpi(struct cpuidle_driver *drv) { }
|
||||
static inline bool intel_idle_off_by_default(u32 mwait_hint) { return false; }
|
||||
#endif /* !CONFIG_ACPI_PROCESSOR_CSTATE */
|
||||
|
||||
/*
|
||||
* intel_idle_probe()
|
||||
*/
|
||||
@ -1109,18 +1318,16 @@ static int __init intel_idle_probe(void)
|
||||
}
|
||||
|
||||
id = x86_match_cpu(intel_idle_ids);
|
||||
if (!id) {
|
||||
if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
|
||||
boot_cpu_data.x86 == 6)
|
||||
pr_debug("does not run on family %d model %d\n",
|
||||
boot_cpu_data.x86, boot_cpu_data.x86_model);
|
||||
return -ENODEV;
|
||||
}
|
||||
|
||||
if (id) {
|
||||
if (!boot_cpu_has(X86_FEATURE_MWAIT)) {
|
||||
pr_debug("Please enable MWAIT in BIOS SETUP\n");
|
||||
return -ENODEV;
|
||||
}
|
||||
} else {
|
||||
id = x86_match_cpu(intel_mwait_ids);
|
||||
if (!id)
|
||||
return -ENODEV;
|
||||
}
|
||||
|
||||
if (boot_cpu_data.cpuid_level < CPUID_MWAIT_LEAF)
|
||||
return -ENODEV;
|
||||
@ -1135,7 +1342,13 @@ static int __init intel_idle_probe(void)
|
||||
pr_debug("MWAIT substates: 0x%x\n", mwait_substates);
|
||||
|
||||
icpu = (const struct idle_cpu *)id->driver_data;
|
||||
if (icpu) {
|
||||
cpuidle_state_table = icpu->state_table;
|
||||
if (icpu->use_acpi)
|
||||
intel_idle_acpi_cst_extract();
|
||||
} else if (!intel_idle_acpi_cst_extract()) {
|
||||
return -ENODEV;
|
||||
}
|
||||
|
||||
pr_debug("v" INTEL_IDLE_VERSION " model 0x%X\n",
|
||||
boot_cpu_data.x86_model);
|
||||
@ -1317,60 +1530,39 @@ static void intel_idle_state_table_update(void)
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* intel_idle_cpuidle_driver_init()
|
||||
* allocate, initialize cpuidle_states
|
||||
*/
|
||||
static void __init intel_idle_cpuidle_driver_init(void)
|
||||
static void intel_idle_init_cstates_icpu(struct cpuidle_driver *drv)
|
||||
{
|
||||
int cstate;
|
||||
struct cpuidle_driver *drv = &intel_idle_driver;
|
||||
|
||||
intel_idle_state_table_update();
|
||||
|
||||
cpuidle_poll_state_init(drv);
|
||||
drv->state_count = 1;
|
||||
|
||||
for (cstate = 0; cstate < CPUIDLE_STATE_MAX; ++cstate) {
|
||||
int num_substates, mwait_hint, mwait_cstate;
|
||||
unsigned int mwait_hint;
|
||||
|
||||
if ((cpuidle_state_table[cstate].enter == NULL) &&
|
||||
(cpuidle_state_table[cstate].enter_s2idle == NULL))
|
||||
if (intel_idle_max_cstate_reached(cstate))
|
||||
break;
|
||||
|
||||
if (cstate + 1 > max_cstate) {
|
||||
pr_info("max_cstate %d reached\n", max_cstate);
|
||||
if (!cpuidle_state_table[cstate].enter &&
|
||||
!cpuidle_state_table[cstate].enter_s2idle)
|
||||
break;
|
||||
}
|
||||
|
||||
mwait_hint = flg2MWAIT(cpuidle_state_table[cstate].flags);
|
||||
mwait_cstate = MWAIT_HINT2CSTATE(mwait_hint);
|
||||
|
||||
/* number of sub-states for this state in CPUID.MWAIT */
|
||||
num_substates = (mwait_substates >> ((mwait_cstate + 1) * 4))
|
||||
& MWAIT_SUBSTATE_MASK;
|
||||
|
||||
/* if NO sub-states for this state in CPUID, skip it */
|
||||
if (num_substates == 0)
|
||||
continue;
|
||||
|
||||
/* if state marked as disabled, skip it */
|
||||
/* If marked as unusable, skip this state. */
|
||||
if (cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_UNUSABLE) {
|
||||
pr_debug("state %s is disabled\n",
|
||||
cpuidle_state_table[cstate].name);
|
||||
continue;
|
||||
}
|
||||
|
||||
mwait_hint = flg2MWAIT(cpuidle_state_table[cstate].flags);
|
||||
if (!intel_idle_verify_cstate(mwait_hint))
|
||||
continue;
|
||||
|
||||
if (((mwait_cstate + 1) > 2) &&
|
||||
!boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
|
||||
mark_tsc_unstable("TSC halts in idle"
|
||||
" states deeper than C2");
|
||||
/* Structure copy. */
|
||||
drv->states[drv->state_count] = cpuidle_state_table[cstate];
|
||||
|
||||
drv->states[drv->state_count] = /* structure copy */
|
||||
cpuidle_state_table[cstate];
|
||||
if (icpu->use_acpi && intel_idle_off_by_default(mwait_hint) &&
|
||||
!(cpuidle_state_table[cstate].flags & CPUIDLE_FLAG_ALWAYS_ENABLE))
|
||||
drv->states[drv->state_count].flags |= CPUIDLE_FLAG_OFF;
|
||||
|
||||
drv->state_count += 1;
|
||||
drv->state_count++;
|
||||
}
|
||||
|
||||
if (icpu->byt_auto_demotion_disable_flag) {
|
||||
@ -1379,6 +1571,24 @@ static void __init intel_idle_cpuidle_driver_init(void)
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* intel_idle_cpuidle_driver_init()
|
||||
* allocate, initialize cpuidle_states
|
||||
*/
|
||||
static void __init intel_idle_cpuidle_driver_init(void)
|
||||
{
|
||||
struct cpuidle_driver *drv = &intel_idle_driver;
|
||||
|
||||
intel_idle_state_table_update();
|
||||
|
||||
cpuidle_poll_state_init(drv);
|
||||
drv->state_count = 1;
|
||||
|
||||
if (icpu)
|
||||
intel_idle_init_cstates_icpu(drv);
|
||||
else
|
||||
intel_idle_init_cstates_acpi(drv);
|
||||
}
|
||||
|
||||
/*
|
||||
* intel_idle_cpu_init()
|
||||
@ -1397,6 +1607,9 @@ static int intel_idle_cpu_init(unsigned int cpu)
|
||||
return -EIO;
|
||||
}
|
||||
|
||||
if (!icpu)
|
||||
return 0;
|
||||
|
||||
if (icpu->auto_demotion_disable_flags)
|
||||
auto_demotion_disable();
|
||||
|
||||
|
@ -279,6 +279,21 @@ static inline bool invalid_phys_cpuid(phys_cpuid_t phys_id)
|
||||
|
||||
/* Validate the processor object's proc_id */
|
||||
bool acpi_duplicate_processor_id(int proc_id);
|
||||
/* Processor _CTS control */
|
||||
struct acpi_processor_power;
|
||||
|
||||
#ifdef CONFIG_ACPI_PROCESSOR_CSTATE
|
||||
bool acpi_processor_claim_cst_control(void);
|
||||
int acpi_processor_evaluate_cst(acpi_handle handle, u32 cpu,
|
||||
struct acpi_processor_power *info);
|
||||
#else
|
||||
static inline bool acpi_processor_claim_cst_control(void) { return false; }
|
||||
static inline int acpi_processor_evaluate_cst(acpi_handle handle, u32 cpu,
|
||||
struct acpi_processor_power *info)
|
||||
{
|
||||
return -ENODEV;
|
||||
}
|
||||
#endif
|
||||
|
||||
#ifdef CONFIG_ACPI_HOTPLUG_CPU
|
||||
/* Arch dependent functions for cpu hotplug support */
|
||||
|
@ -77,6 +77,7 @@ struct cpuidle_state {
|
||||
#define CPUIDLE_FLAG_COUPLED BIT(1) /* state applies to multiple cpus */
|
||||
#define CPUIDLE_FLAG_TIMER_STOP BIT(2) /* timer is stopped on this state */
|
||||
#define CPUIDLE_FLAG_UNUSABLE BIT(3) /* avoid using this state */
|
||||
#define CPUIDLE_FLAG_OFF BIT(4) /* disable this state by default */
|
||||
|
||||
struct cpuidle_device_kobj;
|
||||
struct cpuidle_state_kobj;
|
||||
|
Loading…
Reference in New Issue
Block a user