linux/drivers/thermal
Rafael J. Wysocki b684682698 thermal: gov_step_wise: Restore passive polling management
Consider a thermal zone with one passive trip point, a cooling device
with 3 states (0, 1, 2) bound to it, passive polling enabled (nonzero
passive_delay_jiffies) and no regular polling (polling_delay_jiffies
equal to 0) that is managed by the Step-Wise governor.  Suppose that
the initial state of the cooling device is 0 and the zone temperature
is below the trip point to start with.

When the trip point is crossed, tz->passive is incremented by the
thermal core and the governor's .manage() callback is invoked.  It
sets 'throttle' to 'true' for the trip in question and
get_target_state() returns 1 for the instance corresponding to the
cooling device (say that 'upper' and 'lower' are set to 2 and 0 for
it, respectively), so its state changes to 1.

Passive polling is still active for the zone, so next time the
temperature is updated, the governor's .manage() callback will be
invoked again.  If the temperature is still rising, it will change
the state of the cooling device to 2.

Now suppose that next time the zone temperature is updated, it falls
below the trip point, so tz->passive is decremented for the zone (say
it becomes 0 then) and the governor's .manage() callbacks runs.

It finds that the temperature trend for the zone is 'falling' and
'throttle' will be set to 'false' for the trip in question, so the
cooling device's state will be changed to 1.  However, because
tz->polling is 0 for the zone, the governor's .manage() callback
may not be invoked again for a long time and the cooling device's
state will not be reset back to 0.

This can happen because commit 042a3d80f1 ("thermal: core: Move
passive polling management to the core") removed passive polling
management from the Step-Wise governor.

Before that change, thermal_zone_trip_update() would bump up
tz->passive when changing the target state for a thermal instance
from "no target" to a specific value and it would drop tz->passive
when changing it back to "no target" which would cause passive
polling to be active for the zone until the governor has reset the
states of all cooling devices.  In particular, in the example above
tz->passive would be incremented when changing the state of the
cooling device from 0 to 1 and then it would be still nonzero when
the state of the cooling device was changed from 2 to 1.

To prevent this problem from occurring, restore the passive polling
management in the Step-Wise governor by partially reverting the
commit in question and update the comment in the restored code
to explain its role more clearly.

Fixes: 042a3d80f1 ("thermal: core: Move passive polling management to the core")
Closes: https://lore.kernel.org/linux-pm/ZmVfcEOxmjUHZTSX@hovoldconsulting.com
Reported-by: Johan Hovold <johan+linaro@kernel.org>
Tested-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-06-11 21:00:44 +02:00
..
broadcom thermal: ns: Convert to platform remove callback returning void 2023-09-29 12:34:16 +02:00
intel Driver core changes for 6.10-rc1 2024-05-22 12:13:40 -07:00
mediatek thermal/drivers/mediatek/lvts_thermal: Fix wrong lvts_ctrl index 2024-05-06 10:33:26 +02:00
qcom thermal/drivers/qcom: Remove some unused fields in struct qpnp_tm_chip 2024-04-23 12:40:29 +02:00
samsung thermal/drivers/exynos: Use set_trips ops 2024-01-02 09:33:19 +01:00
st thermal: Drop spaces before TABs 2024-03-11 17:14:46 +01:00
tegra thermal: tegra-bpmp: Convert to platform remove callback returning void 2023-10-02 14:24:13 +02:00
ti-soc-thermal thermal: ti-bandgap: Convert to platform remove callback returning void 2023-10-02 14:24:15 +02:00
amlogic_thermal.c thermal/drivers/amlogic: Support A1 SoC family Thermal Sensor controller 2024-04-23 12:40:29 +02:00
armada_thermal.c thermal/drivers/armada: Simplify name sanitization 2024-04-23 12:40:29 +02:00
cpufreq_cooling.c thermal/cpufreq: Remove arch_update_thermal_pressure() 2024-04-24 12:08:00 +02:00
cpuidle_cooling.c thermal: cpuidle_cooling: fix kernel-doc warning and a spello 2023-12-21 12:05:48 +01:00
da9062-thermal.c thermal: core: Eliminate writable trip points masks 2024-02-27 12:04:38 +01:00
db8500_thermal.c
devfreq_cooling.c thermal: devfreq_cooling: Fix perf state when calculate dfc res_util 2024-03-27 16:27:39 +01:00
dove_thermal.c thermal: dove: Convert to platform remove callback returning void 2023-09-29 12:34:16 +02:00
gov_bang_bang.c thermal: gov_bang_bang: Fold thermal_zone_trip_update() into its caller 2024-04-23 20:38:26 +02:00
gov_fair_share.c thermal: gov_fair_share: Eliminate unnecessary integer divisions 2024-04-24 10:15:08 +02:00
gov_power_allocator.c thermal: core: Move passive polling management to the core 2024-04-30 21:16:13 +02:00
gov_step_wise.c thermal: gov_step_wise: Restore passive polling management 2024-06-11 21:00:44 +02:00
gov_user_space.c thermal: gov_user_space: Use .trip_crossed() instead of .throttle() 2024-04-24 20:42:10 +02:00
hisi_thermal.c thermal: hisi: Convert to platform remove callback returning void 2023-09-29 12:34:16 +02:00
imx8mm_thermal.c thermal/drivers/imx8mm_thermal: Fix function pointer declaration by adding identifier name 2023-10-15 23:40:09 +02:00
imx_sc_thermal.c
imx_thermal.c thermal: core: Eliminate writable trip points masks 2024-02-27 12:04:38 +01:00
k3_bandgap.c thermal/drivers/k3_bandgap: Remove some unused fields in struct k3_bandgap 2024-04-23 12:40:29 +02:00
k3_j72xx_bandgap.c thermal: k3_j72xx_bandgap: Convert to platform remove callback returning void 2023-09-29 12:34:17 +02:00
Kconfig thermal: Get rid of CONFIG_THERMAL_WRITABLE_TRIPS 2024-02-23 18:24:48 +01:00
khadas_mcu_fan.c
kirkwood_thermal.c thermal: kirkwood: Convert to platform remove callback returning void 2023-09-29 12:34:17 +02:00
loongson2_thermal.c thermal/drivers/loongson2: Add Loongson-2K2000 support 2024-04-23 12:40:30 +02:00
Makefile thermal: Drop spaces before TABs 2024-03-11 17:14:46 +01:00
max77620_thermal.c thermal/drivers/max77620: Remove duplicate error message 2023-10-15 23:40:10 +02:00
qoriq_thermal.c thermal/drivers/qoriq: Fix getting tmu range 2024-03-11 17:14:46 +01:00
rcar_gen3_thermal.c thermal/drivers/rcar_gen3: Update temperature approximation calculation 2024-04-23 12:40:29 +02:00
rcar_thermal.c thermal: core: Eliminate writable trip points masks 2024-02-27 12:04:38 +01:00
rockchip_thermal.c thermal: rockchip: Convert to platform remove callback returning void 2023-10-02 14:23:30 +02:00
rzg2l_thermal.c thermal: rzg2l: Convert to platform remove callback returning void 2023-10-02 14:23:51 +02:00
spear_thermal.c thermal: spear: Convert to platform remove callback returning void 2023-10-02 14:24:06 +02:00
sprd_thermal.c thermal: sprd: Convert to platform remove callback returning void 2023-10-02 14:24:08 +02:00
sun8i_thermal.c thermal/drivers/sun8i: Don't fail probe due to zone registration failure 2024-03-11 17:14:46 +01:00
thermal_core.c thermal: core: Do not fail cdev registration because of invalid initial state 2024-06-07 13:51:51 +02:00
thermal_core.h thermal: trip: Trigger trip down notifications when trips involved in mitigation become invalid 2024-05-27 13:00:00 +02:00
thermal_debugfs.c thermal/debugfs: Allow tze_seq_show() to print statistics for invalid trips 2024-05-27 13:00:00 +02:00
thermal_debugfs.h thermal/debugfs: Pass cooling device state to thermal_debug_cdev_add() 2024-04-26 15:01:56 +02:00
thermal_helpers.c thermal: core: Move threshold out of struct thermal_trip 2024-04-08 16:01:20 +02:00
thermal_hwmon.c thermal: core: Store zone ops in struct thermal_zone_device 2024-02-23 18:24:48 +01:00
thermal_hwmon.h
thermal_mmio.c
thermal_netlink.c Merge branch 'thermal-intel' into thermal 2024-04-15 15:45:32 +02:00
thermal_netlink.h thermal: netlink: Add genetlink bind/unbind notifications 2024-03-27 14:50:26 +01:00
thermal_of.c thermal/of: Assume polling-delay(-passive) 0 when absent 2024-03-11 17:14:46 +01:00
thermal_sysfs.c thermal: core: Move threshold out of struct thermal_trip 2024-04-08 16:01:20 +02:00
thermal_trace_ipa.h thermal: core: Make struct thermal_zone_device definition internal 2024-04-08 16:01:20 +02:00
thermal_trace.h tracing/treewide: Remove second parameter of __assign_str() 2024-05-22 20:14:47 -04:00
thermal_trip.c thermal: trip: Trigger trip down notifications when trips involved in mitigation become invalid 2024-05-27 13:00:00 +02:00
thermal-generic-adc.c
uniphier_thermal.c thermal: uniphier: Convert to platform remove callback returning void 2023-10-02 14:24:17 +02:00